> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/reference-1/sql-commands/jobs/create-job/transformation/insert.md).

# INSERT

An `INSERT` job defines a query that pulls in a set of data based on the given `SELECT` statement and inserts it into the designated target. This query is then run periodically based on the `RUN_INTERVAL` defined within the job.

By default the `INSERT` job simply appends the new data to the target table. For data lake tables, those created using `CREATE TABLE` in Upsolver and managed by AWS Glue Data Catalog, when a primary key is defined the job will perform an upsert operation (update if PK exist, insert if it does not).

When writing to non-data lake targets, such as Snowflake and Redshift, the `INSERT` job will not recognize a primary key, if defined on the target table, and will only perform an `INSERT` operation. If you want to upsert, refer to the `MERGE` command.

Additionally, if two jobs writing to the same table rewrite the same primary key, it is nondeterministic as to which job's data ends up in the table and is thus not recommended.

## Syntax

```sql
CREATE [SYNC] JOB <job_identifier>
    [ COMMENT = '<comment>' ]
    { job_options }
AS INSERT INTO { target_definition }
        [ { (<column_name> [ , ... ] ) 
          | MAP_COLUMNS_BY_NAME [ EXCEPT <column_name> [, ...] ] } ]
        { <select_statement> 
        | UNNEST(<select_statement>) };
```

#### Jump to

* [Job identifier](#job-identifier)
* [Job options](#job-options)
* [Target definition](#target-definition)
* [`MAP_COLUMNS_BY_NAME`](#map_columns_by_name)
* [`EXCEPT`](#except)
* [`SELECT` statement](/content/reference-1/sql-commands/jobs/create-job/transformation/select.md)&#x20;
* [`UNNEST`](/content/reference-1/sql-commands/jobs/create-job/transformation/insert/unnest.md)

## Job identifier

Valid identifiers match the following format:

```sql
identifier = "([^"]|"")*"|[A-Za-z_][A-Za-z0-9_]*;
```

## Job options

#### Target location

* [Amazon Redshift](/content/reference-1/sql-commands/jobs/create-job/transformation/job-options/amazon-redshift.md)
* [Amazon S3](/content/reference-1/sql-commands/jobs/create-job/transformation/job-options/amazon-s3.md)
* [Data Lake Tables](/content/reference-1/sql-commands/jobs/create-job/transformation/job-options/data-lake-tables.md)
* [Elasticsearch](/content/reference-1/sql-commands/jobs/create-job/transformation/job-options/elasticsearch.md)
* [PostgreSQL](/content/reference-1/sql-commands/jobs/create-job/transformation/job-options/postgresql.md)
* [Snowflake](/content/reference-1/sql-commands/jobs/create-job/transformation/job-options/snowflake.md)

## Target definition

<pre class="language-sql"><code class="lang-sql"><strong>{ &#x3C;table_identifier>
</strong>| SNOWFLAKE &#x3C;catalog_name>.&#x3C;schema_name>.&#x3C;table_name>
| REDSHIFT &#x3C;catalog_name>.&#x3C;schema_name>.&#x3C;table_name>
| S3 BUCKET = &#x3C;bucket_name>
| ELASTICSEARCH 
} 
</code></pre>

## `MAP_COLUMNS_BY_NAME`

The [`MAP_COLUMNS_BY_NAME`](/content/reference-1/sql-commands/jobs/create-job/transformation/insert/map_columns_by_name.md) keyword maps columns from the [`SELECT`](/content/reference-1/sql-commands/jobs/create-job/transformation/select.md) statement to the table by the names of the returned columns in the query. Columns listed after [`EXCEPT`](#except) are excluded from the final table.

When `MAP_COLUMNS_BY_NAME` is set, the `SELECT` statement must provide an alias for each field or they must be a simple column reference or a star expression. The columns are then mapped to the fields of the same name in the table.

Otherwise columns are mapped by ordinal positions with reference to the provided column list or the order of the columns in the table.

When using a star query in the `SELECT` statement, `MAP_COLUMNS_BY_NAME` is required. Additionally, all primary key and partition columns must be mapped.

## `EXCEPT`

Columns listed after the `EXCEPT` keyword are not written to your target table.

***

## Examples

### Amazon S3

```sql
CREATE SYNC JOB insert_into_s3
    START_FROM = BEGINNING
    FILE_FORMAT = (type = CSV)
    COMPRESSION = GZIP
    DATE_PATTERN = 'yyyy-MM-dd-HH-mm'
    RUN_INTERVAL = 1 MINUTE
    COMPUTE_CLUSTER = "Default Compute"
AS INSERT INTO S3 s3_output_connection 
    LOCATION = 's3://your-bucket-name/path/to/folder/'
    -- Use the SELECT statement to choose columns from the source and 
    -- implement your business logic transformations.
    SELECT 
      column1 AS s3_column1, 
      MD5(column2) AS s3_column2 -- hash or mask columns using built-in functions
    FROM default_glue_catalog.your_schema.your_raw_data_table
    WHERE time_filter();
```

### Data lake table

```sql
CREATE SYNC JOB transform_orders_and_insert_into_athena
    START_FROM = BEGINNING
    ADD_MISSING_COLUMNS = true	
    RUN_INTERVAL = 1 MINUTE
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data 
    MAP_COLUMNS_BY_NAME
    -- Use the SELECT statement to choose columns from the source and 
    -- implement your business logic transformations.
    SELECT 
      -- rename columns
      orderid AS order_id, 
      -- hash or mask columns using built-in functions
      MD5(customer_email) AS customer_id, 
      -- computed field defined later in the query
      customer_name,  
      nettotal AS total, 
      -- populate the partition column with the processing time of the event, 
      -- and automatically cast to DATE type
      $commit_time AS partition_date 
    FROM default_glue_catalog.upsolver_samples.orders_raw_data
    -- create a computed column
    LET customer_name = customer_firstname || ' ' || customer_lastname 
    WHERE ordertype = 'SHIPPING' 
    AND time_filter();
```

### Elasticsearch

```sql
CREATE SYNC JOB insert_into_elasticsearch_index
   RUN_INTERVAL = 1 MINUTE
   START_FROM = BEGINNING
   COMMENT = 'insert data into Elasticsearch index'
AS INSERT INTO ELASTICSEARCH your_elasticsearch_connection 
   PREFIX = 'orders'
      SELECT customer_id,
             COUNT(DISTINCT order_id) AS num_orders,
             SUM(net_total) AS total_spent,
             MIN(order_date) AS first_purchase,
             MAX(order_date) AS last_purchase
      FROM default_glue_catalog.your_database.staging_table
      WHERE time_filter()
      GROUP BY customer_id;
```

### PostgreSQL

```sql
CREATE JOB load_data_to_postgres
    START_FROM = BEGINNING
AS INSERT INTO POSTGRES your_postgres_connection.your_schema.target_table 
    MAP_COLUMNS_BY_NAME            
      SELECT orderid AS app_name
      FROM default_glue_catalog.your_database.orders
      WHERE time_filter();
```

### Snowflake

```sql
CREATE SYNC JOB load_data_to_snowflake
   RUN_INTERVAL = 1 MINUTE
   START_FROM = BEGINNING
   COMMENT = 'insert data into snowflake target table'
AS INSERT INTO SNOWFLAKE your_snowflake_connection.your_schema.target_table 
   MAP_COLUMNS_BY_NAME
      SELECT customer_id,
             COUNT(DISTINCT order_id) AS num_orders,
             SUM(net_total) AS total_spent,
             MIN(order_date) AS first_purchase,
             MAX(order_date) AS last_purchase
      FROM default_glue_catalog.your_database.staging_table
      WHERE time_filter()
      GROUP BY customer_id;
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/reference-1/sql-commands/jobs/create-job/transformation/insert.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
