> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/quickstarts-1/jobs/ingestion/job-basics/output-to-a-target-table.md).

# Output to a Target Table

### Creating an output table for refined data in Amazon Athena

To start transforming your data, choose from two types of [transformation jobs](/content/reference-1/sql-commands/jobs/create-job/transformation.md): [`INSERT`](/content/reference-1/sql-commands/jobs/create-job/transformation/insert.md) or [`MERGE`](broken://spaces/WKMq8oT1OPM3KjP8vlg2/pages/FoViNkMvl4MA1WrT7Bsn). To take full advantage of these jobs, we recommend that you explicitly create your target table, which enables you to define partitions, primary keys, and more. The following example creates a target table in the data lake.

**Here's the code to create a target table in the data lake:**

```sql
// Syntax
CREATE TABLE default_glue_catalog.<DB_NAME>.<TRANSFORMED_TABLE_NAME> (
     partition_date date)
    PARTITIONED BY partition_date;
    
// Example
CREATE TABLE default_glue_catalog.upsolver_samples.orders_transformed_data(
        partition_date date)
        PARTITIONED BY partition_date;
```

You don’t need to define all of your output columns, as Upsolver automatically adds any missing columns. If you do want to control which columns are added, you can define them in the `CREATE TABLE` statement and then set the `ADD_MISSING_COLUMNS` job property to **FALSE** in the job options.

### Prepare your data for the refined zone

Now that you have created the target table, let's create a transformation job to read data from the staging table, transform it, and write the results into the target table inside your refined zone.

{% hint style="info" %}
You must have data in your staging table to run this job.
{% endhint %}

**Create a transformation job to read from staging and write to the target:**

```sql
// Syntax
CREATE JOB <TRANSFORM_JOB_NAME>
    START_FROM = BEGINNING
    ADD_MISSING_COLUMNS = TRUE	
    RUN_INTERVAL = 1 MINUTE
AS INSERT INTO default_glue_catalog.<DB_NAME>.<TRANSFORMED_TABLE_NAME> 
  MAP_COLUMNS_BY_NAME
    SELECT 
      <source_column_name> as <target_column_name>,
      MD5(<source_column_name>) as <target_column_name>,
      <new_variable> as <target_column_name>,
      <source_column_name> as <target_column_name>,
      $commit_time AS partition_date
    FROM default_glue_catalog.<DB_NAME>.<STAGING_TABLE_NAME>
    WHERE $commmit_time BETWEEN run_start_time() AND run_end_time()
    AND <column_name> = '<filter_by_value>';

// Example
CREATE JOB transform_orders_and_insert_into_athena
    START_FROM = BEGINNING
    ADD_MISSING_COLUMNS = true	
    RUN_INTERVAL = 1 MINUTE
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data 
  MAP_COLUMNS_BY_NAME
    -- Use the SELECT statement to choose columns from the source and 
    -- implement your business logic transformations.
    SELECT 
        -- rename columns
        orderid AS order_id,
        -- hash or mask columns using built-in functions
        MD5(buyeremail) AS customer_id, 
        nettotal AS total, 
        -- populate the partition column with the processing time of the event, 
        -- and automatically cast to DATE
        $commit_time AS partition_date 
    FROM default_glue_catalog.upsolver_samples.orders_raw_data
    WHERE eventtype = 'ORDER' 
    AND $commit_time BETWEEN run_start_time() AND run_end_time();
```

Let's walk through this job:&#x20;

```sql
CREATE JOB transform_orders_and_insert_into_athena
    START_FROM = BEGINNING
    ADD_MISSING_COLUMNS = true	
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data 
    MAP_COLUMNS_BY_NAME
```

First, we create a job named **transform\_orders\_and\_insert\_into\_athena**. The `START_FROM` option instructs Upsolver to include all data going back in time, and `ADD_MISSING_COLUMNS` ensures columns added in the future are added to the target table. In this example, the `RUN_INTERVAL` specifies that the job executes every **1 minute**.&#x20;

Then, in the `AS INSERT INTO` command, you set the insertion table as the target table you created in the previous step. The `MAP_COLUMNS_BY_NAME` command tells Upsolver to match the source and target columns based on name.

In the `SELECT` statement, you can perform transformations on your data:

```sql
SELECT 
      -- rename columns
      orderid AS order_id, 
      -- hash or mask columns using built-in functions
      MD5(buyeremail) AS customer_id, 
      nettotal AS total, 
      -- populate the partition column with the processing time of the event, 
      -- and automatically cast to DATE
      $commit_time AS partition_date 
```

Examples of the transformations you can apply to your data include renaming columns, such as changing **orderid** to **order\_id**, or populating a new column with data from a different one. In the above example, **buyeremail** is masked using the `MD5` function, and a new column, **customer\_id**, is created.

Lastly, you specify the table from which you are copying your data, usually your staging table:

```sql
 FROM default_glue_catalog.upsolver_samples.orders_raw_data
     WHERE eventtype = 'ORDER' 
     AND $commit_time BETWEEN run_start_time() AND run_end_time();
```

### Query your data

Finally, to view your pipeline results, use a `SELECT` statement to query your table.

**Here's the code to view your pipeline data:**

```sql
SELECT * 
FROM default_glue_catalog.upsolver_samples.orders_transformed_data 
LIMIT 100;
```

Including the `LIMIT` option instructs the Upsolver query engine to limit the result set to **100** rows.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/quickstarts-1/jobs/ingestion/job-basics/output-to-a-target-table.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
