> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/reference-1/sql-commands/iceberg-tables/external-iceberg-tables/create-external-iceberg-table.md).

# CREATE EXTERNAL ICEBERG TABLE

## Syntax

```sql
CREATE EXTERNAL ICEBERG TABLE <table_identifier>
   [{ table_options }];
```

#### Jump to

* [Table identifier](#table-identifier)
* [Table options](#table-options)

## Table identifier

Table identifiers are provided in the following format:

```sql
<catalog_name>.<schema_name>.<table_name>
```

Note that only metastore connection types are accepted for the catalog name.

Valid table names match the following identifier format:

```sql
identifier = "([^"]|"")*"|[A-Za-z_][A-Za-z0-9_]*;
```

## Table options

```sql
[ COMMENT = '<comment>']
[ OPTIMIZE = { TRUE | FALSE } ]
[ RETENTION_DATE_PARTITION = <column_name> ]
[ TABLE_DATA_RETENTION = <integer> DAYS ]
```

**Jump to**

* [COMMENT](#comment-editable)
* [OPTIMIZE](#optimize-editable)
* [RETENTION\_DATE\_PARTITION](#retention_date_partition)
* [TABLE\_DATA\_RETENTION](#table_data_retention-editable)

#### `COMMENT` — editable

Type: `text`

(Optional) A description or comment regarding this table.

#### `OPTIMIZE` — editable

Type: `Boolean`

Default: `false`

(Optional) When `false`, the tables will be not optimized by Upsolver. When `true`, files are compacted to reduce storage costs and increase query scan performance.

#### `RETENTION_DATE_PARTITION`

Type: `identifier`

Default: The only partition column of type `date`

(Optional) This configures the partition column to be used to determine whether the retention period has passed for a given record.

This option is required if you have more than one date partition column.

#### `TABLE_DATA_RETENTION` — editable

Value: `<integer> DAYS`

(Optional) When set, data in partitions that have passed the retention period are deleted from the table. Number of days can range between 1 and 9999.

This option is not a deterministic mechanism that deletes data when it immediately surpasses the defined threshold. This mechanism is closer to the lifecycle policies on common blob storage services, such as Amazon S3, and is designed to save storage costs, not to delete data based on a specific time. Therefore when data passes the retention period, it will be deleted at some point in the future, and can no longer be relied on to exist, though Upsolver aims to delete it within a reasonable timeframe.

You should be aware that transformation job that reads from a table with a defined data retention may or may not read data that has surpassed the retention threshold.

For example, if the current time is 2023-02-23 12:30:00 UTC, and you have defined `TABLE_DATA_RETENTION = 2 days`, you can expect data written during 2023-02-23, 2023-02-22, and 2023-02-21 to exist in the table. The retention threshold truncates data to the nearest day, so when the time changes to 2023-02-24 00:00:00 UTC, you can no longer expect data from 2023-02-21 to be present in the table, although it might be there for a while.

Note that you need at least one date partition column for this option to work.

***

## Examples

### Basic table

The example below creates an external Iceberg table that Upsolver manages for you:

{% code overflow="wrap" %}

```sql
CREATE EXTERNAL ICEBERG TABLE
    default_glue_catalog.my_database.my_external_iceberg_table
    OPTIMIZE = TRUE;
```

{% endcode %}

### Table with data retention

The following example creates an external Iceberg table with data retention. First, we create a table named **orders** in our Glue catalog. The `PARTITIONED BY` option specifies the partition column as **order\_date**.&#x20;

Next, we create an external Iceberg column in Upsolver, with the data retention based on the **order\_date** column. The data will be retained for **2 days**:

```sql
// CREATE ICEBERG TABLE IN GLUE
CREATE TABLE my_glue_catalog.demo.orders
(
    order_id string,
    order_date date,
    customer_email string,
    net_total bigint, 
    num_items bigint
)
    PARTITIONED BY (order_date)
    LOCATION 's3://upsolver-demo/order_data/'
    TBLPROPERTIES ('table_type' = 'ICEBERG');

// CREATE EXTERNAL ICEBERG TABLE IN UPSOLVER
CREATE EXTERNAL ICEBERG TABLE default_glue_catalog.demo.orders
    RETENTION_DATE_PARTITION = order_date
    TABLE_DATA_RETENTION = 2 days;
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/reference-1/sql-commands/iceberg-tables/external-iceberg-tables/create-external-iceberg-table.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
