> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/articles-1/jobs/ingest-data-using-cdc/cdc-known-limitations.md).

# CDC Known Limitations

This article provides an in-depth look at the specific limitations associated with using Upsolver for Change Data Capture (CDC) to replicate data into data lakes or Snowflake. Understanding these limitations is crucial for effectively managing data workflows and ensuring data integrity.

### Limitations

#### 1. Empty Tables

Tables that are empty, containing no rows, will not be replicated. Ensure tables have data if replication is necessary.

#### 2. Null-Value Columns

Columns populated entirely by null values will not be replicated. This may cause schema discrepancies between the source and target environments.

#### 3. Conflicting Data Types Across Tables

In scenarios where multiple tables have columns with the same name but different data types, conflicts can occur during replication:

* If both tables are updated simultaneously, and one column type is `date` while another is `int`, the column will be replicated as `date` in the target environment. This conflict also applies to columns with types `long` and `timestamp`.

#### 4. Nested Column Type Limitations

Nested data types(like JSON) come with several limitations:

* **Missing Fields**: JSON fields that are null are omitted during replication.
* **Nulls in Arrays**: Null values within JSON arrays are skipped.
* **Empty Arrays:** Empty arrays or arrays with only nulls will be treated as null.
* **Type Casting in Arrays**: Arrays containing elements of different types will be cast to `varchar` (e.g., `[1, 'str']` becomes `['1', 'str']`).

#### 5. TOAST Values

For PostgreSQL sources, fields stored as TOAST require a full replica identity for replication, affecting large data fields.

#### 6. Non-Replication of Default Values

Default values defined in database schemas are not replicated. This can affect how data appears in the target system if defaults are relied upon.

#### 7. Data Type Upcasting

Upsolver converts original data types to a set of supported primitive types:

* **Integer Types**: All are mapped to `bigint`.
* **Floating Point and Decimal Types**: All are mapped to `double`.
* **Decimal Type** is mapped to `double`.

Upsolver supported primitive types are:

* String
* Bigint
* Double
* Boolean
* Date
* Timestamp (milliseconds precision)

#### 8. Unsupported Truncate Events

Truncate operations, which delete all rows in a table, are not supported in CDC replication.

#### 9. Commit Synchronization

There is no mechanism to synchronize commits between target tables, which may result in temporal data discrepancies.

#### 10. Handling Changes in Column Types

A change in a source column's data type results in the creation of a new column in the target. For example, if a column changes from `bigint` to `varchar`, a new column with a suffix (e.g., `col_string`) will be created for the `varchar` values.&#x20;

See [Schema Evolution](/content/articles-1/data/schema-evolution.md) for more details.

#### 11. Dropped and Renamed Columns

* **Dropped Columns**: Columns dropped in the source are not dropped in the target, potentially leading to outdated schema representations.
* **Renamed Columns**: Renamed columns will not be renamed in the target; instead, an additional column with the new name will be created.

#### 12. Column Order

The order of the created columns in the target tables is not guaranteed and may not match the order in the source tables.

### Conclusion

These limitations highlight the challenges and considerations when using Upsolver for CDC with data lakes or Snowflake. Planning and understanding these constraints is essential for effective data management and integration strategies, ensuring that the replicated data is accurate and consistent with business needs.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/articles-1/jobs/ingest-data-using-cdc/cdc-known-limitations.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.