> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/how-to-guides-1/connectors/enable-cdc/postgresql.md).

# PostgreSQL

This article guides you through the process of enabling CDC on your PostgreSQL database.

## Prerequisites for PostgreSQL

### **Database configuration**

CDC uses logical replication to get change events from the Write-Ahead Log (WAL), so you should ensure this is enabled in advance.

You can check the replication method of your database by running:

```sql
SHOW wal_level;
```

This query should return `logical`. If it returns any other value, please change the replication method:

* For self-hosted PostgreSQL:

```sql
ALTER SYSTEM SET wal_level = logical;
```

* For RDS / Aurora PostgreSQL: set the `logical_replication` parameter in AWS RDS console to **1**.

The database must be restarted for this change to take effect.

### Permissions

In order for Upsolver to read the WAL and the initial state of the database, the CDC data source requires:

1. A user with replication and select permissions
2. A PostgreSQL [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html) for the replicated tables

#### Creating a CDC user

If you are using an AWS RDS PostgreSQL instance please see the instructions below on how to create a CDC user. If you are using a self-hosted instance, please refer to[ Debezium's documentation](https://debezium.io/documentation/reference/connectors/postgresql.html#postgresql-permissions).

Upsolver recommends creating a separate user for the CDC process and granting that user replication and select permissions.&#x20;

Replication permission can be granted using:

* For self-hosted PostgreSQL:

```sql
ALTER ROLE <user> WITH REPLICATION;
```

* For RDS / Aurora PostgreSQL:

```sql
GRANT rds_replication TO <user>;
```

SELECT permissions can be granted either by giving permission to the entire schema, or for each table individually:

```sql
 GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO <user>; // For all tables
 GRANT SELECT ON <table_name> TO <user>; // For a specific table
```

### Creating a publication

Upsolver requires the *publication* of replicated tables in order to be able to subscribe to change events. The publication can be either on all tables or on specific tables. &#x20;

{% hint style="warning" %}
If the database contains tables without primary keys, adding those tables to the publication will cause PostgreSQL to block all updates and deletes for that table. This can be prevented in one of the following ways:

* Manage the publication tables manually and don't include such tables.&#x20;
* Change the `REPLICA IDENTITY` of that table to `FULL` or `INDEX`. More information can be found in the [PostgreSQL documentation](https://www.postgresql.org/docs/current/logical-replication-publication.html).&#x20;
* Add a primary key to the table.&#x20;
  {% endhint %}

Creating a publication of all tables is the simplest method and will automatically include all tables created in the future. However, this can lead to issues if tables without primary keys are created without having a `REPLICA IDENTITY` configured.

A publication of all tables can be created by a superuser using the following command:&#x20;

```sql
CREATE PUBLICATION upsolver FOR ALL TABLES
```

Creating a publication on specific tables will not automatically include newly created tables. For instructions on how to create and manage publications for specific tables, see the [PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-createpublication.html).&#x20;

More information about permissions can be found in the [Debezium documentation](https://debezium.io/documentation/reference/connectors/postgresql.html#postgresql-permissions.).

### Setting up a Heartbeat Table

The WAL in a PostgreSQL instance is shared among all tables and databases on that instance. When at least one replication slot exists, PostgreSQL will wait until all replication slots confirm that a particular WAL event has been consumed before deleting it. Consequently, a replication slot that doesn't confirm any event for a long time might lead to the WAL growing excessively.

Upsolver can confirm an event only when receiving one. This means that when a data source doesn't contain any frequently changing tables, Upsolver might not get an opportunity to confirm any events, thereby leading to high disk usage (by the WAL).

To prevent excessive growth, *heartbeats* can be placed in the replication stream so that Upsolver can periodically confirm events. A *Heartbeat Table* sets up the heartbeats. After Upsolver is set up to use the heartbeat table, heartbeats will be tracked and confirmed by the Upsolver data source.

Follow these steps to create a heartbeat table:

1. Create a heartbeat table:

```sql
CREATE TABLE IF NOT EXISTS <schema_name>.<table_name> 
   (key int primary key, value timestamp)
```

2\. Grant the necessary permissions for Upsolver to use the heartbeat table:

```sql
GRANT INSERT, UPDATE, SELECT on table <table_name> to <user>
```

3\.  If the publication used in the data source was created for specific tables (not all tables), add the heartbeat table to the publication. See the [PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-alterpublication.html).

4\. Add the heartbeat table to the data source in the data source creation wizard or in the data source properties.

### Known Limitations

1. PostgreSQL omits unchanged TOASTed values unless they are explicitly part of the table's replica identity. See [Handle PostgreSQL TOAST Values](/content/how-to-guides-1/jobs/database-replication/handle-postgresql-toast-values.md) page for suggested solutions and workarounds.
2. Columns with [generated values](https://www.postgresql.org/docs/current/ddl-generated-columns.html) are not included in the PostgreSQL replication log. Generated columns will be replicated during snapshotting but will not be included in streaming changes. Consider reproducing the the generation expression in the target system or in the replication job.&#x20;


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/how-to-guides-1/connectors/enable-cdc/postgresql.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
