Configuring PostgreSQL CDC Replication Slots and Publications

5 Minutes to read

Print
Share
Dark
Light
PDF

Configuring PostgreSQL CDC Replication Slots and Publications

5 Minutes to read

Print
Share
Dark
Light
PDF

Article summary

Did you find this summary helpful?

Thank you for your feedback!

Introduction

Replication is a key feature in database management systems that allows the synchronization of data between multiple database instances. It is commonly used for various purposes, such as high availability, disaster recovery, and data distribution. In some replication setups, publication slots are used to control the flow of changes between the master and replica databases.

This document guides you through the process of configuring replication slots and publications.

Please Note:

Replication slot and publication names in PostgreSQL are case-sensitive.
Having superuser privileges is crucial for obtaining data from a Replication slot. In the case of Amazon RDS, this means having superuser access, while on the Google Cloud Platform, you would need the cloudsqlsuperuser role to acquire the required REPLICATION permission.
In order to create a Publication, it is necessary to have superuser (or cloudsqlsuperuser) privileges, which will grant you the required CREATE permission.

Replication Slots Configuration

Activating replication slots allows for efficient and reliable data replication in PostgreSQL. By authorizing the Rivery user with the REPLICATION role, you enable the necessary permissions for managing replication slots.
These slots play a crucial role in the replication process by reserving space on the server to store replicated data before it is consumed. Proper configuration of replication slots is crucial for maintaining this setup.

To configure replication slots, the Rivery user must be granted the REPLICATION role. If the server does not already have a REPLICATION role, it can be created using the provided SQL commands in your PostgreSQL:

CREATE ROLE <replication_role_name> REPLICATION LOGIN;

Please ensure to replace <replication_role_name> with a role name of your preference.

Once the role has been created, execute the following SQL command to create a new login for Rivery's Log-Based extraction if it doesn't already exist:

CREATE USER rivery WITH <replication_role_name>;

In case the Rivery user already exists, use the following SQL command to assign the new role to the user:

ALTER USER rivery WITH <replication_role_name>;

Configure Replica Identity to Each Non-Keyed or Unique Table

To ensure that the previous values of a table are retrieved from the Write-Ahead Log (WAL) in Postgresql during DELETE events, it is necessary to configure the replication identity of the tables. Postgres offers 4 different types of REPLICA IDENTITY that dictate how events are written to the logical replication stream:

DEFAULT: This setting includes the previous values of a table's primary key columns only in UPDATE and DELETE events. In the case of UPDATE, it includes only the primary columns with changed values.
NOTHING: With this setting, the UPDATE and DELETE events do not contain any information about the previous values of any table column.
FULL: The previous values of all columns in the table are stored in the UPDATE and DELETE events when this setting is used.
INDEX: UPDATE and DELETE events include the previous values of the columns defined in the index named "index name". However, in the case of UPDATE, only the indexed columns with changed values are included.

For additional details regarding DELETE events, please refer to our Databases overview page.

Choose the appropriate replica identity option based on your requirements. For non-keyed or unique tables, the FULL option is often used to ensure accurate replication:

ALTER TABLE <schema>.<table> REPLICA IDENTITY FULL;

Replace <schema> and <table> with the actual names of the schema and table you want to modify.

Publication Slots Configuration

Publication slots configuration involves using the Create Replication Slot command in PostgreSQL to establish a slot dedicated to publishing data. This command serves the purpose of reserving a specific space within the primary database, exclusively intended for the designated publication slot. By creating such a slot, users gain the ability to efficiently and securely publish data to the primary database while ensuring seamless replication and synchronization with other databases.

The basic syntax to create a publication slot is as follows:

SELECT pg_create_logical_replication_slot('slot_name', 'plugin');

Replace 'slot_name' with a unique name for the publication slot, and 'plugin' with the name of the logical decoding plugin you intend to use for replication.

Custom Configurations

Once you have granted the appropriate permissions, Rivery will automatically generate replication slots and publications in a Postgresql database as part of its default configuration. However, you have the option to create dedicated replication slots or publications for specific Rivers according to your preferences.

Custom Replication Slots

With Rivery, you have the flexibility to set a custom Replication slot name. This name refers to the PostgreSQL logical decoding slot that is generated to stream changes from a particular plug-in for a specific database/schema.

Please Note:

If you do not provide a replication slot, a default one will be automatically created: Rivery_<account id>_<river id>_source.

Run this to view the replication slots and the associated data within them:

SELECT * FROM pg_replication_slots;

To create a replication, please perform the following commands in your database:

Access the PostgreSQL client by logging in with superuser credentials.
Execute the following command:

SELECT pg_create_logical_replication_slot('rivery_pgoutput', 'pgoutput');

Assign the replication slot name to the River.

Custom Publication Slots

You can create a publication for specific tables in your database, where only changes from those tables will be replicated by Rivery. To perform this action, you will need to have the CREATE privileges.

Please Note:
If a specific name is not provided, the default name for the following publication will be created: rivery_<account id>_<river id>_source.

In such cases, Rivery automatically executes the following command during the initial run and creates a publication for ALL TABLES:

 CREATE PUBLICATION rivery_<account id>_<river id>_source FOR ALL TABLES;

To create a publication for specific tables within your database, please perform the following commands:

Log in to a PostgreSQL client as a superuser.
Create publication for your tables.

CREATE PUBLICATION rivery_publication FOR TABLE movies, films, actors;

To remove or update the list of tables under a publication:

ALTER PUBLICATION rivery_publication ADD/DROP TABLE my_table;

Optionally, you can also choose to create a publication for specific operations on particular tables:

CREATE PUBLICATION rivery_insert_only FOR TABLE movies WITH (publish = 'INSERT');

Ensure that your tables are included in the publication.

SELECT * FROM pg_publication_tables;

Once you have created your publication, assign its name to the River.
To ensure the accurate insertion of the publication, please execute the following command:

SELECT pubname, puballtables, pubinsert, pubupdate, pubdelete
FROM pg_publication
WHERE pubname = ‘rivery_insert_only’;

Partitioned Tables in PostgreSQL

Rivery supports PostgreSQL partitioned tables in both Standard Extraction and Change Data Capture (CDC).

Was this article helpful?

What's Next

PostgreSQL Walkthrough

Table of contents

Introduction
Replication Slots Configuration
Publication Slots Configuration
Custom Configurations
Partitioned Tables in PostgreSQL