PostgreSQL CDC Troubleshooting Guide
  • 2 Minutes to read
  • Dark
    Light
  • PDF

PostgreSQL CDC Troubleshooting Guide

  • Dark
    Light
  • PDF

Article summary

Introduction

This guide provides step-by-step instructions to troubleshoot Change Data Capture (CDC) with PostgreSQL in Rivery. It covers LSN checks, RDS settings, PostgreSQL configuration, replication slots, publications, permissions, logs, and data capture validation.

Step 1: Verify AWS RDS Logical Replication

Check if Logical Replication is Enabled

SHOW rds.logical_replication;

Expected: on
If off, enable via AWS Parameter Groups:

  • Go to AWS Console → RDS → Parameter Groups

  • Create or modify a custom group

  • Set:

    rds.logical_replication = 1
    
  • Apply to your RDS instance and restart it.

Step 2: Confirm Required Extensions

Check for pgoutput Extension

SELECT * FROM pg_available_extensions WHERE name = 'pgoutput';

If not installed:

CREATE EXTENSION IF NOT EXISTS pgoutput;

Step 3: Validate Replication Slot

Check Existing Slots

SELECT * FROM pg_replication_slots;

Create Logical Replication Slot (if needed)

SELECT * FROM pg_create_logical_replication_slot('rivery_slot', 'pgoutput');

Verify Slot Status

SELECT slot_name, plugin, active FROM pg_replication_slots;

Expected:
rivery_slot | pgoutput | f

Step 4: Validate WAL Decoding

Peek Changes from Slot

SELECT * FROM pg_logical_slot_peek_binary_changes(
  'rivery_slot', NULL, NULL,
  'publication_names', 'rivery publication',
  'proto_version', '1'
);

Expected output includes WAL entries such as BEGIN, INSERT, COMMIT.

Step 5: Check WAL Level and Slot Config

Show WAL Settings

SHOW wal_level;
SHOW max_replication_slots;
SHOW max_wal_senders;

Expected:

  • wal_level = logical

  • Adequate values for slots/senders (e.g., 5)

Update via parameter group if needed:

wal_level = logical  
max_replication_slots = 5  
max_wal_senders = 5

Step 6: Ensure Table Has a Primary Key

Check Primary Key

SELECT a.attname, format_type(a.atttypid, a.atttypmod)
FROM pg_index i
JOIN pg_attribute a ON a.attrelid = i.indrelid AND a.attnum = ANY(i.indkey)
WHERE i.indrelid = 'public.customers'::regclass AND i.indisprimary;

Add Primary Key (if missing)

ALTER TABLE public.customers ADD PRIMARY KEY (id);

Step 7: Monitor WAL Retention

Check WAL Retention Settings

SHOW wal_keep_size;
SHOW max_wal_size;
SHOW max_slot_wal_keep_size;

Recommended Minimums

wal_keep_size = 4GB  
max_wal_size = 20GB  
max_slot_wal_keep_size = 10GB

Update as needed:

ALTER SYSTEM SET wal_keep_size = '4GB';
ALTER SYSTEM SET max_wal_size = '20GB';
ALTER SYSTEM SET max_slot_wal_keep_size = '10GB';

Restart PostgreSQL after changes.

Estimate Daily WAL Generation

SELECT pg_size_pretty(wal_bytes / (extract(epoch from now() - stats_reset) / 86400)) AS daily_wal
FROM pg_stat_wal;

Step 8: Monitor Replication Lag

View Retained WAL

SELECT 
  slot_name,
  pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal,
  pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) / (wal_segment_size * 1024^2) AS segments_retained
FROM pg_replication_slots;

Step 9: Use REPLICA IDENTITY FULL for Full Row Data

Set Replica Identity

ALTER TABLE your_table_name REPLICA IDENTITY FULL;

Example:

ALTER TABLE orders REPLICA IDENTITY FULL;

This ensures all columns, including unchanged TOAST values, are emitted during CDC events.

Final Checklist

  • Logical Replication enabled in RDS

  • WAL level set to logical

  • Sufficient WAL retention

  • Logical replication slot exists and is active

  • Publication includes necessary tables

  • All CDC tables have a PRIMARY KEY

  • Rivery user has SELECT and REPLICATION permissions

  • No replication lag

  • REPLICA IDENTITY FULL set where needed


Was this article helpful?