- Getting Started
- Demo Videos
- Blueprint
- Rivers
- Rivers List & Groups
- Source to Target River
- Logic River
- Action River
- Settings Tab
- Sources
- Applications
- Reconnecting to an Existing OAuth2 Connection in Rivery
- Active Campaigns
- Adaptv
- Adaptive Planning
- Adjust
- Adobe Ads
- Adobe Analytics
- Adroll
- Adtelligent (Verta Media)
- Airtable
- Amplitude
- Anaplan
- Apple Search Ads
- AppNexus
- Appsflyer
- Appstore Connect - Sales and Trends
- Appstore Connect - Apps Analytics
- Avantlink
- AWS Billing
- Bing Ads
- BlueSnap
- Branch
- Brandwatch
- Bronto
- CallRail
- Cellxpert
- Content.Ad
- Coupa
- CreativeX (Picasso Labs)
- Crimson Hexagon
- Criteo
- Criteo V2023 -10
- Customer Gauge
- Currency Service
- Dear Systems
- Delta Projects
- Digital Turbine
- Display & Video 360
- Docebo
- DoubleClick DCM
- Ex-Co
- Facebook Ads
- Facebook Audience Network
- Facebook Social
- Freshservice
- Genesys
- GMP365
- Google Ad Manager (DoubleClick for Publishers)
- Google Ads
- Google Analytics
- Google Play Store
- Google Search Console
- GothamAds
- Greenhouse
- Harvest
- Hibob
- HiveStack
- HubSpot
- Impact Radius
- Index Exchange
- Innovid
- Iron Source
- Instagram Social
- Intercom
- Jira
- JWPlayer
- JWPlayer Analytics
- Kantata (Mavenlink)
- Keepcon
- Klaviyo
- Kochava
- LinkedIn Ads
- Linkedin Social
- LKQD
- Magento
- Marketo
- Maxio Chargify
- Mediamath
- MixPanel
- Moat Analytics
- MobileAction
- Monday
- MyAffiliates
- NetSuite APIs
- Nium
- Nuviad
- Optimove
- Outbrain
- PayPal Reports
- Pendo
- Pepperjam
- Quickbooks
- Recharge
- Recurly
- Reddit Ads
- Revcontent
- Sage Intacct
- Sailthru
- Salesforce
- Salesforce Audience
- Salesforce Marketing Cloud
- Salesforce Pardot
- SAP Concur
- Search Ads 360 Reporting
- Sendgrid
- Sensor Tower
- Servicenow
- ShareASale
- ShipHero
- Shopify
- Skai
- Snapchat Marketing
- Socialbakers
- Social Studio
- Spotx
- Sprinklr
- StackAdapt
- Streamrail
- Stripe
- SuccessFactors
- SurveyGizmo
- SurveyMonkey
- Taboola
- The Trade Desk
- Tiger Pistol
- Tiktok
- TrendKite
- Tune (HasOffers)
- UXCam
- Verizon Media (Oath)
- X (Twitter)
- Yahoo Ads
- Yandex
- Yotpo
- YouTube
- Zendesk
- Zendesk Chat
- Zendesk Talk
- Databases
- Databases Overview
- Database Connectivity Options
- Database River Modes
- Automatic Schema Drift Management
- Database Table Configuration Options
- CDC 'Point in Time' Position
- BigQuery
- Datastore
- ElasticSearch
- MariaDB
- Microsoft SQL Server
- MongoDB
- MySQL
- Oracle
- PostgreSQL
- Snowflake as a Source
- Teradata
- Vertica
- Events
- Storage & Files
- Rest API
- Applications
- Targets
- Administration
- Security
- REST API
- Support
- Troubleshooting
- Error Messages
- Adaptive Insights
- Activate River
- Adobe Ads
- Adobe Analytics
- Airtable
- Amplitude
- Anaplan
- Apple Search Ads
- RVR-ASA-001
- RVR-ASA-002
- RVR-ASA-100
- RVR-ASA-101
- RVR-ASA-102
- RVR-ASA-103
- RVR-ASA-104
- RVR-ASA-105
- RVR-ASA-106
- RVR-ASA-200
- RVR-ASA-201
- RVR-ASA-202
- RVR-ASA-203
- RVR-ASA-204
- RVR-ASA-205
- RVR-ASA-206
- RVR-ASA-207
- RVR-ASA-208
- RVR-ASA-209
- RVR-ASA-210
- RVR-ASA-211
- RVR-ASA-212
- RVR-ASA-300
- RVR-ASA-301
- RVR-ASA-302
- RVR-ASA-303
- RVR-ASA-304
- AppsFlyer
- RVR-APPSFLYER-001
- RVR-APPSFLYER-002
- RVR-APPSFLYER-003
- RVR-APPSFLYER-004
- RVR-APPSFLYER-101
- RVR-APPSFLYER-102
- RVR-APPSFLYER-103
- RVR-APPSFLYER-104
- RVR-APPSFLYER-105
- RVR-APPSFLYER-106
- RVR-APPSFLYER-107
- RVR-APPSFLYER-108
- RVR-APPSFLYER-201
- RVR-APPSFLYER-202
- RVR-APPSFLYER-203
- RVR-APPSFLYER-204
- RVR-APPSFLYER-205
- RVR-APPSFLYER-206
- App Store Apps
- Athena
- Azure Blob Storage
- Azure SQL
- RVR-MSS-720
- RVR-MSS-714
- RVR-MSS-803
- RVR-MSS-823
- RVR-MSS-806
- RVR-MSS-721
- RVR-MSS-819
- RVR-MSS-709
- RVR-MSS-710
- RVR-MSS-711
- RVR-MSS-713
- RVR-MSS-717
- RVR-MSS-708
- RVR-MSS-804
- RVR-MSS-822
- RVR-MSS-820
- RVR-MSS-824
- RVR-MSS-826
- RVR-MSS-821
- RVR-MSS-827
- RVR-MSS-825
- RVR-MSS-802
- RVR-MSS-805
- RVR-MSS-719
- RVR-MSS-718
- RVR-MSS-716
- RVR-MSS-801
- RVR-MSS-704
- RVR-MSS-712
- RVR-MSS-701
- RVR-MSS-715
- RVR-MSS-703
- RVR-MSS-704
- RVR-MSS-500
- RVR-MSS-501
- RVR-MSS-702
- RVR-LTMS-606
- Azure Synapse
- RVR-SYNP-100
- RVR-SYNP-200
- RVR-SYNP-201
- RVR-SYNP-202
- RVR-SYNP-203
- RVR-SYNP-300
- RVR-SYNP-301
- RVR-SYNP-302
- RVR-SYNP-303
- RVR-SYNP-304
- RVR-SYNP-305
- RVR-SYNP-306
- RVR-SYNP-307
- RVR-SYNP-308
- RVR-SYNP-309
- RVR-SYNP-310
- RVR-SYNP-311
- RVR-SYNP-312
- RVR-SYNP-313
- RVR-SYNP-314
- RVR-SYNP-315
- RVR-SYNP-316
- RVR-SYNP-317
- RVR-SYNP-318
- RVR-SYNP-319
- RVR-SYNP-320
- RVR-SYNP-321
- RVR-SYNP-403
- Base REST API
- Google BigQuery
- Bing Ads
- Blob Storage
- Google BigQuery as a Source
- CellExpert
- Conversion Service
- Coupa
- Criteo V2024-07
- DataFrame
- Deployment
- Docebo
- DoubleClick Publisher
- Display & Video 360
- RVR-DC-001
- RVR-DC-100
- RVR-DC-216
- RVR-DC-207
- RVR-DV-219
- RVR-DV-218
- RVR-DC-210
- RVR-DV-303
- RVR-DV-003
- RVR-DV-223
- RVR-DV-209
- RVR-DV-217
- RVR-DV-222
- RVR-DV-206
- RVR-DV-201
- RVR-DC-300
- RVR-DC-211
- RVR-DC-202
- RVR-DV-304
- RVR-DV-210
- RVR-DC-205
- RVR-DV-205
- RVR-DV-211
- RVR-DV-207
- RVR-DV-212
- RVR-DV-102
- RVR-DV-204
- RVR-DV-208
- RVR-DV-202
- RVR-DV-213
- RVR-DV-100
- RVR-DV-221
- RVR-DV-301
- RVR-DV-302
- RVR-DV-300
- RVR-DV-220
- RVR-DV-216
- RVR-DV-215
- RVR-DV-002
- RVR-DV-224
- RVR-DV-225
- RVR-DV-214
- RVR-DV-200
- RVR-DV-203
- RVR-DV-001
- RVR-DC-302
- RVR-DV-101
- RVR-DC-209
- RVR-DC-216
- RVR-DC-203
- RVR-DC-208
- RVR-DC-301
- RVR-DC-206
- RVR-DC-214
- RVR-DC-200
- RVR-DC-101
- RVR-DC-215
- RVR-DC-201
- RVR-DC-212
- RVR-DC-213
- RVR-DC-204
- Elasticsearch
- Facebook Social
- Firebolt
- General
- Genesys
- Google Play
- Harvest
- HasOffers
- HiBob
- IP Allowlist Connection
- iTunes Reporter
- Jira
- Klaviyo v3
- Kochava
- Logic
- Logicode
- Mixpanel
- Moat
- MongoDB
- Microsoft SQL Server
- MySQL
- NetSuite
- NetSuite Analytics
- NetSuite RESTlets
- Pepperjam
- Pipedrive
- PostgreSQL
- Python
- QuickBooks
- RDBMS
- RVR-SF-RDBMS-315
- RVR-SF-RDBMS-317
- RVR-SF-RDBMS-316
- RVR-SF-RDBMS-305
- RVR-SF-RDBMS-307
- RVR-RDBMS-403
- RVR-RDBMS-401
- RVR-RDBMS-305
- RVR-RDBMS-202
- RVR-SF-RDBMS-318
- RVR-SF-RDBMS-314
- RVR-SF-RDBMS-313
- RVR-SF-RDBMS-312
- RVR-SF-RDBMS-311
- RVR-SF-RDBMS-308
- RVR-SF-RDBMS-310
- RVR-SF-RDBMS-309
- RVR-SF-RDBMS-302
- RVR-SF-RDBMS-304
- RVR-SF-RDBMS-306
- RVR-SF-RDBMS-303
- RVR-SF-RDBMS-301
- RVR-RDBMS-312
- RVR-RDBMS-313
- RVR-RDBMS-314
- RVR-RDBMS-311
- RVR-RDBMS-309
- RVR-RDBMS-310
- RVR-RDBMS-402
- RVR-RDBMS-304
- RVR-RDBMS-308
- RVR-RDBMS-201
- RVR-RDBMS-101
- RVR-RDBMS-307
- RVR-RDBMS-306
- RVR-RDBMS-301
- RVR-RDBMS-303
- RVR-LTBQ-102
- RDBMS API
- Recurly
- Amazon Redshift
- REST Action
- Rivers
- Rivery Configuration Error
- Amazon S3
- SAP Concur
- Search Ads 360 Reporting
- SendGrid
- ServiceNow
- Salesforce Marketing Cloud
- Snapchat Ads
- Snowflake
- SSH
- SSH Connection
- SSL Connection
- StackAdapt
- Storages
- RVR-DTBR-100
- RVR-LSNF-103
- RVR-SNFLGC-001
- RVR-SNFLGC-400
- RVR-FTP-100
- RVR-SFTPConn-100
- RVR-SFTPConn-101
- RVR-SFTPConn-102
- RVR-SFTPConn-103
- RVR-SFTPConn-104
- RVR-SFTPStor-100
- RVR-SFTPStor-101
- RVR-SFTPStor-102
- RVR-SFTPStor-103
- RVR-SFTPStor-104
- RVR-SFTP-100
- RVR-SFTP-101
- RVR-BSPTOR-100
- RVR-BSPTOR-101
- RVR-BSPTOR-102
- RVR-BSPTOR-103
- RVR-BSPTOR-104
- RVR-BSPTOR-105
- RVR-BSPTOR-106
- RVR-BSPTOR-107
- RVR-BSPTOR-108
- RVR-BSPTOR-109
- RVR-BSPTOR-110
- RVR-BSPTOR-111
- RVR-BSPTOR-113
- RVR-BSPTOR-114
- RVR-BSPTOR-115
- RVR-BSPTOR-116
- RVR-BSFEEDER-101
- RVR-BSFEEDER-103
- RVR-BSFEEDER-204
- RVR-BSFEEDER-205
- RVR-BSFEEDER-206
- Stripe
- SuccessFactors
- SurveyGizmo
- The Trade Desk
- RVR-TTD-100
- RVR-TTD-101
- RVR-TTD-102
- RVR-TTD-103
- RVR-TTD-120
- RVR-TTD-121
- RVR-TTD-122
- RVR-TTD-123
- RVR-TTD-124
- RVR-TTD-125
- RVR-TTD-126
- RVR-TTD-130
- RVR-TTD-200
- RVR-TTD-201
- RVR-TTD-202
- RVR-TTD-203
- RVR-TTD-204
- RVR-TTD-220
- RVR-TTD-221
- RVR-TTD-222
- RVR-TTD-250
- RVR-TTD-401
- RVR-TTD-402
- RVR-TTD-410
- RVR-TTD-411
- RVR-TTD-412
- RVR-TTD-413
- RVR-TTD-414
- TikTok
- Verta Media
- Yotpo
- YouTube Reporting
- Release Notes
- 2025
- 2024
- December 2024 Release Notes
- November 2024 Release Notes
- October 2024 Release Notes
- September 2024 Release Notes
- August 2024 Release Notes
- July 2024 Release Notes
- June 2024 Release Notes
- May 2024 Release Notes
- April 2024 Release Notes
- March 2024 Release Notes
- February 2024 Release Notes
- January 2024 Release Notes
- 2023
- December 2023 Release Notes
- November 2023 Release Notes
- October 2023 Release Notes
- September 2023 Release Notes
- August 2023 Release Notes
- July 2023 Release Notes
- June 2023 Release Notes
- May 2023 Release Notes
- April 2023 Release Notes
- March 2023 Release Notes
- February 2023 Release Notes
- January 2023 Release Notes
- 2022
- December 2022 Release Notes
- November 2022 Release Notes
- October 2022 Release Notes
- September 2022 Release Notes
- August 2022 Release Notes
- July 2022 Release Notes
- June 2022 Release Notes
- May 2022 Release Notes
- April 2022 Release Notes
- March 2022 Release Notes
- February 2022 Release Notes
- January 2022 Release Notes
- 2021
- API Upgrades Changelog
- API Upgrade Overview
- 2025
- 2024
- 2024-12 Campaign Manager 360 Fields Update
- 2024-10 Shopify API Upgrade v2024-10
- 2024-10 Monday API Upgrade v2024-10
- 2024-08 Display & Video 360 Upgrade v3
- 2024-08 Google Ads Upgrade v17
- 2024-08 LinkedIn Social Upgrade v202406
- 2024-08 LinkedIn Ads Upgrade v202406
- 2024-08 Google Ad Manager Upgrade v202405
- 2024-07 Pendo Upgrade
- 2024-05 Adjust API Upgrade
- 2024-05 Search Ads 360 API Upgrade
- 2024-05 Klaviyo Legacy v1 and v2 to v2024-02-15 API Migration
- 2024-03 Facebook Social Deprecated Metrics
- 2024-02 Shopify API Upgrade v2024-01
- 2024-02 Google Ads API Upgrade V15
- 2024-02 Airtable API Keys Deprecation
- 2024-01 Instagram Social Upgrade to Graph API v18
- 2024-01 Criteo API Upgrade v2023-10
- 2024-01 Google Ad Manager upgrade to v202311
- 2024-01 Monday API Upgrade v2024-01
- 2024-01 Facebook Ads - Marketing API Upgrade v18
- 2023
PostgreSQL Change Data Capture (CDC) Overview
What is Change Data Capture Extraction?
Rivery's Change Data Capture (CDC) extraction mode provides a real-time stream of any changes made to the databases and tables configured, eliminating the need to implement and maintain incremental fields or retrieve data via select queries. It also allows you to retrieve deleted rows and schema changes from the database.
How Does Change Data Capture Extraction Work?
The Change Data Capture (CDC) mechanism is based on aWrite-Ahead Log (WAL) provided by the PostgreSQL database. The logical decoding feature of PostgreSQL is available in versions 10 and higher.
In PostgreSQL, the logical decoding mechanism creates a consistent and easy-to-understand format in one place. The logical decoding mechanism in PostgreSQL, on the other hand, is based on converting the data written to the WAL using an output plugin that converts the WAL transaction rows into a dedicated format. To be able to consume data from the database, this output plugin must be installed on the PostgreSQL server and the logical decoding configuration must be enabled. See the PostgreSQL configuration docs for more information on setting up a PostgreSQL database for Rivery CDC extraction.
Please Note:
Rivery offers CDC extraction mode for PostgreSQL on both Google Cloud SQL and Amazon RDS / Aurora.
Rivery uses the output plugin to continuously pull new rows in the WAL in order to pull data using the CDC architecture. In order to get historical data from the database, Rivery cannot rely on the entire log history already existing prior to setting up a river. This is because PostgreSQL normally keeps track of the WAL and purges it after a certain amount of time.
Rivery uses the Overwrite loading mode to create a full snapshot (or migration) of the chosen table(s) in order to align the data and metadata as it was on the first run. After the migration is complete, Rivery takes the existing WAL records and performs an Upsert-merge to the target table(s), while continuing to fetch new records from the log as they are created.
Rivery's PostgreSQL connector reads WAL records and generates change events in the FileZone files for row-level INSERT, UPDATE, and DELETE commands. Each file represents a set of database operations performed over a period of time. The data from the log is continuously streamed into the FileZone path determined in the river (with timeframes of no more than 15 minutes) and pushed into the target by the river's scheduled frequency. This method saves the data first in the FileZone, and then it can be pushed into the target DWH at any time.
FileZone is covered in further detail in the Target documentation.
CDC Point in Time Position Feature
The CDC "Point in Time" Position feature allows users to gain deeper insights into the operational details of a River's streaming process. This functionality is essential for data recovery and synchronization, enabling users to locate and retrieve data from a specific point in history using the exact information stored in the CDC log position. For additional information, refer to our documentation.
A Sequence Change Data Capture Deployment
Discrepancies in transaction records can arise when two users simultaneously execute identical transactions, causing conflicts in the timestamp field.
Recognizing this challenge, Rivery has implemented a "sequence" Change Data Capture (CDC) mechanism to tackle this issue.
Rivery has enhanced each emitted record from the database by incorporating two extra metadata fields: '__transaction_id' and '__transaction_order'.
The '__transaction_id' field serves as a unique identifier for each transaction, ensuring that no two transactions share the same identifier. This uniqueness allows for precise identification and differentiation between transactions, mitigating conflicts arising from identical timestamps.
Furthermore, the '__transaction_order' field denotes the order in which the transactions were emitted from the database. By incorporating this field, the sequencing of transactions can be accurately maintained, enabling downstream systems such as Apache Kafka or AWS Kinesis to process and order transactions correctly.
The inclusion of these metadata fields guarantees that the ordering of transactions is preserved throughout the River. As a result, smooth and accurate transaction flows can be achieved, resolving the discrepancies that previously arose from transactions with identical timestamps.
The additional fields are depicted in this table:
For further information about Change Data Capture (CDC) Metadata Fields, please refer to our Database Overview document.
Partitioned Tables in PostgreSQL
Rivery supports PostgreSQL partitioned tables in both Standard Extraction and Change Data Capture (CDC).
For PostgreSQL versions 13 and later, partitioned tables can be replicated using logical replication. More details on logical replication limitations can be found in PostgreSQL's documentation.
Configuring Logical Replication for Partitioned Tables
To enable logical replication, you must create a publication for your tables. Publications allow selective replication, meaning you can add or remove tables dynamically. Only tables included in a publication will be replicated to Rivery. Each database can have multiple distinct publications.
Steps to Create a Publication
Ensure you have CREATE privileges or higher.
Create a publication for all tables using the following command:
CREATE PUBLICATION rivery_pub FOR ALL TABLES WITH (publish_via_partition_root=true);
Please Note:
The publication name 'rivery_pub' is provided as an example. The actual publication name must be unique for each database and cannot start with a number.
Partitioning Method Support
Rivery supports only declarative partitioning using 'CREATE TABLE ... PARTITION BY' command. This approach is required for proper change tracking in PostgreSQL version 13 and later.
Legacy inheritance-based partitioning is not supported.
Troubleshooting
Verifying Logical Replication
Before enabling CDC Streaming in Rivery, verify that logical replication is working correctly by running the following query:
SELECT *
FROM pg_logical_slot_peek_binary_changes('rivery_slot', null, null, 'proto_version', '1', 'publication_names', 'rivery_pub');
If CDC streaming is already enabled on a River that reads from the same slot, the query will fail with the following error:
ERROR: replication slot "rivery_slot" is active for PID 12345
To resolve this issue:
- Disable CDC streaming on the relevant River.
- Run the query again to verify replication.
How to Enable Change Data Capture Extraction?
After you've established a connection, go to the Source tab and do the following:
- Choose the Multi-Tables as the River mode.
- Select the Log Based as the extraction mode.
- A brief reminder appears, encouraging you to check your connection and set up your Source and Target, which will happen next. Select 'Got It' to proceed.
- Choose Log Based Version from the Advanced Options drop-down menu.
- Set the name of the custom replication slot (optional).
- Set a custom publication name (optional).
- Turn the 'Enable Log' toggle to true at the bottom of the page.
- A new window pops up, guiding you on the following steps. Select 'Target'.
- You'll be directly sent to the Target tab, where you can choose your data target.
- Define your Target Connection, then choose a Database and a Schema to load into.
- To make sure the operation is complete, click the Enable Log toggle once more.
- To navigate to the Schema tab, click Schema.
- Click 'Show Me My Schemas'.
- A minimum of one Schema must be chosen.
- To use CDC, the table must contain a key, which is defined as the Source Column - id.
- Navigate to the 'Enable Log' toggle and select the number of hours you want to run the River, then click 'I'm Sure'.
- Wait for the Initiation Log to complete its processing.
- You're all set to run your log-based River now.
- Following the completion of the River Run, return to the Schema tab and check that the Table you selected is Streaming.
Type Mapping in CDC
While extracting your data, we match PostgreSQL data types to Rivery data types.
The mapping ofPostgreSQL data types to Rivery-compatible types is shown in the table below:
PostgreSQL Type | Rivery Type |
BOOLEAN | BOOLEAN |
BYTEA | STRING |
TOAST | -Not Supported- |
CHAR | STRING |
VARCHAR | STRING |
TEXT | STRING |
JSON | JSON |
JSONB | JSON |
DATE | STRING |
FLOAT4 | STRING |
FLOAT8 | STRING |
INT2 | INTEGER |
INT4 | INTEGER |
INT8 | INTEGER |
INTERVAL | STRING |
NUMERIC | FLOAT |
TIME | STRING |
TIMESTAMP | STRING |
TIMESTAMPTZ | STRING |
UUID | STRING |
XML | STRING |
Limitations
- The 'TOAST' file format is not supported.
- Any special character will be replaced with an underscore. As a result, if you wish to edit the table name, go to:
1. The 'Schema' tab
2. Select a Table
3. Click 'Table Settings'
4. Choose 'Edit' to change the table name manually.
Current Stream Position in Your Database
To confirm the Stream position, run the following command on the server:
SELECT pg_current_wal_lsn() - '0/0' AS lsn_position;
Table of contents
- What is Change Data Capture Extraction?
- How Does Change Data Capture Extraction Work?
- CDC Point in Time Position Feature
- A Sequence Change Data Capture Deployment
- Partitioned Tables in PostgreSQL
- How to Enable Change Data Capture Extraction?
- Type Mapping in CDC
- Limitations
- Current Stream Position in Your Database