- 3 Minutes to read
SQL Server Change Tracking
- 3 Minutes to read
Change Tracking captures the fact that rows in a table have changed, but does not capture the data that was changed or how many times a row has been change. This enables applications to determine which rows have been changed and when that change occurred. Therefore, Change Tracking is more limited in the historical questions it can answer compared to Change Data Capture. However, for those applications that do not require historical information, there is far less storage overhead because the changed data itself is not captured.
Change Tracking saves the change information by saving the primary key column of each row that was changed. It saves a version number for each INSERT/UPDATE/DELETE operation. To obtain the latest data for changed rows, an application can use the primary key column values to join the source table with the tracked table. In each execution, the application can track the latest changes by version numbers.
How does Change Tracking Extraction work in Rivery?
In order to align the data and the metadata in the first run, Rivery makes a migration of the chosen table(s) using the Overwrite loading mode. After the initial migration completes successfully, Rivery saves the database's current change tracking version in the river settings. The following river executions will extract data from this version until the newest version at the current execution time, and load the data using the Upsert-Merge loading mode to the target table(s).
Enable Change Tracking for Tables
To track changes, change tracking must first be enabled for the database, and then enabled for the tables that you want to track within that database. The table definition does not have to be changed in any way, and no triggers are created.
To enable change tracking for a DB, use the following ALTER operation:
ALTER DATABASE AdventureWorks2012 SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON)
To enable change tracking for a specific table, use the following ALTER operation:
ALTER TABLE Person.Contact ENABLE CHANGE_TRACKING
You can find more information regarding change tracking settings in Microsoft SQL Server docs.
Using Change Tracking
Select the Change Tracking option in the Source tab:
For the first river execution, make sure that all tables are checked as "Initiate Migration" in the Mapping tab:
This will extract all the source table's data into your target table and synchronize the river with the DB last change tracking version.
The following executions will extract data from the last synchronization version until the database's current version.
In the Activities tab you can find which versions were extracted for each execution:
Microsoft-SQL Change Tracking with Snowflake target rivers features our 'Include Deleted Rows' functionality.
This option allows you to keep track of data that has been deleted from a source table.
To activate "Include Deleted Rows", follow these steps:
1. In Microsoft SQL Server river, click on Schema.
2. Choose a dedicated Source Table.
3. Select Table settings.
4. Check Initiate Migration, and then Overwrite Table Migration to match the existing database in SQL Server with the Snowflake table (optional).
Skip this step if you want to keep working with the data in Snowflake that has been marked as deleted.
After the first Initiate Migration is completed, The checkbox is de-checked.
5. Click Include Deleted Rows.
If 'Include Deleted Rows' is selected, the river will retrieve all changes from the Change Tracking table, including deleted rows.
It also adds a '__DELETED' column to indicate if the row was deleted or not from the source table.
6. Enable Remove Deleted Rows (optional).
This option is permitted only by selecting 'Upsert - Merge' in Loading Mode, as seen here:
By enabling 'Remove Deleted Rows', all rows having the status 'True' (Deleted) will be eliminated and removed from Snowflake target table.
7. Click Run.
8. Connect to Snowflake.
When any database restore operation is made for a table, please check the “Initiate Migration” checkbox to initialize the table after the restore operation. This will load all the data from the source table into the target table to make sure it's synchronized and to prevent data loss.
Cleanup During River Extraction
While the data is being extracted, a database cleanup process may remove change tracking data that is older than the designated retention period.
By the time the changes are obtained, the most recent synchronization version might no longer be valid. Therefore, an error will pop up - "Min valid version is greater than the last synchronization version. Please reinitialize the table".
To re-initialize the table, navigate to your River's Schema and choose "Set initiate migration for all tables" from the 3 dots menu.
All of the tables you've chosen should have the status "Waiting For Migration," which denotes that your synchronization version has been reset.