MariaDB Walkthrough
  • 6 Minutes to read
  • Dark
    Light
  • PDF

MariaDB Walkthrough

  • Dark
    Light
  • PDF

This guide will walk you through the steps needed to integrate data from a MariaDB database (on-prem or managed service) into a cloud target using Rivery.

The following topics will be covered:

Note

This connector can also be used for AWS Aurora (with MariaDB compatibility).

Configure MariaDB For Rivery

Note

The following connection instructions will allow Rivery to connect to MariaDB for standard extraction. For log-based extraction, follow the configuration guidelines here.

Create a User for Rivery

In order to extract data, it's recommended to create a new user in the database for Rivery.
Replace the <RiveryUsername> and <password> with your own values.

Note

The Password may contain the following special characters: @.,/:+-!=~*$^=
If you wish to use % as well, make sure it will not end up in the following way (represented as regex template):
%[8-9a-fA-F][0-9a-fA-F]
Each [..] represents one character. Here are a few examples you cannot use as a part of the password:
%80, %81,..,%89, %8a,..,%8f, %8A,..%8A,..,%a0,..,%FF
Possible solution: Add a valid character between % and the first [..], e.g., %180.

  • For MariaDB versions <= 8.0.3:
  CREATE USER  ' < RiveryUsername > '@'%' IDENTIFIED BY  '< password >';
  • For MariaDB versions >= 8.0.4:
CREATE USER '< *RiveryUsername* >'@'localhost' IDENTIFIED WITH  mariadb_native_password  BY '< password >';

Now, grant your username with the following privileges in your MariaDB Database:

GRANT SELECT, SHOW DATABASES, SHOW VIEW, EXECUTE ON *.* TO '< RiveryUsername >';

Open Rivery IPs in MariaDB Database Firewall

In order to make a successful connection, there is a need to open Rivery IPs in the MariaDB (or SSH Tunnel) server firewall.

Open the IPs for the wanted MariaDB port, if it's a direct connection from Rivery to the database server, or use the wanted SSH Tunnel ports if the connection to the DB runs over an ssh tunneling.

The IPs that need to be opened are:



Here's the most recent information about whitelisting Rivery IPs.

Pull MariaDB Data into a Target

Using Rivery, you can pull data from your MariaDB tables and send that data into your target database.

First, select 'Create New River' from the top right of the Rivery screen.

image

Choose Data Source to Target as your river type.

In the 'General Info' tab, name your river and give it a description. Next, navigate to the 'Source' tab.

Find MariaDB in the list of data sources and select it:

Define a Source Connection (this will be the connection created earlier in the process). If you do not yet have a MariaDB connection in your Rivery account, you can create a new connection here by clicking 'Create New Connection.'

Next, choose your River mode.

image

  • Multi-Tables : Load multiple tables simultaneously from MariaDB to your target.
  • Custom Query : Create a custom query and load it into your target.
  • Union Tables : Merge several tables with the same metadata into a single table.
  • Legacy River : Choose a single source table to load into a single target.

Database Migration

For a detailed walk-through of Multi-Tables mode, please refer to the Rivery Database Migration docs.

Pulling Data Using a Custom Query

You may also use the Custom Query in order to define a query data pull from Rivery. You may use any query that works in the source, using a specific SELECT query without any other statements. Rivery isn’t compatible with using multi-statement or SQL Script in the custom query field.

image

Extract Method:

Using Rivery, you can pull your data incrementally or pull the entirety of the data that exists in the table:

image

  • All: Fetch all data in the table using Chunks.
  • Incremental: Gives you the option to run over a column in a table or your custom query top SELECT. You can run over it by filtering Start and End dates or Epoch times

Moreover, you may choose to run over the date using Daily, Monthly, Weekly, or Yearly chunks.

Please define the incremental Field to be used in the Incremental Field section. After choosing the incremental field, choose the Incremental Type and the dates/values you would like to fetch.

Note: Rivery will manage the increments over the runs using the Maximum value in the data. This means you will always get the entire data since the last run, which prevents data holes. You just need to configure your river once.

Recommended: Define your incremental field in Rivery over a field with an Index or Partitions key in the table.

Special note with a use case: Rivery updates the Start Date after a river is finished to be as the previous End Date.
Setting the river to increment in this case will run until the last increment value (included if you checked the "include end value"), but the Start Date will be the one before the last increment value so the next river run will pick up the last record along with other records.
E.g., If you have one record each day, and you execute a river 7 days ago until today, the new Start Date will be yesterday (where the last increment is today).

image

Limit and Auto Mapping:

After defining the extract method, you may choose a limit of top N rows to fetch. Rivery will set your Schema using the Auto Mapping feature. You can also choose fields you want to fetch in the Mapping table and add fields on your own.

image.png

Union Tables Mode

For like-named tables with identical schemas, the Union Tables river mode can be used to ingest these tables in a single process. Simply enter the table prefix using '*' as a wildcard and then

image

You can click 'Search Tables' below to return a list of tables in the database.

image

Next, choose your extraction method and map the columns by clicking 'Auto Mapping.' Note - the column names and data types are expected to be identical in all tables that are selected to load through the Union Tables mode.

image

Legacy River Mode

This river mode allows for a load of a single source table into a single target table.

image

In the above screenshot, there are steps to define the source table to pull (Rivery will auto-detect available schemas and tables), the extraction method to use, and options for filters or row limits on the data pull.

Multi-Table Mode

Load multiple tables simultaneously from SQL Server to your target. There are two Default Extraction Modes: Standard Extraction and System Versioning. The system versioning method enhances the standard extraction by getting the latest version of each record in a given time span.

image.png
On the 'Table Settings' tab you are able to edit the following:

  • Change the loading mode
  • Change the extraction method. If 'Incremental' is selected, you can then define which field will be used to define the increment.
  • Filter by the expression that will be used as a WHERE clause to fetch the selected data from the table.

Schedule the River

Once the creation of the river is complete, navigate to the 'Schedule' tab and click 'Schedule Me.'

image

Choose the frequency at which to schedule the river.

image

To notify certain users about a river failure or warning, enable notifications below:

image

You can edit your {Mail_Alert_Group} in the Variables page (find this in the left-hand pane of the browser).

Monitor the River

During the river run, or after the run has completed, you can monitor the river in its 'Activities' tab.

image

In this tab, you can monitor the status of the current river run. For the Multi-Table mode, you can monitor at the table level (see above).

By toggling between 'Run View' and 'Target View' you can see the river results grouped either by time of run or by target location.

Check out the Targets section to find out how to load the data to your target data warehouse.

System Versioning

This extraction method is only available on system versioned tables. When selecting default extract method = system versioning and checking a table it will automatically detect if this table is system versioned and will set the 'system versioning' extract method. If it's not a system versioned table the extraction method will be initialized to 'all'.

In the following example checking all tables results in the default extract method = system versioning for test_table1 and test_table2 (which are system versioned) and test_table3 is not system versioned:

More information about system versioning can be found here. In general, we check if the table is versioned by checking the field TABLE_TYPE in INFORMATION_SCHEMA.TABLES.


Was this article helpful?