- 7 Minutes to read
- Print
- DarkLight
- PDF
SAP Hana Walkthrough
- 7 Minutes to read
- Print
- DarkLight
- PDF
This guide will walk you through the steps needed to integrate data from an SAP Hana database (on-prem or managed service) into a cloud target using Rivery.
The following topics will be covered:
- Configure SAP Hana for Rivery
- Connect to SAP Hana from Rivery
- Pull SAP Hana data into a Target
- Schedule the River
- Monitor the River
CONFIGURE SAP FOR RIVERY
Create a User for Rivery:
In order to extract data, it's recommended to create a new user in the database for Rivery:
CREATE USER < myUsername > PASSWORD < password >;
GRANT CREATE SESSION TO < myUsername >;
Use the database you want to extract data from. Change the <database> with your existing database name:
USE DB < database >;
Now, grant permissions to the new user you've just created for Rivery to the table or schema you want to extract data from. Use the commands below, and change the <database>, <schema> and <table> structures with your existing schema and table names.
- To grant SELECT operation to all of the tables in the database, use:
GRANT SELECT ANY OBJECT TO < myUsername >;
- To grant a SELECT operation to a specific schema and table, use:
GRANT SELECT ON "< schema>"."< table >" TO < myUsername >;
Connect TO SAP Hana From Rivery
Connect to SAP Database in Rivery:
To connect to your SAP Hana Database, you must first set up your database credentials.
Define your SAP Hana connection:
- In Rivery’s main menu, Go to Connections->Create Connection, or create your new connection in the source, pressing Create New Connection in the Rivers wizard.
- In the pop-out window, define your credential details: Please be advised about whitelisting Rivery's IPs on your Database Firewall.
Please note - In EU server there are 4 IPs to whitelist in the Database Firewall: 52.14.86.20 , 34.254.56.182, 13.58.140.165 and 52.14.192.86.
2.1 Define your Host, Port, Database (SAP Hana Schema), Username and Password.
Recommended:
Using SSH Tunneling in order of connecting your On-Premise database server with Rivery
Using a hybrid method of data processing, Rivery can connect to on-prem servers. We recommend the use of an SSH Tunneling Method.
To use the SSH Tunneling via Rivery, please configure your SSH Tunnel server in your network. This server will have the ability to get SSH connections from Rivery IPs and will be able to connect to your database server via the internal IP or address. Rivery connects to that SSH server using a private key.
After you've configured SSH tunneling, you can set it up in the Rivery Connection:
Under the SSH Options section, in the SAP Hana Connection Pop-up, you can define the following:
- SSH Hostname
- SSH Port (in most of the connections, that should be 22)
- The Username Rivery should connect via the SSH to the server
- Password (optional)
- SSH Pem Key (Recommended)
- And if you have also a Pem Password for that key file.
Test the connection and Save.
Pull SAP Hana Data Into Your Target
Using Rivery, you can pull data from your SAP Hana tables and send that data into your target database.
First, select 'Create New River' from the top right of the Rivery screen.
Choose 'Data Source to Target' as your river type.
In the 'General Info' tab, name your river and give it a description. Next, navigate to the 'Source' tab.
Find SAP Hana in the list of data sources and select it:
Define a Source Connection (this will be the connection created earlier in the process). If you do not yet have an SAP Hana connection in your Rivery account, you can create a new connection here by clicking 'Create New Connection.'
Next, choose your River mode.
- Multi-Tables : Load multiple tables simultaneously from SAP Hana to your target.
- Custom Query : Create a custom query and load it into your target.
- Legacy River : Choose a single source table to load into a single target.
Multi-Table Mode:
Make sure 'Multi-Tables' is selected in the Source tab. Navigate to the Target tab.
Select your desired Target:
Depending on what Target you select, the Target tab will look a little bit different. For relational database targets, in the Target tab you will identify the location where the data is to land (database and schema). You also have the option to define a table prefix.
Next, you can define the default loading mode for your multi-table migration.
For storage targets (Google Cloud Storage, Amazon S3, Azure Blob Storage) the Target tab will allow you to identify the bucket and path for your data to land.
Once your desired target location is configured, navigate to the 'Mapping' tab.
Select 'Show Me My Schemas' to pull the metadata of your SAP Hana database.
Once the metadata has been detected, you'll see a list of schemas:
Click into the desired schema to list the tables within that schema.
From the image above, you can see that there are checkboxes that allow for you to pull in specific tables or simply select all by clicking the checkbox next to the 'Table Name' header.
Even though in the Target tab there is a default Loading Mode defined, the loading mode for any individual table can be edited by clicking in the 'Edit' button on the far right.
Once you click 'Edit', the metadata of the corresponding table will be displayed with additional options to update.
On the 'Columns' tab you are able to edit the following:
- Select/deselect columns to bring into the target
- Rename the field in the 'Target column'
- Cast the field to a different data type (by double-clicking the current data type under 'Type')
- Change the mode of the field under 'Mode'
- Add a field to be a Cluster Key by checking the 'Cluster Key' checkbox (this is used for partitioning).
- Define a field as the key for Upsert-Merge by clicking the key to the left of the field name to highlight it
On the 'Table Settings' tab you are able to edit the following:
- Change the loading mode
- Change the extraction method. If 'Incremental' is selected, you can then define which field will be used to define the increment.
- Filter by an expression that will be used as a WHERE clause to fetch the selected data from the table.
If a date is chosen, you'll be prompted for a start date. If an integer field is chosen, you'll be prompted for a starting value.
Once all desired table edits have been made, you can navigate back to the schema level of your 'Mapping' tab to make more edits.
Before moving on to the 'Schedule' tab to schedule your river, try clicking the 'Run' button at the bottom of the screen to test it first.
Once you've started running your river, you'll see a status bar appear at the bottom of the screen.
Custom Query Mode:
Pulling data from a custom query:
You may also use the Custom Query in order to define a query data pull from Rivery. You may use any query that works in the source, using a specific SELECT query without any other statements. Rivery isn’t compatible with using multi-statement or SQL Script in the custom query field.
Extract Method:
Using Rivery, you can pull your data incrementally or pull the entirety of the data that exists in the table:
- All: Fetch all data in the table using Chunks.
- Incremental: Gives you the option to run over a column in a table or your custom query top SELECT. You can run over it by filtering Start and End dates or Epoch times
Moreover, you may choose to run over the date using Daily, Monthly, Weekly, or Yearly chunks.
Please define the incremental Field to be used in the Incremental Field section. After choosing the incremental field, choose the Incremental Type and the dates/values you would like to fetch.
Note: Rivery will manage the increments over the runs using the Maximum value in the data. This means you will always get the entire data since the last run, which prevents data holes. You just need to configure your river once.
Recommended: Define your incremental field in Rivery over a field with an Index or Partitions key in the table.
Limit and Auto Mapping:
After defining the extract method, you may choose a limit of top N rows to fetch. Rivery will set your Schema using the Auto Mapping feature. You can also choose fields you want to fetch in the Mapping table and add fields on your own.
Legacy River Mode:
This river mode allows for a load of a single source table into a single target table.
In the above screenshot, there are steps to define the source table to pull (Rivery will auto-detect available schemas and tables), the extraction method to use, and options for filters or row limits on the data pull.
SCHEDULE THE RIVER
Once the creation of the river is complete, navigate to the 'Schedule' tab and click 'Schedule Me.'
Choose the frequency at which to schedule the river.
To notify certain users about a river failure or warning, enable notifications below:
You can edit your {Mail_Alert_Group} in the Variables page (find this in the left-hand pane of the browser).
MONITOR THE RIVER
Check out the Targets section to find out how to load the data to your target data warehouse.