- 3 Minutes to read
- Print
- DarkLight
- PDF
Email Walkthrough
- 3 Minutes to read
- Print
- DarkLight
- PDF
A Guide for getting data from any Email to Rivery.
Prerequisites
Create an Email Connection.
How to pull data from Email using Rivery
First, select 'Create New River' from the top right of the Rivery screen.
Choose 'Data Source to Target' as your river type.
In the 'General Info' tab, name your river, describe it and choose a group.
Next, navigate to the 'Source' tab.
Find Email in the list of data sources and select it. (under Organization)
1. Under Source Connection, select the connection you created, or create a new one.
2. Date range -
Pulls data in the date range between the start and end date provided, including the end date.
You must select a start date.
Leaving the end date empty will pull data according to the current time of the river's run.
Please Note:
- Rivery pulls emails from the specified date range; if multiple runs occur on the same day, the most recent emails received since the last run are pulled.
The 'Entire Day' checkbox ignores this default behavior and pulls the entire day regardless of whether or not there are new emails so that duplicates may occur.
- The Start Date won't be advanced if a River run is unsuccessful.
If you don't want this default setting, click More Options and check the box to advance the start date even if the River run is unsuccessful (Not recommended).
3. Add the desired filters attributes:
To see how those filters work, we advise you to try it in your Gmail account as follows:
Click near the search mail on the downward arrow.
Select Advanced, and set the filters you want.
When you finish click search to see if the mail you would like to extract from Rivery has shown up in your Gmail.
Testing filename filter in Gmail's UI:
This filter's full detail is shown below:
There may be a problem for Gmail to recognize a pattern with * after a non-alphanumeric character. In the following example no files were detected:
Another example that worked with multiple asterisks (*):
So it is best to play around with Gmail's search to filter your results before using it in Rivery.
Now when you can find your emails, set the same types in Rivery's Gmail source:
- From - The email address from who the email messages have been received.
- Label - Choose a label from a list of optimal labels.
- Subject - The entire subject or a few words that the subject contains.
- Has the word - Word that is contained in the mail content.
- Body - Words that are contained in the mail content.
- File Name - The filename that you want to pull.
- File Type - Choose the file type you want to pull.
- Specify whether the file/sheet has a header. In case there is a header specify its position.
- If it is a link to download from, check the checkbox to download the report:
DD/MM/YYYY Timestamp
To work in DD/MM/YYYY timestamp, check the box at the bottom of the river page:
Source Auto Mapping
The last step in the source page is the Auto Mapping feature.
Rivery will automatically set your mapping based on the data returned from the files/sheets you've selected.
You may also modify the generated fields according to:
- Choose which fields you want to fetch in the Mapping table, add fields on your own, or remove unwanted fields.
- Check the "fill empty" box if you want empty cells to get values from the above closest cell.
- Define each field as a source or static. The source is a field coming from your Google Sheet and static is a field for adding expression as you would like.
- Static will just put a repeated value in each cell in the respective field alias, and the source will do a regex comparison to the field name. Several examples of such expressions (remove quotations; " "):
- The expression: ".*?\d+" will take copy all values in a given cell up to the first digit. ('id123' -> 'id1')
- The expression: "\w+ \w+" will take the first 2 words. ('first second third' -> 'first second')
- Other types of expressions in Python's re.match can be found in here: https://docs.python.org/2/library/re.html
Common Issues
If there is no data for one day in a specific time range that was chosen, a "warning notification" will not notify the user. The solution in which you still can still be notified when a response is received empty is to manually add the field "email_date" in the automapping stage.
In addition, you can add the email_date TIMESTAMP to get the date at which the file was in your Gmail:
Then perform a new automapping to get the new meta-data field: