- 4 Minutes to read
- Print
- DarkLight
- PDF
Blueprint Components and Configuration
- 4 Minutes to read
- Print
- DarkLight
- PDF
Blueprint Now Available in Private Preview
We are excited to announce that our new Blueprint engine is now available to a select group of preview customers!
If you're interested in joining the private preview, click here to request access.
Understanding the Blueprint YAML Structure
Blueprints in Rivery provide a structured way to define custom connectors and data workflows using YAML. This YAML configuration consists of various components that define how data is fetched, processed, and stored. Each Blueprint is composed of:
- Interface Parameters – Defines user-configurable parameters, such as authentication and date filters.
- Connector Configuration – Specifies the API or system being connected to, including authentication and endpoints.
- Variable Metadata – Defines variables for storing and processing data.
- Storage Configuration – Determines where processed data and temporary variables are stored.
- Steps – Represents the workflow steps, including API requests, data transformations, and error handling.
- Retry Strategies – Ensures reliability by handling API failures through automatic retries.
By understanding these core components, users can efficiently build scalable, reusable custom connectors within Rivery.
How to Create a River Using Blueprint
Blueprints offer a powerful and flexible way to create and manage Rivers in Rivery. This guide provides a detailed flow to help you create a River using Blueprints, with an emphasis on defining and using interface parameters.
Step-by-Step Flow for Creating a River
1. Add Interface Parameters
Interface parameters define the dynamic inputs for your River. These parameters will be exposed on the River that uses the Blueprint as its source. They essentially represent the configuration that the API expects to receive in order to retrieve data correctly. These parameters can include authentication details, domain names, or custom date ranges.
YAML Example:
interface_parameters:
section:
source:
- name: "your_domain"
type: "string"
value: "rivery-jira"
- name: "connectToAPI"
type: "authentication"
auth_type: "basic_http"
fields:
- name: "username"
type: "string"
value: "your_email@example.com"
- name: "password"
type: "string"
value: "your_api_token"
- name: "time_period"
type: "date_range"
period_type: "date"
format: "YYYY-MM-DD"
fields:
- name: "start_date"
value: "2024-01-01"
- name: "end_date"
value: "2024-01-31"
##### Explanation of Interface Parameters:
String Parameters:
- Define constant values like the domain name or specific strings.
- Example:
your_domain
specifies the domain for API calls.
Authentication Parameters:
- Use the
connectToAPI
parameter to configure authentication, specifying the type (basic_http
,oauth
, etc.) and credentials.
- Use the
Date Range Parameters:
- Specify date ranges dynamically for tasks like fetching data for a specific period.
- Example:
time_period
definesstart_date
andend_date
.
2. Configure the Connector
The next step in creating a River using Blueprint is configuring the connector, which defines the API or system you will interact with.
YAML Example:
connector:
name: JiraConnector
type: rest
base_url: "https://{your_domain}.atlassian.net/rest/api/2"
- name: Identifies the connector (e.g., JiraConnector).
- type: Specifies the type of connector, such as
rest
. - base_url: The base endpoint for API requests.
3. Configure Variable Metadata
Declare variables that will be used in your workflow to store and process data.
YAML Example:
variables_metadata:
task_data:
storage_name: results_memory
format: json
processed_output:
storage_name: results_file
format: json
- task_data: Holds intermediate results from API calls.
- processed_output: Stores the final processed data.
4. Set Up Variable Storage
Define where variables and outputs will be stored. Rivery supports multiple storage types, including in-memory and file system storage.
YAML Example:
variables_storages:
- name: results_memory
type: memory
- name: results_file
type: file_system
path: storage/results/filesystem
- results_memory: Temporary storage for intermediate variables.
- results_file: Permanent storage for final results.
5. Build the Steps
Steps define the core logic for your River, including data retrieval and processing.
Example Step: Fetching Issues from Jira
steps:
- name: RetrieveIssues
description: "Fetch Jira issues within the specified date range"
endpoint: "{{%BASE_URL%}}/search"
http_method: GET
query_params:
jql: "created >= {{%start_date%}} AND created <= {{%end_date%}}"
maxResults: 100
variables_output:
- variable_name: task_data
response_location: data
variable_format: json
- Query Parameters: Dynamically fetch data based on the defined date range.
- Output Variables: Store API response data for further processing.
Putting it all together
interface_parameters:
section:
source:
- name: "your_domain"
type: "string"
value: "rivery-jira"
- name: "connectToAPI"
type: "authentication"
auth_type: "basic_http"
fields:
- name: "username"
type: "string"
value: "your_rivery_mail"
- name: "password"
type: "string"
value: "look at 1password"
- name: "time_period"
type: "date_range"
period_type: "date"
format: "YYYY-mm-DD"
fields:
- name: "start_date"
value: "2024-12-01"
- name: "end_date"
value: "2024-12-12"
connector:
base_url: https://{your_domain}.atlassian.net
default_headers: {}
default_retry_strategy: {}
name: jiraConnector
variables_metadata:
final_output_file:
format: json
storage_name: results dir
variable_name: final_output_file
pagination_token:
format: json
storage_name: results dir
variable_name: pagination_token
variables_storages:
- name: results dir
path: hadas/storage/results/filesystem
type: file_system
steps:
- endpoint: "{{%BASE_URL%}}/rest/api/3/search"
expected_status_codes:
- 200
headers: {}
http_method: GET
name: GetOpenIssuesBetweenDates
pagination:
break_conditions:
- condition:
type: json_items_result_count
value: 0
name: BreakIfNoMoreIssues
variable: '{{%pagination_token%}}'
location: qs
parameters:
- increment_by: 50
name: startAt
value: 0
- name: maxResults
value: 50
type: offset
query_params:
jql: "created>={time_period.start_date} AND created<={time_period.end_date}"
retry_strategy: {}
sleep_between_requests: 0.2
stream: false
type: rest
validate_certificate: true
variables_output:
- overwrite_storage: false
response_location: data
transformation_layers:
- from_type: json
json_path: $.[*]
require_full_file: true
transformation_type: extract_json
type: extract_json
variable_format: json
variable_name: final_output_file
- overwrite_storage: true
response_location: data
transformation_layers:
- from_type: json
json_path: $.issues[*]
require_full_file: true
transformation_type: extract_json
type: extract_json
variable_format: json
variable_name: pagination_token
Best Practices for Blueprint Rivers
- Plan Your Variables: Clearly define the variables needed to store data and ensure they are declared in the metadata.
- Use Interface Parameters: Make your River reusable and dynamic by leveraging interface parameters for authentication, domain configuration, and custom inputs.
- Validate Steps: Test each step to ensure it retrieves and processes data correctly.
- Include Error Handling: Implement retry strategies to handle intermittent API issues.
- Optimize Storage: Choose the appropriate storage type based on the nature of your data (e.g., temporary vs. permanent).