Blueprint Components and Configuration
  • 4 Minutes to read
  • Dark
    Light
  • PDF

Blueprint Components and Configuration

  • Dark
    Light
  • PDF

Article summary

Blueprint Now Available in Private Preview

We are excited to announce that our new Blueprint engine is now available to a select group of preview customers!

If you're interested in joining the private preview, click here to request access.

Understanding the Blueprint YAML Structure

Blueprints in Rivery provide a structured way to define custom connectors and data workflows using YAML. This YAML configuration consists of various components that define how data is fetched, processed, and stored. Each Blueprint is composed of:

  1. Interface Parameters – Defines user-configurable parameters, such as authentication and date filters.
  2. Connector Configuration – Specifies the API or system being connected to, including authentication and endpoints.
  3. Variable Metadata – Defines variables for storing and processing data.
  4. Storage Configuration – Determines where processed data and temporary variables are stored.
  5. Steps – Represents the workflow steps, including API requests, data transformations, and error handling.
  6. Retry Strategies – Ensures reliability by handling API failures through automatic retries.

By understanding these core components, users can efficiently build scalable, reusable custom connectors within Rivery.

How to Create a River Using Blueprint

Blueprints offer a powerful and flexible way to create and manage Rivers in Rivery. This guide provides a detailed flow to help you create a River using Blueprints, with an emphasis on defining and using interface parameters.


Step-by-Step Flow for Creating a River

1. Add Interface Parameters

Interface parameters define the dynamic inputs for your River. These parameters will be exposed on the River that uses the Blueprint as its source. They essentially represent the configuration that the API expects to receive in order to retrieve data correctly. These parameters can include authentication details, domain names, or custom date ranges.

YAML Example:

interface_parameters:
  section:
    source:
      - name: "your_domain"
        type: "string"
        value: "rivery-jira"
        
      - name: "connectToAPI"
        type: "authentication"
        auth_type: "basic_http"
        fields:
          - name: "username"
            type: "string"
            value: "your_email@example.com"
          - name: "password"
            type: "string"
            value: "your_api_token"
            
      - name: "time_period"
        type: "date_range"
        period_type: "date"
        format: "YYYY-MM-DD"
        fields:
          - name: "start_date"
            value: "2024-01-01"
          - name: "end_date"
            value: "2024-01-31"

##### Explanation of Interface Parameters:

  1. String Parameters:

    • Define constant values like the domain name or specific strings.
    • Example: your_domain specifies the domain for API calls.
  2. Authentication Parameters:

    • Use the connectToAPI parameter to configure authentication, specifying the type (basic_http, oauth, etc.) and credentials.
  3. Date Range Parameters:

    • Specify date ranges dynamically for tasks like fetching data for a specific period.
    • Example: time_period defines start_date and end_date.

2. Configure the Connector

The next step in creating a River using Blueprint is configuring the connector, which defines the API or system you will interact with.

YAML Example:

connector:
  name: JiraConnector
  type: rest
  base_url: "https://{your_domain}.atlassian.net/rest/api/2"
  • name: Identifies the connector (e.g., JiraConnector).
  • type: Specifies the type of connector, such as rest.
  • base_url: The base endpoint for API requests.

3. Configure Variable Metadata

Declare variables that will be used in your workflow to store and process data.

YAML Example:

variables_metadata:
  task_data:
    storage_name: results_memory
    format: json
  processed_output:
    storage_name: results_file
    format: json
  • task_data: Holds intermediate results from API calls.
  • processed_output: Stores the final processed data.

4. Set Up Variable Storage

Define where variables and outputs will be stored. Rivery supports multiple storage types, including in-memory and file system storage.

YAML Example:

variables_storages:
  - name: results_memory
    type: memory
  - name: results_file
    type: file_system
    path: storage/results/filesystem
  • results_memory: Temporary storage for intermediate variables.
  • results_file: Permanent storage for final results.

5. Build the Steps

Steps define the core logic for your River, including data retrieval and processing.

Example Step: Fetching Issues from Jira

steps:
  - name: RetrieveIssues
    description: "Fetch Jira issues within the specified date range"
    endpoint: "{{%BASE_URL%}}/search"
    http_method: GET
    query_params:
      jql: "created >= {{%start_date%}} AND created <= {{%end_date%}}"
      maxResults: 100
    variables_output:
      - variable_name: task_data
        response_location: data
        variable_format: json
  • Query Parameters: Dynamically fetch data based on the defined date range.
  • Output Variables: Store API response data for further processing.

Putting it all together

interface_parameters:
  section:
    source:
      - name: "your_domain"
        type: "string"
        value: "rivery-jira"
      - name: "connectToAPI"
        type: "authentication"
        auth_type: "basic_http"
        fields:
          - name: "username"
            type: "string"
            value: "your_rivery_mail"
          - name: "password"
            type: "string"
            value: "look at 1password"
      - name: "time_period"
        type: "date_range"
        period_type: "date"
        format: "YYYY-mm-DD"
        fields:
          - name: "start_date"
            value: "2024-12-01"
          - name: "end_date"
            value: "2024-12-12"

connector:
  base_url: https://{your_domain}.atlassian.net
  default_headers: {}
  default_retry_strategy: {}
  name: jiraConnector
  variables_metadata:
    final_output_file:
      format: json
      storage_name: results dir
      variable_name: final_output_file
    pagination_token:
      format: json
      storage_name: results dir
      variable_name: pagination_token
  variables_storages:
  - name: results dir
    path: hadas/storage/results/filesystem
    type: file_system

steps:
- endpoint: "{{%BASE_URL%}}/rest/api/3/search"
  expected_status_codes:
  - 200
  headers: {}
  http_method: GET
  name: GetOpenIssuesBetweenDates
  pagination:
    break_conditions:
    - condition:
        type: json_items_result_count
        value: 0
      name: BreakIfNoMoreIssues
      variable: '{{%pagination_token%}}'
    location: qs
    parameters:
    - increment_by: 50
      name: startAt
      value: 0
    - name: maxResults
      value: 50
    type: offset
  query_params:
    jql: "created>={time_period.start_date} AND created<={time_period.end_date}"
  retry_strategy: {}
  sleep_between_requests: 0.2
  stream: false
  type: rest
  validate_certificate: true
  variables_output:
  - overwrite_storage: false
    response_location: data
    transformation_layers:
    - from_type: json
      json_path: $.[*]
      require_full_file: true
      transformation_type: extract_json
      type: extract_json
    variable_format: json
    variable_name: final_output_file
  - overwrite_storage: true
    response_location: data
    transformation_layers:
    - from_type: json
      json_path: $.issues[*]
      require_full_file: true
      transformation_type: extract_json
      type: extract_json
    variable_format: json
    variable_name: pagination_token

Best Practices for Blueprint Rivers

  1. Plan Your Variables: Clearly define the variables needed to store data and ensure they are declared in the metadata.
  2. Use Interface Parameters: Make your River reusable and dynamic by leveraging interface parameters for authentication, domain configuration, and custom inputs.
  3. Validate Steps: Test each step to ensure it retrieves and processes data correctly.
  4. Include Error Handling: Implement retry strategies to handle intermittent API issues.
  5. Optimize Storage: Choose the appropriate storage type based on the nature of your data (e.g., temporary vs. permanent).

Was this article helpful?