Blueprint Components and Configuration
  • 5 Minutes to read
  • Dark
    Light
  • PDF

Blueprint Components and Configuration

  • Dark
    Light
  • PDF

Article summary

Blueprint and Copilot Now Available in Private Preview

We are excited to announce that our new Copilot and Blueprint features are now available in private preview for a select group of beta customers. This is your chance to explore and test these powerful capabilities before their official release.

If you're interested in joining the beta program, click here to request access.


How to Configure a Connector

This guide provides a step-by-step walkthrough for setting up a YAML source configuration using GitHub's REST API as an example.


Connector Section

Connector Details

connector:
  name: GitHubConnector 
  type: rest
  • name: Identifies the connector, e.g., GitHubConnector.
  • type: Specifies the connector type as rest for RESTful API calls.

Base URL

base_url: 'https://api.github.com'
  • base_url: The root endpoint for all API requests.

Default Headers

default_headers:
  Authorization: 'Basic {YOUR_AUTH}'
  X-GitHub-Api-Version: '{YOUR_X_GITHUB_API_VERSION}'
  User-Agent: '{YOUR_USER_AGENT}'
  • Authorization: Token-based authentication.
  • X-GitHub-Api-Version: API version to use.
  • User-Agent: Identifies the client making the request.

Variables Storage and Metadata

Variables Storage

variables_storages:
  - name: results dir
    type: file_system
    path: storage/results/filesystem
  - name: results memory
    type: memory
  • results dir: Stores data in the filesystem.
  • results memory: Temporary in-memory storage.

Variables Metadata

variables_metadata:
  page_number:
    storage_name: results memory
    format: json
  last_result:
    storage_name: results memory
    format: json
  final_output_file:
    storage_name: results dir
    format: json
  • page_number: Tracks the current page for pagination.
  • last_result: Holds the last API response data.
  • final_output_file: Stores final processed data in a file.

Steps Section

Main Loop Step

- name: WhileLoopOverCommits
  description: "Get all commits from a GitHub repository"
  type: loop
  loop:
    type: while
    variable_name: page_number
    value: 1
    while_settings:
      operation_to_perform: add
      value_to_perform: 1
      max_iterations: 10000
    break_conditions:
      - name: BreakIfOutOfCommits
        condition:
          type: string_equal
          value: "[]"
        variable: "{{%last_result%}}"
  • Implements a while loop to paginate through commits.
  • Break Condition: Stops if last_result is empty ("[]").
  • Safety: Prevents infinite loops using max_iterations.

Pagination Step

steps:
  - name: Pagination
    description: Retrieve GitHub repository commits
    endpoint: "{{%BASE_URL%}}/repos/{YOUR_REPOSITORY_OWNER}/{YOUR_REPOSITORY}/commits"
    http_method: GET
    query_params:
      page: "{{%page_number%}}"
    variables_output:
      - variable_name: last_result
        response_location: data
        variable_format: json
        overwrite_storage: true
      - variable_name: final_output_file
        response_location: data
        transformation_layers:
          - type: extract_json
            json_path: $.[*].commit
            from_type: json
  • Retrieves commits from a specific GitHub repository.
  • Implements pagination using the page query parameter.

Single Loop Example

steps:
- name: LoopOverCommits
  description: "Process each SHA commit"
  loop:
    variable_name: "sha_commits"
    item_name: "sha_commit"
    type: "data"
  steps:
    - name: GetCommitDetails
      endpoint: "{{%BASE_URL%}}/repos/RiveryIO/rivery-connector-executor/commits/{{%sha_commit%}}"
      method: GET
      type: rest
      variables_output:
        - response_location: "data"
          variable_name: "final_output_file"
          variable_format: "json"

Nested Loops Example

steps:
- name: LoopOverRepositories
  description: "Process each repository"
  loop:
    variable_name: "repositories"
    item_name: "repository"
    type: "data"
  steps:
    - name: GetBranchesForRepository
      endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/branches"
      method: GET
      type: rest
      variables_output:
        - response_location: "data"
          variable_name: "branches"
          variable_format: "json"
    - name: LoopOverBranches
      loop:
        variable_name: "branches"
        item_name: "branch"
        type: "data"
      steps:
        - name: GetCommitsForBranch
          endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/commits?sha={{%branch%}}"
          method: GET
          type: rest
          variables_output:
            - response_location: "data"
              variable_name: "sha_commits"
              variable_format: "json"

Retry Strategy

steps:
  - name: FetchPosts
    description: "Fetch all posts from the API"
    endpoint: "{{%BASE_URL%}}/posts"
    http_method: GET
    expected_status_codes:
      - 200
    retry_strategy:
      400:
        max_attempts: 2
      429:
        max_attempts: 5
  • Automatically retries the request 2 times for 400 errors and 5 times for 429 errors.

Here’s a full YAML example that demonstrates a multi-loop setup (nested loops) in Rivery, integrating all the components described earlier, including authentication, pagination, and output handling.


Full Multi-Loop YAML Example

This example retrieves a list of repositories, iterates over their branches, and fetches the commits for each branch.

interface_params:
  section:
    source:
      - name: "connectToAPI"
        type: "authentication"
        auth_type: "basic_http"
        fields:
          - name: "username"
            type: "string"
            value: "your_username"
          - name: "password"
            type: "string"
            value: "your_password"

connector:
  name: GitHubMultiLoop
  type: rest
  base_url: "https://api.github.com"
  default_headers:
    Authorization: "Basic {YOUR_AUTH}"
    User-Agent: "Rivery-GitHub-Connector"
  variables_storages:
    - name: results_memory
      type: memory
    - name: results_dir
      type: file_system
      path: storage/results/filesystem
  variables_metadata:
    repositories:
      storage_name: results_memory
      format: json
    branches:
      storage_name: results_memory
      format: json
    sha_commits:
      storage_name: results_memory
      format: json
    final_output_file:
      storage_name: results_dir
      format: json

steps:
  - name: FetchRepositories
    description: "Retrieve all repositories for the user"
    endpoint: "{{%BASE_URL%}}/user/repos"
    http_method: GET
    type: rest
    variables_output:
      - variable_name: repositories
        response_location: data
        variable_format: json
        transformation_layers:
          - type: extract_json
            json_path: $.[*].full_name
            from_type: json
        overwrite_storage: true

  - name: LoopOverRepositories
    description: "Loop through each repository"
    loop:
      variable_name: "repositories"
      item_name: "repository"
      type: "data"
    steps:
      - name: FetchBranches
        description: "Fetch branches for a repository"
        endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/branches"
        http_method: GET
        type: rest
        variables_output:
          - variable_name: branches
            response_location: data
            variable_format: json
            transformation_layers:
              - type: extract_json
                json_path: $.[*].name
                from_type: json
            overwrite_storage: true

      - name: LoopOverBranches
        description: "Loop through branches for a repository"
        loop:
          variable_name: "branches"
          item_name: "branch"
          type: "data"
        steps:
          - name: FetchCommits
            description: "Fetch commits for each branch"
            endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/commits?sha={{%branch%}}"
            http_method: GET
            type: rest
            variables_output:
              - variable_name: sha_commits
                response_location: data
                variable_format: json
                transformation_layers:
                  - type: extract_json
                    json_path: $.[*].commit
                    from_type: json
                overwrite_storage: false
              - variable_name: final_output_file
                response_location: data
                variable_format: json
                overwrite_storage: true

Explanation of the Multi-Loop YAML

1. Authentication

  • Uses basic_http authentication for GitHub API access.
  • Username and password (or token) are configured in the interface_params section.

2. Steps

  1. FetchRepositories:

    • Retrieves the list of repositories associated with the user account.
    • Extracts the full_name of each repository using the JSON path.
  2. LoopOverRepositories:

    • Iterates over the list of repositories and processes each one individually.
  3. FetchBranches:

    • Fetches all branches for the current repository.
    • Extracts branch names.
  4. LoopOverBranches:

    • Iterates over each branch and processes it.
  5. FetchCommits:

    • Retrieves commits for the current repository and branch.
    • Extracts commit details (commit) and outputs the results to sha_commits and final_output_file.

3. Storage

  • Results are stored in:
    • results_memory for temporary variables (repositories, branches, sha_commits).
    • results_dir for saving the final output file.

4. Output Handling

  • Uses transformation layers to extract specific data:
    • $.repositories[*] for repositories.
    • $.branches[*].name for branches.
    • $.commits[*].commit for commit details.

How It Works

  1. Fetch all repositories for the user.
  2. Loop over each repository to fetch its branches.
  3. Loop over each branch to fetch commits.
  4. Save the final output containing all commits for all branches in all repositories.

This YAML provides a robust, end-to-end example of nested loops to retrieve data across multiple API endpoints while dynamically handling variable values. It is ideal for workflows requiring multi-level pagination and iterative data processing.

Summary

  1. Connector Configuration: Set up authentication, base URLs, and storage.
  2. Steps: Build pagination, loops, and retry mechanisms.
  3. Single and Nested Loops: Dynamically process data across multiple levels.
  4. Retry Strategies: Ensure workflows recover gracefully from API errors.

This guide equips you with the tools to configure a robust and scalable YAML-based connector for complex API workflows.


Was this article helpful?

What's Next