- 5 Minutes to read
- Print
- DarkLight
- PDF
Blueprint Components and Configuration
- 5 Minutes to read
- Print
- DarkLight
- PDF
Blueprint and Copilot Now Available in Private Preview
We are excited to announce that our new Copilot and Blueprint features are now available in private preview for a select group of beta customers. This is your chance to explore and test these powerful capabilities before their official release.
If you're interested in joining the beta program, click here to request access.
How to Configure a Connector
This guide provides a step-by-step walkthrough for setting up a YAML source configuration using GitHub's REST API as an example.
Connector Section
Connector Details
connector:
name: GitHubConnector
type: rest
- name: Identifies the connector, e.g.,
GitHubConnector
. - type: Specifies the connector type as
rest
for RESTful API calls.
Base URL
base_url: 'https://api.github.com'
- base_url: The root endpoint for all API requests.
Default Headers
default_headers:
Authorization: 'Basic {YOUR_AUTH}'
X-GitHub-Api-Version: '{YOUR_X_GITHUB_API_VERSION}'
User-Agent: '{YOUR_USER_AGENT}'
- Authorization: Token-based authentication.
- X-GitHub-Api-Version: API version to use.
- User-Agent: Identifies the client making the request.
Variables Storage and Metadata
Variables Storage
variables_storages:
- name: results dir
type: file_system
path: storage/results/filesystem
- name: results memory
type: memory
- results dir: Stores data in the filesystem.
- results memory: Temporary in-memory storage.
Variables Metadata
variables_metadata:
page_number:
storage_name: results memory
format: json
last_result:
storage_name: results memory
format: json
final_output_file:
storage_name: results dir
format: json
- page_number: Tracks the current page for pagination.
- last_result: Holds the last API response data.
- final_output_file: Stores final processed data in a file.
Steps Section
Main Loop Step
- name: WhileLoopOverCommits
description: "Get all commits from a GitHub repository"
type: loop
loop:
type: while
variable_name: page_number
value: 1
while_settings:
operation_to_perform: add
value_to_perform: 1
max_iterations: 10000
break_conditions:
- name: BreakIfOutOfCommits
condition:
type: string_equal
value: "[]"
variable: "{{%last_result%}}"
- Implements a while loop to paginate through commits.
- Break Condition: Stops if
last_result
is empty ("[]"
). - Safety: Prevents infinite loops using
max_iterations
.
Pagination Step
steps:
- name: Pagination
description: Retrieve GitHub repository commits
endpoint: "{{%BASE_URL%}}/repos/{YOUR_REPOSITORY_OWNER}/{YOUR_REPOSITORY}/commits"
http_method: GET
query_params:
page: "{{%page_number%}}"
variables_output:
- variable_name: last_result
response_location: data
variable_format: json
overwrite_storage: true
- variable_name: final_output_file
response_location: data
transformation_layers:
- type: extract_json
json_path: $.[*].commit
from_type: json
- Retrieves commits from a specific GitHub repository.
- Implements pagination using the
page
query parameter.
Single Loop Example
steps:
- name: LoopOverCommits
description: "Process each SHA commit"
loop:
variable_name: "sha_commits"
item_name: "sha_commit"
type: "data"
steps:
- name: GetCommitDetails
endpoint: "{{%BASE_URL%}}/repos/RiveryIO/rivery-connector-executor/commits/{{%sha_commit%}}"
method: GET
type: rest
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
Nested Loops Example
steps:
- name: LoopOverRepositories
description: "Process each repository"
loop:
variable_name: "repositories"
item_name: "repository"
type: "data"
steps:
- name: GetBranchesForRepository
endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/branches"
method: GET
type: rest
variables_output:
- response_location: "data"
variable_name: "branches"
variable_format: "json"
- name: LoopOverBranches
loop:
variable_name: "branches"
item_name: "branch"
type: "data"
steps:
- name: GetCommitsForBranch
endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/commits?sha={{%branch%}}"
method: GET
type: rest
variables_output:
- response_location: "data"
variable_name: "sha_commits"
variable_format: "json"
Retry Strategy
steps:
- name: FetchPosts
description: "Fetch all posts from the API"
endpoint: "{{%BASE_URL%}}/posts"
http_method: GET
expected_status_codes:
- 200
retry_strategy:
400:
max_attempts: 2
429:
max_attempts: 5
- Automatically retries the request 2 times for
400
errors and 5 times for429
errors.
Here’s a full YAML example that demonstrates a multi-loop setup (nested loops) in Rivery, integrating all the components described earlier, including authentication, pagination, and output handling.
Full Multi-Loop YAML Example
This example retrieves a list of repositories, iterates over their branches, and fetches the commits for each branch.
interface_params:
section:
source:
- name: "connectToAPI"
type: "authentication"
auth_type: "basic_http"
fields:
- name: "username"
type: "string"
value: "your_username"
- name: "password"
type: "string"
value: "your_password"
connector:
name: GitHubMultiLoop
type: rest
base_url: "https://api.github.com"
default_headers:
Authorization: "Basic {YOUR_AUTH}"
User-Agent: "Rivery-GitHub-Connector"
variables_storages:
- name: results_memory
type: memory
- name: results_dir
type: file_system
path: storage/results/filesystem
variables_metadata:
repositories:
storage_name: results_memory
format: json
branches:
storage_name: results_memory
format: json
sha_commits:
storage_name: results_memory
format: json
final_output_file:
storage_name: results_dir
format: json
steps:
- name: FetchRepositories
description: "Retrieve all repositories for the user"
endpoint: "{{%BASE_URL%}}/user/repos"
http_method: GET
type: rest
variables_output:
- variable_name: repositories
response_location: data
variable_format: json
transformation_layers:
- type: extract_json
json_path: $.[*].full_name
from_type: json
overwrite_storage: true
- name: LoopOverRepositories
description: "Loop through each repository"
loop:
variable_name: "repositories"
item_name: "repository"
type: "data"
steps:
- name: FetchBranches
description: "Fetch branches for a repository"
endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/branches"
http_method: GET
type: rest
variables_output:
- variable_name: branches
response_location: data
variable_format: json
transformation_layers:
- type: extract_json
json_path: $.[*].name
from_type: json
overwrite_storage: true
- name: LoopOverBranches
description: "Loop through branches for a repository"
loop:
variable_name: "branches"
item_name: "branch"
type: "data"
steps:
- name: FetchCommits
description: "Fetch commits for each branch"
endpoint: "{{%BASE_URL%}}/repos/{{%repository%}}/commits?sha={{%branch%}}"
http_method: GET
type: rest
variables_output:
- variable_name: sha_commits
response_location: data
variable_format: json
transformation_layers:
- type: extract_json
json_path: $.[*].commit
from_type: json
overwrite_storage: false
- variable_name: final_output_file
response_location: data
variable_format: json
overwrite_storage: true
Explanation of the Multi-Loop YAML
1. Authentication
- Uses
basic_http
authentication for GitHub API access. - Username and password (or token) are configured in the
interface_params
section.
2. Steps
FetchRepositories:
- Retrieves the list of repositories associated with the user account.
- Extracts the
full_name
of each repository using the JSON path.
LoopOverRepositories:
- Iterates over the list of repositories and processes each one individually.
FetchBranches:
- Fetches all branches for the current repository.
- Extracts branch names.
LoopOverBranches:
- Iterates over each branch and processes it.
FetchCommits:
- Retrieves commits for the current repository and branch.
- Extracts commit details (
commit
) and outputs the results tosha_commits
andfinal_output_file
.
3. Storage
- Results are stored in:
results_memory
for temporary variables (repositories
,branches
,sha_commits
).results_dir
for saving the final output file.
4. Output Handling
- Uses transformation layers to extract specific data:
$.repositories[*]
for repositories.$.branches[*].name
for branches.$.commits[*].commit
for commit details.
How It Works
- Fetch all repositories for the user.
- Loop over each repository to fetch its branches.
- Loop over each branch to fetch commits.
- Save the final output containing all commits for all branches in all repositories.
This YAML provides a robust, end-to-end example of nested loops to retrieve data across multiple API endpoints while dynamically handling variable values. It is ideal for workflows requiring multi-level pagination and iterative data processing.
Summary
- Connector Configuration: Set up authentication, base URLs, and storage.
- Steps: Build pagination, loops, and retry mechanisms.
- Single and Nested Loops: Dynamically process data across multiple levels.
- Retry Strategies: Ensure workflows recover gracefully from API errors.
This guide equips you with the tools to configure a robust and scalable YAML-based connector for complex API workflows.