Custom File Zone
  • 4 Minutes to read
  • Dark
    Light
  • PDF

Custom File Zone

  • Dark
    Light
  • PDF

Article Summary

Introduction

A Custom File Zone is a specialized data storage configuration that allows organizations to manage and store their data in a highly customizable manner. In contrast to the default option provided by Rivery, which involves using the Managed File Zone, requiring no setup process, the Custom File Zone empowers organizations to have greater control over where and how their data is stored.

Additionally, organizations retain the authority to define their own data retention policies for information stored within the Custom File Zone. In contrast, Rivery's default Managed File Zone retains data for a fixed 48-hour duration.

Rivery provides the capability to oversee your data using your own S3 or Azure service. Your data will be securely stored within your designated bucket. A bucket essentially functions as a repository for objects, with each object representing a file along with its associated metadata.


Prerequisites


Amazon S3 Bucket

Create a Bucket

A bucket is an object container. To store data in Amazon S3, you must first create a bucket and specify a bucket name as well as an AWS Region. Then you upload your data as objects to that bucket in Amazon S3. Each object has a key (or key name) that serves as the object's unique identifier within the bucket.
Let's begin by logging into AWS and searching for Buckets:

Note:
This is a tour of the console. Please hover over the rippling dots and read the notes attached to follow through.

Add a Policy

A bucket policy is a resource-based policy that allows you to grant access permissions to your bucket and the objects contained within it.
Now that you've created a bucket, let's create a policy to grant the necessary permissions.

Please Note:
Make sure to repleace < RiveryFileZoneBucket > with the name of your S3 bucket.


Here's the policy's code:

{
 "Version":"2012-10-17",
 "Statement":[
   {
    "Sid":"RiveryManageFZBucket",
    "Effect":"Allow",
    "Action":[
    "s3:GetBucketCORS",
    "s3:ListBucket",
    "s3:GetBucketLocation"
     ],
    "Resource":"arn:aws:s3:::<RiveryFileZoneBucket>"
   },
   {
    "Sid":"RiveryManageFZObjects",
    "Effect":"Allow",
    "Action":[
      "s3:ReplicateObject",
      "s3:PutObject",
      "s3:GetObjectAcl",
      "s3:GetObject",
      "s3:PutObjectVersionAcl",
      "s3:PutObjectAcl",
      "s3:ListMultipartUploadParts"],
    "Resource":"arn:aws:s3:::<RiveryFileZoneBucket>/*"
  },
  {
     "Sid":"RiveryHeadBucketsAndGetLists",
     "Effect":"Allow",
     "Action":"s3:ListAllMyBuckets",
     "Resource":"*"
  }
 ]
}

Create a Rivery User in AWS

Now, in order to connect to the Amazon S3 Source and Target (described in the following section) in Rivery console, you must first create an AWS Rivery user:

Connect to Amazon S3

To connect to Amazon S3, see our Amazon S3 Connection documentation.

Once you've finished creating the bucket and connecting to Amazon S3 in Rivery, continue to 'Configure Custom FileZone in Rivery' down below.


Azure Blob Storage Container

  1. Follow the Microsoft documentation to create a Standard Azure Account.
    Please Note: Only Standard Azure accounts can use Azure Blob Storage Containers (Custom Filezones) with Rivery, Make sure to choose Standard in the Performance section.

    image.png

  2. Make sure all of the settings are correct before clicking Create. The creation of a Blob account may take a few minutes.

image.png

  1. Click on Go to resource.
  2. Choose Containers (Alternatively, scroll down the main menu to Blob Service and select Containers).

image.png

  1. In the upper left corner, click on +Containers.
  2. Give the container a Name.
  3. From the Public access level drop-down menu, select Container.

image.png

  1. Click Ok.

  2. Go to Access Keys in the storage account menu.

  3. Copy and save your keys.
    When connecting to Azure Blob Storage in Rivery, this will be used.

Connect to Azure Blob Storage

  1. Type in your Connection Name.
  2. Fill out the Account Name and Account Key.
  3. Enter your SAS Token
    Note: This is mandatory for using a Blob Storage as a Custom FileZone (it is optional only for Blob Storage as a source).
    To create a SAS Token, consult the Microsoft documentation.
  4. Use the Test Connection function to see if your connection is up to the task.
    If the connection succeeded, you can now use this connection in Rivery.

image.png

Once you've finished creating the container and connecting to Azure Blob storage in Rivery, continue to 'Configure Custom FileZone in Rivery' down below.


Configure Custom File Zone in Rivery

  1. Go to Connections -> +New Connection and search for your Target warehouse.

  2. Type in your Connection Details and Credentials.

  3. Toggle the Custom File Zone to true.

  4. Pick your cloud storage option.

  5. By clicking on File Zone Connection, you can select the previously configured File Zone Connection.

  6. Choose a Default Bucket (Container) from the drop-down list.

  7. Use the Test Connection function to see if your connection is up to the task.
    If the connection was successful, click Save.

image.png

Please Note:

  • Snowflake and Databricks provide support for custom file zone in both Amazon S3 and Azure Blob Storage.
  • When dealing with Databricks, it is necessary to include specific statements to enable Azure blob storage to function as a customized file zone. Please incorporate the subsequent configuration into the SQL Warehouse within Databricks:
spark.hadoop.fs.azure.account.auth.type.rivery.dfs.core.windows.net SAS
spark.hadoop.fs.azure.sas.token.provider.type.rivery.dfs.core.windows.net 
org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider

Was this article helpful?