Fivetran Community

adamrees · ‎04-21-2023

Subject

Automate the deployment of pipelines across multiple environments following CI/CD best practices using the Fivetran API.

Multi-environment setup: Dev > Staging > Prod

A multi-environment setup is typical for an efficient deployment process. These isolated environments often include three or more stages. [1]

Development: Developers will use this to first deploy their code and test any newly created features. Whatever issues and bugs are found are addressed before retesting until the code is ready for the next stage of testing.
Staging: Also referred to as a QA or a test environment. Developers often use this as an area where manual and automated tests are run. The most rigorous testing should be done here to ensure that it is a stable product. Any bugs or enhancements are reported back to developers for resolution. The QA environment can be complex and is used as a mixing ground for reviewing changes. Once it is deemed stable, it can be moved to the next line.
Production: This is where end users will be able to interact with the product and value is created for customers/business. Production is widely available and impacts the company’s reputation and brand name. The production environment should be as free of issues and bugs as possible.

Multi-environment Setup Components & Common Scenarios:

Data Sources

Production data copies
Read Replicas
Sandboxes (Input random data)

Organization Adoption

A desire to test Fivetran features, History Mode, schema drift, DDL changes, etc.
Multiple parties needing confidence in the Product
To set your organization up to scale
A workflow that allows the org. to feed Fivetran
People in control of specific aspects of the MDS

Warehouse Segmentation

Same source but you want to test in different locations in your destination
Possibly needed based on Warehouse user roles
Good for building Analytical models (example from BI Tool Looker)

Key Benefits:

Reliable testing data for developers.
Standardize development practice across multiple departments (Marketing, Finance, Ops, IT, etc.).
Promotes RBAC across the Fivetran account.
Process consolidation and transparency through the Fivetran UI, and the Fivetran Log connector.

Proposed Scenarios:

We have several workflows that customers have used to set up a Multi-Stage environment. Here are the most common ways customers approach this.

Promote + Parallel
Promote + Pause
Sandbox + Pause

The following proposals can all be implemented using a single Fivetran account.

Resources: API Interactive Resources | Postman Collection | Python | New Connector Free Use Period

Promote + Parallel:

Organizations utilizing a Multi-stage environment setup to keep a staging environment fresh with production data can orchestrate that synchronization via the Fivetran API. Developers can test their solutions in an isolated environment with simulated production loads before risking any impact on a production system. Following Continuous Delivery best practices, this process will amount to X connectors running in parallel across Y destinations in Fivetran. Here is an example that breaks out the necessary steps with code snippets.

How to set up:

Create and Configure a connector in your Staging environment (Vital Data Source)
Sync schedule: Every day or once a week at the close of business.
Tables: Limited and pertinent to 80% of developer projects.

How to promote:

Clone the staging connector configuration.

#connector to mirror
mu = "https://api.fivetran.com/v1/connectors/" 
session = requests.Session()
u_0 = mu + "{}"
response =session.get(url=u_0.format(connectorId), auth=a).json()
data_list = response['data']
#validate connector data to migrate
#print(data_list)

POST the new connector to the Production destination.
Sync schedule: Every 5 minutes to once every 24 hours. All end-user requirements must be met.
Tables: 100% of downstream analytics, customer-facing, and internal solution data.

#format response data to create connector
 c = {"service": data_list['service'],
            "group_id": d,
            "trust_certificates": "true",
            "run_setup_tests": "true",
            "paused": "false",
		etc.
          }         
#create the connector and review the results               
x = requests.post(u_1,auth=a,json=c)
z = x.json()
resp = z['data']
print(x.text + " ***Connector Created***")

Promote + Parallel Benefits:

Access to reliable data for impact analysis and bug fixes will expedite the developer’s time to value.
Standardize impact analysis across every data source.
Transparency into the cost of each environment down to the table level via the Fivetran Logs.

Promote + Pause:

Following the best practices of continuous integration, the steps for this process are to set up a connector to sync to a developmental landing destination, test to ensure that it works correctly, ‘promote’ the source to a production destination, and pause the developmental source. The idea is to have just one production source syncing through Fivetran at a time but you are able to first analyze the pipeline before syncing to production.

How to Set Up:

Create and Configure a connector in your Staging environment (Vital Data Source).
Fine-tune which tables you are interested in syncing.
This connector can also be created manually within the Fivetran UI.

How to Promote:

This will be a one-off promotion using the API to capture and store the current configuration then use that information to create a new connector to your production destination. Here is a snippet of code to use.

#connector to mirror
mu = "https://api.fivetran.com/v1/connectors/" 
session = requests.Session()
u_0 = mu + "{}"
response =session.get(url=u_0.format(connectorId), auth=a).json()
data_list = response['data']
#validate connector data to migrate
#print(data_list)

#format response data to create connector
 c = {"service": data_list['service'],
            "group_id": d,
            "trust_certificates": "true",
            "run_setup_tests": "true",
            "paused": "false",
		etc.
          }         
#create the connector and review the results               
x = requests.post(u_1,auth=a,json=c)
z = x.json()
resp = z['data']
print(x.text + " ***Connector Created***")

Promote + Pause Benefits:

Ability to safely test the data in a separate location before syncing to production.
Does not require duplicate MAR with a continuous testing source.
Automation potential via the API.

Sandbox + Pause:

Similar to the scenario detailed above, this process will be following continuous integration. The steps include testing with a developmental source, pausing, then connecting to the main production system. With a dev source, whether that is a database with sample data or a sandbox application, you can see how Fivetran handles schema changes and data processing before connecting to the main production source.

How to Set Up:

Create and Configure a connector in your Staging environment.
This connector can also be created manually within the Fivetran UI.

How to Promote:

Since this connector will be using a different source, you will not have a direct promotion.
You can automatically pause the connector after the 14 day free trial completes through the API. This can be done either during connector creation or later through connector modification by setting the `pause_after_trial` variable to be true.
You can then create your production connector once you are satisfied with the testing connectors.

Sandbox + Pause Benefits:

Separate source allows for more testing on the source side without concerns about impacting the production system.
Ability to pause test connectors after the trial avoids usage of MAR for testing resources.

How to Monitor your Multi-stage environment setup:

Once you have set up a multi-staging environment, you may want to have additional monitoring on your connectors.

Account level Fivetran Log Connector
MAR usage, Schema Changelog and audit tables are available via Quickstart data models.
Fivetran log sample queries.
Calculate MAR by table.

Written by Adam Rees and Elijah Davis with assistance from Srinivas Belide

Fivetran Community

Multi-environment Deployment Framework

Subject

Multi-environment setup: Dev > Staging > Prod

Multi-environment Setup Components & Common Scenarios:

Data Sources

Organization Adoption

Warehouse Segmentation

Key Benefits:

Proposed Scenarios:

Promote + Parallel:

How to set up:

How to promote:

Promote + Parallel Benefits:

Promote + Pause:

How to Set Up:

How to Promote:

Promote + Pause Benefits:

Sandbox + Pause:

How to Set Up:

How to Promote:

Sandbox + Pause Benefits:

How to Monitor your Multi-stage environment setup: