02-12-2024 11:03 AM
Hi Community,
I'm using Fivetran to sync data from my Snowflake data warehouse back into our Postgres application database - here's the complete life cycle of the data:
Postgres -> Fivetran -> Snowflake (ingestion db) -> dbt -> Snowflake (reporting db) -> Fivetran -> Postgres (different/dedicated schema)
The use case here is that we're cleaning up and aggregating data from various sources (including our Postgres database) using dbt, and then pushing it back into Postgres purely for reporting.
The data in my destination tables appears to be always replaced by fresh data vs. the rows being updated. I end up with a very large number of rows with `_fivetran_deleted = false`.
I also noticed that my destination tables all have a `_fivetran_id` column, which is strange because the source dbt models have a `unique_key` configured in their configuration.
Can you please point me in the right direction to troubleshoot?
Solved! Go to Solution.
02-12-2024 03:54 PM
Hi @george_durzi ,
I think I am tracking. Here is what I would do if I was you.
Once your Postgres data lands in Snowflake (ingest db), make sure to use dbt's contracts (https://docs.getdbt.com/reference/resource-configs/contract) and constraints (https://docs.getdbt.com/reference/resource-properties/constraints) in order to actually transform your raw data and assign primary keys to the tables that land in your reporting db.
Fivetran is assigning a unique key b/c it doesn't see one (https://fivetran.com/docs/databases/snowflake#tableswithaprimarykey). You can confirm if there is actually a key or not by running a describe query on your output models
desc table database.schema_something.table_name;
Here is an example config you can add to a schema.yml file:
- name: pk_table_test_two
config:
contract:
enforced: true
columns:
- name: id
data_type: integer
description: hello
constraints:
- type: not_null
- type: primary_key
tests:
- unique
- not_null
- name: other_column_name
data_type: date
02-12-2024 12:16 PM
Hi @george_durzi ,
When you say 'destination tables' - which destination are you referring to? I guess I would assume the snowflake reporting db?
Do you mind sharing the logic of your models that aren't giving you expected results?
Thanks
02-12-2024 12:31 PM
Hi @jacklowery thanks for your response.
I can see how that isn't clear. By "destination tables", I'm referring to the tables that get created in Postgres.
I have 2 Fivetran connectors:
There's nothing special about the models, here's an example:
{{ config (unique_key = 'id') }}
select a.*
from {{ref('stg_attorney_all')}} a
In this case, `stg_attorney_all` is an intermediate model doing a select from my `attorney` table and doing some cleanup / formatting.
02-12-2024 03:54 PM
Hi @george_durzi ,
I think I am tracking. Here is what I would do if I was you.
Once your Postgres data lands in Snowflake (ingest db), make sure to use dbt's contracts (https://docs.getdbt.com/reference/resource-configs/contract) and constraints (https://docs.getdbt.com/reference/resource-properties/constraints) in order to actually transform your raw data and assign primary keys to the tables that land in your reporting db.
Fivetran is assigning a unique key b/c it doesn't see one (https://fivetran.com/docs/databases/snowflake#tableswithaprimarykey). You can confirm if there is actually a key or not by running a describe query on your output models
desc table database.schema_something.table_name;
Here is an example config you can add to a schema.yml file:
- name: pk_table_test_two
config:
contract:
enforced: true
columns:
- name: id
data_type: integer
description: hello
constraints:
- type: not_null
- type: primary_key
tests:
- unique
- not_null
- name: other_column_name
data_type: date
02-13-2024 08:40 AM
Thank you so much @jacklowery, that did the trick.
The only thing I'd add is that I was prompted to add `warn_unenforced: false` on this constraint to suppress the following warning.
[WARNING]: The constraint type primary_key is not enforced by
snowflake. The constraint will be included in this model's DDL statement, but it
will not guarantee anything about the underlying data. Set 'warn_unenforced:
false' on this constraint to ignore this warning.
I struggled to find the correct syntax for `warn_unenforced: false`, so here's a working example:
models:
- name: master_attorney
config:
contract:
enforced: true
columns:
- name: attorney_id
constraints:
- type: not_null
- type: primary_key
warn_unenforced: false