cancel
Showing results for 
Search instead for 
Did you mean: 

dbt models have unique_key configured but destination tables have _fivetran_id and many deleted rows

george_durzi
New Contributor II

Hi Community, 

I'm using Fivetran to sync data from my Snowflake data warehouse back into our Postgres application database - here's the complete life cycle of the data:

Postgres -> Fivetran -> Snowflake (ingestion db) -> dbt -> Snowflake (reporting db) -> Fivetran -> Postgres (different/dedicated schema) 

The use case here is that we're cleaning up and aggregating data from various sources (including our Postgres database) using dbt, and then pushing it back into Postgres purely for reporting. 

The data in my destination tables appears to be always replaced by fresh data vs. the rows being updated. I end up with a very large number of rows with `_fivetran_deleted = false`.

I also noticed that my destination tables all have a `_fivetran_id` column, which is strange because the source dbt models have a `unique_key` configured in their configuration. 

Can you please point me in the right direction to troubleshoot? 

1 ACCEPTED SOLUTION

Hi @george_durzi ,

 

I think I am tracking. Here is what I would do if I was you.

Once your Postgres data lands in Snowflake (ingest db), make sure to use dbt's contracts (https://docs.getdbt.com/reference/resource-configs/contract) and constraints (https://docs.getdbt.com/reference/resource-properties/constraints) in order to actually transform your raw data and assign primary keys to the tables that land in your reporting db. 

Fivetran is assigning a unique key b/c it doesn't see one  (https://fivetran.com/docs/databases/snowflake#tableswithaprimarykey). You can confirm if there is actually a key or not by running a describe query on your output models

desc table database.schema_something.table_name;

Here is an example config you can add to a schema.yml file:

  - name: pk_table_test_two
    config:
      contract:
        enforced: true
    columns:
      - name: id
        data_type: integer
        description: hello
        constraints:
          - type: not_null
          - type: primary_key
        tests:
          - unique
          - not_null
      - name: other_column_name
        data_type: date

View solution in original post

5 REPLIES 5

jacklowery
Fivetranner
Fivetranner

Hi @george_durzi ,

When you say 'destination tables' - which destination are you referring to? I guess I would assume the snowflake reporting db?

Do you mind sharing the logic of your models that aren't giving you expected results?

 

Thanks

Hi @jacklowery thanks for your response. 

I can see how that isn't clear. By "destination tables", I'm referring to the tables that get created in Postgres. 

I have 2 Fivetran connectors:

  • Postgres to Snowflake (ingestion)
    • My dbt Cloud project runs automatically after this connector runs, and populates the Snowflake reporting tables
  • Snowflake (reporting) to Postgres 

There's nothing special about the models, here's an example:

{{ config (unique_key = 'id') }}

select      a.*
from        {{ref('stg_attorney_all')}} a

 In this case, `stg_attorney_all` is an intermediate model doing a select from my `attorney` table and doing some cleanup / formatting. 

Hi @george_durzi ,

 

I think I am tracking. Here is what I would do if I was you.

Once your Postgres data lands in Snowflake (ingest db), make sure to use dbt's contracts (https://docs.getdbt.com/reference/resource-configs/contract) and constraints (https://docs.getdbt.com/reference/resource-properties/constraints) in order to actually transform your raw data and assign primary keys to the tables that land in your reporting db. 

Fivetran is assigning a unique key b/c it doesn't see one  (https://fivetran.com/docs/databases/snowflake#tableswithaprimarykey). You can confirm if there is actually a key or not by running a describe query on your output models

desc table database.schema_something.table_name;

Here is an example config you can add to a schema.yml file:

  - name: pk_table_test_two
    config:
      contract:
        enforced: true
    columns:
      - name: id
        data_type: integer
        description: hello
        constraints:
          - type: not_null
          - type: primary_key
        tests:
          - unique
          - not_null
      - name: other_column_name
        data_type: date

Thank you so much @jacklowery, that did the trick.

The only thing I'd add is that I was prompted to add `warn_unenforced: false` on this constraint to suppress the following warning. 

[WARNING]: The constraint type primary_key is not enforced by
snowflake. The constraint will be included in this model's DDL statement, but it
will not guarantee anything about the underlying data. Set 'warn_unenforced:
false' on this constraint to ignore this warning.

I struggled to find the correct syntax for `warn_unenforced: false`, so here's a working example:

models:
  - name: master_attorney
    config:
      contract:
        enforced: true
    columns:
      - name: attorney_id
        constraints:
          - type: not_null
          - type: primary_key
            warn_unenforced: false