10-11-2023 07:00 AM
Hi,
We are replicating between SQL Server and Snowflake using HVR 6 / LDP.
One consideration impacting reporting against both environments is the different behaviour when it comes to handling varchar data with trailing spaces.
SQL Server (intuitively..?) trims/ignore trailing spaces when performing JOIN/WHERE clause operations. Whereas, Snowflake is more strict on comparisons and will include SPACEs as valid data to compare etc.
The "right-thing" is to fix the data in source - but this will not a be quick task - and will impact deadlines etc.
Spotted the "TrimWhiteSpace" table property action - which we could apply on the TARGET (Snowflake) location for ALL tables etc.
Does anybody have experiences to share on this feature? Initial concerns around performance and impact against integration would be my initial reaction..
Cheers
Terry
Solved! Go to Solution.
10-11-2023 09:34 AM
Hi Terry,
Every function will have a performance impact, but I believe this will have the lowest. Otherwise you can ofcourse add the functions to the join statement with probably has a higher impact. If you handle it during replication you have to pay the price only once.
Cheers,
Richard
10-11-2023 08:33 AM
Hi Terry,
What do you want to accomplish with the trimwhitespaces? Do you want to remove the trailing spaces for the compare functionality?
I would say it makes mostly your life easier. The other solution would be to create integrate and capture expressions. Which is more cumbersome and requires the concept of context.
Although I didn;t tested it I would say that the removal of trailing whitespaces will be minimal it is only a function on the value and will not force the database to do full table scans or ordering which are costly operations on a database.
Cheers,
Richard
10-11-2023 08:40 AM
Hey Richard, thanks for response.
The main issue is that Snowflake reports are producing incorrect results with the same SQL statements run against SQL Server (source). We are finding more examples as testing is progressing!
The reason - is trailing spaces in the data - as Snowflake will include spaces in comparisons.
We could try and change the Snowflake queries with TRIM commands etc - but that will impact query performance.
The 'TrimWhiteSpace' option looked like a good "catch all" option - but assume there will be a performance cost to pay.
Thanks
Terry
10-11-2023 09:34 AM
Hi Terry,
Every function will have a performance impact, but I believe this will have the lowest. Otherwise you can ofcourse add the functions to the join statement with probably has a higher impact. If you handle it during replication you have to pay the price only once.
Cheers,
Richard
10-17-2023 12:06 PM
Hi,
Thought I would share my findings.... We ran some Refresh tests against a test table ( ~100 mil rows) loading into Snowflake. We ran repeated refreshes, with settings :-
Hope this is useful.
Cheers
Terry