Hi Guys,
First of all, thank you for creating this package. Your efforts are much appreciated.
I noticed the hash adapter converts the md5 bytes into a varchar:
{% macro sqlserver__hash(field) %}
convert(varchar(50), hashbytes('md5', {{field}}), 2)
{% endmacro %}
If we leave the hash as varbinary, it uses only 16 bytes of data, but as varchar it uses 32. I suspect it's probably also slower to do joins on 32 character varchar than on a varbinary.
Would it be possible to take out that conversion?
I have been using varbinary in my own adapter for several months without problems. The only issue is that Power Bi doesn't support relationships on varbinary columns.
To solve that problem I created a separate macro that does the conversion when it's needed:
{% macro cast_hash_to_str(col) -%}
convert(varchar(32), {{ col }}, 2)
{%- endmacro %}
And in my final model that is being consumed by PBI I just do something like...
final as (
select
{{ cast_hash_to_str('url_key') }} as UrlKey,
...
from whatever
)
This way the hashes use less space in the DB, perform faster joins and when necessary, such as in Power Bi they are converted to strings.
Hi Guys,
First of all, thank you for creating this package. Your efforts are much appreciated.
I noticed the hash adapter converts the md5 bytes into a varchar:
If we leave the hash as varbinary, it uses only 16 bytes of data, but as varchar it uses 32. I suspect it's probably also slower to do joins on 32 character varchar than on a varbinary.
Would it be possible to take out that conversion?
I have been using varbinary in my own adapter for several months without problems. The only issue is that Power Bi doesn't support relationships on varbinary columns.
To solve that problem I created a separate macro that does the conversion when it's needed:
And in my final model that is being consumed by PBI I just do something like...
This way the hashes use less space in the DB, perform faster joins and when necessary, such as in Power Bi they are converted to strings.