-
-
Notifications
You must be signed in to change notification settings - Fork 22
Description
📝 Feature Description
Support DBT Models without database / catalog ( Lakehouse Pattern).
i.e. Support 2 part references . instead of always requiring 3 part naming expected by the code.
🤔 Problem Statement
To support DBT models which are built with Lakehouse Adapter ( FabricSpark) which are of dialect 'spark' . Spark which is well supported by 'sqlglot' libraries.
The issue is current code expects minimum 3 part naming <db/catalog/schema>.
.Without giving 3 part it throws error like 'NoneType' has no attribute 'meta'.
Even after providing default value for database , The Table Linking code for lineage seems to searching for table Catalog.Schema.Table pattern and in our case it is looking for table in DBT manifest for "..". The leading "." is forcing the lineage to be created as '_HARDCODED__REF' even though it already exists in the manifest.json as ,
🚀 Proposed Solution
By applying the 2 fixes the python source code, this can be easily supported.
- To check if DB is null while parsing the schema from manifest.json ( created by DBT docs), default to "default" to avoid the 'NoneType' runtime error
- When linking tables , if the catalog is not provided , remove the leading "." when looking up tables to make sure lineage links to correct tables . Without this fix , it will link to HARDCODED table without any column information
📋 Alternatives
🙋 Contribution Interest
- I’d like to help implement this feature
✅ Additional Context
We appreciate dbt-colibri, its looks quite promising for our use case.