Skip to content

Subquery Partial Wildcard expansion breaks the column lineage path #612

@kkozhakin

Description

@kkozhakin

Describe the bug
CASE statement breaks the table sequence

SQL
Paste the SQL text here. For example:

create temporary table result_table on commit drop as
with t1 as (
    select
        t.*
    from t0 as t
), t2 as (
    select
        t.*
    from t1 as t
),
 t3 as (
    select
        case
            when FALSE
            then f1
            else null
        end as f2,
        t.*
    from t2 as t
), t4 as (
    select
    *
    from t3
),

select
    *
from t4;


INSERT INTO "schema_1"."table_1"
SELECT * FROM "result_table";

To Reproduce
Note here we refer to SQL provided in prior step as stored in a file named test.sql

from sqllineage.runner import LineageRunner
lr_sqlfluff = LineageRunner(
    sql, dialect='greenplum', silent_mode=True
)

for path in lr_sqlfluff.get_column_lineage(False):
    print(" <- ".join(str(col) for col in reversed(path)))
schema_1.table_1.* <- <default>.result_table.* <- t4.* <- t3.*
t2.* <- t1.* <- <default>.t0.*
t3.f1 <- t2.f1
t3.f2 <- t2.f1

Expected behavior
A clear and concise description of what you expected to happen, and the output in accordance with the To Reproduce section.

schema_1.table_1.* <- <default>.result_table.* <- t4.* <- t3.*<-t2.* <- t1.* <- <default>.t0.*
t3.f1 <- t2.f1
t3.f2 <- t2.f1

Python version (available via python --version)

  • 3.10

SQLLineage version (available via sqllineage --version):

  • 1.5.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions