Skip to content

[Bug] Casing not preserved in seeds #178

@gwenwindflower

Description

@gwenwindflower

Is this a new bug?

  • I believe this is a new bug
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

dbt appears to successfully create new seeds from case sensitive file names e.g. mySeed but then errors with an 'approximate match' error later on because the seed created actually follows the default casing of the warehouse (lower on Redshift, UPPER on Snowflake, etc). Having a cased seed in your project essentially results in an unclear error if you're using an adapter that defaults to casing one way or the other.

Expected Behavior

I would expect one of two things:

  1. dbt throws an error or warning about cased seeds
  2. dbt preserves the casing of the seed

Steps To Reproduce

  1. On Redshift or Snowflake (any adapter that has a default lower or UPPER casing policy will do)
  2. Create a seed called 'mySeed.csv' with whatever data
  3. dbt seed.
  4. dbt seed again or dbt build, anything that would involve referencing that seed.

Relevant log output

/home/circleci/project
[0m14:35:31  [[33mWARNING[0m]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
[0m14:35:31  Running with dbt=1.8.0-b2
[0m14:35:31  Installing ../
[0m14:35:31  Installed from <local @ ../>
[0m14:35:31  Installing dbt-labs/dbt_utils
[0m14:35:31  Installed from version 1.1.1
[0m14:35:31  Up to date!
[0m14:35:33  [[33mWARNING[0m]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
[0m14:35:33  Running with dbt=1.8.0-b2
[0m14:35:33  Registered adapter: redshift=1.8.0-b2
[0m14:35:33  Unable to do partial parsing because config vars, config profile, or config target have changed
[0m14:35:33  Unable to do partial parsing because env vars used in profiles.yml have changed
[0m14:35:33  Unable to do partial parsing because a project dependency has been added
[0m14:35:36  Found 6 models, 23 data tests, 4 seeds, 2 sources, 617 macros
[0m14:35:58  [[33mWARNING[0m]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
[0m14:35:58  Running with dbt=1.8.0-b2
[0m14:35:59  Registered adapter: redshift=1.8.0-b2
[0m14:35:59  Found 6 models, 23 data tests, 4 seeds, 2 sources, 617 macros
[0m14:35:59  
[0m14:36:01  Concurrency: 1 threads (target='redshift')
[0m14:36:01  
[0m14:36:01  1 of 4 START seed file codegen_integration_tests_redshift_Raw_Data_Case_Sensitive.data__Case_Sensitive  [RUN]
[0m14:36:01  1 of 4 ERROR loading seed file codegen_integration_tests_redshift_Raw_Data_Case_Sensitive.data__Case_Sensitive  [[31mERROR[0m in 0.03s]
[0m14:36:01  2 of 4 START seed file codegen_integration_tests_redshift_raw_data.data__a_relation  [RUN]
[0m14:36:02  2 of 4 OK loaded seed file codegen_integration_tests_redshift_raw_data.data__a_relation  [[32mCREATE 2[0m in 0.70s]
[0m14:36:02  3 of 4 START seed file codegen_integration_tests_redshift_raw_data.data__b_relation  [RUN]
[0m14:36:02  3 of 4 OK loaded seed file codegen_integration_tests_redshift_raw_data.data__b_relation  [[32mCREATE 2[0m in 0.60s]
[0m14:36:02  4 of 4 START seed file codegen_integration_tests_redshift_raw_data.data__campaign_analytics  [RUN]
[0m14:36:03  4 of 4 OK loaded seed file codegen_integration_tests_redshift_raw_data.data__campaign_analytics  [[32mCREATE 3[0m in 0.63s]
[0m14:36:03  
[0m14:36:03  Finished running 4 seeds in 0 hours 0 minutes and 3.81 seconds (3.81s).
[0m14:36:03  
[0m14:36:03  [31mCompleted with 1 error and 0 warnings:[0m
[0m14:36:03  
[0m14:36:03    Compilation Error in seed data__Case_Sensitive (seeds/data__Case_Sensitive.csv)
  When searching for a relation, dbt found an approximate match. Instead of guessing 
  which relation to use, dbt will move on. Please delete "ci"."codegen_integration_tests_redshift_raw_data_case_sensitive"."data__case_sensitive", or rename it to be less ambiguous.
  Searched for: "ci"."codegen_integration_tests_redshift_Raw_Data_Case_Sensitive"."data__Case_Sensitive"
  Found: "ci"."codegen_integration_tests_redshift_raw_data_case_sensitive"."data__case_sensitive"
  
  > in macro materialization_seed_default (macros/materializations/seeds/seed.sql)
  > called by seed data__Case_Sensitive (seeds/data__Case_Sensitive.csv)
[0m14:36:03  
[0m14:36:03  Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4

Environment

- OS: Alpine in Docker 
- Python: 3.9.9
- dbt-adapter: redshift=1.8.0-b2

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature:quotingIssues related to dbt's quoting behaviortype:bugSomething isn't working as documented

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions