Fusion static analysis is getting a new default: baseline #1404
dataders
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Based on extensive community feedback from dbt Fusion's beta and preview periods, we're changing how static analysis works to optimize for the first-run experience when an existing dbt project upgrades to Fusion.
We're introducing a new level of static analysis called
baseline, which will become the new default. In baseline mode, Fusion still validates your SQL, but it raises issues as warnings instead of blocking errors, so your runs can complete even if Fusion detects problems that would have stopped execution in strict mode (the previous default).This is especially helpful when upgrading from dbt Core to Fusion for the first time: if your project already runs successfully in production,
baselinelets you transition smoothly without the new static analysis checks unexpectedly breaking established workflows.Now, teams can start getting benefits right away, then adopt stricter validation incrementally over time, which is conceptually similar to what some languages call "gradual typing".
If you want to go deeper on what a SQL compiler even is: check out this blog: The Levels of SQL Comprehension.
TL;DR
What's changing?
The
static_analysisconfig is evolving to the following set of modes:baselinestrict(formerlyon)offunsafeWhat do I need to do?
Update the
static_analysisconfig on your models:offbaselineonstrictunsafestrictstrictontobaseline, your project will drop to baseline and lose some DevEx benefits (see below table).Why baseline?
When we first shipped Fusion, we treated a clean compile with
static_analysis: strict(formerlyon) as the entry point for most of Fusion's new features and experiences. In practice, many real-world dbt projects that run successfully in dbt Core don't meet that bar on day one.Some scenarios that consistently cause strict compilation to fail out of the gate:
SELECTfrom all upstream sourcesOur updated goal is to make Fusion upgrades feel smooth for existing dbt Core projects, while still providing a clear path to adopt stronger guarantees over time. That's why we're introducing
static_analysis: baselineas the new default.What is baseline mode?
The below-linked docs page is a great resource, but it's worth going over the key concepts here.
Baseline mode is a less strict level of static analysis that:
strictAdditionally, it:
Feature availability by
static_analysisconfigOther things worth noting
Introspection consequences
Previously, the system assumed local schemas of your compiled models would be available, since users were compiling with
strict. In baseline, we can no longer assume the full local schema is available and complete, so baseline uses the remote database as the source of truth—similar to dbt Core.The practical result is that the Fusion compiler may sometimes flag incorrect queries resulting from introspective queries that come back empty. If you encounter this, you can:
warn_error_optionsto disable the warningFor example, consider this query using the
dbt_utils.unpivotmacro:If the introspection query fails or returns no results, this renders to:
This is invalid SQL and would normally produce a static analysis error. However, in baseline mode, the error is downgraded to a warning (see below). This behavior allows your project to continue running while still alerting you to potential issues with introspective queries.
Downloading of source schema
When
static_analysis: onwas the default, Fusion would automatically download the column types of all your source tables. Doing so gave Fusion the ability to:In baseline mode, Fusion will not automatically download source schemas. This improves compile performance and reduces complexity, but at the expense of the above features.
We're exploring ways to reintroduce this metadata in baseline without bringing back the complexity of automatic schema downloads (see "What's next" later in this doc).
Unit tests
Unit tests now behave identically to dbt Core in a few ways:
static_analysisisofforbaselinecompiledoes not render a unit test's SQL intotarget/What's next? We want your feedback!
Beyond "any questions?" here are some areas we're thinking about:
Beyond asking the warehouse, how might Fusion get column data type information?
First and foremost, we intend to add column-level lineage analysis to baseline mode in the near future. We know how we'll do this technically, but how it should behave and surface diagnostics when results are incomplete is where we need your input.
With automatic source schema downloads disabled in baseline, we don't have column names and types for sources at compile time. Given that, we see two paths to reintroduce this metadata:
select *projections are opaque, so we can't guarantee 100% coverage. When we tested this on Jaffle Shop, about 70% of columns were successfully deduced. The open question: would you want best-effort analysis done automatically, or explicitly opted into?Warnings or errors
Once source schemas are back in place for baseline, we can layer on gradual schema and type analysis, which raises the question:
Is configuring on the node enough?
We also intend to introduce
warn_error_optionsconfigurations into Fusion within the next month, which makes me think:Lastly, today you'd need to apply
static_analysisacross all node types in yourdbt_project.yml(e.g.models,seeds,analyses, etc.). Should we introduce a top-levelstatic_analysiskey? For example:Beyond that, could you imagine having more fine-tuned control over how your static analysis checks work? Is that something you're interested in?
Beta Was this translation helpful? Give feedback.
All reactions