Skip to content

Conversation

@lennartkats-db
Copy link
Contributor

Changes

This PR prepares a generic default template that I want to use as the basis for default-python, lakeflow-pipelines, and (likely) default-sql:

  • This template revises and replaces lakeflow-pipelines
  • The template now uses an "src" layout with job/pipeline environments pointing to the directory of the pyproject.toml file
  • Jobs in this template now pass the catalog and schema parameters since we ask a question about catalog/schema in the template

To support the notion of a "generic" template, the template schema format now supports a template_dir argument. This allows us to have multiple databricks_template_schema.json files that point to one template directory.

Out of scope: this PR does not yet update default-python. For early testing purposes, an early version is available as experimental-default-python. To keep the diff cleaner, I removed acceptance tests for this template; that's something for a follow-up PR.

Why

  • We want to follow the Lakeflow conventions in templates
  • Our templates are hard to maintain and have many inconsistencies. I'd like to move to one shared template for at least default-python and the Lakeflow template.

Tests

  • Standard template testing methodology.

@lennartkats-db lennartkats-db changed the title [DRAFT] Prepare a new "generic" template, use "src layout" for Lakeflow template Prepare a new "generic" template, use "src layout" for Lakeflow template Sep 29, 2025
Comment on lines +37 to +38
{{- /* We avoid a relative path here to work around https://github.com/databricks/cli/issues/3674 */}}
- --editable ${workspace.file_path}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B., this is a workaround for #3674; -e .. doesn't work in the current CLI.

@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Sep 30, 2025

Run: 18458136252

Env 🔄​flaky ✅​pass 🙈​skip
✅​ aws linux 322 545
✅​ aws windows 323 544
✅​ aws-ucws linux 438 441
✅​ aws-ucws windows 439 440
✅​ azure linux 322 544
✅​ azure windows 323 543
✅​ azure-ucws linux 438 440
✅​ azure-ucws windows 439 439
🔄​ gcp linux 3 318 546
✅​ gcp windows 322 545
Test Name gcp linux
TestAccept/bundle/deployment/bind/job/job-abort-bind 🔄​f
TestAccept/bundle/deployment/unbind/permissions 🔄​f
TestGenerateFromExistingPipelineAndDeploy 🔄​f

@lennartkats-db
Copy link
Contributor Author

Scheduling merge post-bugbash.

@lennartkats-db lennartkats-db added this pull request to the merge queue Oct 13, 2025
Merged via the queue into main with commit c30c456 Oct 13, 2025
22 of 23 checks passed
@lennartkats-db lennartkats-db deleted the add-default-template branch October 13, 2025 08:09
deco-sdk-tagging bot added a commit that referenced this pull request Oct 16, 2025
## Release v0.273.0

### Notable Changes

* (via Terraform v1.92.0) DABs will no longer try to update pipeline permissions upon pipeline deletion. This fixes PERMISSION\_ERROR upon 'bundle destroy'
  for pipelines that have run\_as setting enabled (described in https://community.databricks.com/t95/data-engineering/dab-dlt-destroy-fails-due-to-ownership-permissions-mismatch/td-p/132101)
  The downside is that if 'permissions:' block is removed from the resource, DABs will not try anymore to restore permissions to just the owner of the pipeline.

### CLI

* Add the `--configure-serverless` flag to `databricks auth login` to configure Databricks Connect to use serverless.

### Dependency updates
* Upgrade Go SDK to 0.82.0 ([#3769](#3769))
* Upgrade TF provider to 1.92.0 ([#3772](#3772))

### Bundles
* Updated the internal lakeflow-pipelines template to use an "src" layout ([#3671](#3671)).
* Fix for pip flags with equal sign being incorrectly treated as local file names ([#3766](#3766))
github-merge-queue bot pushed a commit that referenced this pull request Nov 4, 2025
…3712)

## Changes

This updates the `default-python` template according to the latest
Lakeflow conventions as established in
#3671. Notably, the new template
moves away from the use of notebooks for pipeline source code.

The new layout looks as follows when the user selects they want both the
sample job and the sample pipeline:

`📁 resources`
`├── sample_job.job.yml`
`└── sample_etl.pipeline.yml`
`📁 src`
`├── 📁 my_project` — shared source code for use in jobs and/or pipelines
`│   ├── __init__.py`
`│   └── main.py`
`└── 📁 my_project_etl` — source code for the sample_etl pipeline
`    ├── __init__.py`
`    ├── 📁 transformations`
`    │   ├── __init__.py`
`    │   ├── sample_zones_my_project.py`
`    │   └── sample_trips_my_project.py`
`    ├── 📁 explorations` — exploratory notebooks
`    │   ├── __init__.py`
`    │   └── sample_exploration.ipynb`
`    └── README.md`
`📁 tests` — unit tests
`📁 fixtures` — fixtures (these can now be used with
[`load_fixture`](https://github.com/databricks/cli/blob/af524bb993eaffe059d65f93854d544a162fc6ef/acceptance/bundle/templates/default-python/serverless/output/my_default_python/fixtures/.gitkeep))
`databricks.yml`
`pyproject.toml`
`README.md`

The template prompts have been updated to cater to this structure.
Notably, they include a new prompt to manage the catalog and schema used
by the template. These settings are propagated to both the job and the
pipeline:

```
Welcome to the default Python template for Databricks Asset Bundles!

Answer the following questions to customize your project.
You can always change your configuration in the databricks.yml file later.

Note that https://e2-dogfood.staging.cloud.databricks.com is used for initialization.
(For information on how to change your profile, see https://docs.databricks.com/dev-tools/cli/profiles.html.)

Unique name for this project [my_project]: my_project
Include a Lakeflow job that runs a notebook: yes
Include an ETL pipeline: yes
Include a sample Python package that builds into a wheel file: yes
Use serverless compute: yes
Default catalog for any tables created by this project [main]: main
Use a personal schema for each user working on this project.
(This is recommended. Your personal schema will be 'main.lennart_kats'.): yes

✨ Your new project has been created in the 'my_project' directory!

To get started, refer to the project README.md file and the documentation at https://docs.databricks.com/dev-tools/bundles/index.html.
```

## Testing

* Standard unit testing, acceptance testing
* AI excercised templates and with all permutations of options,
deploying/testing/running/inspecting the result
* Bug bash of the original `lakeflow-pipelines` template from
#3671

---------

Co-authored-by: Claude <[email protected]>
Co-authored-by: Julia Crawford (Databricks) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants