You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -11,7 +12,9 @@ There are a few ways to contribute:
11
12
We are pretty chill about the format of the contribution, although having an issue first to discuss the change is always a good idea. Feel free to deliver a PR early, we can always iterate on it.
12
13
13
14
## Writing Code
15
+
14
16
There are a few guidelines that you should follow when writing code:
17
+
15
18
- All new features should be covered by tests.
16
19
- All new features should be documented.
17
20
- The pull request should contain detailed description of the changes made, as well as the reasoning behind them.
@@ -31,11 +34,11 @@ We use Makefile to build the code, which contains a set of commands to lint, tes
31
34
- If you get errors related to missing duckdb library, try running `make duck-db-static-lib` and retrying.
Bruin is a data pipeline tool that brings together data ingestion, data transformation with SQL, Python & R, and data quality into a single framework. It works with all the major data platforms and runs on your local machine, an EC2 instance, or GitHub Actions.
Copy file name to clipboardExpand all lines: docs/assets/dashboard.md
+16-3Lines changed: 16 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,6 +21,7 @@ Bruin supports the following dashboard tools as assets:
21
21
- Tableau: `tableau`
22
22
23
23
## Definition Schema
24
+
24
25
Dashboard assets are defined using the extension `{asset_name}.asset.yml`. Here is an example of the schema:
25
26
26
27
```yaml
@@ -41,96 +42,108 @@ tags:
41
42
## Supported Dashboard Tools
42
43
43
44
### Amazon QuickSight
45
+
44
46
```yaml
45
47
name: myschema.asset_name
46
48
type: quicksight
47
49
```
48
50
49
51
### Apache Superset
52
+
50
53
```yaml
51
54
name: myschema.asset_name
52
55
type: superset
53
56
```
54
57
55
58
### Domo
59
+
56
60
```yaml
57
61
name: myschema.asset_name
58
62
type: domo
59
63
```
60
64
61
65
### Good Data
66
+
62
67
```yaml
63
68
name: myschema.asset_name
64
69
type: gooddata
65
70
```
66
71
67
72
### Grafana
73
+
68
74
```yaml
69
75
name: myschema.asset_name
70
76
type: grafana
71
77
```
72
78
73
79
### Looker
80
+
74
81
```yaml
75
82
name: myschema.asset_name
76
83
type: looker
77
84
```
78
85
79
86
### Looker Studio
87
+
80
88
```yaml
81
89
name: myschema.asset_name
82
90
type: looker_studio
83
91
```
84
92
85
93
### Metabase
94
+
86
95
```yaml
87
96
name: myschema.asset_name
88
97
type: metabase
89
98
```
90
99
91
100
### Mode BI
101
+
92
102
```yaml
93
103
name: myschema.asset_name
94
104
type: modebi
95
105
```
96
106
97
107
### Power BI
108
+
98
109
```yaml
99
110
name: myschema.asset_name
100
111
type: powerbi
101
112
```
102
113
103
114
### Qlik Sense
115
+
104
116
```yaml
105
117
name: myschema.asset_name
106
118
type: qliksense
107
119
```
108
120
109
121
### Qlik View
122
+
110
123
```yaml
111
124
name: myschema.asset_name
112
125
type: qlikview
113
126
```
114
127
115
128
### Redash
129
+
116
130
```yaml
117
131
name: myschema.asset_name
118
132
type: redash
119
133
```
120
134
121
135
### Sisense
136
+
122
137
```yaml
123
138
name: myschema.asset_name
124
139
type: sisense
125
140
```
126
141
127
142
### Tableau
143
+
128
144
Tableau assets allow you to both define and refresh Tableau dashboards, workbooks, and worksheets. Please see the [Tableau assets](./tableau-refresh) for more information.
Copy file name to clipboardExpand all lines: docs/assets/definition-schema.md
+34-6Lines changed: 34 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,11 @@
1
1
# Asset Definition
2
+
2
3
Assets are defined in a YAML format in the same file as the asset code.
3
4
This enables the metadata to be right next to the code, reducing the friction when things change and encapsulating the relevant details in a single file.
4
5
The definition includes all the details around an asset from its name to the quality checks that will be executed.
5
6
6
7
Here's an example asset definition:
8
+
7
9
```bruin-sql
8
10
/* @bruin
9
11
@@ -51,57 +53,75 @@ Assets that are defined as YAML files have to have file names as `<name>.asset.y
51
53
:::
52
54
53
55
## `name`
56
+
54
57
The name of the asset, used for many things including dependencies, materialization and more. Corresponds to the `schema.table` convention.
55
58
Must consist of letters and dot `.` character.
59
+
56
60
-**Type:**`String`
57
61
58
62
## `uri`
63
+
59
64
We use `uri` (Universal Resource Identifier) as another way to identify assets. URIs must be unique across all your pipelines and can be used to define [cross pipeline dependencies](../cloud/cross-pipeline).
65
+
60
66
-**Type:**`String`
61
67
62
68
## `type`
69
+
63
70
The type of the asset determines how execution will happen. Must be one of the types listed in <ahref="https://github.com/bruin-data/bruin/blob/main/pkg/pipeline/pipeline.go#L31">pkg/pipeline/pipeline.go</a>.
71
+
64
72
-**Type:**`String`
65
73
66
74
## `owner`
75
+
67
76
The owner of the asset, has no functional implications on Bruin CLI as of today, allows documenting the ownership information. On [Bruin Cloud](https://getbruin.com), it is used to analyze ownership information, used in governance reports and ownership lineage.
77
+
68
78
-**Type:**`String`
69
79
70
80
## `tags`
81
+
71
82
As the name states, tags that are applied to the asset. These tags can then be used while running assets, e.g.:
83
+
72
84
```bash
73
85
bruin run --tag client1
74
86
```
87
+
75
88
-**Type:**`String[]`
76
89
77
90
## `domains`
91
+
78
92
Business domains that the asset belongs to. This is used for organizing and categorizing assets by business function or domain.
93
+
79
94
-**Type:**`String[]`
80
95
81
96
## `meta`
97
+
82
98
Additional metadata for the asset stored as key-value pairs. This can be used to store custom information about the asset that doesn't fit into other predefined fields.
99
+
83
100
-**Type:**`Object`
84
101
85
102
## `depends`
103
+
86
104
The list of assets this asset depends on. This list determines the execution order.
87
105
In other words, the asset will be executed only when all of the assets in the `depends` list have succeeded.
88
106
The items of this list can be just a `String` with the name of the asset in the same pipeline or an `Object` which can contain the following attributes
89
-
*`asset` : The name of the asset. Must be on the same pipeline
90
-
*`uri` : The URI of the upstream asset. This is used in [cloud](../cloud/overview.md) when you want to have an upstream on a different pipeline. See [uri](#uri) above
91
-
*`mode`: can be `full` (a normal dependency) or `symbolic`. The latter being just for the purpose of showing lineage without the downstream actually depending or having to wait on the upstream to run.
92
107
108
+
-`asset` : The name of the asset. Must be on the same pipeline
109
+
-`uri` : The URI of the upstream asset. This is used in [cloud](../cloud/overview.md) when you want to have an upstream on a different pipeline. See [uri](#uri) above
110
+
-`mode`: can be `full` (a normal dependency) or `symbolic`. The latter being just for the purpose of showing lineage without the downstream actually depending or having to wait on the upstream to run.
93
111
94
-
```
112
+
```yaml
95
113
- asset: asset_name
96
114
mode: symbolic
97
115
```
98
116
99
-
100
117
## `start_date`
118
+
101
119
The start date for the asset, used when running with full refresh (`--full-refresh`). When specified, the asset will process data starting from this date during full refresh runs (overrides the pipeline's start_date).
120
+
102
121
- **Type:** `String` (YYYY-MM-DD format)
103
122
104
123
## `interval_modifiers`
124
+
105
125
Controls how the processing window is adjusted by shifting the start and end times. Requires the `--apply-interval-modifiers` flag when running the pipeline.
See [interval modifiers](./interval-modifiers) for more details.
142
+
122
143
- **Type:** `Object`
123
144
124
145
## `rerun_cooldown`
146
+
125
147
Set a delay (in seconds) between retry attempts for failed assets. This helps prevent overwhelming downstream systems during failures and allows for temporary issues to resolve. If not specified, the asset inherits the pipeline's `rerun_cooldown` setting.
126
148
127
149
```yaml
128
150
rerun_cooldown: 300 # Wait 5 minutes between retries
129
151
```
130
152
131
153
**Special values:**
154
+
132
155
- `0`: No delay between retries (inherit from pipeline if not specified)
133
156
- `> 0`: Wait the specified number of seconds before retrying
134
157
- `-1`: Disable retry delays completely
135
158
136
159
When deploying to Airflow, this is automatically translated to `retries_delay` for compatibility.
160
+
137
161
- **Type:** `Integer`
138
162
139
163
## `materialization`
164
+
140
165
This option determines how the asset will be materialized. Refer to the docs on [materialization](./materialization) for more details.
141
166
142
167
## `hooks`
143
-
Hooks let you run SQL snippets before and/or after the main asset query. This is useful for setup or cleanup (loading extensions, attaching databases, or writing run logs, etc.).
168
+
169
+
Hooks let you run SQL snippets before and/or after the main asset query. This is useful for setup or cleanup (loading extensions, attaching databases, or writing run logs, etc.).
144
170
145
171
```yaml
146
172
hooks:
@@ -150,6 +176,7 @@ hooks:
150
176
post:
151
177
- query: "SET s3_region=''"
152
178
```
179
+
153
180
Hooks are currently supported for SQL assets. Each hook entry supports a single `query` field and is executed in order. Queries may have a trailing `;` or not.
154
181
155
182
Hooks can also be set as pipeline defaults (see [pipeline defaults](/getting-started/pipeline#default-pipeline-level-defaults)). Assets inherit default `pre` and `post` hooks independently - defining only `pre` hooks on an asset will still inherit default `post` hooks.
@@ -161,6 +188,7 @@ Hooks can also be set as pipeline defaults (see [pipeline defaults](/getting-sta
161
188
This is a list that contains all the columns defined with the asset, along with their quality checks and other metadata. Refer to the [columns](./columns.md) documentation for more details.
162
189
163
190
## `custom_checks`
191
+
164
192
This is a list of custom data quality checks that are applied to an asset. These checks allow you to define custom data quality checks in SQL, enabling you to encode any business logic into quality checks that might require more power.
0 commit comments