Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
NAME="github.com/odpf/meteor"
VERSION=$(shell git describe --always --tags 2>/dev/null)
COVERFILE="/tmp/app.coverprofile"
PROTON_COMMIT := "2d2177aa02ee885bae094af283ff79a1d800791a"
PROTON_COMMIT := "5267e1fdf3abc8d9a06938290e202efdd060f665"
.PHONY: all build clean test

all: build
Expand Down
11 changes: 7 additions & 4 deletions agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package agent
import (
"context"
"fmt"
"runtime/debug"
"sync"
"time"

Expand Down Expand Up @@ -170,8 +171,10 @@ func (r *Agent) Run(ctx context.Context, recipe recipe.Recipe) (run Run) {
// while stream is listening via stream.Listen().
go func() {
defer func() {
if r := recover(); r != nil {
run.Error = fmt.Errorf("agent run: close stream: panic: %s", r)
if rcvr := recover(); rcvr != nil {
r.logger.Error("panic recovered")
r.logger.Info(string(debug.Stack()))
run.Error = fmt.Errorf("agent run: close stream: panic: %s", rcvr)
}
stream.Close()
}()
Expand Down Expand Up @@ -280,8 +283,8 @@ func (r *Agent) setupSink(ctx context.Context, sr recipe.PluginRecipe, stream *s
return err
}, defaultBatchSize)

//TODO: the sink closes even though some records remain unpublished
//TODO: once fixed, file sink's Close needs to close *File
// TODO: the sink closes even though some records remain unpublished
// TODO: once fixed, file sink's Close needs to close *File
stream.onClose(func() {
if err = sink.Close(); err != nil {
r.logger.Warn("error closing sink", "sink", sr.Name, "error", err)
Expand Down
2 changes: 1 addition & 1 deletion agent/retrier.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ func (r *retrier) retry(ctx context.Context, operation func() error, notify func
return err
}
// if err is RetryError, returns err directly to retry
if errors.Is(err, plugins.RetryError{}) {
if errors.As(err, &plugins.RetryError{}) {
return err
}
// if err is not RetryError, wraps error to prevent retrying
Expand Down
17 changes: 12 additions & 5 deletions docs/docs/reference/extractors.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,17 @@ Meteor currently supports metadata extraction on these data sources. To perform
|:------------------------------------|:----------|:----------|:------------|:-------|
| [`caramlstore`][caramlstore-readme] | ✗ | ✅ | ✗ | ✅ |

### Service
### Application

| Type | Ownership | Upstreams | Downstreams | Custom |
|:-------------------------------|:----------|:----------|:------------|:-------|
| [`service_yaml`][service-yaml] | ✅ | ✅ | ✅ | ✅ | ✅ |
| Type | Ownership | Upstreams | Downstreams | Custom |
|:----------------------------------------------|:----------|:----------|:------------|:-------|
| [`application_yaml`][application-yaml-readme] | ✅ | ✅ | ✅ | ✅ | ✅ |

### Machine Learning Model

| Type | Ownership | Upstreams | Downstreams | Custom |
|:--------------------------|:----------|:----------|:------------|:-------|
| [`merlin`][merlin-readme] | ✅ | ✅ | ✗ | ✅ | ✅ |

<!--- Not using relative links because that breaks the docs build -->

Expand Down Expand Up @@ -96,4 +102,5 @@ Meteor currently supports metadata extraction on these data sources. To perform
[gcs-readme]: https://github.com/odpf/meteor/tree/main/plugins/extractors/gcs/README.md
[optimus-readme]: https://github.com/odpf/meteor/tree/main/plugins/extractors/optimus/README.md
[caramlstore-readme]: https://github.com/odpf/meteor/tree/main/plugins/extractors/caramlstore/README.md
[service-yaml]: https://github.com/odpf/meteor/tree/main/plugins/extractors/service_yaml/README.md
[application-yaml-readme]: https://github.com/odpf/meteor/tree/main/plugins/extractors/application_yaml/README.md
[merlin-readme]: https://github.com/odpf/meteor/tree/main/plugins/extractors/merlin/README.md
113 changes: 85 additions & 28 deletions docs/docs/reference/metadata_models.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,79 @@
# Meteor Metadata Model

We have a set of defined metadata models which define the structure of metadata that meteor will yield.
To visit the metadata models being used by different extractors please visit [here](../reference/extractors.md).
We are currently using the following metadata models:
We have a set of defined metadata models which define the structure of metadata
that meteor will yield. To visit the metadata models being used by different
extractors please visit [here](extractors.md). We are currently using the
following metadata models:

- [Bucket](https://github.com/odpf/proton/blob/main/odpf/assets/bucket.proto):
Used for metadata being extracted from buckets. Buckets are the basic containers in google cloud services, or Amazon S3, etc that are used fot data storage, and quite popular because of their features of access management, aggregation of usage and services and ease of configurations.
Currently, Meteor provides a metadata extractor for the buckets mentioned [here](../reference/extractors.md)
- [Bucket][proton-bucket]: Used for metadata being extracted from buckets.
Buckets are the basic containers in google cloud services, or Amazon S3, etc
that are used fot data storage, and quite popular because of their features of
access management, aggregation of usage and services and ease of
configurations. Currently, Meteor provides a metadata extractor for the
buckets mentioned [here](extractors.md#bucket)

- [Dashboard](https://github.com/odpf/proton/blob/main/odpf/assets/dashboard.proto):
Dashboards are an essential part of data analysis and are used to track, analyze and visualize.
These Dashboard metadata model includes some basic fields like `urn` and `source`, etc and a list of `Chart`.
There are multiple dashboards that are essential for Data Analysis such as metabase, grafana, tableau, etc.
Please refer to the list of Dashboards meteor currently supports [here](../reference/extractors.md).
- [Dashboard][proton-dashboard]: Dashboards are an essential part of data
analysis and are used to track, analyze and visualize. These Dashboard
metadata model includes some basic fields like `urn` and `source`, etc and a
list of `Chart`. There are multiple dashboards that are essential for Data
Analysis such as metabase, grafana, tableau, etc. Please refer to the list of
'Dashboard' extractors meteor currently
supports [here](extractors.md#dashboard).

- [Chart](https://github.com/odpf/proton/blob/main/odpf/assets/chart.proto):
Charts are included in all the Dashboard and are the result of certain queries in a Dashboard.
Information about them includes the information of the query and few similar details.
- [Chart][proton-dashboard]: Charts are included in all the Dashboard and are
the result of certain queries in a Dashboard. Information about them
includes the information of the query and few similar details.

- [User](https://github.com/odpf/proton/blob/main/odpf/assets/user.proto):
This metadata model is used for defining the output of extraction on Users accounts.
Some of these sources can be GitHub, Workday, Google Suite, LDAP.
Please refer to the list of user meteor currently supports [here](../reference/extractors.md).
- [User][proton-user]: This metadata model is used for defining the output of
extraction on User accounts. Some of these sources can be GitHub, Workday,
Google Suite, LDAP. Please refer to the list of 'User' extractors meteor
currently supports [here](extractors.md#user).

- [Table](https://github.com/odpf/proton/blob/main/odpf/assets/table.proto):
This metadata model is being used by extractors based around `databases` or for the ones that store data in tabular format.
It contains various fields that include `schema` of the table and other access related information.
- [Table][proton-table]: This metadata model is being used by extractors based
around databases, typically for the ones that store data in tabular format. It
contains various fields that include `schema` of the table and other access
related information. Please refer to the list of 'Table' extractors meteor
currently supports [here](extractors.md#table).

- [Job](https://github.com/odpf/proton/blob/main/odpf/assets/job.proto):
Most of the data is being streamed as queues by kafka or other stack in DE pipeline.
And hence Job is a metadata model built for this purpose.
- [Job][proton-job]: A job can represent a scheduled or recurring task that
performs some transformation in the data engineering pipeline. Job is a
metadata model built for this purpose. Please refer to the list of 'Job'
extractors meteor currently supports [here](extractors.md#table).

`Proto` has been used to define these metadata models.
To check their implementation please refer [here](https://github.com/odpf/proton/tree/main/odpf/assets).
- [Topic][proton-topic]: A topic represents a virtual group for logical group of
messages in message bus like kafka, pubsub, pulsar etc. Please refer to the
list of 'Topic' extractors meteor currently
supports [here](extractors.md#topic).

- [Machine Learning Feature Table][proton-featuretable]: A Feature Table is a
table or view that represents a logical group of time-series feature data as
it is found in a data source. Please refer to the list of 'Feature Table'
extractors meteor currently
supports [here](extractors.md#machine-learning-feature-table).

- [Application][proton-application]: An application represents a service that
typically communicates over well-defined APIs. Please refer to the list of '
Application' extractors meteor currently
supports [here](extractors.md#application).

- [Machine Learning Model][proton-model]: A Model represents a Data Science
Model commonly used for Machine Learning(ML). Models are algorithms trained on
data to find patterns or make predictions. Models typically consume ML
features to generate a meaningful output. Please refer to the list of 'Model'
extractors meteor currently
supports [here](extractors.md#machine-learning-model).

`Proto` has been used to define these metadata models. To check their
implementation please refer [here][proton-assets].

## Usage

[//]: # (@formatter:off)

```golang
import(
"github.com/odpf/meteor/models/odpf/assets/v1beta1"
"github.com/odpf/meteor/models/odpf/assets/facets/v1beta1"
assetsv1beta1 "github.com/odpf/meteor/models/odpf/assets/v1beta1"
"github.com/odpf/meteor/models/odpf/assets/facets/v1beta1"
)

func main(){
Expand All @@ -64,3 +98,26 @@ func main(){
}
}
```

[//]: # (@formatter:on)


[proton-bucket]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/bucket.proto

[proton-dashboard]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/dashboard.proto

[proton-user]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/user.proto

[proton-table]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/table.proto

[proton-job]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/job.proto

[proton-topic]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/topic.proto

[proton-featuretable]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/feature_table.proto

[proton-application]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/application.proto

[proton-model]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2/model.proto

[proton-assets]: https://github.com/odpf/proton/tree/main/odpf/assets/v1beta2
34 changes: 20 additions & 14 deletions models/odpf/assets/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,24 @@
# Metadata Models

Metadata models are structs in which metadata of a certain kind will be extracted in order to mainatain the integrity across similar data sources.
For e.g, MySQL and Postgres are supposed to provide similar struct for metadata since both are SQL based databases.
Currently meteor provides the extracted metadata as one of the following metadata models:
Metadata models are structs in which metadata of a certain kind will be
extracted in order to mainatain the integrity across similar data sources. For
e.g, MySQL and Postgres are supposed to provide similar struct for metadata
since both are SQL based databases. Currently meteor provides the extracted
metadata as one of the following metadata models:

* [Bucket](bucket.pb.gp)
* [Chart](chart.pb.go)
* [Dashboard](dashboard.pb.go)
* [Group](group.pb.go)
* [Job](job.pb.go)
* [Table](table.pb.go)
* [Topic](topic.pb.go)
* [User](user.pb.go)
* [`Bucket`](bucket.pb.gp)
* [`Chart`](chart.pb.go)
* [`Dashboard`](dashboard.pb.go)
* [`Group`](group.pb.go)
* [`Job`](job.pb.go)
* [`Table`](table.pb.go)
* [`Topic`](topic.pb.go)
* [`User`](user.pb.go)
* [`FeatureTable`](feature_table.pb.go)
* [`Application`](application.pb.go)
* [`Model`](model.pb.go)

While adding an extractor one needs to provide metadata supported by these models.
If you want some other data model added to the list feel free to raise a issue.
Please refer [docs](../../../docs/data%20models/README.md) for easier reference of how data models are being used.
While adding an extractor one needs to provide metadata supported by these
models. If you want some other data model added to the list feel free to raise a
issue. Please refer [docs](../../../docs/docs/reference/metadata_models.md) for
easier reference of how data models are being used.
47 changes: 34 additions & 13 deletions models/odpf/assets/v1beta2/feature_table.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading