Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 15 additions & 27 deletions docs/docs/concepts/overview.md → docs/docs/concepts/asset.mdx
Original file line number Diff line number Diff line change
@@ -1,22 +1,13 @@
# Overview
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

Compass has three major concept when it comes to data ingestion: Asset, Type, and Service.
# Asset

Asset is essentially an arbitrary JSON object that represent a metadata of a specific service with a specific type.
In Compass, we call every metadata that you input as an Asset. All your tables, dashboards, topics, jobs are an example of assets.

Type defines a ‘type’ of an asset and it is pre-defined. There are currently 4 supported types in Compass: `table`, `job`, `dashboard`, and `topic`.

Service defines the application name that the asset was coming from. For example: `biquery`, `postgres`, etc. If you wanted to push data for `bigquery` dataset\(s\) to Compass, you would need to first define the ‘`bigquery`’ service in compass.
<Tabs>
<TabItem value="table" label="Table View">

Some features that compass has:
* Asset Tagging
* User
* Discussion
* Starring

## Asset

An Asset is a JSON document that describes a metadata. Asset has a schema:
| Field | Required | Type | Description |
|---|---|---|---|
| id | false | string | compass' auto-generated uuid |
Expand All @@ -29,7 +20,10 @@ An Asset is a JSON document that describes a metadata. Asset has a schema:
| labels | false |json | labels of metadata, written in key-value string |
| owners | false | []json | array of json, where each json contains `email` field |

```text
</TabItem>
<TabItem value="json" label="JSON">

```json
{

"urn": "topic/order-log",
Expand Down Expand Up @@ -57,9 +51,13 @@ An Asset is a JSON document that describes a metadata. Asset has a schema:
}
```

</TabItem>
</Tabs>


Every asset that is pushed SHOULD have the required fields: `urn`, `type`, `service`, `name`. The value of these fields MUST be string, if present.

Asset ingestion API \(/v1beta1/assets\) is using HTTP PATCH method. The behavioud would be similar with how PATCH works. It is possible to patch one field only in an asset by sending the updated field to the ingestion API. This also works for the data in dynamic `data` field. The combination of `urn`, `type`, `service` will be the identifier to patch an asset.
Asset ingestion API (`/v1beta1/assets`) is using HTTP PATCH method. The behavioud would be similar with how PATCH works. It is possible to patch one field only in an asset by sending the updated field to the ingestion API. This also works for the data in dynamic `data` field. The combination of `urn`, `type`, `service` will be the identifier to patch an asset.
In case the `urn` does not exist, the asset ingestion PATCH API \(/v1beta1/assets\) will create a new asset.

## Lineage
Expand Down Expand Up @@ -146,13 +144,3 @@ If there is an update to the `environment` in the asset labels, here is the asse
## Tagging an Asset
Compass allows user to tag a specific asset. To tag a new asset, one needs to create a template of the tag. Tag's template defines a set of fields' tag that are applicable to tag each field in an asset.
Once a template is created, each field in an asset is possible to be tagged by calling `/v1beta1/tags` API. More detail about [Tagging](../guides/tagging.md).

## User
The current version of Compass does not have user management. Compass expect there is an external instance that manages user. Compass consumes user information from the configurable identity uuid header in every API call. The default name of the header is `Compass-User-UUID`.
Compass does not make any assumption of what kind of identity format that is being used. The `uuid` indicates that it could be in any form (e.g. email, UUIDv4, etc) as long as it is universally unique.
The current behaviour is, Compass will add a new user if the user information consumed from the header does not exist in Compass' database. More detail about [User](./user.md).
## Discussion
Compass supports discussion feature. User could drop comments in each discussion. Currently, there are three types of discussions `issues`, `open ended`, and `question and answer`. Depending on the type, the discussion could have multiple possible states. In the current version, all types only have two states: `open` and `closed`. A newly created discussion will always be assign an `open` state. More detail about [Discussion](../guides/discussion.md).

## Starring
Compass allows a user to stars an asset. This bookmarking functionality is introduced to increase the speed of a user to get information. There is also an API to see which users star an asset (stargazers). More detail about [Starring](../guides/starring.md).
24 changes: 24 additions & 0 deletions docs/docs/concepts/overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Overview

Compass has three major concept when it comes to data ingestion: Asset, Type, and Service.

Asset is essentially an arbitrary JSON object that represent a metadata of a specific service with a specific type.

Type defines a ‘type’ of an asset and it is pre-defined. There are currently 4 supported types in Compass: `table`, `job`, `dashboard`, and `topic`.

Service defines the application name that the asset was coming from. For example: `biquery`, `postgres`, etc. If you wanted to push data for `bigquery` dataset\(s\) to Compass, you would need to first define the ‘`bigquery`’ service in compass.

Some features that compass has:
* [Asset Tagging](./asset#tagging-an-asset)
* [User](./user.md)
* [Discussion](../guides/discussion.md)
* [Starring](../guides/starring.md)

## Discussion
Compass supports discussion feature. User could drop comments in each discussion. Currently, there are three types of discussions `issues`, `open ended`, and `question and answer`. Depending on the type, the discussion could have multiple possible states. In the current version, all types only have two states: `open` and `closed`. A newly created discussion will always be assign an `open` state. More detail about [Discussion](../guides/discussion.md).

## Starring
Compass allows a user to stars an asset. This bookmarking functionality is introduced to increase the speed of a user to get information. There is also an API to see which users star an asset (stargazers). More detail about [Starring](../guides/starring.md).
3 changes: 3 additions & 0 deletions docs/docs/concepts/type.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Type

TBD
File renamed without changes.
96 changes: 96 additions & 0 deletions docs/docs/tour/1-my-first-asset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# 1. My First Asset

Before starting the tour, make sure you have a running Compass instance. You can refer this [installation guide](../installation).

## 1.1 Introduction

In Compass, we call every metadata that you input as an [Asset](../concepts/asset). All your tables, dashboards, topics, jobs are an example of assets.

In this section, we will help you to build your first Asset and hopefully it will give your clear idea about what an Asset is in Compass.

## 1.2 Hello, ~~World~~ Asset!

Let's imagine we have a `postgres` instance that we keep referring to as our `main-postgres`. Inside it there is a database called `my-database` that has plenty of tables. One of the tables is named `orders`, and below is how you represent that `table` as an Compass' Asset.

```json
{
"urn": "main-postgres:my-database.orders",
"type": "table",
"service": "postgres",
"name": "orders",
"data": {
"database": "my-database",
"namespace": "main-postgres"
}
}
```

- **urn** is a unique name you assign to an asset. You need to make sure you don't have a duplicate urns across all of your assets because Compass treats `urn` as an identifier of your asset. For this example, we use the following format to make sure our urn is unique, `{NAMESPACE}:{DB_NAME}.{TABLE_NAME}`. (more info about URN generation can be found [here](../guides/urn-generation))

- **type** is your Asset's type. The value for type has to be recognizable by Compass. More info about Asset's Type can be found [here](../concepts/type).

- **service** can be seen as the source of your asset. `service` can be anything, in this case since our `orders` table resides in `postgres`, we can just put `postgres` as the service.

- **name** is the name of your asset, it does not have to be unique. We don't need to worry to get mixed up if there are other tables with the same name, `urn` will be the main identifier for your asset, that is why we need to make it unique across all of your assets.

- **data** can hold your asset's extra details if there is any. In the example, we use it to store information of the **database name** and the **alias/namespace** that we use when referring the postgres instance.

## 1.3 Sending your first asset to Compass

Here is the asset that we built on previous section.

```json
{
"urn": "main-postgres:my-database.orders",
"type": "table",
"service": "postgres",
"name": "orders",
"data": {
"database": "my-database",
"namespace": "main-postgres"
}
}
```
Let's send this into Compass so that it would be discoverable.

As of now, Compass supports ingesting assets via `gRPC` and `http`. In this example, we will use `http` to send your first asset to Compass.
Compass exposes an API `[PATCH] /v1beta1/assets` to upload your asset.

```bash
curl --location --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Content-Type: application/json' \
--header 'Compass-User-UUID: [email protected]' \
--data-raw '{
"asset": {
"urn": "main-postgres:my-database.orders",
"type": "table",
"service": "postgres",
"name": "orders",
"data": {
"database": "my-database",
"namespace": "main-postgres"
}
}
}'
```

There are a few things to notice here:
1. The HTTP method used is `PATCH`. This is because Compass does not have a dedicated `Create` or `Update` apis, it uses a single API to `Patch / Create` an asset instead. So when updating or patching your asset, you can use the same API.

2. Compass requires `Compass-User-UUID` header to be in the request. More information about the identity header can be found [here](../concepts/user). To simplify this tour, let's just use `[email protected]`.

3. When sending our asset to Compass, we need to put our asset object inside an `asset` field as shown in the sample curl above.

On a success insertion, your will receive below response:

```json
{ "id": "cebeb793-8933-434c-b38f-beb6dbad91a5" }
```

**id** is an identifier of your asset. Unlike `urn` which is provided by you, `id` is auto generated by Compass.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**id** is an identifier of your asset. Unlike `urn` which is provided by you, `id` is auto generated by Compass.
**id** is an identifier of your asset. Unlike `urn` which is provided by you, `id` is auto generated by Compass if there was no asset found with the given URN.


## Summary

Now that you have successfully ingested your asset to Compass, we can now search and find it via Compass.

In the next section, we will see how Compass can help you in searching and discovering your assets.
182 changes: 182 additions & 0 deletions docs/docs/tour/2-querying-assets.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# 2. Querying your Assets

In this section, we will learn how we can find and search our assets using the following approaches:
- [Using URN](#21-using-urn)
- [Using Search API](#23-using-search-api)

## 2.1 Using URN

Using the URN returned from when [you are uploading your asset](./1-my-first-asset.md#13-sending-your-first-asset-to-compass), you can easily find your asset like below

```bash
curl --location --request GET 'http://localhost:8080/v1beta1/assets/cebeb793-8933-434c-b38f-beb6dbad91a5' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
curl --location --request GET 'http://localhost:8080/v1beta1/assets/cebeb793-8933-434c-b38f-beb6dbad91a5' \
curl 'http://localhost:8080/v1beta1/assets/cebeb793-8933-434c-b38f-beb6dbad91a5' \

--header 'Content-Type: application/json' \
--header 'Compass-User-UUID: [email protected]'
```

It will return
```json
{
"data": {
"id": "cebeb793-8933-434c-b38f-beb6dbad91a5",
"urn": "main-postgres:my-database.orders",
"type": "table",
"service": "postgres",
"name": "orders",
"description": "",
"data": {
"database": "my-database",
"namespace": "main-postgres"
},
"labels": null,
"owners": [],
"version": "0.2",
"updated_by": {
"uuid": "[email protected]"
},
"changelog": [],
"created_at": "2021-03-22T22:45:11.160593Z",
"updated_at": "2021-03-22T22:45:11.160593Z"
}
}
```

## 2.2 Adding more assets

Before we try other APIs let's first add **5 additional assets** to Compass.

<Tabs>
<TabItem value="product" label="Product Table">

```bash
curl --location --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Content-Type: application/json' \
--header 'Compass-User-UUID: [email protected]' \
--data-raw '{
"asset": {
"urn": "main-postgres:my-database.products",
"type": "table",
"service": "postgres",
"name": "products",
"data": {
"database": "my-database",
"namespace": "main-postgres"
}
}
}
'
```

</TabItem>
<TabItem value="different-database" label="Different Database">

```bash
curl --location --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Content-Type: application/json' \
--header 'Compass-User-UUID: [email protected]' \
--data-raw '{
"asset": {
"urn": "main-postgres:temp-database.invoices",
"type": "table",
"service": "postgres",
"name": "invoices",
"data": {
"database": "temp-database",
"namespace": "main-postgres"
}
}
}
'
```

</TabItem>
<TabItem value="mysql" label="MySQL">

```bash
curl --location --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Content-Type: application/json' \
--header 'Compass-User-UUID: [email protected]' \
--data-raw '{
"asset": {
"urn": "userdb:identity.users",
"type": "table",
"service": "mysql",
"name": "users",
"data": {
"database": "identity",
"namespace": "userdb"
}
}
}
'
```

</TabItem>
<TabItem value="dashboard-type" label="Dashboard Type">

```bash
curl --location --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Content-Type: application/json' \
--header 'Compass-User-UUID: [email protected]' \
--data-raw '{
"asset": {
"urn": "mymetabase:collections/123",
"type": "dashboard",
"service": "metabase",
"name": "My Profit Dashboard",
"data": {
"collection_id": 123,
"charts": [
"Income Chart",
"Outcome Chart"
]
}
}
}
'
```

</TabItem>
</Tabs>

## 2.3 Using Search API

Search API is the preferred way when browsing through your assets in Compass. Let's see how powerful Compass is for discovering your assets.

Now that we have added more assets to Compass [here](#22-adding-more-assets), let's try to search for our newly added `products` table. To use Search API, we just need to provide a query/text/term.

Let's search for our `products` table using a typo query `"podcts"`.

```bash
curl --location --request GET 'http://localhost:8080/v1beta1/search?text=podcts' \
--header 'Compass-User-UUID: [email protected]'
```

Search results:
```json
{
"data": [
{
"id": "7c0759f4-feec-4b5e-bf26-bf0d0b1236b1",
"urn": "main-postgres:my-database.products",
"type": "table",
"service": "postgres",
"name": "products",
"description": ""
}
]
}
```

Compass Search API supports fuzzy search, so even when you give `"podcts"`, it will still be able to fetch your `products` table.

## Summary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to 'Conclusion'? Summary is typically provided at the start.


Search API is a really powerful discovery tool that you can leverage when storing your assets. It has lots of feature like `fuzzy search` which we just see, you can also easily filter through asset's type, service and much more.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which we just {see => saw}


Up to this point, you have learnt how to create assets, inserting assets and querying them. Using those features only you can start leveraging Compass to be your Metadata Discovery Service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using {those => these} features only...


Next we will see how you can use Compass to build a Lineage between your assets.
Loading