Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions greenery/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

target/
dbt_packages/
logs/
15 changes: 15 additions & 0 deletions greenery/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Welcome to your new dbt project!

### Using the starter project

Try running the following commands:
- dbt run
- dbt test


### Resources:
- Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction)
- Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers
- Join the [chat](https://community.getdbt.com/) on Slack for live discussions and support
- Find [dbt events](https://events.getdbt.com) near you
- Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices
54 changes: 54 additions & 0 deletions greenery/README_project_answers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
1] How many users do we have?
- We have 130 distinct users
select count(distinct user_id)
from DEV_DB.DBT_KLYMPERIFLEXPORTCOM.STG_USERS

2] On average, how many orders do we receive per hour?
- On average receive 7.5 orders per hour
with
cte_orders_per_hour as (
select
date_trunc('hour', created_at) as order_hour,
count(distinct order_id) as total_hourly_orders
from DEV_DB.DBT_KLYMPERIFLEXPORTCOM.stg_orders
group by 1
)
select
round(avg(total_hourly_orders),1) as avg_hourly_orders
from cte_orders_per_hour;

3] On average, how long does an order take from being placed to being delivered?
- On average we take 3.9 days to deliver the goods from the time of placing the order
select
round(avg(datediff(days, created_at, delivered_at)),1) as days_to_delivery
from DEV_DB.DBT_KLYMPERIFLEXPORTCOM.stg_orders

4] How many users have only made one purchase? Two purchases? Three+ purchases?
- In our historical data we have 25 users that placed 1 order, 28 users with 2 orders while 71 ordered more than 3.
with order_count as (
select
user_id,
count(order_id) as total_orders
from DEV_DB.DBT_KLYMPERIFLEXPORTCOM.stg_orders
group by 1)

select
sum(case when total_orders = 1 then 1 else 0 end) as nr_users_1_order,
sum(case when total_orders = 2 then 1 else 0 end) as nr_users_2_order,
sum(case when total_orders >= 3 then 1 else 0 end) as nr_users_3_order
from order_count;

5] On average, how many unique sessions do we have per hour?
-We average at 16.3 sessions per hour
with
cte_hourly_sessions as (
select
date_trunc('hour', created_at) as hourly_session,
count(distinct session_id) as total_sessions
from DEV_DB.DBT_KLYMPERIFLEXPORTCOM.stg_events
group by 1)

select
round(avg(total_sessions),1) as avg_hourly_sessions
from cte_hourly_sessions;

Empty file added greenery/analyses/.gitkeep
Empty file.
37 changes: 37 additions & 0 deletions greenery/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'greenery'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'greenery'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
greenery:
# Config indicated by + and applies to all files under models/example/
example:
+materialized: view
Empty file added greenery/macros/.gitkeep
Empty file.
6 changes: 6 additions & 0 deletions greenery/macros/lbs_to_kgs.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{% macro lbs_to_kgs(column_name, precision=2) %}
ROUND(
(CASE WHEN {{ column_name }} = -99 THEN NULL ELSE {{ column_name }} END / 2.205)::NUMERIC,
{{ precision }}
)
{% endmacro %}
8 changes: 8 additions & 0 deletions greenery/macros/positive_values.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{% test positive_values(model, column_name) %}


select *
from {{ model }}
where {{ column_name }} < 0

{% endtest %}
27 changes: 27 additions & 0 deletions greenery/models/example/my_first_dbt_model.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

/*
Welcome to your first dbt model!
Did you know that you can also configure models directly within SQL files?
This will override configurations stated in dbt_project.yml
Try changing "table" to "view" below
*/

{{ config(materialized='table') }}

with source_data as (

select 1 as id
union all
select null as id

)

select *
from source_data

/*
Uncomment the line below to remove records with null `id` values
*/

-- where id is not null
6 changes: 6 additions & 0 deletions greenery/models/example/my_second_dbt_model.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

-- Use the `ref` function to select from other models

select *
from {{ ref('my_first_dbt_model') }}
where id = 1
21 changes: 21 additions & 0 deletions greenery/models/example/schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

version: 2

models:
- name: my_first_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
tests:
- unique
- not_null

- name: my_second_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
tests:
- unique
- not_null
45 changes: 45 additions & 0 deletions greenery/models/staging/postgres/_postgres__models.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
version: 2

models:
- name: stg_addresses
description: Addresses in the platform with geographical dimentions
columns:
- name: address_id
tests:
- unique
- name: stg_events
description: All events occuring on the platform with url, product type, users, timestamps
columns:
- name: event_id
tests:
- unique
- name: stg_order_items
description: Information on quantity of items and product in a specific order
columns:
- name: order_id
tests:
- unique
- name: stg_orders
description: All platform orders with metdata info such as user, costs, delivery times etc
columns:
- name: order_id
tests:
- unique
- name: stg_products
description: Products on the platform with name, price and inventory availability
columns:
- name: product_id
tests:
- unique
- name: stg_promos
description: Promotion codes and discounts
columns:
- name: promo_id
tests:
- unique
- name: stg_users
description: Platfom's users personal information
columns:
- name: user_id
tests:
- unique
23 changes: 23 additions & 0 deletions greenery/models/staging/postgres/_postgres__sources.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
version: 2

sources:

- name: postgress # this data originates from postgress
schema: public # this is the schema our raw data lives in
database: raw # this is the name of the database that our source data lives in

quoting:
database: false
schema: false
identifier: false


tables:
# The source tables with raw data from postgress
- name: events
- name: users
- name: addresses
- name: orders
- name: products
- name: order_items
- name: promos
15 changes: 15 additions & 0 deletions greenery/models/staging/postgres/stg_addresses.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
address_id,
address,
zipcode,
state,
country
from {{ source ('postgress', 'addresses') }}
18 changes: 18 additions & 0 deletions greenery/models/staging/postgres/stg_events.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
event_id,
session_id,
user_id,
page_url,
created_at,
event_type
order_id,
product_id
from {{ source ('postgress', 'events') }}
13 changes: 13 additions & 0 deletions greenery/models/staging/postgres/stg_order_items.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
order_id,
product_id,
quantity
from {{ source ('postgress', 'order_items') }}
23 changes: 23 additions & 0 deletions greenery/models/staging/postgres/stg_orders.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
order_id,
user_id,
promo_id,
address_id,
created_at,
order_cost,
shipping_cost,
order_total,
tracking_id,
shipping_service,
estimated_delivery_at,
delivered_at,
status
from {{ source ('postgress', 'orders') }}
14 changes: 14 additions & 0 deletions greenery/models/staging/postgres/stg_products.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
product_id,
name,
price,
inventory
from {{ source ('postgress', 'products') }}
13 changes: 13 additions & 0 deletions greenery/models/staging/postgres/stg_promos.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
promo_id,
discount,
status
from {{ source ('postgress', 'promos') }}
18 changes: 18 additions & 0 deletions greenery/models/staging/postgres/stg_users.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
--snowflake_warehouse env_var is to demosntrate another config we can apply. In our case we only have 1 size of
--transformer_dev warehouse but we could have multiple and depending on model requirements
--we could choose to run/deploy against a bigger warehouse
{{ config(
materialized='table',
snowflake_warehouse=env_var('SNOWFLAKE_WAREHOUSE_XS'))
}}

select
user_id,
first_name,
last_name,
email,
phone_number,
created_at,
updated_at,
address_id
from {{ source ('postgress', 'users') }}
Loading