Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
covid data		covid data
AWS Architecture.png		AWS Architecture.png
Covid19_DE_Project.ipynb		Covid19_DE_Project.ipynb
Data Model.pdf		Data Model.pdf
Dimensional Model.pdf		Dimensional Model.pdf
README.md		README.md
redshift_connector-2.1.2-py3-none-any.whl		redshift_connector-2.1.2-py3-none-any.whl

Repository files navigation

Covid19-Data-Engineering-Project

Dataset URL: https://aws.amazon.com/blogs/big-data/a-public-data-lake-for-analysis-of-covid-19-data/

Steps:

Understood the data in the datasets.
Uploaded the data to S3.
Built crawlers using AWS Glue.
The data can be seen on Athena.
Built a Data Model.
Built a Dimensional model using star schema.
Created Dimension and Fact Tables in Python (pandas).
Loaded data into those tables in Python (pandas).
Save the resulting CSV files onto S3.
Written an AWS Glue Job using Python Shell Script.

Connected to Redshift.
Created Dimension and Fact Table schemas.
Loaded data from CSV files in S3 to Redshift.

This is the creation of a data warehouse.

About

No description, website, or topics provided.

Report repository

Releases

No releases published

Packages

No packages published

Languages