Skip to content

We will work with a CSV file that features some fun data about the top 50 best selling books on Amazon from 2009 to 2019 (provided by Kaggle).

Notifications You must be signed in to change notification settings

azxigen/Analyze-Best-Selling-Amazon-Books-with-Pandas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“Š Amazon Best Selling Books Analysis (with Pandas)

๐Ÿ“Œ Project Overview

This project analyzes the Amazon Best Selling Books dataset using Python (Pandas & NumPy).
The goal was to explore, clean, and analyze spreadsheet data to identify trends in bestselling books over the years.

Itโ€™s a beginner-friendly project that shows how to handle real-world CSV data with Python.


โœจ Features

  • โœ” Load and explore dataset with Pandas
  • โœ” Data cleaning (remove duplicates, handle missing values)
  • โœ” Basic exploratory data analysis (EDA)
  • โœ” Sorting & filtering by ratings, reviews, and prices
  • โœ” Aggregating data (e.g., most popular authors, categories)
  • โœ” Exporting cleaned results back to CSV

๐Ÿ›  Tech Stack

Language: Python 3
Libraries: Pandas, NumPy


๐Ÿš€ How to Run

Clone this repository:

git clone https://github.com/azxigen/amazon-books-analysis.git  
cd amazon-books-analysis

๐Ÿ“‚ Dataset

Amazon Bestselling Books (2009โ€“2019) Contains details like:

  • Book Title
  • Author
  • Genre (Fiction / Non-Fiction)
  • User Rating
  • Reviews
  • Price

๐Ÿ” Sample Insights

  • Top authors with multiple bestselling books
  • Price trends across years
  • Most common categories in bestsellers

Note: Since no Matplotlib was used, results are shown as Pandas tables & CSV exports.


๐Ÿ”ฎ Future Improvements

  • ๐Ÿ“Š Add visualizations (Matplotlib / Seaborn)
  • ๐Ÿ–ฅ Build interactive dashboard with Streamlit
  • ๐Ÿ“ˆ Compare trends with more recent Amazon data

โœ… Conclusion

This project demonstrates how spreadsheet-style data can be analyzed using Python and Pandas.
It provides a strong foundation for data analysis, which can be extended with visualizations and machine learning.

About

We will work with a CSV file that features some fun data about the top 50 best selling books on Amazon from 2009 to 2019 (provided by Kaggle).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages