This project analyzes the Amazon Best Selling Books dataset using Python (Pandas & NumPy).
The goal was to explore, clean, and analyze spreadsheet data to identify trends in bestselling books over the years.
Itโs a beginner-friendly project that shows how to handle real-world CSV data with Python.
- โ Load and explore dataset with Pandas
 - โ Data cleaning (remove duplicates, handle missing values)
 - โ Basic exploratory data analysis (EDA)
 - โ Sorting & filtering by ratings, reviews, and prices
 - โ Aggregating data (e.g., most popular authors, categories)
 - โ Exporting cleaned results back to CSV
 
Language: Python 3
Libraries: Pandas, NumPy
Clone this repository:
git clone https://github.com/azxigen/amazon-books-analysis.git  
cd amazon-books-analysisAmazon Bestselling Books (2009โ2019) Contains details like:
- Book Title
 - Author
 - Genre (Fiction / Non-Fiction)
 - User Rating
 - Reviews
 - Price
 
- Top authors with multiple bestselling books
 - Price trends across years
 - Most common categories in bestsellers
 
Note: Since no Matplotlib was used, results are shown as Pandas tables & CSV exports.
- ๐ Add visualizations (Matplotlib / Seaborn)
 - ๐ฅ Build interactive dashboard with Streamlit
 - ๐ Compare trends with more recent Amazon data
 
This project demonstrates how spreadsheet-style data can be analyzed using Python and Pandas.
It provides a strong foundation for data analysis, which can be extended with visualizations and machine learning.