Movie recommendation system with Python. Implements content-based filtering (TF-IDF + cosine similarity), collaborative filtering with matrix factorization (TruncatedSVD), and a hybrid approach. Evaluates with Precision@K, Recall@K, and NDCG. Includes rating distribution plots, top movies, and sample recommendations.
- Synthetic ratings dataset with users, movies, and multi-label genres
- Content-based: TF‑IDF on genres + item–item cosine similarity
- Collaborative: user–item matrix factorization (TruncatedSVD)
- Hybrid: weighted combination of content & collaborative scores
- Evaluation: Precision@K, Recall@K, NDCG@K (per-user & macro)
- Visuals: rating distribution, top movies, sample recommendations
- Outputs: metrics JSON, recommendation files, and charts
movie-recommendation-system/
├─ README.md
├─ LICENSE
├─ requirements.txt
├─ data/
│  └─ generate_ratings.py
├─ src/
│  ├─ build_recommender.py
│  └─ utils.py
└─ outputs/
   └─ figures & reports (auto-created)
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
pip install -r requirements.txtpython data/generate_ratings.py --users 800 --movies 1200 --seed 42 --outdir data
# Produces: data/ratings.csv, data/movies.csvpython src/build_recommender.py --ratings data/ratings.csv --movies data/movies.csv --outdir outputs --k 10 --alpha 0.6 --seed 42Outputs
- outputs/metrics.json– Precision@K, Recall@K, NDCG@K (content, collab, hybrid)
- outputs/recs_user_*.txt– sample recommendations for a few users
- outputs/ratings_hist.png
- outputs/top_movies.png
| Method | Precision@10 | Recall@10 | NDCG@10 | 
|---|---|---|---|
| Collaborative | 0.0024 | 0.0107 | 0.0059 | 
| Content-based | 0.0026 | 0.0129 | 0.0073 | 
| Hybrid (α=0.6) | 0.0024 | 0.0107 | 0.0060 | 
➡️ On this sparse synthetic dataset, content-based slightly outperforms collaborative and hybrid.
 
 
User 1
Movie 959    41.08
Movie 541    36.28
Movie 952    34.83
Movie 95     34.81
Movie 40     33.30
User 2
Movie 915    47.02
Movie 122    38.27
Movie 77     38.18
Movie 1097   37.65
Movie 952    37.04
User 3
Movie 800    52.00
Movie 410    45.77
Movie 9      45.59
Movie 738    41.34
Movie 713    41.00
- Positive items are defined as ratings ≥ 4 for evaluation.
- The hybrid weight alphacontrols emphasis on collaborative (α→1) vs. content (α→0).