Welcome to my submission for the Kaggle Dogs vs. Cats Redux classification challenge!
This project demonstrates how to build progressively better deep learning models to distinguish between images of cats and dogs. The final submission predicts the probability that an image contains a dog — evaluated via log loss on Kaggle.
Develop a binary classifier that assigns a probability score to each test image, indicating whether it's a dog (1.0) or a cat (0.0).
- Custom ConvNet trained from scratch
- Data augmentation: horizontal flip, zoom
- 5-epoch training with early stopping
- Achieved ~0.25 log loss on validation
- Pretrained embeddings from EfficientNet
- Classical neural net on top of frozen features
- Faster convergence, better generalization
- Source: Provided by Kaggle
- Training images: 25,000 JPEGs in the format
cat.123.jpgordog.456.jpg - Test images: Unlabeled set used for leaderboard submission
- Python, TensorFlow/Keras
- EfficientNet-PyTorch
- Matplotlib, Seaborn, OpenCV
- Jupyter Notebooks, Kaggle Kernels
| Model | Log Loss | Accuracy (Val) |
|---|---|---|
| Baseline CNN | ~0.25 | ~90% |
| EfficientNet Model | ~0.18 | ~93% |
- Data augmentation helps mitigate overfitting on small image datasets.
- Transfer learning (with EfficientNet) improves performance and training time.
- Simple binary classification pipelines can still be highly effective with the right preprocessing.
Made with ❤️ by Justin Varghese
Feel free to fork, star, or reach out if you liked this repo!