CatBoost Iris Classifier 🌸

A machine learning project that uses CatBoost to classify iris flower species based on their physical characteristics. This implementation demonstrates gradient boosting for multiclass classification on the classic Iris dataset.

📋 Description

This project implements a CatBoost classifier to predict iris species (Setosa, Versicolor, and Virginica) based on four features:

Sepal length
Sepal width
Petal length
Petal width

The model achieves high accuracy using gradient boosting techniques with minimal code and configuration.

🚀 Features

Dataset Loading: Uses the built-in Iris dataset from scikit-learn
Data Inspection: Displays dataset shape, features, and class distribution
Train/Test Split: 80/20 split for model validation
CatBoost Implementation: Gradient boosting classifier with multiclass support
Performance Evaluation: Accuracy metrics on test data

📦 Requirements

pandas
scikit-learn
catboost

🔧 Installation

Clone this repository:

git clone https://github.com/javsan77/CatBoost-Iris-Classifier.git
cd catboost-iris-classifier

Install dependencies:

pip install pandas scikit-learn catboost

💻 Usage

Run the script:

python iris_classifier.py

Expected output:

--- Dataset Inspection ---
X shape (rows, columns): (150, 4)
y shape (samples): (150,)

First 5 rows of X (Features):
   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
...

Value counts of y (Target Species 0, 1, 2):
0    50
1    50
2    50
--------------------------

Iniciando el entrenamiento del modelo CatBoost en la CPU...
¡Entrenamiento completado!

Precisión del modelo en el conjunto de prueba: 1.0000

🧠 Model Architecture

CatBoost Classifier Parameters:

iterations: 100 trees
learning_rate: 0.1
depth: 6 (tree depth)
loss_function: MultiClass (for 3 species)
random_seed: 42 (for reproducibility)

📊 Dataset Information

Iris Dataset:

Samples: 150 (50 per species)
Features: 4 numeric attributes
Classes: 3 (Setosa, Versicolor, Virginica)
Split: 120 training samples, 30 test samples

🎯 Performance

The model typically achieves ~97-100% accuracy on the test set, demonstrating CatBoost's effectiveness for this classification task.

🔍 How It Works

Data Loading: Imports the Iris dataset and converts to pandas DataFrame
Data Inspection: Prints dataset dimensions and distribution
Data Splitting: Divides data into training (80%) and testing (20%) sets
Model Training: Trains CatBoost classifier with 100 iterations
Prediction: Makes predictions on unseen test data
Evaluation: Calculates and displays accuracy score

🌟 Why CatBoost?

High Performance: Often outperforms other gradient boosting libraries
Easy to Use: Minimal hyperparameter tuning required
Built-in Categorical Support: Handles categorical features automatically
Fast Training: Optimized for CPU training
Robust: Less prone to overfitting

📚 Learning Objectives

This project helps you understand:

Gradient boosting for classification
CatBoost library implementation
Multiclass classification problems
Model evaluation techniques
Data preprocessing with pandas

🤝 Contributing

Contributions are welcome! Feel free to:

Report bugs
Suggest improvements
Add new features
Improve documentation

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

👤 Author

Javier Sanchez Ayte - https://www.linkedin.com/in/javier-sanchez-ayte/

🙏 Acknowledgments

Built with CatBoost
Dataset from scikit-learn
Inspired by gradient boosting techniques

📖 Further Reading

⭐ If you found this helpful, please give it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
catboost_info		catboost_info
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CatBoost Iris Classifier 🌸

📋 Description

🚀 Features

📦 Requirements

🔧 Installation

💻 Usage

🧠 Model Architecture

📊 Dataset Information

🎯 Performance

🔍 How It Works

🌟 Why CatBoost?

📚 Learning Objectives

🤝 Contributing

📝 License

👤 Author

🙏 Acknowledgments

📖 Further Reading

About

Uh oh!

Releases

Packages

Languages

License

javsan77/CatBoost-Iris-Classifier

Folders and files

Latest commit

History

Repository files navigation

CatBoost Iris Classifier 🌸

📋 Description

🚀 Features

📦 Requirements

🔧 Installation

💻 Usage

🧠 Model Architecture

📊 Dataset Information

🎯 Performance

🔍 How It Works

🌟 Why CatBoost?

📚 Learning Objectives

🤝 Contributing

📝 License

👤 Author

🙏 Acknowledgments

📖 Further Reading

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages