Real-time network anomaly detection system using Kafka for data streaming, MongoDB for persistence, and Python ML models for behavioral analytics.
NetSage ML is a real-time system that detects network anomalies using machine learning only — no IDS or packet capture tools. It consumes streaming flow/telemetry data via Kafka, stores enriched results in MongoDB, and visualizes alerts and trends in a React dashboard.
- Streaming: Apache Kafka
- Backend: FastAPI (Python)
- Database: MongoDB
- ML Engine: scikit-learn + PyOD + TensorFlow (for optional AutoEncoder)
- Frontend: React + Tailwind CSS + Chart.js
- Visualization: Grafana-ready APIs
- Python 3.9+
- Node.js 16+
- MongoDB running on localhost:27017
- Kafka running on localhost:9092
- Clone and setup Python environment:
cd netsage-ml
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt- Setup environment variables:
cp .env.example .env
# Edit .env with your configuration- Train initial models:
python scripts/train_iforest.py-
Start MongoDB and Kafka (if not already running)
-
Start the FastAPI backend:
python -m uvicorn api.main:app --reload --port 8000- Start Kafka consumer (ML pipeline):
python kafka/consumer.py- Start Kafka producer (mock data):
python kafka/producer.py- Setup and start React dashboard:
cd dashboard
npm install
npm startnetsage-ml/
├── kafka/
│ ├── producer.py # Simulates flow events
│ └── consumer.py # Feeds ML pipeline
├── ml_engine/
│ ├── feature_extractor.py # Feature engineering
│ ├── anomaly_detector.py # ML model wrapper
│ ├── models/ # Trained model files
│ └── train_model.py # Training script
├── api/
│ ├── main.py # FastAPI app
│ ├── routes/ # API endpoints
│ └── models/ # Pydantic models
├── dashboard/ # React frontend
├── scripts/ # Utility scripts
└── .env.example
- Isolation Forest: Fast unsupervised anomaly detection
- DBSCAN: Clustering-based anomaly detection
- AutoEncoder: Deep learning anomaly detector (optional)
✅ 100% ML-based anomaly detection (no IDS or DPI) ✅ Real-time Kafka streaming ingestion ✅ MongoDB storage for alerts & flows ✅ Modular ML architecture (swap models easily) ✅ React dashboard for real-time visualization