Skip to content

corticalstack/Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ” Network Intrusion Detection Clustering

A Python implementation of clustering techniques applied to the KDD Cup 1999 Intrusion Detection System (IDS) dataset, demonstrating both 2D and 3D visualizations with and without Principal Component Analysis (PCA).

๐Ÿ“š Description

This repository contains code for analyzing and visualizing network intrusion detection data using K-means clustering. It demonstrates how different clustering approaches can help identify patterns in network traffic that may indicate various types of attacks. The implementation showcases:

  • Data preprocessing and cleaning techniques
  • Attack categorization and classification
  • Dimensionality reduction using PCA
  • 2D and 3D visualization of clustering results

๐Ÿงฎ Dataset

The project uses the KDD Cup 1999 Intrusion Detection System dataset, which contains a wide variety of simulated intrusions in a military network environment. The dataset includes:

  • Normal connections
  • Four main categories of attacks:
    • Denial of Service (DoS)
    • User to Root (U2R)
    • Remote to Local (R2L)
    • Probing

Each connection in the dataset is represented by 41 features and labeled as either normal or a specific type of attack.

๐Ÿ”ง Prerequisites

To run this code, you'll need:

  • Python 3.x
  • The following Python libraries:
    • pandas
    • numpy
    • scikit-learn
    • matplotlib

๐Ÿš€ Setup

  1. Clone this repository:
git clone https://github.com/username/network-intrusion-clustering.git
cd network-intrusion-clustering
  1. Install the required dependencies:
pip install pandas numpy scikit-learn matplotlib
  1. Ensure the KDD Cup dataset file (kddcup.data_10_percent) is in the root directory of the project.

๐Ÿ’ป Usage

Run the main script to perform clustering and visualization:

python main.py

This will:

  1. Load and preprocess the KDD Cup dataset
  2. Apply K-means clustering with 5 clusters
  3. Generate four visualization plots:
    • 2D clustering without PCA
    • 2D clustering with PCA
    • 3D clustering without PCA
    • 3D clustering with PCA

โœจ Features

  • Data Preprocessing: Handles duplicates, outliers, and encodes categorical features
  • Attack Categorization: Classifies attacks into five categories (normal, DoS, U2R, R2L, probe)
  • Flexible Sampling: Supports adjustable dataset sampling for testing and development
  • Multiple Visualization Options: Provides both 2D and 3D visualizations with and without dimensionality reduction

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ”— Resources

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages