AI Intrusion Detection System

Hybrid anomaly detection with explainable AI using SHAP and LIME on the CIC IDS 2017 dataset.

PROJECT OVERVIEW

This system combines deep learning and traditional machine learning to detect network intrusions with high accuracy. The hybrid architecture integrates anomaly-based and signature-based detection, supported by explainability methods to interpret model decisions in real time.

FEATURES & IMPLEMENTATION

• Preprocessing pipeline with normalization, label encoding, and feature selection
• Hybrid IDS using Autoencoder, One-Class SVM, and Random Forest
• Real-time prediction with confidence scores and model decision breakdowns
• SHAP and LIME integration for explainable AI visualization

TECHNOLOGIES USED

Python, Pandas, NumPy, Scikit-learn, TensorFlow, Keras, SHAP, LIME, Matplotlib, CIC IDS 2017 dataset

CHALLENGES

• Managing highly imbalanced datasets
• Balancing sensitivity and specificity across models
• Making model outputs understandable for non-technical users

LEARNINGS & IMPACT

This project sharpened my skills in designing interpretable machine learning systems. I learned how to build trustworthy models that are transparent, scalable, and effective for security-critical applications.

SCREENSHOTS & DIAGRAMS

General structure of IDS types and hybrid integration
Figure 1. Diagram outlining IDS types and hybrid model placement
Confusion matrix: Autoencoder + One-Class SVM
Figure 2. Autoencoder + One-Class SVM: Significant false negatives with imbalanced data
Confusion matrix: Random Forest
Figure 3. Random Forest: Strong classification on supervised signature-based data
Confusion matrix: Hybrid Model
Figure 4. Hybrid Model: Excellent performance, combining strengths of both models
LIME explanation for a single prediction
Figure 5. LIME output showing local explanation of anomaly prediction
SHAP summary plot for important features
Figure 6. SHAP summary plot highlighting feature importance and impact on predictions