AI Intrusion Detection System

Hybrid anomaly detection with explainable AI using SHAP and LIME on the CIC IDS 2017 dataset.

PROJECT OVERVIEW

This system combines deep learning and traditional machine learning to detect network intrusions with high accuracy. The hybrid architecture integrates anomaly-based and signature-based detection, supported by explainability methods to interpret model decisions in real time.

FEATURES & IMPLEMENTATION

• Preprocessing pipeline with normalization, label encoding, and feature selection
• Hybrid IDS using Autoencoder, One-Class SVM, and Random Forest
• Real-time prediction with confidence scores and model decision breakdowns
• SHAP and LIME integration for explainable AI visualization

TECHNOLOGIES USED

Python, Pandas, NumPy, Scikit-learn, TensorFlow, Keras, SHAP, LIME, Matplotlib, CIC IDS 2017 dataset

CHALLENGES

• Managing highly imbalanced datasets
• Balancing sensitivity and specificity across models
• Making model outputs understandable for non-technical users

LEARNINGS & IMPACT

This project sharpened my skills in designing interpretable machine learning systems. I learned how to build trustworthy models that are transparent, scalable, and effective for security-critical applications.

SCREENSHOTS & DIAGRAMS

General structure of IDS types and hybrid integration

Figure 1. Diagram outlining IDS types and hybrid model placement

Confusion matrix: Autoencoder + One-Class SVM

Figure 2. Autoencoder + One-Class SVM: Significant false negatives with imbalanced data

Figure 3. Random Forest: Strong classification on supervised signature-based data

Figure 4. Hybrid Model: Excellent performance, combining strengths of both models

LIME explanation for a single prediction

Figure 5. LIME output showing local explanation of anomaly prediction

SHAP summary plot for important features

Figure 6. SHAP summary plot highlighting feature importance and impact on predictions