Abstract
Intrusion detection systems (IDSs) must balance detection quality with operational transparency. We present a deterministic, leakage-free comparison of three classical classifiers: Naïve Bayes (NB), Logistic Regression (LR), and Linear Discriminant Analysis (LDA). We also propose a hybrid pipeline that trains LR on Autoencoder embeddings (AE). Experiments use NSL-KDD and CICIDS2017 under two regimes (with/without SMOTE (Synthetic Minority Oversampling Technique) applied only on training data). All preprocessing (one-hot encoding, scaling, and imputation) is fitted on the training split; fixed seeds and deterministic TensorFlow settings ensure exact reproducibility. We report a complete metric set—Accuracy, Precision, Recall, F1, Area Under the Curve (AUC), and False Alarm Rate (FAR)—and release a replication package (code, preprocessing artifacts, and saved prediction scores) to regenerate all reported tables and metrics. On NSL-KDD, AE+LR yields the highest AUC (≈0.904) and the strongest F1 among the evaluated models (e.g., 0.7583 with SMOTE), while LDA slightly edges LR on Accuracy/F1. NB attains very high Precision (≈0.98) but low Recall (≈0.24), resulting in the weakest F1, yet a low FAR due to conservative decisions. On CICIDS2017, LR delivers the best Accuracy/F1 (0.9878/0.9752 without SMOTE), with AE+LR close behind; both approach ceiling AUC (≈0.996). SMOTE provides modest gains on NSL-KDD and limited benefits on CICIDS2017. Overall, LR/LDA remain strong, interpretable baselines, while AE+LR improves separability (AUC) without sacrificing a simple, auditable decision layer for practical IDS deployment.
| Original language | English |
|---|---|
| Article number | 749 |
| Journal | Algorithms |
| Volume | 18 |
| Issue number | 12 |
| DOIs | |
| State | Published - Dec 2025 |
Bibliographical note
Publisher Copyright:© 2025 by the authors.
Keywords
- AUC
- autoencoder
- CICIDS2017
- false alarm rate
- intrusion detection system (IDS)
- Linear Discriminant Analysis
- Logistic Regression
- Naïve Bayes
- NSL-KDD
- SMOTE
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver