Machine Learning Approach Based on Synthetic Minority Over-Sampling Technique and Isolation Forest for Insider Threat Detection
Keywords:
Imbalance data, Insider threat detection, Insider threat detectionMachine Learning (ML), Synthetic minority over-sampling techniqueAbstract
Detecting insider threats is challenging due to
insiders' deep familiarity with networks and
security protocols, allowing them to bypass
traditional security measures. While various
methods combat insider threats, creating
effective detection systems remains difficult.
Research advocates using Machine Learning
(ML) techniques, but handling imbalanced
datasets reduces accuracy. To tackle this, this
paper presents "SMOTE-IForest," merging
SMOTE and IForest for insider threat
detection. Testing on the CERT r6.2 dataset
achieved 80.0% accuracy in detecting user
behaviour. Additionally, it reached a 63.4%
detection rate with a 67.0% false positive rate,
boasting a high AUC of 96.0%, 93.30%
precision, and 88.80% f-measure. This model
addresses accuracy, detection, and false
positive rate issues. SMOTE improves dataset
balance by creating synthetic samples from
the minority class, enhancing classification
accuracy. IForest isolates anomalies,
efficiently handling high-dimensional data
without complex tuning, ideal for insider
threat detection. The "SMOTE-IForest"
model significantly strengthens insider threat
detection systems by overcoming dataset
imbalance and enhancing accuracy. Its
precision and f-measure distinguish between
normal and anomalous behaviour, aiding in
addressing setbacks associated with existing
studies' accuracy, detection, and false positive
rates.