Skip to content

This project focuses on building a machine learning classification model to enhance the efficiency of Security Operation Centers (SOCs). Using the comprehensive GUIDE dataset, the model predicts the triage grade of cybersecurity incidents (True Positive, Benign Positive, or False Positive).

License

Notifications You must be signed in to change notification settings

Avijit-Jana/Classifying_Cybersecurity_Incidents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

👨‍💻Classifying Cybersecurity Incidents👨‍💻

GitHub repo size GitHub language count GitHub top language GitHub last commit

Table of Contents

📖Project Description

This is a Data Science Project to enhance the efficiency of Security Operation Centers (SOCs) by developing a machine learning model that can accurately predict the triage grade of cybersecurity incidents. Utilizing the comprehensive dataset, the goal is to create a classification model that categorizes incidents as true positive (TP), benign positive (BP), or false positive (FP) based on historical evidence and customer responses. The model should be robust enough to support guided response systems in providing SOC analysts with precise, context-rich recommendations, ultimately improving the overall security posture of enterprise environments.

📁Data Set Overview

There are three hierarchies of data: (1) evidence, (2) alert, and (3) incident. At the bottom level, evidence supports an alert. For example, an alert may be associated with multiple pieces of evidence such as an IP address, email, and user details, each containing specific supporting metadata. Above that, we have alerts that consolidate multiple pieces of evidence to signify a potential security incident. These alerts provide a broader context by aggregating related evidences to present a more comprehensive picture of the potential threat. At the highest level, incidents encompass one or more alerts, representing a cohesive narrative of a security breach or threat scenario.

The Dataset is already divided into 2 parts, a train set containing 70% of the data and a test set with 30% containing 45 features, labels, and unique identifiers across 1M triage-annotated incidents. stratified based on triage grade ground-truth, OrgId, and DetectorId. We ensure that incidents are stratified together within the train and test sets to ensure the relevance of evidence and alert rows.

  • You can download the datasets from here : Datasets

🚩Approach

Developed By - Avijit Jana

About

This project focuses on building a machine learning classification model to enhance the efficiency of Security Operation Centers (SOCs). Using the comprehensive GUIDE dataset, the model predicts the triage grade of cybersecurity incidents (True Positive, Benign Positive, or False Positive).

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published