Analysis and Classification for Threat Detection

As part of the analysis and detection process, the Cybereason platform’s CMC Engine uses a combination of artificial intelligence, machine learning models, and behavioral analysis-based detection rules to analyze and determine the malicious nature of what is happening in your organization’s environment and network with a high level of accuracy.

An event’s final “classification” involves a combination of sources:

  • Behavioral analysis and comparison of the behavior against platform detection rules

  • Enrichment from threat intelligence sources

  • Platform classification

  • Endpoint classification when possible

This section provides some details on classification techniques.

Behavioral analysis

The Cybereason platform takes a behavioral approach to threat detection. In addition to analyzing indicators of compromise (IOC) such as URLs and hash values, the Cybereason platform identifies and analyzes tactics, techniques, and procedures (TTPs). Identifying a threat’s TTPs is more valuable than focusing on the IOCs because TTPs take longer to develop and are unlikely to change during the duration of the attack.

The Cybereason platform uses a number of malicious activity models, which are predictive models that help theCybereason look for Malops during the life cycle of a hacking campaign. These models are based on continuous research and evaluation of attacker behavior over time.

The Cybereason platform’s malicious activity models include:

Model

Details

Malware models

This model looks for tell-tale signs of known and unknown malware, malicious tools, and zero-day exploits that attackers use to get an initial foothold in your environment.

Privilege escalation models

This model examines user and process behavior to identify an attacker’s attempt to gain a higher level of access to resources in your environment.

Lateral movement models

This model identifies attackers trying to expand their foothold in your environment by using legitimate tools, a method that traditional security programs cannot detect.

Command and control models

This model detects network traffic between your environment and an attacker’s command and control servers. This model identifies behaviors such as domain generated algorithms (DGA).

Data exfiltration models

This model identifies the attacker’s attempt to exfiltrate data or cause other types of damage in your environment.

Ransomware models

This model identifies malware that encrypts files in an attempt to extort users.

The platform continues to evaluate and add additional models as additional techniques and attacker behaviors are found.

Once the CMC Engine analyzes the behavior, the engine compares the behavior patterns found against the platform’s proprietary detection rules to generate evidence, suspicions, and Malops that you can investigate and remediate.

Threat intelligence enrichment

For events found throughout your organization’s environment, the Cybereason platform must determine a number of things, including (but not limited to):

  1. Is the process/file/module/service associated with this event malicious or benign?

  2. If the event is malicious, is it a known item (such as known malware or a known malicious framework)?

  3. What else is this event/Element doing?

How does the Cybereason platform reach this conclusion?

For items that are known, such as known malware or known malicious files, a key element in the Cybereason platform’s decision is the real-time querying of global threat intelligence sources. The Cybereason Global Threat Intel server receives reports from sources such as Virus Total. These reports from additional threat intelligence sources are useful as they are used by many security vendors and represent an aggregation of intelligence from a wide range of sources. If a majority of vendors determine a file to be malicious, it most likely is, and security products must make use of this information.

However, some reports might contain conflicting information, and different vendors may disagree.

Virus Total

Platform classification through artificial intelligence

In addition to information from threat intelligence sources, the Cybereason platform uses a statistical machine learning classification algorithm that is trained over large sets of data and analyst feedback. The algorithm gives scores to the information contained in threat intel reports, enabling the Global Threat Intel server to reach a confident classification, even when vendors disagree.

The algorithm includes a set of complex calculations that:

  • Assess the accuracy of previous classifications

  • Address the strengths and weaknesses of specific vendors

  • Perform sophisticated analysis of text tokens within the report in order to understand the meaning behind each vendor’s verdict string. For example, if the report text includes the string ‘Bad Rabbit’, this increases the likelihood that the file will be classified as ransomware.

  • Use additional metadata from intel reports apart from vendor verdicts

The result is an intelligent classification into a number of categories, such as:

  • Malware

  • Hack tool

  • Ransomware

  • Unwanted (PUP)

  • Indifferent

This classification model provides a more detailed and nuanced description of malicious events to reflect real world complexity, as opposed to a simple ‘Good/Bad’ identification.

The accuracy of the classification improves over time, and the algorithm is updated on a periodic basis based on new data. In some cases, this means that the Cybereason platform retroactively re-classifies a process that ran previously, based on new data.

Classification on the endpoint

On an endpoint machine, the Cybereason platform’s NGAV component uses artificial intelligence to classify file hashes as malicious or benign. This machine learning algorithm analyzes file properties and additional metadata to determine the likelihood that the file is a new, unknown type of malware that has not yet been discovered in global threat intel sources.

ML in NGAV

The Cybereason platform’s NGAV’s AI model is trained using the results of the machine learning algorithm based on threat intel sources. In this way, both the platform’s threat intelligence and NGAV components work hand in hand, using AI to prevent malware before it can execute.