Skip to main content

Employing machine learning in a security environment

(Image credit: Image Credit: Geralt / Pixabay)

No matter where you look in the security world today, you’ll see the terms machine learning and artificial intelligence (AI). There’s been a great deal of interest in these technologies as organisations look for better ways to improve their security posture and fight against advancing cyberattacks. Both machine learning and AI are offering breakthroughs in solving problems in many other areas of our lives, so it’s only natural to try to use them to make similar breakthroughs in the field of security. Unfortunately, there’s a lot of hype and misinformation surrounding what machine learning and AI can do to improve security. 

Machine learning has been around for decades, but until recently, it wasn’t feasible for most organisations for two reasons. Firstly, machine learning needs an incredible amount of computational power in order to apply its algorithms to data and get reliable results quickly. Secondly, machine learning requires vast stores of data to mine. As the cost of storage has gone down, however, it has become increasingly accessible for data storage needs (e.g., building repositories such as data warehouses or data lakes). Processing power continues to nearly double each year. These advances in technology have come together to make machine learning practical and accessible. 

The need for automation

Machine learning and AI have subsequently become frequent buzzwords in the security space. Security teams have an urgent need for more automated methods for detecting threats and malicious user behaviour—and this need is driving increased interest in these topics. Automation is vital for overwhelmed security teams. This is because prevention measures are not infallible, and many of today’s detection methods rely on manual investigation and decision making to find advanced threats, malicious user behaviour, and other serious issues. 

Security analysts encounter huge numbers of false positives and negatives. Indeed, the threat surface has increased exponentially due to the expansion of mobile devices, cloud storage, and the Internet of Things—all of which only increase the number of false positives. Security teams are buried in alarm fatigue. They can’t keep up with the activity that needs to be analysed, or they struggle to identify emerging threats in order to focus the real threats.

Improving detection means improving accuracy and efficiency, and that requires figuring out how to make detection technologies smarter. That’s where AI and machine learning come in. Machine learning offers far better capabilities than humans can deliver in recognising and predicting certain types of patterns. Security technologies can use machine learning to identify patterns in their data, enabling them to make decisions and to help humans make decisions faster and more accurately. With machine learning, security technologies can also move beyond rules-based approaches that require prior knowledge of known patterns. For example, it can learn the typical patterns of activity within a networking environment to recognise pattern deviations. These departures are possibly indicative of threats and identify these threats earlier in the Cyber Attack Lifecycle. 

It's worth noting though that the effectiveness of machine learning relies on having access to large sets of high-quality, rich, structured data capturing network activities across numerous endpoints. The old phrase “garbage in/garbage out” perfectly explains this situation. If machine learning algorithms ingest data sets that aren’t accurate, clear, well-organised, and comprehensive, they’re not going to produce the desired results. In other words, just because there are machine learning algorithms in place doesn’t necessarily mean what they learn is intelligent and useful. If you teach the algorithms the wrong lessons, they’re going to deliver the wrong answers.

The hype and the reality

In a perfect world, machine learning would be the silver bullet for defeating your organisation’s security challenges It would enable full automation of security operations, eliminating the need for human involvement. It would learn what every user, system, and application does in incredible detail, enabling immediate identification and handling of user impersonation, malicious intent, and other issues. However, this isn’t realistic.  

Applying AI to security via machine learning is frequently presented as an easy solution. It’s not. Contrary to many claims, no product can deliver a silver bullet for this today. It will take considerable time and advancement to achieve this effectively. Consider the similarities between a Security Operations Centre (SOC) that identifies and responds to security incidents and a fraud department that uses fraud analytics techniques to identify and respond to credit card misuse. Even though analysis and identification may be automated, humans are still needed to respond and recover (e.g., deciding an issue is a false positive, communicating with the affected people, and coordinating actions with other organisations). Today’s security products cannot fully automate the SOC and completely eliminate the need for security analysts, incident responders, and other SOC staff. 

While it is not a silver bullet, there is a tremendous amount of value in applying machine learning to solving security challenges. Achieving AI would significantly reduce the mundane work performed by highly skilled and highly paid people. It would also make incident response much faster, effective, efficient, and accurate. However, instead of striving for the unrealistic goal of having AI today, we need to make incremental progress. For instance, applying machine learning pattern recognition to automatically link a threat model from six weeks ago to a similar one today is a realistic goal. 

Today, machine learning is most helpful in threat detection by learning the patterns of normal activities and recognising anomalies: the introduction or prediction of a new pattern, a change in an existing pattern, or the removal of a pattern. Given the sheer volume of activities occurring in today’s systems and applications, machine learning’s pattern recognition and predictive capabilities have become incredibly important. 

There is a shortcoming to machine learning, however. Alone, it lacks the understanding of security context to recognise the importance or unimportance of each anomaly. Machine learning can identify that a user is acting in an atypical manner, but atypical behaviour is not necessarily good or bad. For example, a user connecting to a server for the first time might be an anomaly, but is it a malicious act? 

In business analytics and other fields, machine learning works well on its own because it looks at anomaly-free data and needs no additional context to predict trends. In the security field, there are many benign anomalies, so the ability to identify anomalies, while important, can’t possibly provide the whole explanation of what’s happened and enable accurate predictions of what will happen. It’s best to think of machine learning’s anomaly recognition capabilities as one of the tools in your toolbox. Imagine that 80–85 percent of threats are known or recognisable by your security information and event management (SIEM) platform. If this is true, 15–20 percent of threats are unknown, and therefore unrecognisable, by your SIEM. This is where machine learning comes in. You need the right tool for the right job, and this is Next Generation SIEM. 

To effectively detect threats, you need to employ the correct algorithm for that threat type. The rest of your tools provide the security context and relevancy. A NextGen SIEM solution can integrate and correlate information from many tools, such as human resources (HR) systems, identity management solutions, vulnerability scanners, and asset management systems. When used together, machine learning and the other tools generate the risk information needed to prioritise human actions. Without prioritisation, there are so many anomalies that it’s impossible to examine them all and find the truly important ones.

Applying machine learning to User and Entity Behaviour Analytics

Threat prediction and detection is a critical area of security that can benefit from machine learning. Consider the challenges in performing user and entity behaviour analytics (UEBA). Gartner defines UEBA as “profiling and anomaly detection based on a range of analytics approaches, usually using a combination of basic analytics methods (e.g., rules that leverage signatures, pattern matching and simple statistics) and advanced analytics (e.g., supervised and unsupervised machine learning”.

UEBA is a perfect application for machine learning as long as the necessary security context is available for understanding the significance of each anomaly. Machine learning can make UEBA considerably more effective for the following reasons: 

  • It can handle the volumes of data to be analysed and the environment to be understood. This includes being able to incorporate many types of data sets, from network traffic patterns and application data to records of user authentication attempts and user access to sensitive data
  • Machine learning-driven UEBA is well suited for identifying “qualified” threats—those that are legitimate and require action. It can take many more factors into consideration than humans can when looking at potential threats, and it can do so in near real time
  • Machine learning-driven UEBA can identify the threats that are hardest to find, such as insider threats, privileged account takeovers, and unknown threats by recognising shifts in behaviour. An organisation can use machine learning to dynamically identify asset risks
  • It can leverage that risk information to identify new activity that conflicts with expected patterns, such as a low-risk user suddenly connecting to a high-risk system and transferring large amounts of data from it to a laptop

Machine learning offers a great deal of promise in improving security by greatly reducing human effort and lowering the time to detect, respond to, and recover from incidents. When used effectively, machine learning can help organisations detect hidden threats and minimise false positives, accelerate incident response, streamline SOC operations to reduce mean time to detect and respond to threats. The technology essentially enables security teams and technology to be better, smarter, and faster by having advanced analytics at the fingertips to solve real problems—like detecting user-based threats such as UEBA—quickly.

Ross Brewer, Vice President and Managing Director of EMEA at LogRhythm

Image Credit: Geralt / Pixabay

Ross Brewer
Ross Brewer has more than 20 years in the information security sector. At LogRhythm he leads the EMEA team where he helps deliver consistent, rapid growth in the region.