We’ve been losing the war on cybercrime for some time. Research firm Forrester reports over a billion accounts stolen in 2016 alone, and these data breaches are going up, not down. We are having to wade through more incident data, and people cannot keep up. Could machine learning help solve the problem?
For years, researchers hoped that artificial intelligence would produce human-like machines. Now, they focus on a subset of AI that can solve more realistic and useful challenges. Machine learning cannot do everything a human can, but it doesn’t have to. Instead, we can train it to be good at narrowly-defined tasks – even better at them than humans, in some cases.
These algorithms are good at recognising things using a technique called supervised learning. For example, say you want a computer to recognise pictures of cars. Show a machine learning algorithm a collection of car pictures. These could be sports cars, jeeps or anything else that fits the description. Then show it a set of non-car pictures, such as flowers, dogs, or something tricky like bikes. You have already classified and tagged the pictures, so the algorithm knows when it’s seeing a car and when it isn’t.
The machine learning algorithm uses the pictures to build a statistical model. The model recognises car-like things in images. In this way, it can tag cars in new pictures that it hasn't seen before. The better the data used to train it, the better its guess will be.
Machine learning frees programmers from creating rules to describe a car. That’s a good thing, because those rules get complicated. Is a car still a car if it’s red? What about if it’s a different shade of red, or green, or a three-wheeled Robin Reliant? Or a convertible?
These new algorithms can solve other problems that are hard to program explicitly. That's why machine learning is transforming the messy, non-deterministic world of cybersecurity. Some are exploring its use for classifying malware. Others are using it to make sense of the many security alerts that they see every day.
Users: the messiest problem of all
Another area of cybersecurity that could enjoy machine learning is access management. It can decide how to grant someone system access under different conditions. There are two things that make machine learning ideal for this process.
First, if there is one thing that it’s difficult to code explicit rules for, it’s people. They are unpredictable. They access different applications from different locations, and at different times.
Technology makes people more unpredictable because it gives them more choices. It isn’t only C-suite executives that access data and applications from the road these days. Companies are using technology to create a better working environment. This enables lower-level staff to work remotely from different devices.
It’s difficult to create explicit rules for each individual person. A parent in marketing needs to access data from their mobile on the school run. Their behaviour is unlikely to match the head of sales, who makes regular trips to Dubai.
Second, the whole process of access authorisation needs revising. Passwords should be long dead, but they are still the most common access mechanism. Security companies help shore things up with two-factor authentication. Attackers respond with new techniques, and the rules keep changing. Biometrics are expensive and hard to use in remotely-accessed systems.
These security flaws fuel one of today's biggest cybersecurity problems: account hijacking. Insecure but valuable account credentials are the Internet’s new currency and stolen usernames and passwords are ubiquitous on the dark web. Attackers can use them to log in and impersonate a user, bypassing data protections. Encryption doesn’t matter if a user’s stolen account can see the data in plain text.
Machine learning to the rescue
Machine learning is a promising technology to change all that. It adds another layer of defence to identity and access management (IAM) systems. Companies already use IAM to assign different privileges to users’ accounts. This prevents Jim in the mailroom from using the same applications and data as Julie, the CEO. Now, we can prove that the right people are using those accounts.
Machine learning systems can use historical IAM data to populate their statistical models. They can then identify normal access patterns for individual employees. Jim may only ever access his account from the mailroom on the single PC housed down there, between 9am and 5pm. Julie may hop around the world, logging in at different times from different devices.
A machine learning model would understand these patterns. When Jim’s access looks normal, the security controls may be minimal. This lets him do his job with no fuss, while keeping up with the fast-paced daily sorting process.
A machine learning system might know that Tokyo is a regular access location for Julie. It may grant access but impose a mid-level authentication process. But if Julie starts accessing from the Ukraine on a device never seen before, it’s better to get her to call in and speak to a supervisor before allowing her in.
It would be hard for IT professionals to program explicit rules like this for every employee. Updating them to reflect changing business conditions would be even harder. By updating its own statistical model, machine learning frees them from having to and, instead, creates 'living' security policies that reflect the latest working practices.
A machine learning system also makes decisions in real time. This closes the gap between an account theft and someone noticing that it's gone. That’s not a window you want to leave open.
Let’s not elevate machine learning beyond its limitations. It won’t fix cybercrime. It can help ease some specific pressure points, though, and account theft is one of them. We need every weapon in our arsenal to ensure that the right people are logging in. Why not enlist machine learning’s help?
Barry Scott, CTO, Centrify EMEA
Image source: Shutterstock/Sarah Holmlund