Cyber attackers are maturing. Attacks are getting more complex. This is especially true when it comes to cyberwar, so much so that government sponsored attacks have been bolstered by research investments that approach military proportions.
Just look at the recent report published by the US State Department, which said that strategies for stopping cyber attacks need to be fundamentally reconsidered in light of complex cyber threats posed by rival states.
In order to detect and stop these attacks, innovation is required.
I say that because anomaly detection based on traditional correlation rules often results in too many false positives and events that can reasonably be manually reviewed. As security threats are growing faster than security teams and budgets can keep up, there is an urgent need for better and more automation when it comes to anomaly detection.
‘Deep Learning’ is the innovative technology that should bring us back in the game. While machine learning and deep learning neural networks are not new technology, the reduction in cost of storage and computing resources has helped in their rebirth, and in turn they have become highly relevant in our bid to keep networks secure.
But the most important factor for re-examining its use is the availability of huge amounts of data, crowdsourced by the hyper-cloud giants who have pushed the research forward into what we have today. It’s the vast volumes of data that have helped us understand how deep learning can be applied to the scenarios we face today.
Of course, deep learning requires massive amounts of good data. Feeding the model with bad or poisoned data will lead to biased models and ultimately to false negatives. It means we need to train the model in a "clean" environment or need access to huge amounts of scrubbed data that matches our network. So where should security specialists start?
One could synthetically generate data, but by definition this data will be correlated, and correlation will have adverse effect on performance. Studies in adversarial machine learning are ongoing, its goals are to find better ways to learn in the presence of adversaries and create models that are more resistant against noise and wrongly labelled data.
For example, look at the poisoning of Tay, the Microsoft Twitter bot that was supposed to behave as a teenage girl but had to be taken offline after 24 hours because a group of Internet trolls turned the cute little twitter-bot into a sex-crazed, Nazi-loving, Trump supporter. It illustrates how far we still have to go.
The role of honeypots
Deep learning systems are also not good at handling changing and dynamic environments. As new applications, devices and protocols are added to our networks, the deep learning system will at some point require retraining.
Every source of data is useful for increasing the prediction quality and probability of generating a correct output. In that sense, honeypots deployed within the network provide a good source of information. All information collected from honeypots can be labelled as “bad” or “attack” as there is nearly no use-case where events generated from honeypots can be considered good behaviour.
So deep learning is basically a black box that we program with data. Nobody fully understands how it comes to a specific answer or solves modern day problems, all we know is that its decision making is based on a concept of generalisation.
It should not come as a surprise that a certain amount of distress in the input value can lead to different or wrong classification. These are evasion attacks, used by an attacker by applying small perturbations in the input and leading to incorrect classification.
Unfortunately, as with all new innovative technologies for automating cyber defence hackers will find ways to leverage them and abuse them for attacks. There has always been an imbalance between success rates for attack vs defence: the defence has to continuously plug all holes and vulnerabilities while the offence only has to find a single vulnerability or hole to be successful.
No CISO ever got decorated for stopping hundreds of attack attempts, but they will be immediately blamed if a single attempt gets through their defences. The same goes for applications of AI. For the defence, there is zero toleration for error while the offence can work with an AI that spits out faulty results most of the time but by luck generates a single good output that results in a breach.
Social engineering takes hold
Big opportunities for hackers also lie in automating social engineering and turning spear-phishing into massive, automated campaigns. We’re talking campaigns that automatically scrub the Internet for personal data and learn from it to produce the ultimate email or text that will trick a person to open an attachment or click a malicious link.
In the area of bot detection and distinguishing between good humans, bad humans, good bots and bad bots, Captcha has been an annoying but effective way to differentiate humans from bots/scripts. That was until 2016 when researchers designed deep learning systems that can solve Google reCAPTCHA with 98% accuracy - far better than most of us humans can solve these things.
So what can we do?
Until we find new ways to make deep learning work with limited amounts of data, or we can make them effective for learning in the presence of adversaries, there will still be the need to find huge sources of non-synthetic data matching our unique network posture. But the good and bad data need to be labelled correctly, or the good data should at least far outweigh the bad data in order to prevent bias in our model and by consequence faulty predictions.
For now, in my opinion, the best place for deep neural network-based anomaly detection is in crowd sourcing and global communities, where a large number of enterprises and networks are contributing the data. Bad data will be present, but because of the diversity in members and scale in numbers, the good data will outweigh the bad by a considerate amount - enough to make deep learning work.
But there is still a need for continuous human expert intervention required to ensure no false positives are produced by the deep learning, and to regularly evaluate its performance. In fact, one could ask if the technology is worth it, since deep neural networks still require a considerable amount of maintenance and experts to validate and tune the model.
The truth is that while rule-based systems can detect a great number of anomalies automatically, its performance still very much depends on input by human intellect. Having said that, deep neural networks are able to find associations, very complex and deep correlations in data that no human will be able to discover. That is a huge leap forward in the cyber security world.
For me, this technology is the future of cyber defence. Even in its current form, while not fully automatable, it provides the level of intelligence and processing of data that will be required for stopping future attacks.
The question remains if the incremental advancements in deep learning combined with adversarial studies will ultimately lead to the next generation of fully automated cyber defensive solutions. Otherwise it could be a case of needing another breakthrough in machine learning and neural networks to achieve the ultimate goal of fully autonomous cyber defence.
Pascal Geenens, EMEA Security Evangelist at Radware
Image Credit: Computerizer / Pixabay