Last year, mobile operators lost $38 billion (£28bn) of their revenue to fraud, according to the Communications Fraud Control Association’s 2015 survey. International crime rings are successfully and profitably using highly sophisticated techniques to bulldoze through phone companies’ anti-fraud defences. However, emerging big data machine learning applications are beginning to turn the tide. Padraig Stapleton, vice president of engineering at Argyle Data provides insights on how mobile operators are deploying big data and AI to protect themselves and their consumers.
1. What is the financial impact of mobile operator fraud on the mobile economy and on the consumer?
Mobile operators face an increasingly complex battle against sophisticated global cybercriminals. Operators must continue to expand into new markets and innovate to win new customers, but this constant evolution increases the potential for fraud, negative margin, and arbitrage. The effect of fraud isn’t confined to the operator’s bottom line: it also has a negative impact on customer service costs, subscriber churn and brand damage.
Recent coverage of phone scams in the UK ensnared several mobile providers whose customers found themselves hit with vastly inflated mobile phone bills. This happens all the time, often on a less noticeable scale to the consumer. For example, millions of users are charged premium rates for ‘wangiri’ or call-back scams – you call a number that pops up on your phone and find yourself connected to a premium line. Big data, machine learning and AI are a powerful combination in combating this type of fraud.
2. What types of fraud and threats are the most common and why do they slip past fraud analyst defences?
We are seeing a huge spike in international revenue share/roaming fraud – it’s up by 500 per cent between 2013 and 2015. Theft has also seen an unhealthy 104 per cent increase in the same period. But the reality is that most major attacks are mixtures - what we call ‘fraud cocktails’.
There are many reasons why operators have not been able to detect fraud, but the chief cause is that rules-based traditional systems can only capture known attack types and require manual review and intervention. Many scams pass unnoticed simply because they are unknown, extremely rapidly executed, and highly sophisticated.
3. Why do modern cybercriminals out-innovate traditional fraud systems?
Criminals don’t abide by rules. Just as with enterprise cybersecurity breaches, it’s impossible to predict where the next attack will come from. A cyber gang can set up, go to work, and disappear in 24 hours or less – before an operator knows the attack is happening. Modern cyberattacks mutate, evolve, and arbitrage faster than an analyst can write rules to detect them. This type of mutating attack has a cloak of invisibility that is impervious to detection with traditional methods.
This is made more complicated by the fact that each subscriber usually has many devices, and each mobile service provider is constantly introducing new offers, service plans and incentives, all of which open different areas of vulnerability. And operators keep their data in silos – billing, credit, service history, etc. It’s big data, but it’s not combined into a form that gives a total picture of each user. Without this 360-degree perspective, fraud managers can’t use big data to distinguish normal traffic from anomalous activity. Everything points to the need for a different way to detect new fraud and protect customers and carriers.
4. How can big data be leveraged to protect against fraud and threats and how can the limitations of big data silos be overcome?
Big data is commonly described in terms of the ability to handle 'volume, velocity, and variety'. How does a service provider get to the point where they can quickly identify anomalous behaviour? You need to start by combing big data from silos into a vast data lake that provides a total view of each user. So, for example, if a user whose entire call history has been in France suddenly begins to make volumes of calls to and from Latvia, it’s easy to match up the normal against the abnormal.
Hadoop is the ideal -- in fact, essential -- storage platform for data lakes. Fraud detection applications written on native Hadoop have a distinct edge over hybrids or ‘Hadoop-compatible’ options when it comes to performance and ROI. In our case, Argyle Data’s big data analytics platform is applied onto our data lake architecture.
5. What are the advantages of the immune system approach using adversarial machine learning and graph theory, and is this approach used in other enterprise/industries?
Artificial Intelligence (AI) / machine learning is key to beating mobile fraud. It is not feasible to have fraud analysts manually inspect high volumes of individual calls, analyse figures, and create visualisations of fraud attempts graphs to detect fraud. This approach simply doesn't scale.
The combination of a native Hadoop architecture, combined with real-time data ingestion, analytics, and machine learning provide a thorough defence against revenue fraud. Graphical representations enable fraud analysts to easily see attacks as they happen – a more effective approach than an Excel workbook full of numbers – which traditional systems would struggle to approximate. Graph visualisations can make fraud, profit, and SLA threats very obvious in ways that other techniques can’t. Graphs are used to quickly illustrate important data. This approach can be applied across any industry or organisation needing to apply analytics to detect issues ranging from mechanical breakdown to cyberthreats.
6. Why is machine learning/AI a key technology for fighting evolving fraud?
Detecting fraud or revenue drain as they happen – rather than identifying suspect traffic and waiting two or three days for accurate analysis – is a key tool in the fight against fraud. Crime rings that are thwarted on a few successive attempts will halt those attacks and move to a more vulnerable target, i.e. any operator or organisation whose defences don’t stand up to modern-day threats.
The only way to detect anomalies -- identifying both known and emerging fraud types -- is to apply AI/ machine learning at massive scale in real time. This was not possible until recently, but now there are systems available that utilise big data and Hadoop to do just that. When you have enough data, and you have access to that data in real time, you use AI/machine, and you can present data to fraud analysts in an instantly-readable fashion, you can detect and prevent fraud in real time.
7. Why is visualisation a vital tool for fraud analysts in machine learning and analytics systems?
Visualisations present anomalies in a way that makes fraud identification simple both for machines and humans. For instance, if on a chart there is a cluster of calls in the bottom left corner – but one number, that sticks out like a sore thumb in the top left corner, has made over 21,000 calls and consumed almost zero seconds. This is obviously 'anomalous behaviour' – both to a human and a machine. Visualisations bring fraud to life and make it beautifully obvious to a human.
Another example could be a call-based volume and velocity attack illustrating a Wangiri attack (the fraud method) used to misappropriate revenue from international mobile calls IRSF (the fraud type). Assume that a real-time machine learning algorithm has alerted you via your dashboard scenario map to a Wangiri attack from Cuba. The attack type and method become far clearer when you visualise it
8. How does this approach prevent mobile fraud?
This approach enables mobile providers and other organisations to detect and shut down revenue attacks as they occur. We have already shown in deployment that crime rings will quickly desist from unprofitable or unsuccessful attacks. The trick is to catch them early, and stop them fast.
Fraud is big business driven by attack and defence/detection phases. Facebook, Google, and LinkedIn have pioneered big data and machine learning approaches to protecting their subscribers and gaining insight on vast amounts of data. We believe that communications providers can learn from the big data approaches taken and apply them to the mobile industry to detect and analyse fraud and create data lakes for new applications.
Padraig Stapleton, vice president of engineering at Argyle Data