How banks are keeping ahead of criminals with supercharged graph analytics

International finance has become hugely complex. The increased velocity of trading, the development of highly sophisticated instruments and the growth of stringent regulation has been matched by the demand for far more elaborate security, surveillance and reporting.

Just as the nature of the financial world has become more complicated, the activities of criminals and fraudsters have also evolved. The integrity of a financial institution’s operations is now at risk from insiders with specialised knowledge, their collusion with crooks at trading partners, the activities of experienced global money launderers and increasingly, the skills of cyber criminals.

The volume of data and constantly changing variables that has to be monitored and investigated in order to maintain security against these fast-developing threats is huge. It involves spotting suspicious links and patterns among vast amounts of very different kinds of data and it is a task that conventional relational databases are incapable of performing well.

It is only the deployment of graph analytics run on a supercomputing platform that allows connections to be made and anomalies flagged up rapidly and accurately with remarkably low levels of time-consuming false positives.

Why graph?

Skeptics (or the ignorant) may question why graph analytics is so suitable in this field. The simple answer is that relational analytics techniques come to a standstill when an enterprise such as a bank or insurance company has to rely on querying such large volumes of structured and unstructured data.

If surveillance only involved data in tables, relational techniques would suffice. But in the real world, detection depends on establishing suspicious links and connections from all kinds of information in many different formats. Faced with these challenges, even Hadoop, the distributed storage and processing framework, will not deliver the magic that is often ascribed to it.

Thriving on complexity

Graph, by contrast, thrives on high levels of complexity and interconnectedness and has no rival in discerning significant relationship patterns between variegated data types. What might cause conventional analytics to explode, graph analytics can accomplish in seconds.

In a simple example, an investment bank concerned about insider-trading may wish to find all employees who have used instant messaging to contact a third party who is a friend on Facebook with someone else who has access to the back-office settlements system. For graph, this is a simple matter of three hops, unlike conventional methods that require three sets of data to be joined together.

Snuffing out cyber threats

Equally, when protecting financial institutions from cyber-attack, a graph engine will draw on data from a dozen or more sources to determine whether a pattern of activity represents a suspicious anomaly that has to be countered immediately. An entire network infrastructure and its links to third parties can be represented in graph, establishing connections with patterns of previous cyber security incidents and with technical information on government security databases.

This is a level of complexity that only graph can handle, given that the data volumes required for cyber detection can be huge, including weblogs, telemetry, emails, firewall and IP data. In a large enterprise this can easily amount to 20 terabytes per day, some of it structured in tables, but much of it only semi-structured.

Graph’s capacity to cope with complexity on this level is behind the growth in new cyber reconnaissance and analytics services that build a high-resolution image of each organisation’s cyber landscape from the criminal or unscrupulous rival’s perspective.

Cyber analytics, using graph’s ability to join together pieces of knowledge at vast scale, gives users insights at a much higher level of frequency, leaving conventional signature-based security trailing in its wake. Multiple analytics workloads can be run concurrently on a single platform, exploiting the speed of supercomputing to identify relationships and look for behavioural patterns from data that is now generated and stored at a much faster rate than it can be analysed. Without this protection, malicious content has the space to hide and operate undetected.

Once an organisation sees its vulnerability from an adversary’s perspective, it can position its resources to have the biggest impact on boosting security.

Fraud prevention

This capacity to determine links and connections from raw data also makes graph supreme in finding new patterns of fraud. It can protect an organisation by creating a new set of rules that are pushed out to operational systems, determining when an alert should be triggered, immediately flagging up suspicious chains of events.

For example, the chain may be that a bank trader phones a colleague in IT and then at the close of trading, the door security technology indicates they have walked out within a minute of each other, followed by another data source showing the IT employee quickly purchasing shares. In addition to establishing patterns, graph’s ability to explore hidden corners is vital – illuminating fraud, for example, by drawing on data already in the public domain, such as an employee or contractor’s friendship on social media with a CFO.

In insurance fraud, a graph engine has the power to expose collusion where real identities are being recycled or manipulated to create fake evidence. A single social connection from among thousands can unravel an entire plot, saving large amounts of money.

Reducing the cost of AML

In anti-money laundering (AML) operations, supercomputer-powered graph analytics can also take a scythe to costs. Conventional AML can involve many thousands of staff at a large multi-national investment bank and often requires the expensive blocking of transactions while investigations are conducted. With graph, the time it takes for such investigations can be slashed from typically, three-to-four hours, to a mere 20 minutes.

Graph analytics is fast and effective in handling these challenges because it does not integrate data, it takes the feeds from the systems and goes straight to work as a complementary technology.

Totally scalable

Graph analytics on highly connected data is all in memory operating on all the nodes and edges at the same time. This can be a problem for larger use cases for banks with millions of accounts and transactions. No compute node is big enough for all the memory needed. Firms have handled this lack of graph scaleability by partitioning their data unnaturally across many compute nodes. This has meant that you have to make assumptions about what questions can be asked so that only a single node is involved. The problem is that opportunities to exploit the complexity of todays or furure data sources cannot be fully exploted.

Powered by supercomputing technologies, graph engine's can create a single memory space that uses the fast interconnect spanning many compute nodes, making it the most scalable graph technology available. It means it can expand to meet evolving needs, without having to store data in a way that makes assumptions about the questions to be answered and the relationships between the nodes.

It is not necessary to “normalise” the data in order to achieve the desired outcome, merely to add a new set of nodes and relations between the nodes.

Beefed-up security and big ROI

Indeed, across the full range of use cases, the combination of graph analytics and supercomputing delivers substantial return on investment for financial institutions of all sizes with remarkable rapidity, saving time and costly man-hours, offering a shortcut to a level of expertise that would otherwise be inaccessible.

Without it, banks and finance houses that rely on relational databases and more conventional approaches risk floundering in dark, sustaining severe damage from disasters that could have been foreseen and prevented long ago.

Phil Filleul, financial services global lead, Cray Inc.

Image source: Shutterstock/igor.stevanovic