We know that in the financial trading sector, every second counts and a miniscule delay can be the difference between success and failure. Billions are spent on new technologies to shave milliseconds off everything from international trading to fraud prevention. Financial services technology has in many cases outpaced regulation which has created a looming legal minefield for many financial companies as regulators start to catch up.
For example, many banks are sitting on vast amounts of customer data, such as call recordings, which could put them at risk of breaching data-privacy laws on customer consent. Some financial institutions are also using data-powered ‘robo-advice’ to automate customer service but this could leave them at risk of breaching anti-discrimination laws because some speech-recognition systems contain hidden racial and gender biases. Fraudulent mis-selling is also causing increased problems with Prudential recently fined by the Financial Conduct Authority for mis-selling retirement products.
In response, many financial sector companies are adopting AI to combat both staff and customer fraud. Banks already use AI to detect and prevent payments fraud and employ image-recognition systems for security. What is less widely-known is that some companies are also now successfully using AI to comb call records for GDPR breaches or even monitor live calls to flag mis-selling and rogue trading in real-time.
Among the variety of applications of AI in the financial sector is speech recognition, which offers numerous possibilities, including voice-based account servicing, robo-advice, autonomous analysis of audio archives and live ‘sentiment analysis’ of customer calls as well as the real-time transcription of any audio feed to allow instant decisions to be made.
Giants such as Deloitte are now using AI to help enforce compliance and mine their audio data for additional business insights. For instance, automated speech recognition (ASR) technology in audio monitoring can set live triggers on chosen keywords, which can include major financial announcements and other announcements that can have impact on share prices. This monitoring capability can also detect potential issues, signs of insider trading and patterns of misconduct such as rogue trading. ASR technology has the ability to derive sentiment, emotion and ‘in-house’ industry jargon to monitor and evaluate calls for customer satisfaction and compliance in real-time.
Is discrimination going to be an issue?
Automation is creating concerns over discrimination. For example, some of the speech-recognition language models behind ‘robo-advisers’ could have been trained on data from people with a certain accent or dialect, unwittingly discriminating against English speakers from other nationalities. Some pioneering financial firms are responding by using voice-recognition systems that are trained across all dialects and accents.
Machine learning (ML) systems are even being deployed to analyse advice provided by financial institutions, driven by the need to combat increased mis-selling of financial products such as loans and mortgages, which is a recognised issue in the sector.
For instance, ML systems are capable of autonomously transcribing and analysing videos and calls in real-time, enabling them to reduce the risk of fines and prosecutions. For example, many banks are aware of evidence of PPI mis-selling in customer calls but cannot release or analyse them with their software due to the recordings containing sensitive data. As a result, they have generally opted to accept the fine for PPI mis-selling over the fines associated with breaching regulations such as GDPR. With the availability of ML enabled AI-powered speech recognition systems, it is possible for banks to isolate the relevant components of those calls, without exposing customers’ sensitive information. ML enhances the effectiveness of AI-powered speech recognition systems focused on fraud because it allows the system to constantly learn, evolving in tandem with the nature of the threat.
As a technology speech recognition has endless potential and is in great shape – accuracy levels are good and improving all the time. The accuracy is no longer focused on the easy scenarios, but is now being used for noisier, harder conversational use-cases, making the technology practical for real-world applications. This is supported by the ability to deploy the technology in scalable ways that meet business needs, offering on-premises models as well as a public cloud.
Building a risk picture
The way it is consumed is getting easier too. Speech recognition can support things like multi-accents and dialect models to avoid the challenges of managing deployments for the diverse world that we live and operate within. Speech technology is not just for English either – it also supports native speakers of a growing range of many different languages. The capabilities of speech technology are ever increasing, enabling businesses to operate globally with the same scale and support that they would have in the English-speaking world.
For the future, we predict that analysis of single streams of data (voice, video or audio) will be combined to create holistic AI regulators that analyse everything from image to audio in real-time. This will lead to financial organisations using AI to build a complete real-time ‘risk picture’ through all contact channels, including in-person interactions, internet and phone banking and even ATMs.
Jeff Palmer, VP, Speechmatics