ML and the advent of the human HMI

null

Mobile phones are no longer simply communication devices – our smartphones are digital hubs that enable us to order consumer goods, monitor our health, and control our smart homes from many miles away. They are an intrinsic part of our daily lives, and most of us are rarely without them. It’s hard to imagine a more impactful device, and that success is due in no small part to the smartphone’s human machine interface (HMI).  However, it wasn’t always this way. If you recall, the first mobile phone you ever likely had 3D buttons, a small, black and white screen, and the ability to send a text message wasn’t always guaranteed. The user interface was basic, because the device was basic, but we were just impressed that we could make a phone call on the move.

The introduction of touchscreen technology represented a major shift in the way we interacted with our phones. Omitting physical buttons left space for a bigger screen, enabling media consumption, and paving the way for more sophisticated interaction. Executing commands via touchscreen is highly intuitive, and even the slightest set of movements give users access to a range of features and functionality.

While touchscreen technology revolutionised the way we use our devices, the onus fell on us to learn how to interact with them. Machine learning (ML) is now transforming the way we interact with our devices, shifting responsibility from humans to machines.

The arrival of the future

Around four years ago, I wrote a blog about HMIs, outlining my predictions for the next generation of interfaces, and how they would affect device interaction:

“Looking into the near future, I can see the next phase of HMI arriving in the form of our devices ‘reaching out’ to interact with us. Why should I have to remember a PIN sequence to unlock my device? Why can’t I be the key? This trend is at the beginning of its lifecycle with facial recognition capabilities becoming standard in mobile devices and starting to be used for unlocking phones. As another example, why do we still have to find the controller every time we wish to change the channel or volume on the TV? Why can’t we control TVs directly via gesture or voice control? Why can’t the TV control itself, reaching out to see if anyone is watching it or whether the content is suitable for the audience (for example if there are children in the room)?”

These predictions are becoming a reality. When I call the British tax office, my voice is the ‘key’ that gives access to my account: the machine has learned my accent and inflection, and can detect my voice to a level of detail that it is now considered unique and secure. We can ask our personal assistants to turn up the volume, change the film or order a pizza. This ‘new wave’ of HMIs is now part of our infrastructure, exchanging information with an increasingly wide range of devices, from notebooks and tablets to smart watches, smart cameras and health monitors. Yet despite their ubiquity, these developments only scratch the surface in regard to the way ML is revolutionising device interactions.

The age of approximation

However, using these same biometrics in combination with machine learning achieves significantly greater accuracy rates, and fewer false positives. In regard to voice recognition, ML can identify subtleties that significantly increase positive identification, bringing accuracy close to Andrew Ng’s often-cited, game-changing figure of 99 per cent.

Machine learning enables approximation and, until its introduction, precise parameters were necessary to create an effective response (e.g. “if the temperature in the room reaches 21°C, then switch off the heating”). Machine learning enables devices to detect and execute tasks with little detail (e.g. identify a cat with a high degree of accuracy because it has learned the ‘ingredients’ that constitute a cat). Almond eyes, narrow face, whiskers? Probably a cat. Previously, we gave computers lots of rules and widened the parameters to make them seem as though they could handle approximation, but ML can do it for real.

Approximation is key when dealing with human interaction. For example, when a person repeats their name, the sounds they produce are rarely identical. If a person’s voice is the access point to their tax account, the ability for devices to learn and recognise subtle tone variations is crucial.

Biometrics, such as face and iris recognition – typically based on traditional algorithms – have long been used as additional security layers on edge devices. Historically, they’ve been less secure than pattern, PIN, or password unlocking … not to mention the many stories of face-recognition tools being fooled by photographs or models of an eye.

The role of patterning

Patterning based on ML has huge implications for HMIs. Mobile device usage, for instance, is unique to various individuals – from the speed and rhythm at which we tap and scroll on a screen, to the way we hold the device – creating individualised relationships between humans and their devices. Devices can detect these “relationships” via recognition of anomalous patterns, enabling ML to deliver more fluid and natural device experiences. 

This has a humanising effect on our interaction with machines, but also has significant implications for the security of our devices. As our reliance on digital technology increases – and our devices are increasingly used for shopping and banking – the more value our personal data has, and the more it becomes attractive to cybercriminals. From phishing and ransomware to botnets and attack vectors, there is an ever-increasing array of methods that hackers can use to exploit vulnerabilities in our devices’ hard- and software to gain access to our data, our money and our reputation.

ML-based patterning, however, is fast becoming an invaluable tool in the fight against cybercrime, revolutionising security through technologies such as context-based recognition and behavioural identification. Potentially applicable to almost any situation, it has the power to identify even subtle adjustments, gradually building up a precise understanding of typical scenarios and delivering alerts if potential security issues are identified.

This essentially creates an optimised experience for the user, with the combined benefits of increased security and increasingly personalised interaction – shifting responsibility from humans to machines and revolutionising the way we communicate with our devices.

Ultimately, this could transport us to a time in which HMIs – as we know them – become obsolete; a time where our devices do the learning, educating themselves to interact with us on our terms … and we, the users, become the next-generation HMIs.

Dennis Laudick, VP of Marketing, Machine Learning, Arm.
Image Credit: PHOTOCREO Michal Bednarek / Shutterstock