Skip to main content

The new decade and the rise of AutoML

(Image credit: Image source: Shutterstock/TechnoVectors)

In 2019, The World Economic Forum forecasted that data analysts would be in high demand by 2020, and so far this year we’re seeing the prediction become a reality. The fact is, as much as companies would love to hire dozens or even hundreds of highly trained data scientists - even in today’s challenging economic climate - the skill set is so highly sought after that it can be both difficult and costly to find and integrate the right people.

This is where the role of the data analyst comes in. Many companies have invested in automated machine learning (AutoML), which has enabled them to automate the process of applying machine learning to solve business challenges. What this means is that a wider variety of data analysts, who are not necessarily highly trained data scientists and who may have broader business skill sets, can access and use data more freely.

The move to AutoML is also being driven by the fact that it’s becoming increasingly recognised that organisations using AI cannot improve the business-led insight generated from that AI without improving the access to it. More people need access to data sources, the models being fed by data, and to data-driven analytics.

Data needs to be democratised. We’re past a point where it’s acceptable for data access to be restricted only to highly trained data scientists well-versed in manipulating it. If we want to see the mass business benefits of data-driven analytics, data in all its various guises needs to make it outside of the confines of the data science lab and into the hands of a new generation of data analysts and business users.

In this article, we discuss how AutoML and new businesses operational models are influencing and accelerating the rise of the data analyst in this new decade.

Technology is transforming data access and insight

The shift has meant that AutoML now has a broader scope to help democratise data science in general, meaning that it’s becoming easier for data analysts to get involved in the data-to-insights pipeline. While AutoML is not going to replace data scientists, it does mean that data analysts can be self-guided through feature creation, feature selection, model creation and comparison, and even operationalisation. What this means is that AutoML drives self-serve, augmented analytics, which can add efficiency to large swaths of the data pipeline.

At a very high level, AutoML is about automating the process of applying machine learning. Early on, AutoML was almost exclusively used for the automatic selection of the best-performing algorithms for a given task and for tuning the hyperparameters of said algorithms.

While this has been very helpful for data scientists, until recently, it hadn’t improved data access or insights for data analysts or business users, who still may be reliant on data scientists to build machine learning based models in code. However, the emphasis on AutoML has shifted to making machine learning more accessible by automatically building models without the help of data scientists.

Business is empowering data collaboration and new roles

In the last two years of the previous decade, one of the biggest operational shifts that became apparent in technology-driven businesses was the continued convergence of data science and business intelligence. It was certainly a far cry from more traditional operational models, where organisations employed separate teams standard business intelligence (dashboards, reports, data visualisation, SQL) and data science (statistical models, R/Python.)

Their reasoning is logical: in bringing data science and business intelligence practices together, companies effectively form real-time, centralised access to what may have previously been disparate sources of data. This growing convergence and/or closer collaboration between data science and analytics teams has empowered more people to become data analysts, often referred to as ‘citizen data scientists.’

But don’t let the term fool you: citizen data scientists come in many forms, and their data analysis skills are empowering business insight in very important ways. Their roles can include the Data Translator, who is bridging the technical expertise of data engineers and data scientists with the operational expertise of marketing, supply chain, manufacturing, risk, and other industry domains.

We are also seeing Data Explorers, who focus on identifying and connecting to new data sources, merging and preparing data, and building production-ready data pipelines. Data Modellers are responsible for building predictive models and generating either a product or a service from those models, and then implementing them.

Regardless of the nature of these new roles, there is a common theme: unlike the data scientists of the previous decade, analysts don’t need to master all the intricacies of advanced machine learning and feature engineering. What they bring to the table is an intimate knowledge of the problems at hand and the business questions that need to be answered.

Users are demanding self-service AI/ML

Heads of business units have traditionally had a more difficult time accessing data analytics, and have to specifically request reports and analysis from the data scientists on a case-by-case basis. The next evolution will be for machine learning itself to become more self-serviced. Deployment and maintenance of models will become more and more easy and automated, as will many analytic tasks.

By integrating self-service machine learning into their core business strategies, innovative companies are enabling data analysts to use real-time data at scale to make better and faster decisions throughout their organisations.

It’s clear that AI maturity and its resulting data-driven insight cannot improve without expanding the breadth of people that have access to and work with data on a day-to-day basis. It’s exciting to see companies prioritise a cultural shift toward a data-driven culture and the economic imperative of data insights. As the new decade progresses, we’re set to see this continue as one of the more powerful analytics trends that are already transforming business in 2020.

Alexis Fournier, Director of AI Strategy, Dataiku