Artificial intelligence (AI) is expected to provide enterprises with the knowledge they need to create new revenues, streamline business processes and deliver superior customer experiences. While there is a great deal of debate over where to begin and which use case is more critical to profitability, operational issues are often handled at the end of the planning process. Machine learning (ML) models need to work efficiently to generate meaningful insights and the only way to make sure this happens is to tackle production issues from the beginning.
Algorithms are required to process large volumes of data efficiently to generate timely insights. But often models fail to execute as intended in production, because of data bottlenecks and architectural complexities that were not foreseen in the early planning stages.
When building an AI strategy, enterprises should define the optimal way to build their ML pipelines from research, training and validation to deploying in production, monitoring and retraining. Building the features vector requires the transformation of data from different sources with the ability to aggregate, correlate and enrich the data. All the phases in production must be accomplished at the required speed and scale to ensure that the model will meet business goals and requirements.
Here are three tips for enterprises to successfully move their models into production.
Manage the product lifecycle with MLOps
As a result of scaling up to manage larger volumes of data, and a lack of data and governance standards, more and more AI based machine learning models are running into roadblocks.
According to Gartner, “While many organizations have experimented with AI proofs of concept, there are still major blockers to operationalizing its development. IT leaders must strive to move beyond the POC to ensure that more projects get to production and that they do so at scale to deliver business value.”
Machine Learning Operations (MLOps) is a practice designed to integrate, deploy, and monitor ML models in production. This includes data preparation, model training, and validation, packaging, deploying, as well as monitoring the model accuracy and closing the loop for retraining.
Adopting an MLOps approach can assist enterprises to close the loop between insights and actionable business value.
MLOps can help models scale up to accommodate large volumes of real-time operational and historical data from different sources, while accommodating a large number of concurrent users, ensuring that the resources are available and bottlenecks are identified and automatically handled. The ML data pipeline should be monitored to identify small problems before they affect general performance with tools that view performance metrics and system alerts, while measuring model accuracy in production to identify problem areas immediately. Continuous monitoring and training of the model is essential for data that quickly becomes stale to ensure accuracy.
Companies like LinkedIn, Airbnb, and Uber have spent millions of dollars developing their own internally developed MLOps systems to smooth out production issues. In December 2019, Cloudera called the industry to create open standards to improve machine learning operations. These standards included defined processes for monitoring how the ML models are running, tools for predicting skew, drift, accuracy, and determining when models need to be retrained.
Utilizing a MLOps approach enables enterprises to avoid common hiccups, while ensuring machine learning models are successfully transforming insight into actionable business value.
Integrate data for impact
One of the largest hurdles hindering the deployment of advanced data analytics using AI is not just a shortage of skilled workers, but difficulty in accessing the relevant data assets. In order to garner the timely and accurate insights required to optimize business processes and services, enterprises need to consolidate data from multiple real-time and historical sources including CRM, MES, ERP, and financial systems, and feed it to ML models.
Data silos often prevent these models from accessing the data they need for timely and accurate results. These silos exist for different reasons:
Projects are deployed within specific areas of the business which use different data management vendors according to their specific needs, without considering the use of the data in a broader business context.
Companies grow through acquisitions, mergers, and different rounds of leadership resulting in multiple incompatible systems.
Operational data is separate from historical data which is used for batch analytics.
The ability to access data from multiple sources and to be able to blend and enrich streaming data with historical data are required to feed machine learning models the input they need. Data integration initiatives like all IT projects need to also have an eye on the ROI. It is advisable to begin with the use case that can have the biggest impact on profitability such as on-line loan approvals, fraud analysis for digital payments, automating time-consuming processes etc. and harvest the data needed for these insights, and along the way define the standards that will eventually enable all of the data to be shared for additional machine learning models.
Be prepared for a complex cloud environment
In today’s data-driven world, the deployment of a wide range of AI solutions is coinciding with multi-cloud and hybrid cloud environments including Oracle DBs, IBM DB2, Cloudera, Amazon S3, Azure Data Lake Storage and many more.
Enterprises are increasingly utilizing multiple clouds for a number of reasons:
- Organizations want to avoid vendor lock by being fully dependent on a particular cloud provider's infrastructure, add-on services and pricing model.
- Business units are selecting cloud infrastructures independently of the IT department.
- Different cloud providers are used for data locality to ensure high performance.
- Regulations can also dictate that customer data be kept in specific regions, such as the EU’s GPDR.
- A multi-cloud strategy can also prevent the risk of down time for mission critical applications.
This is adding to the complexity of leveraging machine learning models and requires a deeper dive to understand how the data is stored and shared between the different cloud vendors.
Hybrid cloud is another common environment that needs to factor into the system architecture of machine learning models. Hybrid cloud, or the combination of private, public cloud and on-premise infrastructure allows organizations to leverage existing investments and avoid new deployment strategies for mission critical services that are already running on premise while deploying new applications on the cloud. Organizations with a need for higher security can maintain their on-premise infrastructure and use a cloud instance to host non-sensitive data. The cloud also offers the major benefit of providing unlimited resources by expanding workloads to a public cloud during demand spikes such as end of the year reporting.
However, the data management in a hybrid cloud and multi-cloud environment can be very complex. There needs to be an intelligent method of data replication which is able to aggregate, mask, encrypt and compress the data according to the business needs and smart management. Enterprises should understand, plan for and manage the complexities of mixed environments and associated costs when deploying machine learning models into operation.
In summary, AI strategies need to take into account the complexities of data management and processing in deployment to ensure successful operationalization of ML. Start with the practice of MLOps, leverage infrastructure that can blend relevant data from disparate data stores with streaming and operational data, and implement solutions that support hybrid and multi-cloud environments. It’s never too soon to start thinking about production. In the end a model can only deliver impactful wisdom, if it can run accurately at the speed and scale required by the business.
Karen Krivaa, GigaSpaces