The Covid-19 pandemic may have created economic uncertainty, but it’s a testament to the incredible excitement surrounding AI innovation that investments in the space largely weathered the storm: Just 7 percent of investments decreased, and 16 percent were temporarily suspended in 2020, while 47 percent remained unchanged and 30 percent were set to increase.
Over the past few years, AI products have gained momentum for many reasons, but two, in particular, stand out. First, the data being generated continues to exponentially increase, even as it takes on new forms and emerges from a wider geographical footprint. More data and, critically, more diverse data is better for teaching an AI model to avoid bias and accomplish all kinds of goals.
The other factor contributing to AI’s success is the sheer number of tools available to CIOs and CTOs to assist with implementations, from data collection and enrichment tools to annotation search engines, machine learning model developers, and ML model testers.
Just a few years ago, technology leaders hoping to take on AI development needed to start by building a platform to label or annotate training data — a process that might take almost 80 percent of the total development time. That's a tough pill to swallow. Now, though, sophisticated platforms are readily available to streamline the development process significantly.
Coping with challenges
Despite the accelerating pace of AI development, real challenges persist. Modeling bias is one of the biggest barriers to AI adoption, and it’s a complex problem to diagnose because bias can occur from multiple angles simultaneously.
For example, when U.S. states including Florida sought to predict rates of recidivism after prisoners were released, the COMPAS AI tool predicted twice as many false positives for Black prisoners. That's a really serious issue. While some of the blame surely falls on the AI model, the team also failed to account for the presence of bias in many other aspects of the prison system.
Data itself can also be an obstacle. Although it’s being generated at an incredible rate, that doesn’t mean acquiring it is easy — far from it. Many organizations hesitate to share data because of valid concerns about privacy and malpractice, and sharing data internationally requires adhering to many complex protocols. The creation of a universal data regulation could someday facilitate data transfer while protecting consumer privacy, but such a development is not likely anytime soon.
In the meantime, AI leaders focus on the following three priorities when developing the next generation of groundbreaking AI products:
1. Create goals with the end-user in mind
Developing AI solutions can be expensive, so it’s critical to take steps throughout the process to protect your investment. It’s not uncommon to spend huge sums and commit untold hours to develop an AI tool only to have it give erroneous results and show obvious bias. The most important measure you can take to avoid this outcome is to begin by clearly defining goals that have the user’s best interest in mind.
With clear goals established, you can create a road map to achieve them — but don’t be afraid to look for qualitative as well as quantitative feedback along the journey. Give users a chance to use the tool, and collect their candid feedback about what works well and what to change. When the folks at Amazon set out to create a solution for rapidly scanning résumés, for example, they achieved that goal — at the clear expense of women applicants. Training a tool with diverse data is vital, but it’s also critical to collect feedback from as diverse a set of user groups as possible.
2. Avoid information gaps between teams
If stakeholders don’t have a vivid understanding of the product in development, they can easily misunderstand the process, timeline, cost, and other key components. These communication gaps only lengthen the development timeline and detract from product quality, which is why you should take deliberate steps to avoid them.
The most successful AI projects come from seamless collaboration. Data teams must remain in contact with technology teams, who receive input from customers to help develop the products that meet their needs. Getting all these groups on the same page can be difficult, but diversification can help. Try cycling personnel through different roles to enable colleagues to speak other groups' languages and to break down collaborative barriers.
3. Don’t compromise on quality
Data is omnipresent, but not all data is created equal. Because the quality of an AI solution is inextricably linked with the quality of data used to train it, you need to separate the wheat from the chaff. According to Gartner, about 85 percent of AI projects created prior to 2022 will rely on biased inputs that lead to errors in output. That's incredibly concerning. Avoid becoming the next AI disaster by deploying data experts and quality assurance professionals to validate your methods.
The Cognilytica Data Annotation 2019 report found that the market for third-party data labeling solutions is expected to go to over $1 billion by 2023. For every dollar spent on third-party data labeling, five dollars are spent on internal data labeling efforts — over $750 million in 2018 and growing to more than $2 billion by the end of 2023. AI projects that relate to autonomous vehicles, recognizing objects or images and annotation of text and images are the most common data labeling efforts. In the next two years, all competitive data-preparation tools are expected to have machine learning-augmented intelligence as a core part of the offering.
When you’re trying to build a groundbreaking AI solution, it’s not a good idea to cut costs. Some teams take on the construction of an annotation platform and then, in the name of savings, rely on their own data scientists and engineers to clean and annotate data. Ironically, making a viable annotation platform is an enormous amount of work, and your experts are your most valuable asset. These efforts will almost always cost you more in the long run, whether in time wasted or lower-quality results for the end-user.
AI innovations are picking up speed thanks to secure data availability and the proliferation of AI development tools, and investors have never been more bullish on the technology’s prospects. By keeping the customer involved, communicating effectively between teams, and keeping quality in the crosshairs, you can ensure your product development process yields AI models that achieve your customer’s goals time and time again.
Vatsal Ghiya, CEO and co-founder, Shaip