As long as AI was confined to public or private R&D laboratories, the risk of its use and any potential impact was limited to theoretical research subjects. But the rise of AI and its progressive use in businesses and in the daily lives of citizens around the world has made its impact much more real. Responsible AI has now emerged as what should be a foundational principle for an organization’s holistic AI efforts.
What is Responsible AI, and how can we define it? Responsible AI is a framework that aligns the outcomes of an AI system with an organization’s intentions and desired impacts beyond the business. This means that systems should focus on the people most impacted by AI products, and organizations need to integrate the right methods and practices into the process of development and deployment of their AI pipelines. Every organization will have their own guiding principles or intended impacts of AI, but the real challenge is how to integrate Responsible AI into your organization’s existing AI lifecycle.
A guiding framework: Define the ethics of your company
The first step for a company or group that decides to deploy AI is to define a framework of ethics and values in which their AI should operate. Senior leaders and executives should outline their broader impact with AI and communicate these goals widely. The definition and clarification of criteria, linked to the activity and the values of the company, have two important benefits: ensuring a clear position of the company on all of its principles and facilitating the communication of these elements to all the teams.
Then, define governance and empowerment
The other two pillars of a guiding framework, governance and empowerment, are equally as important as defining ethics. Governance is about creating checkpoints in your process and creating processes that can allow for tracing. It's not just about how well the model is performing over time or how robust a system is, it’s about asking the important questions. Who is accessing the data, who is it for, what specific purposes is it being used for, where are these models being deployed and used, and who made decisions around when they got deployed or pulled from the system?
Empowerment is about educating decision makers about the right methods to stay on track with your overall goal and providing the right tools for those involved in the pipeline so that they can actually build responsible products upfront. This also means providing mechanisms for AI developers to alert the team to any potential pitfalls or deviations.
Examine the data
In order to prioritize Responsible AI, companies will need to look to include people who are representative of those who will use the algorithms. An AI algorithm is built on pre-existing data, and this data has a fundamental impact on the behavior of future AI. For example, over the past few years there have been examples of bias in data in AI job application screening, where companies ended up with more examples of successful male candidates than women. Due to historic hiring practices, the algorithms defaulted to screening out women.
It goes to show that data frequently contains biases. The example of women automatically screened out in recruitment processes is largely linked to past biases. Indeed, if recruiters provide the algorithm with data subject to their own criteria, the AI will present the same flaws. In working to achieve Responsible AI, companies need to examine their choices when it comes to modeling and validation techniques and to be conscious that each step in the creation of an AI is likely to induce biases and drifts.
There are a few different ways to explore biases in datasets before beginning model development. One is Exploratory Data Analysis (EDA), which, in terms of Responsible AI, can be a good way to narrow your search for underlying biases in data. EDA may help teams to better understand and summarize their data samples and to come to more concrete conclusions about the underlying population a dataset represents. It can also help to visualize the structure of a dataset.
Data can also be explored with datasheets in order to outline issues in data, and to acknowledge any gaps that exist. Creating datasheets requires that the creator documents answers to questions about dataset collection, motivation, composition, labelling, pre-processing, intended use, maintenance and distribution. This helps creators to carefully consider how the data was collected, lending transparency to possible sources of bias.
Shifting to the responsible AI mindset
Today, a significant portion of companies that have embarked on data science have done so semi-organically. It's not uncommon to see several or even a dozen different teams developing AI within a large group, each using different technologies and data. Enterprise-wide governance can make it possible to track ongoing projects, and to visualize what data is being used and how it is being used, but it’s important to remember that companies need to create a culture and mindset that values Responsible AI across the board. Accountability is all about making AI human-centered and getting the right people in the room to discuss how models will do what they are meant to do.
There is no silver bullet when it comes to Responsible AI, and the work towards responsible systems needs constant evaluation. However, if companies can put a dedicated effort into ensuring they articulate their ethics and establish governance and empowerment whilst putting in a dedicated effort to testing data, they will be in the best possible position to integrate Responsible AI across the enterprise.
Triveni Gandhi is Senior Industry Data Scientist, Life Sciences and Responsible AI, Dataiku