Skip to main content

The tricky task of using data to tackle Covid-19

(Image credit: Image Credit: Geralt / Pixabay)

Health workers continue to battle Covid-19 in hospitals, care homes and on the streets as emergency services continue to react as best they can in the circumstances. In order to protect these heroes, governments are relying on science to guide their plans and responses. But taking direction from data isn’t as simple as it sounds.

Analysing data to a high standard is crucial when it comes to deciphering it and gleaning helpful insights to stem the flow of the pandemic. Our way out of this crisis hinges on the tricky task of collecting data, using it to help understand more about how it spreads, and harnessing insights to prevent or limit the effects of future pandemics.

Gathering the data and putting it to use

Collaboration and rapid information sharing are essential to have the best chance of predicting, preventing, responding to and recovering from infectious disease outbreaks. Public health and scientific data must be shared freely and rapidly with stakeholders and key decision makers so they can act. Events like the Covid-19 pandemic require public and private sectors to work closely together and share data to limit disease spread and save lives.

While there is no centralised data collection and sharing initiative on a global scale, there are many open source data sets and models online that are being used globally to share and analyse data.  The more data people have about case counts, incidence and mortality rates, how a disease spreads and how contagious it is, the better decisions they can make to limit, prevent and treat the disease.

Governments hold much of the critical data needed to understand current conditions during an outbreak, but analytics offer an ability to synthesise this data with other non-health (social indicators) and non-governmental data to get the most insights from this unified data. Analytics can provide insights about the spread of a disease and the effectiveness of public health action, which can improve the response.

Certainly, both private and public companies must uphold data privacy laws. There are valuable projects, however, which do not require exposure of private data. For example, anonymised communications data can be used to quantify the rate of travel between municipalities within a region, which may be an indicator of future risk.  On the other hand, if you personally had been exposed in a public place or while traveling, would you appreciate a tip that would suggest you quarantine instead of risk infecting others? This is a balance that health officials and policy leaders need to manage together. 

Predicting the effects with intelligent machines

Predicting the spread of illness and human risk requires quick public health and scientific study and the ability to rapidly share information with stakeholders so that action can be taken. That said, because of the dynamic nature of disease spread, particularly for new, previously unseen viruses, as well as the unknown impacts of potential future government and public health interventions, complete precision in epidemic modelling is usually impossible. There is always uncertainty. 

Artificial Intelligence (AI) and machine learning (ML) can help to automate data analysis, identify patterns and build models on risk factors to aid in scenario analysis of infection transmission. ML especially excels at seeing connections and correlations that humans would not find or observe. To increase the accuracy and precision of ML, diverse information sources are combined into analytical data sets, e.g. official incidence records, clinical emergency data, physician’s records, social media, flight records, school absence, and sales data of anti-fever medication. 

The goal of any epidemiologic model built upon data analysis isn’t necessarily to get the predictions exactly right, but rather to help provide insights about the epidemic that can facilitate effective, rapid decision making for public health officials and policymakers. It is important to use great care when assessing predicted future spread based on historical information. Some of the most advanced computational methods, applied by some of the smartest scientists in the world, still get these predictions wrong – often by large margins.

Is it possible to stop the next outbreak using data?

Once a disease outbreak is contained or has ended, governments and global health organisations must make decisions about how to best prevent or limit similar outbreaks in the future. Data scientists can certainly learn from and draw upon data analysis and epidemiological modelling from previous disease outbreaks with the understanding that the virology of each disease is unique.

Advanced analytics can help to detect early signals of symptoms that would point at a possible new epidemic. With these sophisticated techniques, early signals can be found often weeks before officials raise the alarm and this can help limit the spread of the virus. They require special analytical techniques that can find rare but meaningful events, such as a spike in school absenteeism in a certain region or state. Each outbreak requires a combination of epidemiological, clinical and AI skillsets to adapt to the infectious agent or virus under study.

There are opportunities to use data to help even before an outbreak happens. As the human population increases and interacts increasingly with animal populations in new ways and with increased species diversity in many places around the world, there are increasing opportunities for viruses that start out in animals to make the “jump” into human populations and spread. We’ve seen this frequently in recent years, from the recent SARS and MERS coronaviruses, to new forms of the flu, and even in the recent Ebola crisis in west Africa several years ago.

Joining forces to find solutions

Combatting the spread of a virus requires a vast amount of time, resources and expertise. Without the right technologies, it is nigh on impossible to prevent the spread effectively. However, strides are being made towards assembling a taskforce so effective that global pandemics of this scale could one day become a thing of the past. Scientists are coming together to deploy the latest AI techniques in this current pandemic. These technologies manage greater volumes of data than ever before, as well as finding more elusive patterns and making more insightful predictions. Datasets, too, are coming together; though no dataset matches Covid-19’s spread exactly, the data from historic epidemics, studies of animals, human populations, and information on how different societies function are all feeding into the analysis.

The insights they produce will influence how governments communicate with the public now, and in the future. They will shape our lifestyles for good as we work towards ensuring these types of pandemics never happen again.

Mark Lambrecht, PhD, Global Director for Health and Life Sciences, SAS

Dr. Mark Lambrecht, Director of the Global Health and Life Sciences Practice at SAS, joined SAS in 2005 and leads a senior team working for SAS’ healthcare and life sciences industry and organisations.