Capitalize On Artificial Intelligence To Combat COVID-19

  • Since the last few months, the world is experiencing COVID-19 outbreaks that generally follow a similar pathway: an initial phase with few infections and limited response, followed by a take-off of the epidemic curve along with a national lockdown to flatten the curve. Amidst all this, governments across the world are burdened by the question as to when and how to manage de-confinement.
  • Throughout the pandemic, great emphasis has been placed on the sharing (or lack of it) of critical information across countries — in particular from China — about the spread of the disease. However, relatively little has been said about how COVID-19 could have been better managed by leveraging the advanced data technologies that have transformed businesses over the past 20 years. Now it’s time for governments to leverage those technologies in managing this pandemic.

Utilising Power of Personalised Prediction Model

  • Alternative Approach: The time has come for the policy makers to make use of an alternative approach to battle COVID-19 based on the technology of personalized prediction using Artificial Intelligence (AI), which has transformed many industries over the years. Using machine learning and AI technology, data-driven firms (from “Big Tech” to financial services, travel, insurance, retail, and media) make personalized recommendations for what to buy, and practice personalized pricing, risk, credit, etc. using the data that they have amassed about their customers. For instance, Companies like Netflix evaluate consumers’ past choices and characteristics to make predictions about what they’ll watch next. The same approach could work for pandemics too.
  • Effective Risk Analysis: Using multiple sources of data, machine-learning models could be trained to measure an individual’s clinical risk of suffering severe outcomes, if infected with COVID: what is the probability they will need intensive care, for which there are limited resources? How likely is it that they will die? The data could include individuals’ basic medical histories (for Covid-19, the severity of the symptoms seems to increase with age and with the presence of co-morbidities such as diabetes or hypertension) as well as other data, such as household composition. For example, a young, healthy individual (who might otherwise be classified as “low risk”) could be classified as “high risk” if he or she lives with old or infirm people who would likely need intensive care should they get infected.
  • Customising Policies for Resource Allocation: These clinical risk predictions could then be used to customize policies and resource allocation at the individual/household level, appropriately accounting for standard medical liabilities and risks. It could, for instance, enable us to target social distancing and protection for those with high clinical risk scores, while allowing those with low scores to live more or less normally. The criteria for assigning individuals to high or low risk groups would, of course, need to be determined, also considering available resources, medical liability risks, and other risk trade-offs, but the data science approaches for this are standard and used in numerous applications.
  • A personalized approach has multiple benefits. It may help build herd immunity with lower mortality and fast. It would also allow better and fairer resource allocation, for example of scarce medical equipment (such as test kits, protective masks, and hospital beds) or other resources.

Delineating De-confinement Strategies

  • Easy Classification: De-confinement strategies at later stages of the pandemic, a next key step, can benefit by using AI technology. Deciding which people to start the de-confinement process with, is, by nature, a classification problem. Some governments are already approaching de-confinement by using age as a proxy for risk, a relatively crude classification that potentially misses other high-risk individuals (such as the aforementioned example of healthy young people living with the elderly). Performing classification based on data and AI prediction models could lead to de-confinement decisions that are safe at the community level and far less costly for the individual and the economy.
  • Safe De-confinement: A key feature of Covid-19 is that it has exceptionally high transmission rate, but also relatively low severe symptoms or mortality rate. Data indicate that possibly more than 90% of infected people are either asymptomatic or experience mild symptoms when infected. In theory, with a reliable prediction of who these 90% are we could de-confine all these individuals. Even if they were to infect each other, they would not have severe symptoms and would not overwhelm the medical system or die. These 90% low clinical risk de-confined people would also help the rapid buildup of high herd immunity, at which point the remaining 10% could be also de-confined.
  • Limited Fallout: If a prediction score were to prove wrong, the consequences would be limited to the “safest” individuals who were first released from confinement. They could be managed with available medical resources, which would not be overtaxed by treating the remaining 10% or more high-risk people who remained confined. In practice, of course, de-confinement should be introduced more gradually, starting from the lowest clinical risk groups first and building up herd immunity over time.
  • Scalability: Given the fact that there is lack of perfect clinical risk prediction models, we need to make the models as robust as we can. But that does not mean we should not consider using them. Unlike medical tests which are scarce, expensive, and slow to deploy, this clinical data-driven digital personalization approach can be applied quickly and is easy to scale. It could enable, with the right models, safer de-confinement at a much faster rate than current test-track-isolate best practices for Covid-19, under which anyone infected and their contacts would remain in confinement, even if they are at low risk of suffering serious symptoms.
  • Harnessing Data: At present, the data required for assessing an individual’s clinical risk from contracting a given virus are not easily accessible. Governments can certainly ramp up national health data gathering by creating or rolling out more comprehensive electronic medical records, but the value of these may be limited as it would take time for patterns to emerge between the historical data in medical records and the impact of a virus on its victims.
  • Need for Shared Prediction Model: In a context of a pandemic that could rapidly affect millions on a global basis, a better approach might be to create and share a prediction model that is “trained” using the data from an initial outbreak.A dataset with tens of thousands seriously affected (those requiring an ICU) individual, balanced with many more relatively less affected ones (those exhibiting mild symptoms), is large enough to enable some level of personalized prediction, the quality of which improves as more data is added.
  • Common Data Standard: Once a model is up and running it can be shared with other countries in the early stages of the spread, because the basic underlying biological and physiological data in people’s medical records do not vary much (everyone grows old, and diabetes in Wuhan is the same as diabetes in New Delhi). If a virus strikes two countries whose populations resemble each other, the outcomes are likely to be similar.Given this, the two countries could use the exact same prediction model without having to share the actual medical records that went into training the model. Of course data patterns across countries may vary due to, say, demographics (Japan has more old people than India) and cultural or lifestyle differences (Indian grandparents may be more involved in child care than American ones), but data analysts can rework the model to accommodate these variations if the data were collected according to a commonly developed standard or protocol.

Challenge of Privacy

Implementing the technological innovations, however, will require policy changes. Existing policies covering data privacy and cybersecurity, and their respective and differing interpretations across countries, will largely prohibit this kind of personalized pandemic management approach. This is largely because current policies do not differentiate between the input data (used to train a model), the prediction models themselves, and the “output data” (predictions from the trained model). When a policy, implicitly or explicitly, prohibits data sharing or requires data to be stored on servers within a country, it covers anything that can be legally interpreted as data, including models and their parameters. Therefore, it is pertinent that policymakers consider distinguishing the sharing of models and the sharing of data.


The need of the hour is that the national governments agree on a protocol for determining when data could be shared. For example, a declaration by the WHO or UN that a particular outbreak qualified as a pandemic could serve as a trigger to suspend normal privacy laws to allow the sharing of anonymized data. In fact, during times like these, many people might be willing to exceptionally and temporarily provide their data, through appropriate and secure channels, for training models that can guide policy decisions with major life and economic consequences. If this materialises, there is a great chance that modern data science and AI could mitigate the fallout from this pandemic and prepare us for limiting the impact of the next pandemic in future.