Healthy data during the COVID-19 pandemic

The COVID-19 pandemic has dominated our lives for more than two years. The battle against the coronavirus has forced us to make changes to the way we work, have fun, educate our children and interact with each other. The justification for these changes is often found in 'the numbers'. But where does this data come from? What is it used for exactly? Which safeguards are in place? And how are different stakeholders and citizens involved in this health data reuse? Individual, societal and stakeholder level choices influence these COVID-19 numbers. The goal of the Healthy Data e-consultation is to give citizens a voice on these issues. Your opinion will inform recommendations for the future European Health Data Space.

Surge capacity is the ability of a health system to manage a sudden and unexpected influx of patients in a disaster or emergency situation. Creating surge capacity involves a comprehensive approach linking the four S's of surge capacity: space (or structure), staff, supplies and systems. (World Health Organization. Strengthening the health systems response to COVID-19. Creating surge capacity for acute and intensive care. Copenhagen: WHO Regional Office for Europe; 2020)

When a person receives a positive PCR test result, they may need to isolate. They may receive treatment and they can get a proof of past infection. While people get tested to be diagnosed, this data continues to serve other purposes long after the virus has run its course. The second life of the data has important societal implications and allows us to develop policies based on evidence.

It is very important to establish links between different sources of data about COVID-19 to be able to inform policies, develop treatments and prevent the spread of the virus as much as possible. Since the beginning of the pandemic, more than 420.000.000 cases have been confirmed worldwide. To understand each of these cases, one needs to know if the person was vaccinated (when, how often, which vaccine, ...), hospitalized (duration, type of treatment, in the ICU or not, ...), returning from abroad, in contact with other people, infected with which variant of the virus, etc. The links between these different types of data need to be made on an individual level. This means that sensitive, personal health information is being shared. Learn more about how this data is kept safe here.

More about...


 Individual choices


Individual choices are always the starting point of any health data story: only when a person decides to interact with the healthcare system, will there be any data created. However,  PCR test results are anonymously and automatically shared, leading to a second life for the data beyond the scope of the individual from whom they originate. This begs the question: should the individual behind the data be directly involved in the reuse of this data? As it stands everyones' data is used to develop better vaccines, including data from people who do not support vaccination. Data from individuals who are against lockdowns will be used to decide whether to enact them or not, etc. This is the case because everyone's data is grouped together and reused by design, with strict safeguards in place. This ensures robust, timely and complete datasets. However, there are different ways in which citizens could be involved more actively.


What do you believe the role of citizens should be in health data reuse? 

Have your say here!


Societal choices


If individual choices do not impact how COVID-19 data is governed as much, what does the framework for health data reuse look like? In short: personal data is protected by a specific legislation, the General Data Protection Regulation, links between datasets containing personal data are closely managed and under strict review (by Data Protection Authorities, housing a multidisciplinary commission) and anonymised data can be freely used and shared because they cannot be traced back to an individual.

For every type of data, every different purpose and all new links, several considerations need to be made:

  • What can the data be used for? Every health data reuse needs to serve a specific purpose that has to be clearly defined. 
  • Who can access the data and under which conditions? Open access will promote more collaboration, but might require strict safeguards that limit the usefulness of certain datasets.
  • Which variables are used: the use of more variables (e.g. age, gender, location, disease, treatment, ...) increases the amount of sensitive information and therefore implies more risks and the requirement of more safeguards. However, the more variables are included, the more useful a dataset becomes. 
  • How will the data be stored? New standards are being developed to increase findability, accessibility, interoperability and reusability (FAIR-principles). Read more about how health data infrastructure can help in the battle against COVID-19 here.


Under which conditions could your health data be reused?

Have your say here!


Stakeholders choices


All actors in the health data reuse environment make decisions within the framework that is developed at the societal level. They decide which purposes to pursue, which safeguards to specifically employ, which collaborations to establish, etc. For example, the WHO collects data about COVID-19 worldwide and makes it, in an aggregated way, accessible to anyone under an open data licence here. Some public health institutes describe in great detail which data they are using and why (e.g. Sciensano). 


Who do you trust with your health data? Why?

Have your say here!