COVID-19 Data Forum

Next event: To be announced.

About the Forum

The COVID-19 Data Forum brings together experts working to collect and curate data needed to drive scientific research and formulate effective public health responses to the pandemic.

The COVID-19 pandemic is challenging science and society to an unprecedented degree. Human lives and the future of our society are at stake. Containing the virus and flattening the curve" of the pandemic depend on mounting a strong, coordinated, scientifically-informed public health response which in turn ultimately depends on having complete and accurate data from multiple data sources.

The COVID-19 Data Forum is an ongoing series of multidisciplinary webinars and online meetings for topic experts to discuss data-related aspects of the scientific response to the pandemic. The Forum is a joint project of the R Consortium and the Stanford Data Science Institute. We host recurring topical webinars that are free and open to the public.

The Forum places particular emphasis on being open to all relevant interested groups and including a wide range of expertise. With respect to computing, the Forum considers all useful tools, languages and environments. We hope that the COVID-19 Data Forum discussions can usefully proceed through three stages of questions:

  • Where are we now, with respect to resources and needs?
  • What immediate steps (e.g. in terms of sharing) would make improvements?
  • What potential projects for new tools, standards, or data models might be worth undertaking?

At all stages, there are many specific topics that need discussion. To sort them out, three kinds of activities are useful categories: obtaining the data; using the data; and communicating about the data.

Obtaining Data. COVID-19 data challenges begin with just acquiring data of the range and quality needed. A very wide range of data is needed, in three dimensions: geographical, time, and domain. Depending on the purpose, data may be needed either at very specific local levels or at the widest global level. Both are challenging --- finding reliable local sources and resolving hugely variable international ones, for example. Particularly on the global (or even national) scale, variable quality will often be a challenge. Timeliness of the data is clearly essential, particularly as public health regulations and other societal responses change. But scientific models and analysis may also need to have data over a long time span. The pandemic has touched our lives in many ways: directly in our health but also in nearly all aspects of our economy and society. As the world responds, data science will need to consider all these aspects, requiring data from the microscopic level of the virus to the population data for epidemiology, social science, and economics.

Using Data. The response to the COVID-19 pandemic from the scientific community continues to generate crucial data-based results. Epidemiologists, public health experts, data scientists, and other researchers have produced a large number of predictive models, interactive resource allocation applications, and disease tracking dashboards. Moving ahead, it will be important to have easy, consistent access to the best data for all these efforts. Co-operation and co-ordination among the teams involved can enhance the scope and help ensure that model results and comparisons use consistent, well-defined data sources.

Communicating Data. A key goal of the Data Forum is to improve communication between decision-makers (in public health, government, and elsewhere) and the data science and general research community. Many tools have been developed for visualizing and interacting with data. It's important to understand how these can be used and enhanced for the decision-making community. We look forward to participation in our meetings by interested members of this community. Another important goal is to improve the information flow to the broader community, with emphasis on giving insight and avoiding misdirection.

All events hosted by the Forum adhere to Stanford Data Science's Code of Conduct policy


Past events

We record our events, and will make them available for replay shortly after they conclude.

Beyond case counts: Making COVID-19 clinical data available and useful

August 13, 2020

9:00 AM San Francisco | 16:00 UTC

The second public COVID-19 Data Forum event, which was held on August 13th, looked beyond the COVID-19 case count data so ubiquitous in public media coverage, and focused on the challenges faced in making COVID-19 clinical data available and useful to physicians, scientists and public health officials. The event featured four expert speakers working on different aspects of clinical data: Dr. Jenna Reps of OHDSI, Dr. Andrea Ganna of the COVID-19 Host Genetics Initiative, Dr Ken Massey of Soma Technologies and Dr. Roni Rosenfeld CMU professor of computer science. Dr Sherri Rose, Stanford Associate Professor of Health Policy, moderated the event and guided the open forum discussion. Over one hundred and twenty-five people attended the event.

Introducing the COVID-19 Data Forum

Thursday, May 14, 2020

19:00 UTC | 12pm PDT | 3pm EDT | 8pm London | 9pm Paris | 3am Beijing

The opening event of the COVID-19 Data Forum was held on May 14 2020 and attracted several hundred attendees for a lively discussion of the current state of COVID-19 data and the challenges researchers face.

Organizing Committee

- Chair R Consortium Board of Directors
- Stanford Department of Statistics and Stanford Data Science
- Assistant Professor, Department of Biostatistics, Yale University
- Senior Research Scientist, Department of Biomedical Data Sciences, Stanford University
- Executive Director, Stanford Data Science
- John Harvard Distinguished Science Fellow, Harvard University


The COVID-19 Data Forum is jointly sponsored by the R Consortium and the Stanford Data Science Institute.