Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more

In an effort to help fight the spread of the novel coronavirus, which is projected to infect millions of people in the U.S. alone, Google today launched the COVID-19 Public Datasets program, which will host a repository of public data sets that relate to the crisis and make them free to access and analyze. The idea is to remove barriers and to provide researchers access to critical information quickly and easily, eliminating the need to search for and onboard large data files.

The corpora within the COVID-19 Public Datasets program include the Johns Hopkins Center for Systems Science and Engineering (JHU CSSE) data set, Global Health Data from the World Bank, and OpenStreetMap data, all of which are stored for free on Google Cloud. (Google says it’ll reach out to organizations whose data sets are pre-selected for inclusion in the program.) The data sets have a “COVID-19” label, a description, and several sample queries, and they’re searchable from the Google Cloud Console Marketplace and from the BigQuery UI with the tag “freebqcovid.”

Researchers can use BigQuery ML, Google’s service that enables users to create and execute machine learning models in BigQuery (a fully managed data warehouse) using SQL queries, to train machine learning models on COVID-19 data sets. Queries are free, and they’ll remain free until September 15. But Google notes that if any of the data sets are joined with non-COVID-19 data sets, the bytes processed will be counted against the free tier — BigQuery Sandbox, which has monthly 10GB storage and 1TB query limits — then charged accordingly, in order to prevent abuse.

“The contents of these datasets are provided to the public strictly for educational and research purposes only, [but] we on the Google Cloud team sincerely hope that the COVID-19 Public Dataset Program will enable better and faster research to combat the spread of this disease,” wrote BigQuery product manager and GIS lead Chad W. Jennings and developer advocate Shane Glass in a blog post.

The debut of the COVID-19 Public Datasets program follows Google’s many other coronavirus mitigation efforts, which are ongoing. The company donated $800 million in ads and loans to organizations fighting the virus, added a coronavirus tips Google Assistant shortcut, and partnered with Microsoft and Palantir to build a dashboard for the U.K.’s National Health Service. Separately, Google launched a dedicated page and search portal to collate resources about COVID-19, and the tech giant’s parent company — Alphabet — ramped up a screening program within the Bay Area.


VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member