The largest data repository in the world.

Google has released datasetsearch, a free tool for searching 25 million publicly available datasets.

The largest topics that the datasets cover are geosciences, biology, and agriculture. The majority of governments in the world publish their data and describe it with schema.org. The United States leads in the number of open government datasets available, with more than 2 million. 

The search tool includes filters to limit results based on their license (free or paid), format (csv, images, etc), and update time. The results also include descriptions of the dataset’s contents as well as author citations.

Google’s dataset aggregation methodology differs from other dataset repositories like Amazon’s open data registry. Unlike other repositories that curate and host the datasets themselves, Google doesn’t curate or provide access to the 25 million datasets directly. Instead, Google relies on the dataset publishers to use the open standards of schema.org to describe their dataset’s metadata. Google then indexes and makes that metadata searchable across publishers.

// Learn more

Justin Flitter

Founder of NewZealand.AI.

http://unrivaled.co.nz
Previous
Previous

Neil deGrasse Tyson interviews robot Sophia about how #AI can help fight COVID-19.

Next
Next

Six questions to break through the AI hype.