Google today announced that Dataset Search, a service that lets you search for close to 25 million different publicly available datasets, is now out of beta. Dataset Search first launched in September 2018.
Researchers can use these datasets, which range from pretty small ones that tell you how many cats there were in the Netherlands from 2010 to 2018 to large annotated audio and image sets, to check their hypotheses or train and test their machine learning models. The tool currently indexes about 6 million tables.
With this release, Dataset Search is getting a mobile version and Google is also adding a few new features to Dataset Search. The first of these is a new filter that lets you choose which type of dataset you want to see (tables, images, text, etc.), which makes it easier to find the right data you’re looking for. In addition, the company has added more information about the datasets and the organizations that publish them.
A lot of the data in the search index comes from government agencies. In total, Google says, there are about 2 million U.S. government datasets in the index right now. But you’ll also regularly find Google’s own Kaggle show up, as well as a number of other public and private organizations that make public data available as well.
As Google notes, anybody who owns an interesting dataset can make it available to be indexed by using a standard schema.org markup to describe the data in more detail.