Dataset Search, Google’s search engine for huge datasets, has received a couple of new features as it leaves beta status and becomes a fully-fledged service.
Launched in September 2018, Dataset Search now comes with new filters to help users scour through roughly 25 million datasets.
Institutions such as governments, universities and laboratories publish a large volume of data to the web. This data is often hard to find using available search engines, but if open-source metadata tags are added to the webpage, the dataset will be indexed by the new search engine.
Google declined to provide exact usage figures for the tool, but did state that “hundreds of thousands of users” have sampled it since it launched. It also said that the feedback from the scientific community was “overall positive”.
Natasha Noy, a research scientist building the tool, told The Verge most data repositories have been “very responsive”. She added that the launch of the tool means older scientific institutions are looking at the idea of “publishing metadata more seriously.”
“For example, [the prestigious scientific journal] Nature is changing its policies to require data sharing with proper metadata,” Noy said.
New features include filters that allow users to search for specific file types, such as tables, images or text. Users can also search for data from specific geographies, and filter by copyright status.