Skip to main content

Google has upgraded its search engine for datasets

(Image credit: Future)

Dataset Search, Google’s search engine (opens in new tab) for huge datasets, has received a couple of new features as it leaves beta status and becomes a fully-fledged service.

Launched in September 2018, Dataset Search now comes with new filters to help users scour through roughly 25 million datasets.

Institutions such as governments, universities and laboratories publish a large volume of data to the web. This data is often hard to find using available search engines, but if open-source metadata tags are added to the webpage, the dataset will be indexed by the new search engine (opens in new tab).

Google declined to provide exact usage figures for the tool, but did state that “hundreds of thousands of users” have sampled it since it launched. It also said that the feedback from the scientific community was “overall positive”.

Natasha Noy, a research scientist building the tool, told The Verge most data repositories have been “very responsive”. She added that the launch of the tool means older scientific institutions are looking at the idea of “publishing metadata more seriously.”

“For example, [the prestigious scientific journal] Nature is changing its policies to require data sharing with proper metadata,” Noy said.

New features include filters that allow users to search for specific file types, such as tables, images or text. Users can also search for data from specific geographies, and filter by copyright status.

Sead Fadilpašić is a freelance tech writer and journalist with more than 17 years experience writing technology-focussed news, blogs, whitepapers, reviews, and ebooks. And his work has featured in online media outlets from all over the world, including Al Jazeera Balkans (where he was a Multimedia Journalist), Crypto News, TechRadar Pro, and IT Pro Portal, where he has written news and features for over five years. Sead's experience also includes writing for inbound marketing, where he creates technology-based content for clients from London to Singapore. Sead is a HubSpot-certified content creator.