Useful sites for finding datasets for Data Analysis tasks

towards-data-science

This post was originally published by Parul Pandey at Towards Data Science

Let’s now look at some of the useful sites for finding open and publicly available datasets, quickly and without much hassle.

Image for post

Screenshot of the Google Dataset Search page (Image by Author)

Google Dataset Search is a search engine dedicated to finding datasets. It is a search engine over metadata from data providers. This implies that it indexes over the descriptions of a dataset instead of its content. So if a dataset is available publicly, there is a good chance, that it will pop up in the Google dataset search. At the time of the launch, Dataset Search had almost 25 million different datasets from across the globe. Google dataset search relies on keyword search and like regular Google search offers a neat autocomplete option when looking for datasets on this site.

Image for post

Some of the search results for the query “Education.”(Image by Author)

Google Dataset Search demo

If you wish to make your own datasets discoverable in Google Dataset search, make sure you use an open standard (schema.org) to describe the properties of your dataset on your own web page.

Image for post

An example of the schema for making datasets discoverable in Google Dataset Search

So, if you have a dataset on your site and you describe it using schema.org, an open standard, others can find it in Dataset Search.

🔗 Link to the site: https://datasetsearch.research.google.com/

Spread the word

This post was originally published by Parul Pandey at Towards Data Science

Related posts