Google Reveals Culturomics: A 500 Billion Word Database

Search giant Google has collaborated with researchers from Harvard to create Culturonics, a massive database comprising of 500 billion words taken from 5.2 million digitised books.

According to The New York Times, the database contains words and short phrases that have been taken from millions of books published between 1500 to 2008 in Chinese, English, French, German and Russian.

The idea behind the database is to allow people worldwide to analyse word patterns and the impact that these words and books have on culture.

Google unveiled the database in the form of a web tool, which allows anyone to see the frequency that a particular word was being used, in any given era. People will also be able to download the data and use it in their own tools or apps.

Jean-Baptiste Michel of Harvard said: “This data is a new tool for the humanities, just one part of the puzzle that humanists can use to address questions about human society.”