Corpus of Global Web-Based English (GloWbE)
The Corpus of Global Web-Based English (GloWbE) is composed of 1.9 billion words from 1.8 million web pages in 20 different English-speaking countries. The corpus was created by Mark Davies of Brigham Young University, and it was released in 2013.
GloWbE (pronounced like "globe") is related to other large corpora that we have created, including the 450 million word Corpus of Contemporary American English (COCA) and the 400 million word Corpus of Historical American English (COHA). Together, these three corpora allow researchers to examine variation in English -- by dialect, genre, and over time -- in ways that are not possible with any other large corpora of English.