|Corpus of Contemporary American English (COCA)||400 million words||1990 - 2009|
|Corpus of Historical American English (COHA) NEW||400 million words||1810s - 2000s|
|BYU-BNC: British National Corpus||100 million words||1980s - 1993|
|TIME Corpus of American English||100 million words||1920s - 2000s|
|Corpus del Español||100 million words||1200s - 1900s|
|Corpus do Português||45 million words||1300s - 1900s|
He's (almost?) singlehandedly put together the largest collection of freely useable corpora. Of the above, only the BNC was not compiled at Brigham Young.
Also, when I pointed out to him that it would be lovely if we could query the range of publications and documents in which a text appears, he agreed and it should be possible in a few months.
Post a Comment