Corpus of Contemporary American English (COCA) | 400 million words | 1990 - 2009 |
Corpus of Historical American English (COHA) NEW | 400 million words | 1810s - 2000s |
BYU-BNC: British National Corpus | 100 million words | 1980s - 1993 |
TIME Corpus of American English | 100 million words | 1920s - 2000s |
Corpus del Español | 100 million words | 1200s - 1900s |
Corpus do Português | 45 million words | 1300s - 1900s |
He's (almost?) singlehandedly put together the largest collection of freely useable corpora. Of the above, only the BNC was not compiled at Brigham Young.
Also, when I pointed out to him that it would be lovely if we could query the range of publications and documents in which a text appears, he agreed and it should be possible in a few months.
No comments:
Post a Comment