| Corpus of Contemporary American English (COCA) | 400 million words | 1990 - 2009 |
| Corpus of Historical American English (COHA) NEW | 400 million words | 1810s - 2000s |
| BYU-BNC: British National Corpus | 100 million words | 1980s - 1993 |
| TIME Corpus of American English | 100 million words | 1920s - 2000s |
| Corpus del Español | 100 million words | 1200s - 1900s |
| Corpus do Português | 45 million words | 1300s - 1900s |
He's (almost?) singlehandedly put together the largest collection of freely useable corpora. Of the above, only the BNC was not compiled at Brigham Young.
Also, when I pointed out to him that it would be lovely if we could query the range of publications and documents in which a text appears, he agreed and it should be possible in a few months.
No comments:
Post a Comment