Thursday, September 29, 2011

40 fascinating lectures for linguistics geeks

From Zoran Nesic through Google+. These lectures could potentially eat up a lot of my marking time.

Wednesday, September 28, 2011

Grammarology 2.0: Linking verbs

My first real Grammarology 2.0 column is up on TESL Toronto's website. As promised in the introduction, it's another look at what constitutes a "linking verb".

Previous English, Jack posts related to linking verbs:
The categorization of so 
The state of linking verbs

Monday, September 26, 2011

Grammarology 2.0

I've started a new column for TESL Toronto which will run roughly twice a month. In it, I’ll tackle grammar questions from two viewpoints: traditional school grammar, and a more modern analysis following The Cambridge Grammar of the English Language (CGEL). The introduction is now online.

Wednesday, September 21, 2011

Google ngrams and TED

You've seen my TED talk using the Google ngram viewer, and now here's another, this time by the authors of the culturomics paper, Jean-Baptiste Michel and Erez Lieberman Aiden. It keeps things pretty light, but the suppression of Chagall's name in the German corpus during the Nazi period was interesting.

Saturday, September 17, 2011

`I' vs `the'

In the Sept 3-9th edition of New Scientist, James Pennebaker discusses the individual variations in frequency with which we use pronouns and other small words, and he considers what this metric might say about our personalities and relationships. The paper version (p. 45) has a graph entitled "The real word count" with the caption "The 20 most frequently used words in the English language, across both spoken and written texts." The graph shows that I is the most common word, followed closely by the.

This prompted the following query by Mike Scott to the Corpora mailing list:
"I wrote to the author, James Pennemaker of the U of Texas, about this, expressing my surprise at the pronoun I having greater frequency than THE, as even in the spoken-only section of the BNC (10m words) we find I occurring only just over half as often as THE. His data contains a mix of spoken and written with a large amount of blog data. He reports that with all his studies in the USA and Mexico, "people always use more I more than THE. It's never close." Can anyone help here, clearing up the position? Someone with access to a really top quality corpus, more up to date and representative than the BNC? "

Wednesday, September 14, 2011

"World's first" English language learning chatbot

This video is actually posted on the website of the company hawking this "service". Incredibly, they're charging people to put themselves through this kind of torture.

Thursday, September 08, 2011

Language Learner Literature Award Winners

The first extensive reading world congress that wrapped up last weekend in Kyoto was, from all reports, a great success despite the typhoon. The ER Foundation has been posting videos of many of the talks on their YouTube channel.

The winners of the 2011 Language Learner Literature Awards were also announced, and the results are now up on their website. I've reproduced them below:

Monday, September 05, 2011

Self control and Google Ngram Viewer

In the New York Times Sunday book review, Steven Pinker reviews Willpower: Rediscovering the Greatest Human Strength by Roy F. Baumeister and John Tierney, a book that was already on my to-read list after the recent summary. In doing so, Pinker writes,
"Nonetheless, the very idea of self-­control has acquired a musty Victorian odor. The Google Books Ngram Viewer shows that the phrase rose in popularity through the 19th century but began to free fall around 1920 and cratered in the 1960s, the era of doing your own thing, letting it all hang out and taking a walk on the wild side."
Being the anal fact-checking type I am, I went straight to the Google Books Ngram Viewer and searched for self-control. Nothing. Not a single hit, which is rather strange since the hyphenated version is not so uncommon. But after playing around a bit, I found that the Ngram viewer seems to have some problems with hyphens. So here's the graph of the frequency of self control sans hyphen.


From this graph, it seems Pinker is about a decade early in diagnosing free-fall, and getting on 20 years late in placing the crater. In fact, by the late sixties, self control had gained back a good deal of its losses, and Pinker doesn't mention that by 2000 we were back near historical highs. 

Maybe he's looking at a different graph. Perhaps he had more success with the hyphenated version, or he might be looking at one of the sub-corpora, say American English, or the English One Million. But none of the other graphs seem to match his description either. In fact, the British English graph tells a completely different story:

As I pointed out in my TEDx talk in June, even if we date the changes accurately, it's really not clear what fluctuations in the frequency of a particular phrase would mean. It would depend on many things including the change in popularity of synonyms (e.g., self restraint, willpower, etc.). It could indicate a shift in the frequency of the hyphenated and non-hyphenated spellings. And people can use self control both approvingly and dis-.  

Despite the trouble with interpreting changes in word frequency over time, though, I predict that there will be a rise in the frequency of this trope in the media.

Sunday, September 04, 2011

New Grammar App from Bas Aarts and Survey of English Usage

Here's the press release:
The Survey of English Usage at UCL is very pleased to announce the publication of a new App for Apple hand-held devices such as the iPhone, iPad and iPod Touch. The interactive Grammar of English (iGE) is a complete course in English grammar written for first year undergraduates, students at high schools and teachers of the English language. For more information see the iGE website