Monday, July 23, 2007

Funding Funding Funding

, writing in The Toronto Star points out the difference between the comparably well-funded federal LINC (Language Instruction for Newcomers to Canada) program and provincially funded ESL programs in Ontario.

Eric Bakovic over at Language Log is also talking up the issue, but in the U.S.

ESL speakers too lazy to learn a third language

The Economist has a bit about the consequences of the dominance of English in Europe. One point I'd never considered before was that people are not learning other languages.
"The rush to learn English can sometimes hurt business by making it harder to find any staff who are willing to master less glamorous European languages.

English is all very well for globe-spanning deals, suggests Hugo Baetens Beardsmore, a Belgian academic and adviser on language policy to the European Commission. But across much of the continent, firms do the bulk of their business with their neighbours. Dutch firms need delivery drivers who can speak German to customers, and vice versa. Belgium itself is a country divided between people who speak Dutch (Flemish) and French. A local plumber needs both to find the cheapest suppliers, or to land jobs in nearby France and the Netherlands."

But what do you do to avoid this problem? Apparently, there is research "by the European Commission suggesting that this risk can be avoided if school pupils are taught English as a third tongue after something else." Given the spectacular failure of most high schools worldwide to teach a second language, I wonder at the practicality of this solution.

Sunday, July 22, 2007

My spammy blog

Yesterday, when I tried to post, I got the following message: "Blogger's spam-prevention robots have detected that your blog has characteristics of a spam blog."


Judy Sierra

Yesterday, my mother was reading Thelonius Monster's sky-high fly pie: a revolting rhyme to my kids while I washed the dishes. Something clicked as she read,
"THELONIUS urgently
e-mailed a spider.
He wanted advice from a savvy insider.

"Who wrote that?" I asked. Sure enough, it was Judy Sierra, winner of the 2005 E. B. White Read Aloud Award for Wild About Books. It's somewhat astounding how one simple line can be so characteristic that you immediately know who has written it.

Saturday, July 21, 2007

Ontario: more ESL regs; no teeth, no new funding

It appears that the province will require schools to improve orientation, testing, and reporting for ESL students and their parents. No new funding accompanies the announcement. Nor will their be any requirements that schools actually spend ESL funding on ESL. Previous discussion of the issue is here.

Friday, July 20, 2007

Learning the Language

I just discovered a language-learning-related blog attached to the Education Week website. It's called Learning the Language. The author, Mary Ann Zehr,
"is an assistant editor at Education Week. She has written about the schooling of English-language learners for more than seven years and understands through her own experience of studying Spanish that it takes a long time to learn another language well. Her blog will tackle difficult policy questions, explore learning innovations, and share stories about different cultural groups on her beat."

The blog has existed since February. It's fairly US-centric and is focussed mostly on policy issues though the posts do everything from introducing new materials to visiting individual classrooms.

Thursday, July 19, 2007

Even where there are ESL teachers...

Samuel Freedman reports in the New York Times on a frustrating situation in which the few ESL teachers are pulled away from teaching ESL by paperwork and other tasks. Many of these teachers

"were responsible for completing more than a dozen different forms, evaluations, assessments and reports that came variously from the levels of district, city, state and federal government, and grading standardized tests.

Teachers like Ms. Rabenau were also repeatedly conscripted within their schools to substitute for absent colleagues, to proctor exams in other classes and to chaperon field trips."

Wednesday, July 18, 2007

Idioms: interpreting the frequencies

I suppose this isn't really specific to idioms, it would apply to any vocabulary item.
As I wrote before, one response to my explanation about idioms was,
"None of the correspondents have suggested that they have any difficulty recognising or understanding 'hit the jackpot', yet the low level of occurrence of the expression in corpora suggests that it should be so unfamiliar as to cause difficulty even to native speakers."
I'm afraid this doesn't show up a problem with the corpora themselves, but it might go some way to explaining why language teachers seem to be so loath to use corpus data: they don't understand what it tells them.
It wouldn't be unusual for a native speaker of English to encounter language that occurs with the frequency of "hit the jackpot" a number of times per month. That's because native speakers of English tend to encounter millions of words each month. The recent Mehl paper in Science suggests that we speak on average something like 16,000 words per day. Presumably, we're doing much of that in conversation with others, often more than one person, so let's put our conversational word count at 40,000 per day spoken and heard.
Then there's TV. I don't have average numbers, but after looking at a few transcripts, it looks like 7,000 words per hour might be a reasonable estimate. According to Neilson, the average American spends 4.5 hours per day watching TV, so we can add another 30,000 words or so to our count, which now totals 70,000.
I have no data on how much people write, but I suspect it's very little. In terms of reading, I can find no adult data, but 5th-grade children read about 5,300 words per day, bringing our total daily word exposure to roughly 75,300 or 2,290,000 words per month. There are likely other sources of input that I have omitted, but this should be sufficient to make the point.
At the previously established rate of 0.18 to 2.0 occurrences paw, we could expect to see "hit the jackpot" about one to four times a month. If you're about my age, you've probably heard it about 900 times in your life. So, contrary to the above writer's conclusion, it's not at all surprising that we know it. But would you be surprised hear that my six-year-old son doesn't? (I just asked him [update: May 25, 2009. He's almost eight and he still says he doesn't know. update 2: Oct 9, 2011: 10 and still unfamiliar.]).
In contrast to native speakers, our learners don't get anything like 2.3 million words a month input. And what input they do get is degraded by the fact that they don't understand much of it. Thus, what seems very common to us, is quite rare for learners. Somehow, though, it's hard to get many language teachers to accept this. They refuse to believe that idioms are not common, but as we saw recently, anything below about 30 occurrences pmw should be considered low frequency.
There are many factors that can skew our perception of a word's commonality. Psychologists have taken this issue much more seriously than have language teachers/applied linguists and have evolved a number of measures. These include:
  • number of letters/phonemes/syllables
  • written/spoken frequency
  • range/keyness/burstiness
  • subjective familiarity rating
  • concreteness rating
  • imagability rating
  • meaningfulness
  • average age of aquisition
  • word category (noun, verb, adj, etc.)
  • affixation
  • status (colloquial/dialect/alien etc)
  • semantic grouping
It would obviously be too onerous to consider all of these constantly in our teaching, but it might not be a bad thing to know about and understand each measure.
Earlier posts this series: Idioms, Differences between the corpora, & Where's the cutoff

Tuesday, July 17, 2007

NY Times: unbalanced but honest

Cornelia Dean, writing in the New York Times today included the following statement in an article about a creationist book
"In fact, there is no credible scientific challenge to the theory of evolution as an explanation for the complexity and diversity of life on earth."

Every article, in every newspaper, that discusses creationism should include such a clear statement. Too often you get so-called balanced reporting where creationism and belief in evolution are both put forward as viable alternatives, as in this Canadian Press story in the Globe and Mail.

Monday, July 16, 2007

Idioms: Where's the cutoff?

Like most people, English teachers can find it tedious to address the same basic grammar points and vocabulary items week after week. It's repetitive and hardly sexy. What many teachers really want is to get into the subtle points of language, the nuances of sophisticated use. Teaching idioms and collocations can help us feel like we're giving our students some value for their money, something they might not be able to get in a standard dictionary. But this is a rather selfish way to go about teaching a language.

As we saw the other day, learning a language is a long slow process. The fact is that few learners move beyond the rudimentary levels. Most students arrive in my classes not knowing the most common 2,000 words of English.

Paul Nation has argued that the top 2,000 to 3,000 word families should be considered high-frequency vocabulary. For one thing, there is broad agreement from one list to the next as to what these words are. This gives us confidence that they are not merely an artifact of a particular corpus construction. This list of words will also give enough coverage that learners will be able to begin to function independently. Furthermore, the additional coverage gained by learning words beyond this level is minuscule. The next 1,000 words only increases your coverage by about 2% and the payback is smaller and smaller as you go up. That is not to say that students should stop at 2,000 words, but merely that it is at this level that they should really take over from the teachers. Finally, from a pragmatic viewpoint, 2,000 to 3,000 words is about all that one can realistically expect to deal with in a course of study, be it the six years of jr. & sr. high school language instruction that is common around the world or a one-year intensive English language program.

If you look at how common these words are, you find that the lower frequency words occur roughly 30 times per million words. So teaching anything less common (and idioms are almost always far less common) really requires some extraordinary justification.

I don't know any teacher who would address say the subjunctive before teaching the progressive aspect, yet when it comes to idioms and vocabulary, a different standard seems to be applied. Perhaps the problem is that teachers have very little sense of what 30 times per million words or 0.3 times per million words tells them. More on this to follow.

In this series: Idioms, Differences between the corpora, & Interpreting the frequencies

Saturday, July 14, 2007

Language Learner Literature Award deadline

The Extensive Reading Foundation's Language Learner Literature Awards (previously mentioned here) will be announced August 31. Readers still have a chance to submit their votes, but the deadline is July 20th.

According to this article, one library in New Zealand has had a great deal of success with displays of nominated books. Not a bad idea.

Friday, July 13, 2007

Japanese words for white

In a NewScientist interview, linguist Annie Mollard-Desfour claims that, to a Japanese person, the brightness of a colour is more important than its hue and that the Japanese language has a large number of words for white, "from the dullest to the most brilliant".

I'm afraid that despite spending ten years in Japan I had never noticed any of this, so I put it to my Japanese wife. She was as perplexed as me. We had a look in the Kenkyusha New English-Japanese Dictionary, 5th ed., and could find only a single translation for the colour white, well 2 actually: the adjective 白い and the noun 白. Oh, there were words like クリーム(cream) and compounds like 雪白 (snow white), and even metaphorical uses meaning pure, snowy, Caucasian and what have you, but only one word for the colour white.

This, of course, brings to mind the great Eskimo Vocabulary Hoax and the endless snowclones that people love to rehearse in exoticising a language or a people.

Yet, since Mollard-Desfour is a linguist and a lexicographer, I'm sure she's not simply making this stuff up. I, therefore, sent a letter to NewScientist requesting clarification. I look forward to seeing the examples or citations that my wife and I must have overlooked.

[See the follow up here]

Thursday, July 12, 2007

Idioms: differences between corpora

One of the arguments that came up yesterday was that idioms are simply not reflected in the frequency counts because the corpora don't reflect the type of language use in which these idioms would usually occur. Michael Stout suggests something similar in his comment.

It is certainly true that particular words, expressions, and forms vary quite considerably in their distribution and frequency on a scale that is often seen as ranging from spoken/informal language to written/academic language. (In fact this is probably better seen as multidimensional space, but that's another issue.) Indeed, if we look at fuck, we can see that it is strikingly common in spoken conversation, occurring 136 times pmw in the BNC vs. 1.63 times pmw in the academic subcorpus.

The following look at "hit the jackpot" should show the range in frequency of these types of idioms:
  • From yesterday, we have 0.32 occurrences per million words in the BNC. Looking at the subcorpora, we have a high of 1.88 pmw in News.
  • In the Time corpus, we have 0.8 pmw with a high of 2.0 pmw in 1940s.
  • The MICASE corpus that Michael mentioned in his comment yesterday has zero instances of "hit the jackpot" in 1,848,364 words. MICASE is unscripted 'merican speech at universities, mostly in lectures and academic discussions.
  • The Enron e-mail corpus has 18 occurrences of "hit the jackpot" in 96.3 million words (about 0.2 pmw; but a number of them are duplications)
  • The first release of the ANC is 11 million words. "Jackpot" occurs 3 times, but "hit the jackpot" does not occur.
  • The million-word Brown corpus, which is an early US written-text corpus has no instances of "jackpot".
  • The Corpus of Spoken, Professional American-English does not have any instances of "jackpot" in its sample of 42,739 words.
  • Nor does the 2-million word US talk TV corpus at Lextutor.
  • I don't have access to the The Wellington Corpus of Spoken New Zealand English, but if anyone else does, please let me know the results. I'll bet that they are all in the same range. [update: July 23| Bernadette Vine, who manages the corpus, was kind enough to do a search for me. She reports that there are no instances of 'hit the jackpot' in the one-million-word corpus.]
    [Update2: Oct 9, 2011. Some new corpora have become available, so I've added them below.]
  • In the Corpus of Current American English, we have a high of 0.61 in magazines and a low of 0.05 in academic writing.
  • And here's the frequency in the Google Books corpus throughout the 20th century, which maxes out at 0.06 pmw.

So, overall, we nothing goes above 2.0 and most are much lower. I would be very surprised, then, to find a situation in which "hit the jackpot" occurs at, say, about 30 pmw or more, at least not one that is going to be relevant to many learners of English.
Tomorrow, more about what exactly "common" would mean for a learner (hint: see the previous paragraph.)
In this series: Idioms, Where's the cutoff, & Interpreting the frequencies

Wednesday, July 11, 2007


Language teachers are overly enamoured of idioms. My pointing out, over on the TESL-L list, that a series of "common gambling idioms" is not at all common began a raveled thread of comments questioning the value of corpus data and supporting the teaching of idioms.

Here's what I posted. The numbers are the occurrences per million words, first in the British National Corpus, and second in the Time corpus (in peak decade).
  • hit the jackpot: 0.32 (2.0 in 1940s)
  • on a roll: 0.30 (2.21 in 1990s)
  • ace in the hole: 0.04 (0.08 in 1940s)
  • Bingo!: 0.17 (0.64 in 1990s)
  • play(s/ed/ing): [somebody's] cards close to [somebody's] chest 0.07 (0.06 in 1960s)
  • wild card: 0.54 (1.38 in 1990s)
  • shoot the works: 0 (0.80 in 1930s)
  • put(s/ting) * money down: 0.05 (0.11 in 1990s)
  • beginner's luck: 0.04 (0.32 in 1960s)

To give you some context anathema, which is about the 23,800th most common word in the British National Corpus, occurs 1.42 times per million words. In other words, unless a learner of English has a huge vocabulary, there are lots and lots and lots and lots and lots of more useful things teachers can be teaching them than gambling idioms (or almost any other idiom, for that matter).

The following is a sample of the responses. Over the next few days I'll try to untangle some of them.

  • "So I think we can safely say that there are times that word frequency lists can be misleading."

  • " I have noted, at first with some dismay, the rabid attacks on any form of linguistic sophistication. Apparently, our foreign students have far better things to do than learn the subtleties of the language they are studying."

  • "Comments have been made on teaching idioms. Idioms are of utmost necessity in using and understanding English. "

  • "I have never said the word anathema, partly cos I'm not sure how to say it. But the gambling idiomatic terms turn up frequently - maybe once a month for each, in colloquial speech in NZ, so they should/could be taught."

  • "None of the correspondents have suggested that they have any difficulty recognising or understanding “hit the jackpot”, yet the low level of occurrence of the expression in corpora suggests that it should be so unfamiliar as to cause difficulty even to native speakers (I could imagine that there are many native speakers of English who would have problems with “anathema”). If the expression is so uncommon, how do we all know it?

    "One possible explanation is that the corpora that we have available seriously misrepresent the language we encounter in our daily lives."

Followups: Differences between the corpora, Where's the cutoff, & Interpreting the frequencies

Tuesday, July 10, 2007

A tale of two numbers

  • $980: funding that schools in Ontario received per ESL student in 2004/2005
  • $245: average personal spending on English-language tutors in South Korea in 2006


According to a report by the Auditor General of Ontario, schools get $225 million (all figures in Canadian dollars) in funding for ESL students. That works out to about $980 per student. In contrast, Korea with a population of about 72 million people spent $17.7 billion on English-language tutors last year. That works out to $245 for every Korean man woman and child. For tutors.

Sunday, July 08, 2007

Things to re-meme-ber me by

The Ridger FCDE at The Greenbelt has tagged English, Jack in an internet meme. Here's a recent history of this particular one:
  1. The Greenbelt
  2. Thoughts in a Haystack
  3. Evolving Thoughts
  4. On Evolution
  5. Scientia Natura: Evolution and Rationality
  6. The Flying Trilobite
  7. Pharyngula
  8. Ironicus Maximus
These are the rules:
  1. We have to post these rules before we give you the facts.

  2. Players start with eight random facts/habits about themselves.

  3. People who are tagged need to write in their own blog about their eight things and include these rules in the post.

  4. At the end of your post, you need to choose eight people to get tagged and list their names.

  5. Don't forget to leave them a comment telling them they're tagged, and to read your blog.

And, these are the facts:

  1. My earliest memory (true or created, I'm not sure) is of hanging on a fence in my Grandpa's backyard in Pipestone, Manitoba. A barb was impaled in my left palm, but the memory is simply a placid image, nothing more.

  2. When I was living in Chiang Mai, thinking a bicycle would be be a great way to get around, I bought a mountain bike. It was, indeed, a wonderful means of transportation and afforded me a good deal of exercise as well. One day, I decided to ride the bike to Wat Phrathat Doi Suthep the top of Suthep mountain. I made my preparations and set out the next morning, early before the sun was too high. I rode out past Chiang Mai University and started up the mountain. Soon, however, I had run out of water and food and the temperature was well up into the 30s. The climb was much harder than I had anticipated and I was just about done in. I hadn't passed anywhere to get food for a long distance and had no idea if I could make it back. Just then, I came around a bend in the road and over to my left saw an open-air restaurant. Salvation! I coasted down to the edge and then, dripping with sweat, red in the face, rubber legged, dressed in cycling shorts and feeling very foolish, I walked over to the buffet. I looked around but could see no wait staff. Eventually a woman came and asked me what I wanted. I pointed it out and reached for my money. "Not to pay," she said. "Is wedding."

  3. In grade 3, I won the school speech contest with a speech about goldfish. The first line was "Blub, blub, blub, swish, swish, swish. Ladies and gentlemen, my speech is on fish, goldfish that is."

  4. When I was teaching jr. high school English in Japan, I was dealing with a very rowdy bunch of first year girls. One girl, in particular, who was very weak academically was fooling around not paying attention. After using various tactics, including explicit warnings, most of the class had settled down. Then I saw the girl writing a note on some cutesy Hello Kitty stationary. I blew up. I grabbed the note from her and berated her harshly. I had no more problems for the rest of the class, but at the end, when I looked at what she had written, I saw it was not a note, but notes related to the lesson. I stopped everyone from leaving and apologised to her. Since then I've done my best never to make assumptions about my students intentions.

  5. One fall, I made about ₤20 per hour busking on the Queensway, just north of Hyde Park in London.

  6. My first bicycle was a light-blue CCM with 16-inch wheels and a movable crossbar.

  7. After racing to Ambon on the Summer Wind II out of Australia, we sailed back south towards Bali. The race had been marked by a lack wind and many of the boats had turned on their engines. After five days at sea, the captain, a 50-something Aussie who wanted to sail around the world, had promised us a leisurely trip with lots of stops. One, however, he discovered that it was hard to get steak and potatoes and that the locals didn't understand him E V E N W H E N H E S P O K E S L O W L Y A N D L O U D L Y, he rescinded that offer and headed straight for Bali. Things came to a head at Komodo where I was put off.

  8. My second toe is longer than my big toe.

Now, as for the 8 people I tag, I'm ignoring rule 5. I'll list the blogs and if these fine folks find this and respond, wonderful. And if they don't, that's just swell too.

  1. From A to Zimmer
  2. Mishka Jaeger
  3. Career Limiting Moves
  4. Stoutfellow
  5. Separated by a common language
  6. Brashaw of the future
  7. Designers who blog
  8. David Crystal's Blog

Saturday, July 07, 2007

Perception, meet reality

A recent survey in Utah seems to explain why so many Americans think that immigrants aren't trying to learn English reports The Salt Lake City Tribune.
"One of the biggest surprises from the survey, community leaders said, is the time employers think it takes to learn English. Almost half of employers said it should take six to 12 months to learn English, the survey said."

The US Department of State classifies various languages by difficulty; it all depends on your first language. But let's look at category III, which for English speakers would include Russian and Persian. According to the National Foreign Language Center, the DoS estimates that

"44 weeks of intensive language training in U.S. government language schools (five days per week, six hours per day) are required to achieve minimum working proficiency... Similar results are achieved after five years of typical college language courses, especially if students spend at least one semester abroad learning the language, in addition to their language courses in the US...

"What is 'minimal working proficiency?' Someone able to function on their own, able to talk about familiar topics and daily life."

And how many of these immigrants have the wherewithal to attend high-quality full-time English-language courses?

By the way, the survey also found that more than 80% of immigrants and refugees say they have formally tried to learn English.

Friday, July 06, 2007

Word drive on

Over at the Simple English Wiktionary, we're trying to reach the goal of having 2,000 entries by August 4th. We won't interrupt regular programming with endless boring discussions about funding issue while bugging you to pledge, but if everyone who visits here would just contribute one word, it would be very much appreciated.

Words per day

Over at Language Log, Mark Liberman has spent a good deal of time addressing the baseless claims in Louann Brizendine's book and her subsequent media appearances, in particular the idea that women speak more than men do. Today, a paper came out in Science supporting Liberman's arguments that men and women use roughly equal numbers of words. Unfortunately, Brizendine's claims are not dead; they are the undead and will continue to wander stupidly among us leaving ignorance in their wake.

The NYT has a nice graphic showing the distribution. The article it goes with isn't so hot though.

Differences between the sexes aside, the numbers are interesting. In the Mehl et al paper, women and men both spoke about 16,000 words per day. The Longman Grammar of Spoken and Written English though says that "on average speakers produce around 7,000 words per hour in the conversational texts of the LSWE Corpus, or a little under 120 words per minute. Based on this speech rate, a one-million word corpus corresponds to 140-150 hours of conversational interaction." (p. 27) Presumably the Longman folks were only considering the time when people were actively engaged in conversation. Given these numbers, we can extrapolate that your average university student spends just over 2 hours speaking per day and is silent for almost 22 hours. If you allow that most of that speaking is part of a balanced conversation, it would seem that these students spend about 4.5 hours per day involved in conversation.

In the written mode, Anderson, Wilson, & Fielding estimated that, outside of school, the median fifth-grade student reads about 600,000 words per year (about 1,650 words per day), while in school, according to Nagy & Anderson, they read about 1.3 million words per year (about 3,650 words per day) for a combined reading exposure of about 5,300 words per day.

It would be interesting to know what our total daily word count is, including everything we write, read, hear and speak. [update: July 19, 2007. I've done a quick guesstimate here which suggests an average of about 75,300 words per day (with huge individual and daily variations).]

Thursday, July 05, 2007

Super Mario Vocabulary

Kahori Sakane has an article in the Daily Yomiuri about schools giving students Nintendo DS handheld game consoles with specially designed vocabulary study software and providing them with time in class to use them. Even better, the schools seem to have done some research into the effectiveness (not well controlled, admittedly) and the students' reactions. It turns out they like it.

Actually this isn't all that surprising. We had similar success with a Mac program called Vocab (which is still free, but seems to have been forgotten by its developers) and its companion Vocab Scheduler back in the late 90s at the Tokyo high school where I taught. The benefit of this new program is that the consoles are a whole lot cheaper than computers, teachers don't need any computer savvy to run them, and they can be moved around from class to class.

Of course, you need to be studying the right words, good content (definitions, translations, example sentences, etc.) and well-established review regime, but I think this is certainly a useful intervention. Even better would be to deploy a version that works with learners' cell phones.

More about vocabulary here, here, and here.

Link courtesy of David Paul at ETJ.

Tuesday, July 03, 2007

Error collections

From time to time it is useful to have a detailed look at the kind of errors that learners of English make. The International Corpus of Learner English from Université catholique de Louvain (Belgium) has an error-tagged corpus of written text produced by learners of English. More information here. Unfortunately, it is not freely available online, but the price is modest. You can also gain access to it by sending them your students' texts (after, I assume, receiving permission from your students.)

Erors can be amusing as well as edifying, such as this one. Anders Henrikson has been collecting students' mistakes for many years. Here's a collection that has been around since at least the early 1990s. He's also got a book of them.

Finally, last week the Language Loggers posted a link to a video of Taylor Mali changing all the usual typos into speakos.

Right to bargain

English-language teachers in Ontario Colleges (and elsewhere) are disproportionately non-full time workers. In our EAP department, for instance, fewer than 2/3 of the classes are taught by full-time faculty.

In Ontario, most part-time and contract college faculty have been denied the right to join a union. These workers are supported by the CAAT division of OPSEU and have set up their own non-bargaining group, OPSECAAT, but have never had the right to bargain collectively. It looks as though change may be coming.

Back in June, the Supreme Court of Canada ruled that,

"The right to bargain collectively with an employer enhances the human dignity, liberty and autonomy of workers by giving them the opportunity to influence the establishment of workplace rules and thereby gain some control over a major aspect of their lives, namely their work."

However, the Ontario government will not reconvene until after the October election, so there will be no legislative changes until then. As far as I can tell, none of the parties, not even the labour-friendly NDP, has made any mention of this ruling in their platform. Finally, the Supremes have given governments a year to address the issue, so we shouldn't expect anything to change soon.