Thursday, December 20, 2007

Lack of data vs. Poverty of the stimulus

NewScientist recently published an article about the subconscious that included the following claim:
"Infants do not need tutoring to acquire their native language; they pick it up subconsciously. What's more they do this with remarkably little linguistic data - what the Harvard University linguist Noam Chomsky has called the "poverty of stimulus" - suggesting that this subconscious learning allows youngsters to use information very efficiently."

The nativist language acquisition argument from the "poverty of the stimulus" seems as though it may have been misconstrued. It is not a quantitative argument but a qualitative one. Miller and Chomsky claimed in 1963 that children's input included many grammatical errors and disfluencies. How were children to discover which input to attend to and which input to ignore?

More recent arguments suggest that, errors aside, given the lack of negative evidence, it is hard to see how children could learn what is ungrammatical. We have evidence for children overapplying rules such as regular past tense endings. Given that, it would be logical to assume they would overapply other patterns such as the following:

  • She sent a letter to him. / she sent him a letter.
  • She explained the letter to him. / *She explained him the letter.

Without evidence that this last construction is not possible, the 'poverty of the stimulus' argument goes, how are children to know that they merely have not yet heard it?

The idea that children simply don't get enough input, however, is odd. Research suggests that average North American children get somewhere in the range of 26 million (plus or minus many million) words of input in their first four years of life. It's hard to see how this can be called "remarkably little linguistic data."

Saturday, December 15, 2007

(det) hijab

What with the tragic filicide of Aqsa Parvezes, the word hijab has been much in the news. What I noticed that I had not before is that it is almost always the hijab, as in "she was killed over her refusal to wear the hijab." The OED cites the following instances:
"1980 Associated Press Newswire (Nexis) 15 July, She said the wearing of the hijab, or veil, is a matter of choice. 1985 Times 5 Jan. 8/3 Every woman on the street wears either the traditional chador..or the more practical hijab, a dark scarf pulled over the forehead. 1994 J. I. SMITH in A. Sharma Today's Woman in World Relig. 306 Many Egyptians who do not adopt the higab are deeply respectful of those who do. 1995 New Yorker 30 Jan. 60/3 A young girl..who was dressed in a black djellabah and wearing the traditional head scarf, or hijab. 2005 Asiana Spring 278/1 (heading) Wearing a hijab is no barrier to success."

Notice that only the last one makes the unmarked choice of a instead of the. This suggests that it is a symbol rather than merely a piece of clothing. Other somewhat analogous items include: veil, burka, turban, kirpan, cross, and robes of office.

Unfortunately, none of these is entirely satisfactory. For a nun, taking up the veil, entails more than simply wearing it. To a certain extent, the same could be said for hijab, but once a nun takes up the veil, it doesn't matter whether or not she is actually wearing it at a given time. As this story shows, the same is not true of hijab. Also, burka and hijab should, theoretically, pattern identically, but they don't. Although the OED has examples such as, "1929 Daily Express 15 Jan. 1/1 The Queen [of Afghanistan] is wearing the boorka--a heavy shapeless garment which effectually hides her beauty," current usage seems to favour "a burka". Here are the counts from the Google news archive:







burkahijab

the8184850

a13802340

With turban, we find 4,540 for the vs. 12,800 for a. And kirpan is almost in a dead heat.

Clearly this data is very muddy, but I think it is at least suggestive that there is something different going on with hijab.

Saturday, December 01, 2007

Past tense cut in US schools. Is Canada next?



With American schools cutting the past tense from the language curricula, it can't be long before Conservative provinces in Canada follow suit.

Sunday, November 25, 2007

Double modals

A number of people who attended my TESL Ontario presentations on Thursday and Saturday seemed skeptical that modal auxiliary verbs co-occur in southern dialects. Ben Zimmer comes to the rescue with a perfect example, and from a US presidential candidate nonetheless. As far as I know, however, there are no dialects that allow multiple coordinator constructions like "and but" (which was the point I was making in my presentation).

Tuesday, November 20, 2007

More large-scale testing problems

I've commented before on problems with large-scale test problems at ETS. Now, Sam Dillon, writing in the New York Times reports on a case in which incorrect page numbering on test booklets led to the disqualification of 5,600 reading test scores, an embarrassment that led to a half-million dollar penalty for RTI International, the company that handled the tests.

Thursday, November 15, 2007

The grammar of physical exhaustion

I wonder if Shatner ever felt that he was hyphenating his words. Likely this is more of a cartoonist's thing.

Wednesday, October 24, 2007

And the worms ate into his brain

In his new book, Musicophilia, Oliver Sacks seems to have created a new word for those sticky songs that you can't get out of your head. In German, these things are called ohrwurms, their word for earwigs. This has come into English as earworm. Now Sacks has decided that he prefers brainworm. You can watch a video of him talking about it on the book's page on Amazon.com.

John Terauds, writing in the Toronto Star on the weekend, used Sacks's new word, but now says via e-mail that it was a brain fart.

Thursday, October 18, 2007

Smaller than small

The ETJ list is often a source of interesting questions. Recently, Peter Warner observed:

"Listening to a British presenter, I realized that his sense of little and small were different. His general order of descending relative size went roughly in this sequence:

huge, big, small, little, tiny

while my American background views little and small as basically synonymous. Speaking with him after his wonderful talk confirmed his different sense of those two words."

Peter then asks, "is this a British vs. North American English usage difference?"

If you check the British ESL dictionaries, they typically define little as meaning small and the other way round. Similarly, the Concise Oxford defines little as "small in amount, size, or degree..." There is no mention of one being less big than the other.

There are, however, other differences. Least interesting among these is that small is only an adjective (the small of your back excepted), while little exists in two flavours: adjective & determiner. More interesting is that they are both marked, but little seems to be more marked. That is, when I want to know the size of something, it's typical that I would ask you how big something is, not how small it is. In this sense, big is unmarked. Yet, it is even less common to ask how little something is, about 10 times less common in the frame "how ~ is it?'.

Because of the problems parsers have distinguishing between adjectives and determiners, it's hard to get a good corpus view of all aspects of the difference, but here's a neat one: Search for (adj) + little in the spoken section of the BNC and you get the following (the first number is raw hits, the second number is hits per million words):

1 NICE LITTLE 161 15.58
2 TINY LITTLE 36 3.48
3 LOVELY LITTLE 35 3.39
4 POOR LITTLE 33 3.19
5 OTHER LITTLE 14 1.35
6 LITTLE LITTLE 13 1.26
7 SILLY LITTLE 13 1.26
8 CHEEKY LITTLE 11 1.06
9 PRETTY LITTLE 11 1.06

Try it again in the academic section and what do you get? Nothing. There are no such pairs.

Now do the same with small, this time with academic section first:

1 PROXIMAL SMALL 28 1.81
2 OTHER SMALL 20 1.30
3 DISTAL SMALL 6 0.39
4 NORMAL SMALL 6 0.39
5 UPPER SMALL 6 0.39
6 ENTIRE SMALL 4 0.26
7 INDIVIDUAL SMALL 4 0.26
8 NUMEROUS SMALL 4 0.26

And in conversation? Zip.

So it would seems that small is more academic, at least in this kind of pairing.

Now, this is stretching things quite a bit, but if we consider the general prejudice that conversation is trivial while academic language is weighty, long, and substantial, there may be an argument, if only a metaphorical one, for little being smaller than small. I think I'll put it to Lynneguist over at Separated by a Common Language.

Friday, October 05, 2007

Problematising Problematic

About a month ago, Russel Smith, writing in the Globe & Mail, discussed a tendency for the meaning of technical terms, such as price point and deconstruct, to drift as the words become mainstream.

"It's inevitable that technical words and phrases will be imported into everyday language from specialized jargons. It's also inevitable that those terms will then change their meanings slightly. They usually lose some of their specificity, a bit of their subtlety, and become synonymous with some other everyday term."

I think he overstates the case here, but still, there is a case to be made. And notice how the paragraph above is simply descriptive, rather than evaluative. Perhaps his time at And Sometimes Y has relieved Smith of some of his more prescriptive tendencies, though, making it clear that he hasn't entirely shuffled off the curmudgeon's burden, he writes, "My favourite example of a corrupted technical term has to be 'deconstruct'." He explains the Derrida's use of the word like this:

"The aim of the reading was to show how the text's meaning is elusive, how it contradicts itself. It's part of a larger view of language as something essentially problematic. In other words, it hardly means an elucidation, as we use it so casually to mean now, but almost the opposite"

Still, Smith remains admirably neutral throughout. And the changes that he describes actually seem to be supported by evidence.

Ironic, then, that he is taken to task in a letter that appeared a few days later:

Posted on 08/09/07

Problematic problem

Kimberley, Ont. -- It is striking that in discussing the demotic use of jargon (Technical Terms And Mainstream Meanings - Review, Sept. 6), Russell Smith employs "problematic" in just such a manner. Having originated in the field of logic, the word means doubtful or questionable and has only recently come to be used in describing something that poses a problem. That Mr. Smith is, surprisingly for him, unaware of this history may be a problem, but it is definitely not problematic.

Here, Ferguson indulges in a number of fallacies (but they're so fine, you see):

  1. language must not change, so the older meaning is the only true meaning

  2. language always moves from the erudite to the debauched, so the technical meaning must be older

  3. whatever I think a word means is what it means

If we consult the OED, we find that it lists more than one meaning for problematic. Yes, polysemy is alive and well. We also find that the term of art from logic dates from 1610, but the more general sense is attested from 1609. From this, it would be hard to argue that one preceded the other in English. Finally, the OED disagrees with Ferguson's definition, giving the following instead:

2. Logic. Of a proposition: that asserts that a state of affairs is possible rather than actual or necessary.

Sunday, September 30, 2007

If they will only listen

Believe me, English really doesn't have a future tense. Some may think this is simply a labeling preference, but it goes beyond that. The belief that English has a future tense gets people all twisted up in ad hoc explanatory knots that vanish once you accept a future-tenseless English into your heart.

Take, for instance, this question, which recently showed up on the ETJ list.

I was asked if the sentence "if he's not going to change his mind
there is no use talking to him" was correct.
I said yes.
"But you can't use if with the future tense" came the reply
"quite right" said I.
"so why is this ok?"

In case you're asking where these two came up with this "rule", here are a couple of versions for you. The first one is from an TESL site called englishpage.com.

Like all future forms, the Simple Future cannot be used in clauses beginning with time expressions such as: when, while, before, after, by the time, as soon as, if, unless, etc. Instead of Simple Future, Simple Present is used.

And here's another from Wikipedia (now changed):

Future tense forms are not used in the condition clause (protasis) in English: *If it will rain this afternoon, …

If you subscribe to the future-tense school of English grammar, it is completely natural to look for regularities in how such a tense would be realised. This is what we're seeing above. The people who are promulgating these "rules" are looking at a specific instance in which it seems to be true (e.g., *if it will rain this afternoon) and generalising from there.

But this disallows perfectly fine sentences such as "if you'll excuse me, I've got a bus to catch" or even "If you'll be using a wheelbarrow frequently, then make risers low". It also catches instances such as the one that initiated the question above.

On the other hand, it ignores other problems such as "it could rain tonight" becoming "if it could rain tonight...", or "that may be the right one" becoming "if that may be the right one", or even "he might have been there" as "if he might have been there".

A belief in the future tense does this because it leads people to look for regularities in some hodge podge notion of the future tense where there are none to be found rather than within the system of modal auxiliaries where they actually exist.

In fact, the problem only resides with certain modals (mainly 'will', 'may', 'might', and 'could') when they are used to express probability (i.e., epistemic uses). Thus, "it could rain tomorrow" doesn't work as *"if it could rain tomorrow", but "you could help me tomorrow, couldn't you" easily becomes "if you could help me tomorrow" because 'could' here is denoting ability (or willingness, i.e., it is deontic) rather than probability. Note that this also holds true for present and past time as well as future time.

Saturday, September 29, 2007

Rethinking rethink

A student (college-level native speaker of English) writes, "It is time to rethink the path society is on." Something in my brain went sproing when I read this. Well, not really sproing, more like twing. Anyhow, when I went over it again, path just didn't feel quite right.

I ran it by my family who were enjoying the weekend while I wade through this pile of papers. My brother found it unremarkable, but my mother immediately rephrased it as "rethink where society is going". It then occurred to me that rethink seems to have a different set of complements from think. After some googling and corpus searches, I'm confident that this is so.

You'd never say, "*think your plans/policy/approach", but "rethink your plans/policy/approach" is fine. It seems that, for whatever reason, rethink now patterns with reconsider rather than think. I wonder when that happened or if it has been thus since it came into the language around 1700. I wish I had access to the OED from home. [Now that I'm back at work with access to the OED, I find that rethink has been transitive from its earliest recored use.]

Thursday, September 27, 2007

CBC blowin' in the Burmese wind

CBC started the week reporting on the marches in "Myanmar, formerly known as Burma". For the last two days, however, it's been "Burma, also known as Myanmar".

[Actually, a quick web search shows that it's a policy change.]

Sunday, September 23, 2007

Promises are not verbs

In today's Toronto Star, Andrew Chung writes,

"Promises are the currency of elections. With no platform to judge, on what other basis could voters make a decision when casting a ballot? Promises are also part of a category of verbs that experts call 'performative speech acts.' These utterances actually cause the speaker to perform a certain act."

I'm pleasantly surprised that the topic of performative speech acts (PSAs) should show up in The Star. They're rather curious self-fulfilling things. When you say, "I promise x", you have done so and need do no more. The act is carried out through the utterance. Promise isn't the only word that works this way. Other examples are declare, sentence, damn, pronounce, etc.

But notice that though this is a list of verbs, the verbs themselves do not, contrary to what Chung says, constitute speech acts. It is the utterance, usually in the form of a sentence, that is the PSA. Nor are promises verbs. But I do appreciate Chung's giving it the old college try.

Saturday, September 22, 2007

An ESL policy update from Annie Kidder

Today, Annie Kidder at People for Education posted a comment to a previous post. It's substantial enough that I'm reposting it here.

In early September, the province quietly posted its new English as a Second Language (ESL/ELD) policy on the Ministry of Education website. The province has been promising new funding and policy for ESL since the spring of 2006, and while the new policy addresses some of the concerns raised over the last three years by Ontario’s Auditor General and others, it is difficult to see how it will alleviate many of the ongoing issues in ESL programs.

  • Many school boards report using a substantial portion of their ESL funding to cover the costs of things like heat, light and building maintenance. The new policy does not protect ESL funds.

  • Students’ ESL support is reduced, or eliminated altogether, when the funding runs out, as opposed to when the student has sufficient English skills to function academically. The new policy says this should not happen, but does not commit to supplying funding for students who are not ready to be withdrawn from ESL programs.

  • There is no measurable English-proficiency standard that ESL students should attain before ESL services are discontinued. No standard was set.

  • The Ministry does not ensure that the ESL/ELD funding targets students most in need of assistance. The policy suggests that boards use the funds where it will be most effective – but there is no specific direction given.

  • Funding for ESL does not differentiate between students who arrive in Ontario as refugees and have little or no formal education in their first language, and students who have attended school in their home countries. The policy notes that there are different levels of need among ESL/ELD students, but does not provide for differentiated funding.

  • According to the Ministry of Education, students usually take from five to seven years to become fluent in English. But funding for ESL/ELD support runs out after four years. Funding has not been extended beyond the four years.

  • There are no minimum ESL/ELD training requirements for regular classroom teachers who have significant number of ESL students. The policy suggests it would be beneficial for new teachers to acquire ESL skills, but does not require it.

Over 600,000 foreign immigrants moved to Ontario between 2001 and 2006, most from non-English-speaking countries and many with school-age children. As a result, the proportion of ESL students in schools increased by 24%. Over the same period, the percentage of elementary schools with ESL teachers declined by 23%.

In schools with higher ESL populations (more than 10 ESL students) the percentage reporting ESL students but no ESL teacher has doubled since 2000.

The new policy does not require school boards to spend ESL money on ESL programs, it does not set standards for an acceptable level of English proficiency and it does not provide funding that recognizes the difference between refugee students needing substantial literacy support and ESL students who have strong literacy skills and only require support to learn the language.

To read the policy go to the Ministry of Education website.

Wednesday, September 19, 2007

Pirate speak in other languages

Since the good folks at Language Log reminded me that today is Talk-Like-A-Pirate Day, I asked my students if pirates have any particular linguistic tics in their languages.
  • A German student came up with a sentence terminal ay! (read /Ei/). There's also this glossary of German pirate words.
  • The Spanish speakers said Spanish pirates all have Castilian accents.
  • Korean, Japanese, Iranian, Iraqi, Arabic, and Turkish students couldn't identify any particular pirate dialect.

Tuesday, September 18, 2007

Rabbit numbers

One of the readings included in our Comm 200 packet for students to respond to is "Dot-com this!" by Stephanie Nolen from The Globe and Mail, Aug 28, 2000. pg. R.1. It includes the following:

"Eric McLuhan, author of Electric Language, adding that English, as the language with the greatest flexibility and largest vocabulary, was the only language prepared for this shift.

But McLuhan, who is the son of the legendary communications theorist Marshall McLuhan, says the 15 years of the computing era have had drastic effects on the building blocks of writing.

Attention spans have declined sharply, and with them, sentence length. Twenty years ago, the average sentence length in a novel was 20 words; today it is 12 to 14 words. In mass-market books such as Harlequin Romance novels, the average sentence is only seven or eight words.

The stuff about English having the greatest flexibility and largest vocabulary is not even worth commenting on, but it seems pretty clear that McLuhan is just making up the sentence lengths too. Just in case, I did a quick tally to see how accurate his claims are. I took the top 10 selling books from 1985 and 2005 (from here) and pulled up the words-per-sentence stats from Amazon.com (e.g., here). Where they weren't available, I took a book from the same author published around the same time. Here are the results.








1985WPS

The Mammoth Hunters by Jean M. Auel13.8

Texas (All We Did Was Fly to the Moon) by James A. Michener14.6

Lake Wobegon Days by Garrison Keillor16.8

If Tomorrow Comes by Sidney Sheldon10

Skeleton Crew by Stephen King12.5

Secrets by Danielle Steel12.7

Contact by Carl Sagan14.3

Lucky by Jackie Collins8.7

Family Album by Danielle Steel14

Jubal Sackett (Lando) by Louis L'Amour14.2

average13.16








2005WPS

The Broker by John Grisham11.5

The Da Vinci Code by Dan Brown11

Mary, Mary (actually Cat and Mouse) by James Patterson9.6

At First Sight by Nicholas Sparks11.6

Predator by Patricia Cornwell11.1

True Believer by Nicholas Sparks12

Light from Heaven by Jan Karon9.9

The Historian by Elizabeth Kostova19

The Mermaid Chair by Sue Monk Kidd13.5

Eleven on Top (Lean Mean Thirteen) by Janet Evanovich9.1

average11.83

In this small unscientific study, there is a tendency for sentences to be shorter in 2005 but the difference of 1.33 WPS is nothing like the 10 WPS McLuhan is claiming. And when the largest average for a single book from 1985 is 16.8 WPS, it seems highly unlikely that you're going to find an average of 20 WPS for all novels published in that year. Maybe what computers have made us better at is pulling numbers out of a hat as if they were rabbits. Then again, maybe better is the wrong word.

Monday, September 17, 2007

Hofstadter on Pinker

Over the weekend The LA Times published a review by Douglas Hofstadter of Steven Pinker's new book The Stuff of Thought (The Times has put this behind a pay wall. You can still read it here.) Hoffstadter writes, appropriately, with the caution and insight of a scientist, disagreeing graciously where he sees fit. "Pinker exploits his wonderfully keen faculty for linguistic observation to pry open the human head and discover its secrets. Sometimes this technique works terrifically, other times not so well."

Saturday, September 15, 2007

A new season of "And sometimes Y"

CBC's "And sometimes y" has started a new season with a new host. Unfortunately, I've missed both episodes so far and they seem to have stopped providing even the limited audio excerpts they used to.
[Update, Sept 2o: Perhaps I blogged too soon. There is now a realAudio stream available for episode 2. Still nothing for 1 though.]

Tuesday, September 11, 2007

Mixed signals

Horizons BetaPro funds are running a marketing campaign featuring a dictionary-entry style design (the ads don't seem to be on their website). The entry is like this (disregarding layout):
b ta prfit
(verb): to capture double the daily market performance of an equity sector by investing in a...

The company is "Canada's sole provider" of this specific type of fund. So, if it's Canadian, why use the British pronunciation of beta? The Canadian Oxford gives only /bei t/. The ad appears to use the American Heritage Dictionary pronunciation key, another oddity (though my mom says this is the system she learned growing up in Manitoba). At least it does for the stressed vowels. The /a/ and the /i/ don't seem to fit in here.

And then there's the stress marks. Most British dictionaries mark stress at the beginning of the syllable and AHD marks it at the end. In this ad, though, it seems to be marked on the stressed vowel.

All in all they've made quite a hash of it. Fortunately for them, almost nobody will notice. And all those diacritics do add a certain cachet--the corporate version of the heavy metal umlaut.

Saturday, September 08, 2007

Lingusitic hipness at ROB

For a conservative magazine, Report on Business is trying remarkably hard to be hip. The September cover reads:
DOES JIM FLAHERTY HAVE A HATE-ON FOR BAY STREET?

The Urban Dictionary has an entry for hate-on that's almost three years old, but Google only return 12,000 hits. That may seem like a lot, but in comparison jape, today's WOTD from Merriam-Webster Online, a word I'd never seen before, gets more than 500,000 hits. Many of the "hate-on for" hits eschew the hyphen, suggesting either that the writers have a more minimalist punctuation style or that they missed the joke.

Meanwhile, the ROB headline writer, obviously in a merry mood, continues on inside with the headline, "Flaherty will get you nowhere".

Friday, September 07, 2007

Voicelessly & labiodentally so

With the Toronto International Film Festival in full swing, the Toronto media is tripping over itself to avoid using the title Young People Fucking. This morning, on CBC Radio 1, Toronto, Jesse Hersh called it Young People Copulating, to which Andy Barrie replied, "only more fffricative."

Thursday, September 06, 2007

AWL washed up?

I have a good deal invested in the Academic Word List (AWL). I have developed a lot of materials based on it and even spent many hours writing definitions and collecting example sentences for the AWL words (available on the Simple English Wiktionary). I have done so because I found the arguments for such a list compelling, because I believed the AWL was well designed, and because the results struck me as having good face validity. Many students have come back to me after beginning their post-secondary studies and said how valuable the AWL has been to them. (I woul also note that our program is small and we do not have the ability to stream students by major.)

But this past June, Ken Hyland and Poly Tse published an article in TESOL Quarterly called 'Is there an "academic vocabulary"?' Here's the abstract:

This article considers the notion of academic vocabulary: the assumption that students of English for academic purposes (EAP) should study a core of high frequency words because they are common in an English academic register. We examine the value of the term by using Cox-head's (2000) Academic Word List (AWL) to explore the distribution of its 570 word families in a corpus of 3.3 million words from a range of academic disciplines and genres. The findings suggest that although the AWL covers 10.6% of the corpus, individual lexical items on the list often occur and behave in different ways across disciplines in terms of range, frequency, collocation, and meaning. This result suggests that the AWL might not be as general as it was intended to be and, more importantly, questions the widely held assumption that students need a single core vocabulary for academic study. We argue that the different practices and discourses of disciplinary communities undermine the usefulness of such lists and recommend that teachers help students develop a more restricted, discipline-based lexical repertoire.

Yikes!

I wondered if the Hyland & Tse study's results had anything to do with the small corpus size. Their corpus is only 3.3 million words, somewhat smaller than Coxhead's 3.5 million-word corpus. Nowhere do they examine whether their results are statistically significant and I'm afraid that my stats are not up to the task either. But there's a much larger corpus that we can compare results against to see if the findings hold up: the British National Corpus using the VIEW interface.

At one point, Hyland & Tse write,

Table 6 shows the main meanings for selected words with different overall frequencies in the AWL together with their distributions. The first four are from our high frequency list, with occurrences above the overall mean, and show that even where uses are very frequent, preferred uses still vary widely, with social science students far more likely to meet consist as meaning "to stay the same" and science and engineering students very unlikely to come across volume meaning " a book or journal series" unless they are reading book reviews.

Where they talk about consist as meaning "to stay the same", they are referring to the word family and, I assume, to the words consistent and consistently. Here is the relevant data from Table 6.








WordsMeaningNat sciEngSoc sci


ConsistStay same34%25%55%



made up of66%75%45%
(As published, 'stay same' for Eng is 15, but all other numbers sum to 100%, so I assume it's a typo).

So, what happens when we look at the same numbers in the BNC?








WordsMeaningNat sciEngSoc sciPolit & law


ConsistStay same36%28%48%49%



made up of64%72%52%51%

The range drops from 30% to 20% but is still notable. I'll poke around a bit more and see what I turn up.

Friday, August 31, 2007

4th Annual Language Learner Literature Award

It's not up on the web site yet, but here's the press release from the Extensive Reading Foundation.

--------
Today, the Extensive Reading Foundation, an unaffiliated, not-for profit organization that supports and promotes extensive reading in language education, announced the winners of the 4th Annual Language Learner Literature Award for books published in 2006.

An international jury chose the winning book in four categories, taking into account the Internet votes and comments of students and teachers from a reported 21 countries around the world. I know that a lot of readers on this list were among the voters (including 107 votes from Japan alone), so thank you and your students for that.

The winning books are (drum roll...)

Young Learners
"The Boy Who Burped Too Much" by Scott Nickel. Illustrated by Steve Harpster. Graphic Sparks (Stone Arch). The jury noted the fast-moving plot and colorful illustrations in a story that will provide great fun and excitement to children, who won't be able to put it down until they finish it. Voters commented, "really fantastic and interesting." (Hong Kong) "It's funny and the pictures are wonderful." (Viet Nam)

Adolescents and Adults--Beginners
"Let Me Out" by Antoinette Moses. Cambridge English Readers, Starter Level. This is Moses' second Language Learner Literature Award after winning in 2004 for "Jojo's Story". The jury found "Let Me Out" a very well crafted science fiction story, noting its rare ability to create an emotional connection between reader and book characters. That it was a popular winner in a competitive field heartened the jury: "It's great to see a strong story-line carry the day." Voters commented, "scary and exciting." (Japan) "The story makes us think deeply about human life." (Russian Federation)

A&A Intermediate
"Rabbit-Proof Fence" by Doris Pilkington Garimara. Retold by Jennifer Bassett. Oxford Bookworms Library, Stage 3. The jury called the book sustained and powerful; a true story that reflects the experience of marginalised people everywhere. The well-paced retelling brings a second Language Learner Literature Award to Jennifer Bassett, who won for "Love among the Haystacks" in 2005. One teacher commented, "From the moment [my students] opened the book, they felt as if the story was about their own lives.... [It] helped them to become closer to their classmates who were not from the same backgrounds as they." (United States)

A&A Advanced
"The Age of Innocence" by Edith Wharton. Retold by Clare West. Oxford Bookworms Library, Stage 5. It successfully marshals its large cast of characters in a book that will keep its readers guessing until the end. The jury found it ideal for readers who enjoy stories that deal with emotions and relationships. Voters commented, "Because the descriptions were very good... you read and don't stop." (Peru) "It was so romantic." (Somalia)

In addition to the winners, the following books were selected as the shortlisted "finalists" in each category:

Young Learners
"The Goose Girl" Classic Tales (Oxford University Press).
"The Twelve Dancing Princesses" Classic Tales (Oxford University Press).

A&A Beginners
"Blog Love" Scholastic Readers Starter Level. (Mary Glasgow Magazines).
"The Story of Chocolate" Easyread Level One. (Black Cat).

A&A Intermediate
"Crossroads to Love" Teen Readers Level 3. (Aschehoug/Alinea).
"The No. 1 Ladies' Detective Agency" Penguin Readers Level 3.
"Strong Medicine" Cambridge English Readers Level 3.

A&A Advanced
"Barchester Towers" Oxford Bookworms Library Stage 6.

If you want to add some excellent books to your extensive reading library, the winners and finalists are available for online purchase at the Cambridge International Book Centre.

Sunday, August 26, 2007

The toilet of yours

Today, my 3-year old say to my mom, "*I'm going to the toilet of yours." It struck me that this would be grammatical (though somewhat inappropriate) if she had just changed the to this. I wonder why.

I'll have to look into it, but we're heading down to the public poor right now. In the meantime, if anyone has an answer, feel free to post it.

Saturday, August 25, 2007

Right-sized answer

You'll recall that a few days ago, Joshua Myerson asked about expressions such as "short sleeve(d)" & "large size(d)". I put it to Rodney Huddleston who pointed me to p. 1709 in the Cambridge Grammar of the English Language. There you will find a rather ingenious solution: the -ed suffix is not the same -ed suffix you find on verbs (cf. the -er suffix on nouns vs. the -er suffix on adjectives). Not only that, the suffix does not apply solely to the final word of the pair, but to the pair as a unit. In this way, it is similar to the possessive suffix on somebody else's, where else's all by itself makes no sense. Though this -ed usually applies to modifier-noun pairs, it can apply to individual nouns (e.g., bearded). The basic meaning is just with (though other specialised meanings also exist).

Given this analysis, we don't have to worry about adjectives modifying verbs any more. This leads to the interesting distinction between high-powered tools in which it wouldn't make sense to ask "by what" and powered tools in which it would.

Neat!

conserving consumption

Loblaw-family supermarkets in Ontario have made a deal with Ontario Hydro to cut electricity usage in exchange for lower electricity prices. On a few hot days the stores have been rather dark, but generally you don't notice the change, except that the stores regularly play an announcement that includes the following: "We are doing our part to help conserve our community's energy consumption."

It seems to me that the just what you do not want to do.

Wednesday, August 22, 2007

Defenders of the pronoun

Recently, for whatever reason, I've been in situations where the question of the status of pronouns has come up. Inevitably, there are people who vehemently reject the idea that pronouns are just a special kind of noun. One correspondent writes,
"Why insist on pronouns as a 'special case of nouns' when current handbooks from Hacker to Troyka and Ready Reference unswervingly place pronouns in a different category from nouns?"
Well, because they generally:
  1. signify the same range of concepts
  2. both are subject to distinctions of case
  3. both are subject to distinctions of gender
  4. both are subject to distinctions of number
  5. share almost all the same functions (e.g., subject, object, determiner)
  6. share the same set of modifiers

Actually, 6. is a bit of a stretch. Typically pronouns don't license determinatives or adjectives but they sometimes can in a pinch (e.g., the new you). Then again, proper nouns don't usually license determinatives or adjectives either and nobody wants to set them off on their own.

These defenders of the pronoun inevitably argue that if you look up "parts of speech" in any reference, you will be told that pronoun is one. It has always been so, they say pointing to the etymology (and falling foul of the etymological fallacy). But they offer nothing beyond tradition to explain why pronouns should get a class all to themselves. Nor can they explain why, if pronouns should, auxiliary verbs, for example, shouldn't.

These people are typically unsurprised that the physics, biology, and chemistry they studied in high school is no longer up to date. But they get positively defensive when somebody suggests that grammatical description has moved on. Why is that?

Saturday, August 18, 2007

Grammar by gosh and by golly

I have been assigned to teach a freshman writing course using Writing by Choice, by Eric Henderson. I've read most of it now and do appreciate the main theme about choice. I was, however, rather surprised when I came to the grammar section. I understand the need to keep things simple, but it seems to me that, perhaps in his efforts to do so, he has developed a basic framework which has rather too many inconsistencies and outright errors.

We begin with his discussion of "substantives". Though tradition is to recognise pronouns and nouns as separate "parts of speech" there appears to be no good reason for doing so. It seems more parsimonious to simply note that pronouns are a special case of nouns. Henderson follows tradition.

Regardless of what you think of the above point, the definition he provides for each remains problematic because of his reliance on semantic properties to the exclusion of morphology and syntax. To wit,
  • "Noun (nomen: 'name'): name of a person, place, or thing."
  • "Pronoun (pro + nomen: 'in place of the noun'): a word that takes the place of a noun in a sentence."

A punch, for example, is in no natural way a thing. It is an action (and I'm sure you can guess what Henderson's definition of verb is), yet punch can be either noun or verb. He goes on to say that the noun that is replaced by the pronoun is its antecedent and that indefinite pronouns have no antecedent. In other words, they don't take the place of a noun. This means they don't meet the defining criteria for pronouns.

The same thing is true of the "demonstrative pronouns" (which modern grammar deals with much more effectively under the category determiner--or determinative if you take determiner to be a function.) What noun does yonder, for instance, replace? And on p. 371, he notes that all pronouns must match the person of the antecedent, and writes, "all nouns are third-person." If we take the definition to be true, this negates the possibility of first and second person pronouns.

The simple fact that nouns (pronouns included) change for number and in forming possessives can easily be incorporated into a definition. Similarly, verbs conjugate for past tense (must excepted). Other properties can also be brought to bear to increase the level of precision (but at the expense of simplicity).

Another problem lies in Henderson's blurry distinction between parts of speech and functions. Nouns, he writes, have five functions: subject, object, object of a preposition, subjective complement, and appositive. Yet, when a noun modifies another noun, he says it is functioning as an adjective. Why is this not in the list of functions? And when did adjective become a function? In this book it is described as a part of speech, a category, that has the function of modifier. So would it not be more consistent to say that nouns functions as modifiers?

The same can be said of the case on p. 320 where phrases are described as acting as nouns. Rephrasing this as "phrases can function as subjects or objects" makes for a much more coherent grammar.

And, before leaving nouns, as far as I can tell, his system completely overlooks their function in cases like night in "I met her last night" or day in "six days old".

The next issue that caught my attention was the treatment of subject. Although the definition on p. 349 ("The subject of a sentence or clause is the noun or pronoun that performs the action of the verb, or that exists in the state or condition expressed by the subjective complement.") is better than the one on p. 307 ("The subject noun is the doer or performer of the action."), neither of them sufficiently handles stative transitive verbs such as 'have', which denote no action and have no "subjective complement". This is similar to the issues raise above regarding the definition of nouns.

But a more serious problem is that the definitions completely ignore passive sentences. In such sentences, the subject is never, as far as I can discern, the "doer or performer of the action", regardless of how forgivingly you interpret action. (In fact, insofar as I can see, the book ignores passive sentences altogether; at least, there is no index entry for passive or voice and I haven't come across any mention in the text.) And while we're on the subject of subjects, he writes, "prepositions cannot ever be the subject of a clause". This denies the existence of sentences such as "After nine is good for me."

Shifting now to verbs, we have the same kind of defining problems again, but those aside, I was shocked to see that he has conflated intransitive verbs and "linking verbs". While there are good reasons for including linking verbs as a special case of intransitives, I can see no basis at all for calling all intransitives linking verbs. Yet, that is what he's done. I've yet to see another grammar that sanctions this grouping. His choice to ignore other valencies (ditransitives or complex transitives) can be dismissed as an issue of scope, but this choice is much harder to defend.

Then he completely loses it with this bit of analysis. Considering the sentence "He acted splendidly as Hamlet in Shakespeare's play", he writes, "acted is used as a transitive verb--there is a direct object of this activity: 'as [in the role of] Hamlet.'" Direct object is not among the functions he lists for prepositions (and rightfully so, I believe). Indeed, Merriam-Webster's online dictionary gives the following example for INtransitive act: "trees acting as a windbreak".

Seemingly by way of explanation, during this discussion of intransitive/linking verbs, he makes the assertion that "in some verbs, the traditional use of to be has been dropped in speech and informal prose," giving the example of seem (e.g., "She seems [to be] well.") As far as I can tell, seem has been used freely without to be as far back as Chaucer's middle English and likely all the way back to Old Norse. For example, from The Canterbury Tales

The vapour, which that fro the erthe glood,
Made the sonne to seme rody and brood;
But natheless, it was so fair a sighte
That it made alle hir hertes for to lighte,
What for the sesoun and the morwenynge,
And for the foweles that she herde synge;

There are other problems, but I've gone on too long already. By the way, this book was first published last year. What justified it, I have no idea.

Thursday, August 16, 2007

big size(d) question

Over on the ETJ list, Joshua Myerson wrote to ask about the differences between pairs such as short sleeve(d) and large size(d).

It's an interesting question and I'm not sure exactly how to approach it. Here are some other examples:

  • long leg(ged)
  • large size(d)
  • oval shape(d)
  • broad shoulder(ed)
  • pencil neck(ed)
  • open neck(ed)
  • small frame(d)
  • small size(d)
  • flat bottom(ed)
  • low ceiling(ed)
  • different colour(ed)
  • middle age(d)
  • double strand(ed)
  • white stripe(d)
  • high power(ed)
  • light colour(ed)
  • good size(d)
  • dark hair(ed)

Note that these are often interchangeable (e.g., low-ceiling(ed) house) with no difference in meaning. But consider the difference between a dark hair gene and a dark haired gene.

While it's commonly thought that only adjectives modify nouns, nouns can also modify nouns (e.g., faculty office), Thus, we can look at the group in which the second constituent is a noun as noun phrases (NPs) that modify other NPs. This isn't problematic.

The -ed group, though is rather harder to deal with. The -ed word may be an adjective or a verb. Either way, it's being modified by an adjective which is something I wasn't aware could happen. In something like long legged, we can use pronunciation to help us decide that leg' ed (two syllables) is an adjective where legged (one syllable) is a verb. This approach is rather limited though.

Another thing to noticed is that while some would be fine without the adjective (e.g., _ power(ed) tools) others make little sense at all without the adjective (e.g. *a _ bottom(ed) boat), though the -ed forms tend to work better here (e.g., a _ sized shirt vs. *a _ size shirt).

Hmmm...

Tuesday, August 14, 2007

The Elements of Typographical Style

Almost a year ago I posted about The Solid Form of Language by Robert Bringhurst. The Elements of Typographical Style is by far the most famous of his books, and I had long been interested in reading it. My wife is now taking courses in graphic design, which gave me a perfect excuse to buy and read the book. After all, what dedicated husband wouldn't want to find out more about his wife's fields of interest?

Bringhust is a renowned Canadian poet, and it shows in his prose. Though the title portends little but fussiness, pedantry, detail and drudgery, the book is actually a delight to read. Here are a few samples (not necessarily the best):
  • on typography "Like oratory, music, dance, calligraphy - like anything that lends its grace to language - typography is an art that can be deliberately misused. It is a craft by which the meanings of a text (or its absence of meaning) can be clarified, honored and shared, or knowingly disguised.
    "In a world rife with unsolicited messages, typography must often draw attention to itself before it will be read. Yet in order to be read, it must relinquish the attention it has drawn."

  • on case "The union of uppercase and lowercase roman letters - in which the upper case has seniority but the lower case has the power - has held firm for twelve centuries. This constitutional monarchy of the alphabet is one of the most durable of European cultural institutions."

  • on notes "Relegating notes to the foot of the page or the end of the book is a mirror of Victorian social and domestic practice, in which the kitchen was kept out of sight and the servants were kept below stairs."

  • on proportions "The proportions of a page are like an interval in music. In a given context, some are consonant, others dissonant. Some are familiar; some are also inescapable, because of their presence in the structures of the of natural as well as the man-made world."

Monday, August 13, 2007

Pilobolus

Over on Small Things Considered, there's a wonderful description of the fruiting body of the fungus pilobolus.

(Via The Loom.)

Friday, August 10, 2007

White noise

A few weeks ago, I doubted a NewScientist interview in which linguist Annie Mollard-Desfour makes the claim that to a Japanese person the brightness of a colour is more important than its hue and that the Japanese language has a large number of words for white, "from the dullest to the most brilliant". In the August 4 issue, Mollard-Desfour responds.

"The importance accorded in Japanese culture to matte-gloss and brightness distinctions is mentioned in numerous linguistics papers dealing with the cultural aspects of language and of naming - as are these features of Inuit language. True, some linguists currently propose that we need to distinguish terms for an abstract "true white", that does not refer to any particular instance of "whiteness", from those that refer to materials such as snow. Japanese certainly has terms for "white in general" and others linked to particular instances of whiteness: www.edicojaponais.com and www.dictionnaire-japonais.com.

It remains the case that the lexicon of colours is difficult to understand and to translate, because the parameters used may be fundamentally different. Hence the controversies: what words translate the French blanc - and are the whites of snow or other bearers of whiteness true colour terms?"

The dictionaries to which she links return the following results when you search for blanc.

  1. *白 white (noun)
  2. *白い white (adj)
  3. *ホワイト white (Japonification of the English word white, used in brands etc.)
  4. *ブランク white (Japonification of the French word blanc, used in brands etc.)
  5. 修正液 white out correcting fluid
  6. 卵の白身 the white of an egg (as distinct from the yoke)
  7. 白目 the white of an eye (as distinct from the iris & pupil)
  8. 笹身 white meat (of a chicken)
  9. 白樺 white birch
  10. 白馬 white horse
  11. 白黒 black and white
  12. 白血球 white blood cell
  13. 白鷺 white heron
  14. 白熊 polar bear
  15. 白紙 white (i.e., blank) paper
  16. 白米 white rice
  17. *真っ白 pure white (opposite of pitch black)
  18. 白髪 gray hair
  19. *灰白 gray (ash white)

The items with an asterisk are the only ones that are actually colour terms, the others merely denoting white things. If you then search for 白 and include only the colour terms, you can add the following to the list:

  1. 青白 pale; green (said of a person who is feeling queasy or shocked; literally blue white)

So there you are. If we stretch it, we can find 7 words for white. And they don't exactly describe a continuum from the dullest to the most brilliant.

Now we can start looking for other words that signify white. Parchment, for example is 灰味黄 (ashy yellow; literally ash flavour yellow), and pearl comes out as 真珠色 (pearl colour), ivory is 象牙色 (elephant tusk colour), etc. If you're really keen, you can have a look at a list of Japanese colour names by kana order here (in Japanese).

I put the issue to Language Logger Bill Poser. He writes.

"That is very curious. My reaction is the same as yours...

I wonder if Mollard-Desfour has got hold of a warped idea about the classical color terms, the ones like imayauiro "red" found in beautiful charts in the endpapers of classical dictionaries, used, as far as I know, basically for describing the colors of kimono in Genji Monogatari? Some of those distinctions might be described in terms of brilliance, but the system still doesn't have multiple types of white."

By they way, I can't find anything on Google scholar about "the importance accorded in Japanese culture to matte-gloss and brightness distinctions" but maybe I'm not looking in the right place or perhaps it's all in French.

[I just got my print-edition of NewScientist, and despite the tag "From issue 2615 of New Scientist magazine, 04 August 2007, page 21" on the web, my letter and Mollard-Desfour's response are not printed]

Thursday, August 09, 2007

Defining words (or not)

Another new journal has cropped up: ELR Journal. ELR stands for empirical language research.

In the inaugural issue, Yasunori Nishima at the University of Birmingham (Google has apparently never heard of him) has a paper entitled "A Corpus-Driven Approach to Genre Analysis: The Reinvestigation of Academic, Newspaper and Literary Texts". The paper really seems more like that of a student learning to use the tools of the trade rather than a real contribution to the field. It's mostly a rehash of existing work with different corpora, none of it done in a way that is particularly novel, and none of it really challenging any existing results or even testing out anything questionable.

Much of the paper is built around word counts and frequencies. Surprisingly though, Nishima doesn't find it useful to define for us what he means by word. When he discusses the most frequent words, does he count run (e.g., I went for a run) and run (e.g., You run well) as two instances of one word or as two distinct words? What about if we add in runner, running, ran, runs, etc.? Who knows? Nishima says he got his frequency information from Adam Kilgarriff (though there is no proper citation). Presumably, he means he got them here, but did he use the lemmatised list or the unlemmatised one? He doesn't say. At least we can guess he is either counting unique word forms or lemmas.

But then he seems to conflate two senses of word when he compares his frequency data to arguments made by Paul Nation. As I have discussed before, Nation feels that we should consider only the most frequent 2000 word families to be high-frequency. Note, however, that Nation is explicitly talking about word families, while as far as I can tell Nishima isn't.

Though Nishima is merely the most recent linguist to ignore this terminological conundrum, his is a rather flagrant and troubling oversight, largely because the paper doesn't even show an awareness of the issue. How could you be doing research with words and not give a second thought to what a word is?

Not an auspicious start for the journal.

Wednesday, August 08, 2007

Quaint, so quiet

I was reading The Mouse and the Motorcycle, by Beverly Cleary to my kid when I came across the following sentence.
“Matt, who had seen guests come and go for many years, knew there were two kinds—those who thought the hotel was a dreadful old barn of a place and those who thought it charming and quaint, so quiet and restful.”

I'm assuming here that so is not used in its sense as an intensifier. This threw me for a bit of a loop because I didn't think that so could be used to connect anything below the level of a clause. In fact, that was one of the properties that I thought distinguished coordinators from conjunctive adverbs, the topic with which this blog started over a year ago.

It strikes me as a little odd, but the more I look at it, the less objectionable it seems. What do you think?

Tuesday, August 07, 2007

10,000th visitor

According to sitemeter.com, somewhere in the wee hours of August 6, English, Jack had its 10,000th visitor. Yeah!

Word drive successful

The word drive at the Simple English Wiktionary that I announced a month ago has been successful. We reached our 2000-word target a few days before the Aug. 4 deadline. Then an editor deleted a number of spurious entries which dropped us down below the 2000 mark again. By August 4th, however, we were back above it.

This seems to have created some momentum which I hope will continue. I would encourage anyone else with any interest to contribute. Also, please point it out to ESL learners you may know.

Sunday, August 05, 2007

Word spurt or gradual acceleration

Bob McMurray was kind enough to respond to my post the other day. The relevant part of his mail is reproduced below.

I've seen Paul Bloom's book, as well as a number of book chapters containing similar arguments. I don't disagree with him at all. I think his points are two-fold. First, there's nothing sudden or stage-like about the vocabulary explosion--rather, it represents smooth, continuous acceleration. This was elegantly demonstrated empirically by Ganger & Brent (2004, Developmental Psychology). Second, the major acceleration may be occurring late. But the [smaller] gains made by children in their second year are particularly noticeable given that they are starting from nothing.

That said, he doesn't offer an explanation for why we see acceleration at all. Moreover, these two points are all perfectly consistent with my model. The model doesn't make strong predictions about when the acceleration occurs. In fact, if you examine its rate of acquisition after the first 50 words (analogous to an 18 m.o.) it's a lot lower than it is after 2000 words (maybe a 3 year old? I'm not sure). What it does show is that acceleration is a guaranteed result in any parallel-learning situation. I think any system in which growth is the integral of a Gaussian distribution of difficulty will actually show faster learning much later than late infancy. I'm certainly not arguing that the acceleration we see at 18 m.o. (and that is apparent in your own numbers) is the top speed of the learning-system.

I think what's important here is that the model offers an explanation for acceleration at all. It simply shows that two commonly held assumptions (parallelism and variation in difficulty), when implemented, can have surprising results. They may be all you need to account for acceleration.

Actually, Bloom does suggest a variety of possible explanations including neurological changes, accumulation of adequate phonological knowledge, increases in memory, increased understanding of kinds and individuals, emergence of theory of mind, increased use of syntax, and exposure to an increasing number of words as children begin to read. Perhaps I'll ask him for his thoughts on McMurray's paper.

Murray doesn't nod

I recently finished The Meaning of Everything: The Story of the Oxford English Dictionary by Simon Winchester and quite enjoyed it. The story and characters are wonderfully quirky and heroic. Winchester does go slightly hyperbolic in his praise, especially of the English language and the English people of the time. And he has the odd habit of using an interesting or unusual word or turn of phrase and then recycling it a few chapters later, but this doesn't detract much from the enjoyment.

What did annoy somewhat is the unneeded sic on p. 200. There is a quote from a letter that James Murry wrote to Dr. William Chester Minor, one of the most significant volunteer contributors to the OED and a man with serious psychological problems which propelled him to murder.
"The supreme position ... is certainly held by Dr. W. C. Minor of Broadmoor, who during the past two years has sine in no less [sic] than 12,000 quots."

Winchester includes the footnote "Even Home nods". The suggestion is that Murry should have used fewer rather than less. Obviously, Winchester didn't check the words in the OED. According to the Merriam Webster Dictionary of English Usage,

"the OED shows that less has been used of countables since the time of King Alfred the Great -- he used it that way in one of his own translations from Latin -- more than a thousand years ago (in about 888). So essentially less has been used of countables in English for just about as long as there has been a written English language."

Saturday, August 04, 2007

Vocab spurt explained?

Bob McMurray has an article in Science that has been picked up in the popular press, "Defusing the childhood vocabulary explosion." He's also put up his own explanation here.

For years, psychologists have argued that since the speed of vocabulary learning increases dramatically at a certain age (somewhere around 18 months), it must mean that there is a fundamental change in the learning process, a shift in strategy perhaps.

I haven't read the Science article, but on his web page and on the various media accounts, this change in learning is said to be accounted for by differences in the input rather than differences in the processing. McMurray shows that word difficulty/frequency can account for the change in learning speed. Basically, there are only a few very easy words and once you get past them, there are more and more words at that level of difficulty/frequency. This is sort of the upside of what I described here.

It's certainly an interesting result. But there's one assumption that remains unquestioned. Do children really have a learning spurt at this age? Paul Bloom thinks not. Citing various studies in his book How Children Learn the Meanings of Words, he produces the following table on p. 44 (you can search inside on Amazon.com):
12 months to 16 months: 0.3 words per day
16 months to 23 months: 0.8 words per day
23 months to 30 months: 1.6 words per day
30 months to 6 years: 3.6 words per day
6 years to 8 years: 6.6 words per day
8 years to 10 years: 12.1 words per day

In other words, children are gradually increasing their word-learning rate at least until the age of 10. It seems likely that the early changes are quite visible to us because we can keep track of which words they know and we readily notice new words. As the stock of words grows, it becomes much harder to do this.

That's not to say that McMurray's results are wrong. Not having read the paper, I can't really say. But it is another reason to believe that there is no particular change in the learning process that happens early on.

Monday, July 23, 2007

Funding Funding Funding

, writing in The Toronto Star points out the difference between the comparably well-funded federal LINC (Language Instruction for Newcomers to Canada) program and provincially funded ESL programs in Ontario.

Eric Bakovic over at Language Log is also talking up the issue, but in the U.S.

ESL speakers too lazy to learn a third language

The Economist has a bit about the consequences of the dominance of English in Europe. One point I'd never considered before was that people are not learning other languages.
"The rush to learn English can sometimes hurt business by making it harder to find any staff who are willing to master less glamorous European languages.

English is all very well for globe-spanning deals, suggests Hugo Baetens Beardsmore, a Belgian academic and adviser on language policy to the European Commission. But across much of the continent, firms do the bulk of their business with their neighbours. Dutch firms need delivery drivers who can speak German to customers, and vice versa. Belgium itself is a country divided between people who speak Dutch (Flemish) and French. A local plumber needs both to find the cheapest suppliers, or to land jobs in nearby France and the Netherlands."

But what do you do to avoid this problem? Apparently, there is research "by the European Commission suggesting that this risk can be avoided if school pupils are taught English as a third tongue after something else." Given the spectacular failure of most high schools worldwide to teach a second language, I wonder at the practicality of this solution.

Sunday, July 22, 2007

My spammy blog

Yesterday, when I tried to post, I got the following message: "Blogger's spam-prevention robots have detected that your blog has characteristics of a spam blog."

Hmm....

Judy Sierra

Yesterday, my mother was reading Thelonius Monster's sky-high fly pie: a revolting rhyme to my kids while I washed the dishes. Something clicked as she read,
"THELONIUS urgently
e-mailed a spider.
He wanted advice from a savvy insider.
"

"Who wrote that?" I asked. Sure enough, it was Judy Sierra, winner of the 2005 E. B. White Read Aloud Award for Wild About Books. It's somewhat astounding how one simple line can be so characteristic that you immediately know who has written it.

Saturday, July 21, 2007

Ontario: more ESL regs; no teeth, no new funding

It appears that the province will require schools to improve orientation, testing, and reporting for ESL students and their parents. No new funding accompanies the announcement. Nor will their be any requirements that schools actually spend ESL funding on ESL. Previous discussion of the issue is here.

Friday, July 20, 2007

Learning the Language

I just discovered a language-learning-related blog attached to the Education Week website. It's called Learning the Language. The author, Mary Ann Zehr,
"is an assistant editor at Education Week. She has written about the schooling of English-language learners for more than seven years and understands through her own experience of studying Spanish that it takes a long time to learn another language well. Her blog will tackle difficult policy questions, explore learning innovations, and share stories about different cultural groups on her beat."

The blog has existed since February. It's fairly US-centric and is focussed mostly on policy issues though the posts do everything from introducing new materials to visiting individual classrooms.

Thursday, July 19, 2007

Even where there are ESL teachers...

Samuel Freedman reports in the New York Times on a frustrating situation in which the few ESL teachers are pulled away from teaching ESL by paperwork and other tasks. Many of these teachers

"were responsible for completing more than a dozen different forms, evaluations, assessments and reports that came variously from the levels of district, city, state and federal government, and grading standardized tests.

Teachers like Ms. Rabenau were also repeatedly conscripted within their schools to substitute for absent colleagues, to proctor exams in other classes and to chaperon field trips."

Wednesday, July 18, 2007

Idioms: interpreting the frequencies

I suppose this isn't really specific to idioms, it would apply to any vocabulary item.
As I wrote before, one response to my explanation about idioms was,
"None of the correspondents have suggested that they have any difficulty recognising or understanding 'hit the jackpot', yet the low level of occurrence of the expression in corpora suggests that it should be so unfamiliar as to cause difficulty even to native speakers."
I'm afraid this doesn't show up a problem with the corpora themselves, but it might go some way to explaining why language teachers seem to be so loath to use corpus data: they don't understand what it tells them.
It wouldn't be unusual for a native speaker of English to encounter language that occurs with the frequency of "hit the jackpot" a number of times per month. That's because native speakers of English tend to encounter millions of words each month. The recent Mehl paper in Science suggests that we speak on average something like 16,000 words per day. Presumably, we're doing much of that in conversation with others, often more than one person, so let's put our conversational word count at 40,000 per day spoken and heard.
Then there's TV. I don't have average numbers, but after looking at a few transcripts, it looks like 7,000 words per hour might be a reasonable estimate. According to Neilson, the average American spends 4.5 hours per day watching TV, so we can add another 30,000 words or so to our count, which now totals 70,000.
I have no data on how much people write, but I suspect it's very little. In terms of reading, I can find no adult data, but 5th-grade children read about 5,300 words per day, bringing our total daily word exposure to roughly 75,300 or 2,290,000 words per month. There are likely other sources of input that I have omitted, but this should be sufficient to make the point.
At the previously established rate of 0.18 to 2.0 occurrences paw, we could expect to see "hit the jackpot" about one to four times a month. If you're about my age, you've probably heard it about 900 times in your life. So, contrary to the above writer's conclusion, it's not at all surprising that we know it. But would you be surprised hear that my six-year-old son doesn't? (I just asked him [update: May 25, 2009. He's almost eight and he still says he doesn't know. update 2: Oct 9, 2011: 10 and still unfamiliar.]).
In contrast to native speakers, our learners don't get anything like 2.3 million words a month input. And what input they do get is degraded by the fact that they don't understand much of it. Thus, what seems very common to us, is quite rare for learners. Somehow, though, it's hard to get many language teachers to accept this. They refuse to believe that idioms are not common, but as we saw recently, anything below about 30 occurrences pmw should be considered low frequency.
There are many factors that can skew our perception of a word's commonality. Psychologists have taken this issue much more seriously than have language teachers/applied linguists and have evolved a number of measures. These include:
  • number of letters/phonemes/syllables
  • written/spoken frequency
  • range/keyness/burstiness
  • subjective familiarity rating
  • concreteness rating
  • imagability rating
  • meaningfulness
  • average age of aquisition
  • word category (noun, verb, adj, etc.)
  • affixation
  • status (colloquial/dialect/alien etc)
  • semantic grouping
It would obviously be too onerous to consider all of these constantly in our teaching, but it might not be a bad thing to know about and understand each measure.
Earlier posts this series: Idioms, Differences between the corpora, & Where's the cutoff