Saturday, August 19, 2006

Frequency and collocations

I'd love to say that my post about word frequency generated a great deal of discussion, but with only about 9 people viewing the blog each day so far--that's OK, it's still in its infancy--I'm happy when I do get the rare comment. At any rate, last summer, there was quite a bit of debate on the TESL-L mailing list about collocations and frequency.

According to Wikipedia, "collocation is defined as a sequence of words or terms which co-occur more often than would be expected by chance." For example, heavy smoker is a collocation (notice, we could say *strong smoker or *great smoker, but we generally don't.)

Back in 1993, Michael Lewis published The Lexical Approach, a book that has been rather influential in TESL circles, in which he pushes a view that has teachers designing and using activities to bring students attention to specific collocations. It has students recording, working with and studying these collocations. In the TESL-L discussion, I argued that this view was both unworkable and unprofitable, mainly because of the low frequency of collocations.

While language learners can build knowledge of collocations through extensive reading and listening, this is not something that we can do effectively by design in the classroom.

A few years ago, I looked at vocabulary in a reading textbook series that our program uses, Interactions. I ignored the most common 1,000 word families (from the GSL) because most of our students know these when they enter the program, but looked at the second thousand most common words and the 560 words of Averil Coxhead's Academic Word List (AWL) (using Tom Cobb's Compleat Lexical Tutor site). I found that, in Interactions 1 Reading and Interactions 2 Reading combined, 60% of these word families appeared less than four time, with singletons being the largest group. Only 24% of word families were repeated more than 7 times. Given this low repetition for individual word families in textbooks, it's clear that there are VERY few collocations that will turn up more than once. And that's over two successive 16-week courses. If they don't recur, students are very unlikely to pick them up.

So what if you deal with them out of context? The problem is that there are simply too many. If you teach strong wind as one poster to TESL-L suggested, then shouldn't you also teach wind's more common collocates: rain, blow, cold, speed, gone, and through (according to Collins COBUILD Corpus Concordance Sampler). That's seven. So, if we're looking at the top 2,000 words in English, and we estimate that there are an average of 3 collocates each that are as strong as strong wind, that's 6,000 collocates (minus whatever mutual collocates there are). There's simply no way you could spend class time on more than a fraction.

Even if you did focus entirely on collocation, the payback would be minimal. If we can take the British National Corpus (BNC) as being representative of English as a whole, then the strong wind collocation occurs a mere 3.06 times per million words (strong within 4 words either side of wind(s)). In contrast, a "difficult" word like compromise (which is not even in the top 2,000 words of English) occurs singly 10 times more often. So, is it more worthwhile to enrich students' understanding of wind by looking at collocates, or to have students study a basic meaning for compromise? Wouldn't "big wind" or "heavy wind" get them by just fine?

Here are a few things for teachers to keep in mind when they consider teaching collocations:
  • many collocations are obvious and require no teaching (e.g., look out the window; read/write/publish a book).
  • many collocations are too specialised to bother teaching to moststudents (e.g., insolvency act)
  • most collocations are too infrequent to bother teaching (e.g., rancid butter; a glimmer of hope)
  • individual words are often far more frequent than even a strong collocation, so keep things in perspective
For the times, however, when you do want to know about specific collocations (and there are good reasons for teachers to pay attention to these, even if they don't "teach" them in class, the most user-friendly way to find collocations that I know of is Just The Word.

6 comments:

Fábio said...

I beg to differ, but I believe you have misunderstood the power of collocational awareness-raising and other more interesting features of the Lexical Approach. In no sense should collocations be the ONLY FOCUS of classroom practice, it sure is more profitable than teaching loose word definitions, though. Think of simple pairs such as WORK/JOB, TRAVEL/TRIP, DAMAGE/INJURE, of which do you believe students would profit most: word definition or collocational power?

Besides, If we are to think of frequency, shouldn't we then focus on frequent collocations which might have more evident communicational power such as the ones listed above? I certainly agree that spending classtime(which is supposedly a means to learning faster) with 6000 collocations is counter-productive, but in no way would I advocate against their teaching. It is the teacher's role(and book publisher's too) to keep focus on relevant collocations at different levels, providing thus adequate and learning-enabling materials to different levels of students.

My email address is TEACHERFABIO@GMAIL.COM if you feel like keeping on with the discussion. i'd love to.

Nice blog :P

Brett said...

Differ away Fábio, but I don't think I've misunderstood this. In fact, the examples you give simply support my argument. Take work/job. First of all, it falls under category of obvious collocations. If a student knows a basic meaning for each of the two words, they're likely to be able to put them together without a teaching bringing their attention to it.

Secondly, it's not even a strong collocation. A mutual information score for two collocates of about 3.0 or above shows a "semantic bonding" between the two words. The score for work/job is only 0.97 in the Corpus of Contemporary American English. In fact, there are 168 collocations for job which occur over 100 times and have MI scores higher than work/job. Since you seem to think this pair is worth teaching, then do you also think the other 168 are worth teaching?

Gabrielle Lambrick said...

I think Fabio's point is important and you've slightly misunderstood it. Students often use "work" and "job" in the wrong situations and it is quite tricky to explain to them what the difference is. I think Fabio's saying (and I agree) that it's more productive to teach collocations of work and and collocations of job than to try to explain the difference in meaning. Some words will collocate with both work and job (e.g. you can find work or find a job) but others don't (e.g. you can lose your job but you can't lose work). Travel and trip are even better - looking at the verbs which can take each one as an object, and the adjectives which go with each one, gives a clear picture of the differences. Much clearer than my garbled explanations in the classroom, I'm sure!

Brett said...

Hi, Gabrielle

I see what you mean about me misunderstanding Fábio's point. But even in this sense, I don't think teaching the collocations of job and work is a good use of time. Here are the main collocations of these two words (as nouns):

work: social, hard, done, force, ethic, volunteer, dirty, detective
job: done, doing, satisfaction, training, quit, full-time, creation, lose

I think students would have to be incredibly intuitive to draw meaningful conclusions from these.

Lindsay said...

I think you're absolutely right that focusing on collocations for the sake of focusing on collocations isn't worth the time it would take, but I think it's important to point out and introduce collocations in the same way you would point out and introduce any new vocabulary in context. I think where teachers could benefit most from this would be to stop treating vocabulary as one word, and start recognizing the collocations as part of vocabulary as well. We do this sometimes, but generally when you look at lists of new vocabulary, you usually see one individual word instead of two or three. If a student is reading and finds "waving frantically" and wants to know what it means, explain them together and the student will remember them together. And when he sees someone waving frantically he'll think waving frantically instead of waving hard or something else that we would never say.

Brett Reynolds said...

Well said, Lindsay.