Thursday, November 06, 2014

Useful examples for language learners

The odd choices of example sentences that sometimes show up in these "teach yourself to speak..." type books along with phrase books has been rightly mocked in the past. In fact, the subtext of this blog's title references just such a phrase book.

Monday, October 06, 2014

On meeting 'otiose' twice again

I asked Mark Liberman to have a look at what I wrote yesterday since I was struggling to get my head around the probabilities. He was kind enough to write the following guest post:

Maybe a better way of thinking about it is this:

Say the probability that word w_i will be selected at random from a collection of text is P(w_i). Then assuming independence, the probability that the next word will NOT be w_i is (1-P(w_i)), and the probability of failing to find w_i in N successive draws is


If P(w_i) is 1/10^7 (one in ten million), and N is 1000, then we get


which is 0.9999. So if we take notice of a rare-ish (P = 1/10000000) word, and draw 1,000 other words at random looking to see it again, then 9,999 times out of 10,0000, we'll fail to find the moderately rare word we were waiting for. And if we draw 10,000 additional words instead of 1,000, the probability of failure is still

(1-(1/10^7))^10000 = 0.999

so we're still gonna fail 999 times out of a thousand.

But the thing is, Rare Words Are Common. That is, a large proportion of word tokens belong to relatively rare types. So suppose that there are 10,000 other words of approximately equal rareness, and every time we see one of them, we set a subconscious process to watch for recurrences of that word within the next thousand instances

If we do this a thousand times, then the chances of failure (for a thousand instances of noting a rare word and looking for it to occur again) become 

((1-(1/10^7))^1000)^1000 = about 0.9
((1-(1/10^7))^10000)^1000 = about 0.368

So if you do enough reading for these conditions to be satisfied once a day, you should expect to have this experience several times a week.

Now, none of this reasoning really applies, because you aren't picking words at random from a well-mixed urn, you're reading them in order in coherent text. And words in coherent text are far from independent Bernoulli trials -- when a rare word appears, the probability that it will appear again before long in the same text is massively increased by topic effects (and to a lesser extent style and priming effects).  But this just means that the experience should be more common rather than less common -- unless you insist that the texts be separate and on different topics, and so forth, in which case it gets complicated.

But still, I think that the real puzzle is not why you had this apparently odd experience, but why such we occasionally notice the kinds of coincidences that are in fact rather common.

This is not an unimportant question, since it has a lot to do with the genesis of superstition (and probably science, for that matter...)

The above is a guest post by Mark Liberman.

Sunday, October 05, 2014

On meeting 'otiose' twice in a day

Well, not in the same day, but certainly within a 24-hour period. As I was lying in bed last night, reading Charles Mann's 1493, I came across the phrase the otiose Percy on p. 78.

As of this morning, I've read to p. 90, so that's about 4,500 words later. I also read a few NY Times articles, adding perhaps another 1,200 words. And then I set about to edit an article for Contact, the TESL Ontario magazine for which I'm the editor. Almost immediately, I came across a quote from David Crystal in which he wonders,
whether the presence of a global language will eliminate the demand for world translation services, or whether the economics of automatic translation will so undercut the cost of global language learning that the latter will become otiose.

Friday, September 19, 2014

Climbing the grammar tree

I've started a new blog called "Climbing the grammar tree". The idea is that I will respond to readings I'm doing for my doctoral studies, so check it out.

Tuesday, September 02, 2014

A title misparsed

This morning, I was reading this article at New Statesman, when I came across the following:
Yet surely, when night after night atrocities are served up to us as entertainment, it's worth some anxiety. We become clockwork oranges if we accept all this pop culture without asking what's in it.
The plural clockwork oranges suddenly threw into sharp relief the title of Burgess's book A clockwork orange. For some reason that I am unable to articulate now, if I ever was aware of it, I had always parsed that title like this:
That is to say, I took orange to be a postpositive modifier of clockwork (like proof positive, governor general, the city proper, etc.) instead of clockwork as an attributive modifier of orange, like this:

This was, I must admit and odd and, even to me, puzzling title, but then it's an odd and puzzling book, so I just rolled with it. As I say, it was the plural oranges that made me see the light: adjectives don't do plurals.

I somehow overlooked the frequency of clockwork as a modifier, which should have tipped me off: in COCA, almost 40% of all instances of clockwork are attributive modifiers. Another thing that I was aware of, but which just seemed like more of the weirdness, is that clockwork is rarely--but sometimes--countable, so a clockwork is kinda weird, but not totally beyond the pale.

Perhaps one thing the pushed me to the first analysis was the stress pattern. Usually, an NP with a noun as modifier gets the main stress in the NP. It's a  
  • FAculty office, not faculty OFfice
  • SOCcer ball, not soccer BALL, and  
  • poLICE officers, not police OFficers. 
My impression is that people tend to say a clockwork ORANGE, rather than a CLOCKwork orange. This is the same pattern you get with postpositive modifiers like proof POsitive.

Whatever the reason, what really impressed me is how decades of misapprehension can be overcome by a single choice example.

Tuesday, August 19, 2014

Antedating "determinative"

The OED gives:

b. Gram. determinative adjective, determinative pronoun, etc. (see quots.); determinative compound = tatpurusha n.

1921   E. Sapir Lang. vi. 135   The words of the typical suffixing languages (Turkish, Eskimo, Nootka) are ‘determinative’ formations, each added element determining the form of the whole anew.
1924   H. E. Palmer Gram. Spoken Eng. ii. 24   To group with the pronouns all determinative adjectives..shortening the term to determinatives.
1933   L. Bloomfield Language xiv. 235   One can..distinguish..determinative (attributive or subordinative) compounds (Sanskrit tatpurusha).
1961   R. B. Long Sentence & its Parts 486   The, a, and every are exceptional among the determinative pronouns in requiring stated heads.
Today, I was reading Kellner's Historical outlines of English syntax from 1892 and came across the following on pp. 113–114 (emphasis added):

In Old English the possessive pronoun, or, as the French say, "pronominal adjective," expresses only the conception of belonging and possession ; it is a real adjective, and does not convey, as at present, the idea of determination. If, therefore, Old English authors want to make such nouns determinative, they add the definite article : 
"hæleð min se leofa" (my dear warrior). —Elene, 511.
"ðu eart dohtor min seo dyreste" (thou art my dearest daughter). —Juliana, 193.
§179. In Middle English the possessive pronoun apparently has a determinative meaning (as in Modern English, Modern therefore its connection; German, and Modern French) with the definite article is made superfluous, while the indefinite article is quite impossible. Hence arises a certain embarrassment with regard to one case which the language cannot do without. 
Suppose we want to say "she is in a castle belonging to her," where it is of no importance what-ever, either to the speaker or hearer, to know whether "she" has got more than one castle how could the English of the Middle period put it? The French of the same age said still "un sien castel," but that was no longer possible in English.

§180. We should expect the genitive of the personal pronoun ("of me," &c., as in Modern German)—and there may have been a time when this use prevailed—but, so far as I know, the language decided in favour of the more complicated construction "of mine, of thine," &c.

This was, in all probability, brought about by the analogy of the very numerous cases in which the indeterminative noun connected with mine, &c., had a really partitive sense (cf. the examples below), and, further, by the remembrance of the old construction with the possessive pronoun.
And later:

Later on, the possessive pronoun apparently implies a determinative meaning (as in Modern German and Modern French) ; therefore its connection with the definite article is made superfluous, while the indefinite article is quite impossible. Instead of the old construction we find henceforth what may be termed the genitive pseudo-partitive. See above, 178–180.