Tuesday, June 21, 2011

In spite of a lot of evidence

Some grammar books will tell you that a lot of is a determiner (e.g., The Grammar Book, p. 330). The problem is that these books haven't decided what they mean when they say something is a determiner. They haven't made a clear distinction between the category (e.g., noun, adjective, preposition, etc.), which I'll call determinative, and the function (subject, object, modifier, etc.), which I'll call specifier, and they tend to shift back and forth between them unwittingly.

Beyond that, though, there are many reasons to doubt that a lot of is a constituent of any sort. There are good reasons to think it's just three words that happen together a lot... um, I mean frequently. Syntactically, it differs from other combinations like a collection of, or a treeful of chiefly in that in clauses like a collection of tables was surrounded by African twig chairs, the verb typically will agree with leftmost noun, in this case singular collection, while in a lot of things have changed since you ran out on me, Ginger, the agreement is with the noun following of.

But in other respects, it's very much just any article + a noun + of combination:
  • the noun can be singular or plural (i.e., a lot of / lots of...)
  • the noun can be modified (e.g., a whole lot of...)
  • the string is ungrammatical without the final noun (e.g., a lot of people were mad, and *a lot of went home.)
  • the of can be fronted (e.g., it's known as `acid flashback,' of which a lot has been documented.)
Nevertheless, they do happen to occur as a group very commonly. In fact, in the Google books corpus, about two thirds of the instances of a lot are followed by of, and roughly the same proportion holds for the plural variant.

In a recent issue of Language Learning dedicated to complex adaptive systems, Joan Bybee and Clay Beckner argue that in spite of  has become a complex preposition, not a preposition-noun-preposition string. Essentially, you've got similar arguments going there except that this time it's fronting that doesn't work (you can't say *the problems of which in spite...) and the whole string makes up about 99% of all instances of in spite instead of just 2/3 as above.

Bybee and Beckner conclude that: chunks become words over time, these relative frequencies are important pieces of information showing how wordlike a chunk has become, and that semantic meaning is similarly important. They argue that the CGEL rejects these last two ideas explicitly, and I understand them to mean that this implies a rejection or at least conflicted acceptance of the first. Moreover, they argue that "If we look at the full range of usage data, it is in fact unquestionable that in spite of has a mostly fixed status, and this fixedness must be acknowledged by a complete theory of constituency."

I think that last part is fair. Yet certainly the CGEL acknowledges some kind of gradient. For example, chapter 7, section 6.1 is "Meanings of prototypical prepositions". Similarly, on p. 1289, we find: "as so often, however, we find that while the central or prototypical cases of coordination and subordination are sharply distinct, there is no clear boundary between the peripheries of the constructions and therefore some uncertainty concerning the precise membership of the category of coordinators."

So it seems that both parties see the need to acknowledge some levels of intermediacy, even if they disagree on what the desiderata should be and where the line should be drawn. It seems to me that if you can take something apart and put it back together again, then, it must have some internal structure, regardless of how rarely you actually do so. Nor do I see any particular advantage in reanalyzing in spite of as a complex preposition. It also has to be mentioned that Bybee and Beckner give us their take on a single case, but no guidance on other cases. And finally, it's much more likely that a Prep + N + Prep would have a similar distribution to uncontroversial prepositions such as in than it is that, say, a + N + of would have a distribution that is overall similar to other determinatives. They've chosen their test case carefully.

Elsewhere, Bybee argues that "that fact that semantically, on top of functions as the opposite of under (a simpler preposition), or that in spite of is paraphrasable by despite are indicators that these originally complex expressions have taken on a unitary status" (Language, Usage and Cognition, p. 143) But following this argument, we could lump all night, for two hours, and continuously into the same category (adverb?). The same goes for on time and late. But now we're right back at our category-function confusion.

