Tuesday, May 10, 2011

The prevalence of first-person pronouns in books

There has been a good deal of discussion recently about C. Nathan DeWall et al., "Tuning in to psychological change: Linguistic markers of psychological traits and emotions over time in popular U.S. song lyrics", Psychology of Aesthetics, Creativity, and the Arts, 3/21/2011. Mark Liberman (here) and Cosma Shalizi (here) have done a good job of explaining what's wrong with the paper and with the media's uptake.

I wondered what we would find if, instead of song lyrics, we looked at books, in particular the Google Books corpus (which, by the way Mark Davies appears to have received access to, tagged, and formatted for his usual interface. Look for an announcement on the BYU corpus site tomorrow.)

Given Liberman's and Shalizi's clear explanations of the problems with interpreting these, I won't even hazard an attempt, but I will point out that I recall reading somewhere that the Google books data post 2000 is less reliable (which is why the default cutoff date is 2000). Unfortunately, I can't find that now. Nevertheless, the sudden upswing for I, my and me are the most dramatic features of this graph, along with their long, gradual decline from 1800 until the 1980s.

For comparison, here are the other independent genitive pronouns (I've omitted his and its because the dependent and independent genitives share the same shape and her because it shares its shape with the accusative. Also, note that your is both singular and plural.)
And here are the nominative pronouns (again, excluding those that share a shape with others).

This suggest that a any increase in first person pronouns may be somewhat an artifact of English books returning to more pronominal language overall rather than specifically becoming more first-personal.

No comments: