Church architecture in Hertfordshire and elsewhere, art, books, and whatever crosses my path

Monday 10 February 2020

According to the OED, which book has introduced most words to the language?

Here’s an enjoyably nerdy question for you. Which text is credited by the Oxford English Dictionary with introducing the most new words to the language?

The OED attempts to record the first usage of every word, so we can count them and declare a particular book to be a winner. We’ve go to be strict about the rules, though; I’m talking about a single text by a single author, not someone’s complete works, so that means that we can’t say, for example, ‘Shakespeare’ or ‘Dickens’; it has to be narrowed down to, for example, King Lear or Bleak House. Big encyclopedias and newspapers also don’t count as they’re written by numerous authors.

My first guess, and I imagine many people's guess, if asked this question before I came across the correct answer would have been Hamlet, as it’s well known that Shakespeare frequently used words that hadn’t been written down before (which doesn’t necessarily mean that he invented all of them, of course, though he probably did invent quite a lot of them) and this is his longest play. But I’d have been wrong.* Next I’d have assumed (correctly, as it turns out) that we have to go back in time to find the answer and plumped for The Canterbury Tales. Also wrong.

I’ll keep you in suspense (even assuming you’ve read this far) no longer, as I’m pretty sure you’ll never guess in a million years, as I wouldn’t have.

The answer is a long poem written in about 1300 by an unknown author somewhere in the north of England (perhaps Northumbria), called Cursor Mundi (which means The Runner of the World). I’d never heard of it before, even though I take an amateur interest in medieval literature. It is apparently a theological history of the world from the Creation to the Last Judgement.

It was very popular in its day (ten manuscript versions of it survive, more than for many medieval texts), but I don’t think it’s going to be featuring on any bestseller lists nowadays. However, it has a sort of pub quiz significance as it happens to contain more new words than any other text. This is due to four main factors: firstly, it’s very long, about 30,000 lines (for comparison, Hamlet is about 4,000). Secondly, it was written at just the right time, when Middle English was really establishing itself and expanding its vocabulary, including many words that we today think of as very ordinary. Thirdly, it’s survived when many earlier (and some later) texts haven’t. Before the invention of printing in the 15th century which texts lasted and which just got randomly lost was always a lottery. By the luck of the draw Cursor Mundi has weathered all that time and fate could throw at it. Fourthly, it’s written in a northern dialect, so it contains many words that would be unlikely to appear in works from elsewhere in the country.

There are 11,521 citations from the poem in the OED. 968 of these are first recorded usages, and 3,013 of them use a word in a sense in which it hadn’t been recorded as being used before. Here are a few now everyday words that the anonymous poet was the first, as far as we know, to write down: aha, anyway, anywhere, backward, baptize, bared, barge, belted, beta, Bible, blend, blister, bodily, break, briefly, brimstone, burning . . . and that only takes us to the b’s. And here are a few words that haven’t survived into modern English that maybe deserve a revival: anywhat (a thing of any kind), blin (end), bobet (punch), brathly (impetuously, violently), cratch (itchy skin disease), cunningship (knowledge), doomster (a judge), eachwhere (everywhere), ferdy (fearful), fire-slaught (lightning), gleg (quick in perception), hindwin (anus), ithand (diligent, busy), mike (friend), misdeedy (sinful), ontinkel (similar), roid (uncivilised),  runkle (wrinkle), samenward (together), saughtliness (reconciliation), snobberly (snubbingly), solwiness (pollution), skail (a scattering), skander (to slander), toomsome (leisurely), trattle (chatter), treget (trickery), unbuxomness (disobedience), unglee (sadness), unlaughter-mild (serious, sober), warsle (wrestle), wrenk (turn aside, divert) and zizany (a weed growing in corn).

Here is the chart from the online OED (it's the fourth column we're looking at, headed 'First evidence for word'):

You’ll notice that Cursor Mundi comes in at number twelve, but that numbers one to eleven are all complete works or the work of numerous authors, and so by the rules I've set it is the winner.

Here's the chart sorted by total number of citations (column three, 'Total number of quotations'):

Here the poem is number eleven, but again number one once disallowed works are excluded.

Lastly, here's the chart for previously recorded words used in a previously unrecorded sense (the last column, 'First evidence for sense'):

By this metric Cursor Mundi reaches as high as number eight. Six of the works above it are clearly inadmissible, but one, Wycliffe's translation of the Bible, has some claim to being a single work by a single author. However, according to Professor Wikipedia, while it was long thought that Wycliffe was the sole translator, recent scholarship has concluded that in fact a team of scholars was responsible, (including John Trevisa, who features at number five in the chart). So I think a fair adjudication would be that Wycliffe's Bible can't count as a single author work (especially as it's a translation rather than an original work), and that therefore Cursor Mundi wins once more.

That’s three golds; not bad for an obscure seven hundred year old poem.

Incidentally, I notice that Chaucer is second in the count of the first use of words, behind only the Philosophical Transactions (the journal of the Royal Society); his total is 1,959, only three behind the total that the magazine has accumulated in its three and a half centuries of publication. I bet Geoffrey would be kicking himself: if only he’d managed to come up with just four more new words he would have been number one!

* As far as I can see, there's no way of making the online OED sort new words by the text they come from, only by author (except when the texts are anonymous). It would be a tedious task (though I'm sure someone has done it) to sort through the 1,463 new words introduced to the language by Shakespeare to sift out those from Hamlet, or any other given play. (Even more tedious to do this job with the 7,402 new meanings for old words - number two in the list, behind only The Times newspaper - that he's credited with.) However, it is possible to find a list of the top ten works by Shakespeare sorted by the total number of citations from them:

Assuming that new words are evenly distributed among the plays in proportion to the total number of citations, it would appear that Hamlet is only fifth on the list anyway, and so isn't even the richest source of first usages in Shakespeare, let alone the whole language. Who would have thought that Henry IV part 2 would be the most cited Shakespeare play in the OED