It should come as no surprise to anyone reading this column that I write fiction in addition to non-fiction.
Specifically, I write science fiction and fantasy for both adults and young adults.
Which is why Ben Zimmer’s recent article in The New York Times’s Sunday Book Review describing the findings of lexicographers using modern computer databases and data-crunching software to uncover some fascinating things about the use of language in fiction.
One such computer-based tool is the Corpus of Contemporary American English, which you can explore online at corpus.byu.edu/coca/.
Compiled by Mark Davis at Brigham Young University, the database contains 425 million words of text from popular magazines, newspapers, academic texts, transcripts of spoken English, short stories, plays in literary magazines, and the first chapters of hundreds of novels from major publishers, all published within the past twenty years.
You can search not only for individual words but for parts of speech.
Zimmer gives as an example a search for past-tense verbs. The database reveals that, as you’d expect, the most common examples are the simplest: “said,” “came,” “got,” etc.
There’s not much difference between fiction and non-fiction in that regard.
But, notes Zimmer, if you ask the database which past-tense verbs show up more frequently in fiction than in, say, academic prose, you get some interesting results.
The top five are “grimaced,” “scowled,” “grunted,” “wiggled” and “gritted.” Or, as Zimmer puts it, “Sour facial expressions, gruff noises and emphatic body movements (wiggling fingers and gritting teeth) would seem to rule the verbs peculiar to today’s published fiction.”
Another way to use these databases of published text is to search for combinations of words, called “collocations.”
Apparently these are of particular interest to the people who make dictionaries for those learning English as a second language: combinations of words can be key to understanding some of the language’s idiosyncrasies.
Lexicographer Orin Hargraves, while working with the Oxford English Corpus, containing about two billion words of 21st-century English, discovered a number of collocations that appear almost solely in fiction.
For instance, he found that although the combination of “brush” and “teeth” was common in all kinds of writing, in fiction, “brush” appeared with words like “hair,” “strand,” “lock” and “lip” up to 150 times more frequently than in non-fiction
Or, as Hargraves puts it, “fictional characters cannot stop playing with their hair.”
Other collocations that Hargraves’s research turned up almost exclusively in fiction were “bolting upright” and “drawing one’s breath.”
Zimmer suggests, and I think he’s right, that creative writers use these phrases because they are looking for ways to portray emotional responses through physical action.
They get used in book after book, though, because they are part of the contemporary vocabulary of written fiction.
In Zimmer’s words, “the conventions of modern storytelling dictate that fictional characters react to their worlds in certain stock ways and that the storytellers use stock expressions to describe those reactions.”
Individual writers have their own quirks, of course, which they may or may not be aware of. (My characters tend to “growl” a lot if I don’t watch them closely).
Zimmer notes that Dan Brown (author of The Da Vinci Code) seems to be “partial to eyebrows,” with characters in his book Digital Fortress arching or raising them no fewer than 14 times.
Zimmer’s overall point: modern fiction may not be written in a self-consciously “literary style” as fiction once was, but that doesn’t mean that there are no conventions.
“When we see a character in contemporary fiction ‘bolt upright’ or ‘draw a breath’,” he says, we’re “picking up the subtle cues that telegraph a literary style,” that provide “a kind of comfortable linguistic furniture to settle into as we read a novel or short story.”
“Literature did not suddenly become unliterary because the prose was no longer so high-flying,” he concludes.
“Rather, the textual hints of literariness continue to wash over us unannounced, even as a new kind of brainpower, the computational kind, can help identify exactly what those hints are and how they function.”
It’s all very interesting to a writer like me. Now, if you’ll excuse me, I have a character whose hair needs attention.
Edward Willett is a Regina freelance writer. E-mail comments or questions to firstname.lastname@example.org. Visit Ed on the web at www.edwardwillett.com.