Mathematicians at the University of Vermont have been meddling in a field very far from boring numbers. Earlier this month, they officially declared the English language “optimistic” based on a careful analysis that combined statistics and subtle human evaluation. The researchers, led by assistant professor Chris Danforth, aggregated texts from Twitter, the New York Times, song lyrics, and Google Books’ database dating back to 1520. They picked the top 5,000 words from each source, which totaled 10,222 words. (Why is it not 20,000 words? There was some overlap in the top 5,000. It is no surprise that “the,” “a” and “of” are some of the most frequently used words across these different collections.)
Once they had the most frequently used words, they brought in human evaluators to judge the words on a happiness scale: 0 being the least happy and 9 being the happiest. When each word had a value, they calculated their frequency in the texts. The composite score for all of the words was 6 – a statistically significant shift towards the “happy” end of the spectrum.
Of course, this study isn’t perfect. For one, some words that may be judged as “happy” are not always used that way. The word “bad” has developed an alternative meaning of “outstandingly excellent; first rate.” Some words had very divergent scores from the evaluators. Words like “pregnant” and “alcohol” may be very positive to one person but very negative to another.
Also, the study measures the use of words, not the existence of words, so perhaps it would be more accurate to say that English speakers in the selected works were positive, rather than the English language being inherently positive. If the study evaluated the unabridged dictionary using the same methods, we might be able to better measure the emotional tone of the language as a whole.
Danforth and his colleague Peter Sheridan Dodds have been developing a “hedonometer” to measure the mood of a population based on the words they use. The idea of a scale to rate happiness has been around for quite some time; an economist Francis Edgeworth first spread the idea in the 1800s. But Dodds and Danforth have come much closer than their predecessors to a successful system that generates a happiness score for a particular time period by reading tweets, political speeches, and blogs, among other raw texts. Read their academic word here.
Do you think English is an innately positive language? Do you think you can measure happiness by word use?