Each year brings us a new list of words that, once hip or subcultural, signal their admission into the mainstream by entering the pages—print or online—of the Oxford English Dictionary or Merriam Webster’s. Many of those come from the world of hip hop. The form is a veritable laboratory of linguistic innovation, spawning dozens of region-specific argots that mutate and evolve beyond the capacity of hip lexicographers to document. One data scientist, Matt Daniels, has made an interesting attempt, however, in a project he calls “The Largest Vocabulary in Hip Hop.” Proceeding from the premise that certain rappers might match or best Shakespeare for the title of “largest vocabulary ever,” Daniels used a methodology called “token analysis” to analyze the lyrical content of “the most famous artists in hip hop.” He relied on Rap Genius transcriptions, which are only current to 2012, to produce a sample size of 35,000 words (the equivalent of 3–5 studio albums).
Topping the list by far with a total of 7,392 unique words used is rapper Aesop Rock, whom, Daniels admits, is somewhat obscure by comparison with Jay Z or Snoop Dog. More well-known artists like Wu Tang Clan, The Roots, and Outkast also rank highly, but what Daniels discovered is that many of the rappers near the top of the scale are underground or obscure artists who don’t sell millions of records. And occupying the lower end are some top-selling artists and household names like Lil Wayne, Kanye West, and Snoop Dog (DMX is dead last at #85). King of the hill Jay‑Z, whose 2013 album Magna Carta…Holy Grail sold half a million copies in its first week, ranks somewhere in the middle, and Daniels quotes from the mega-selling rapper’s “Moment of Clarity” from his Black Album in which he plainly admits that he’ll write middlebrow lyrics for million dollar sales figures, saying “I dumbed down for my audience to double my dollars” (one wonders how many listeners perceived the slight).
Daniels admits in an NPR interview that this is “not a serious academic study” but a project he undertook for the fun of it. And a great many of the “unique words” counted in each rapper’s totals are slang coinages or variants like “pimps, pimp, pimping, and pimpin,” each of which counts separately. Even so, writes Daniels on the project’s site, “it’s still directionally interesting,” as well as sociologically. And of course, literary writers have been contributing made-up words to the general lexicon for centuries. See Daniels’ site for an interactive visualization (screen shot above) of the rankings of all 85 rappers surveyed.
If you’re wondering who has a bigger vocabulary — Shakespeare or rappers — here’s the quick answer in purely numerical terms. In his sample size of 35,000 words per artist, Daniels determined that Aesop Rock used 7,392 unique words (and Wu-Tang Clan, 5,895) against Shakespeare’s 5,000 unique words. And there you have it.
Related Content:
Jay‑Z: The Evolution of My Style
The Greatness of Charles Darwin Explained with Rap Music
The Art of Data Visualization: How to Tell Complex Stories Through Smart Design
Josh Jones is a writer and musician based in Durham, NC. Follow him at @jdmagness.
I saw this yesterday (can’t remember where) and the comments were in favor of the rappers as somehow better writers than Shakespeare. If I had not scanned to the bottom of this post I wouldn’t have known that it is “not a serious academic study.”
Does it really have to be pointed out that sometimes bigger is not better? It is not the size of vocabulary, but how it is put together.
Many writers have a large vocabulary — James Michner claimed nearly 80,000 words. He would never have compared himself to the Bard.
What bothers me is, a lot of young people take this seriously. In the words of one commenter:
“To be, or not to be. What does that even mean?”
I wonder if Elvis was keener on rocket science than Astrophysicist Neil deGrasse Tyson?
An important note with regards to comparing anyone on this chart with Shakespeare:
” I used the first 5,000 words for 7 of Shakespeare’s works: Hamlet, Romeo and Juliet, Othello, Macbeth, As You Like It, Winter’s Tale, and Troilus and Cressida. For Melville, I used the first 35,000 words of Moby Dick.”
A direct comparison between the rappers on this chart and Shakespeare or Melville is not reflective of full vocabulary size, it is just meant as a kind of visual reference point.
This is a nonsense “study” that only proves that one can prove anything with faulty methodology. A person’s vocabulary isn’t limited to a period in their life, so picking a few plays only shows what words were needed for those plays. To really show anyone’s vocabulary, you’d need to count all the words they used in everything. And counting derivations of the same word like pimp as multiple words is idiocy.
No doubt there are rappers with large vocabularies, but this doesn’t come close to demonstrating that, let alone show they’re beyond Shakespeare.
https://kottke.org/10/04/how-many-words-did-shakespeare-know