I saw this link on Astral Codex Ten, Scott Alexander’s new blog, which I didn’t learn about until recently:
Fiiine, I’ll link to the creativity test that’s gone viral on Twitter recently. You choose ten words, and it grades you as more creative the more different all ten are from each other on some measure of semantic distance.
I love Scott Alexander’s long, thoughtful posts about stuff, but the linked post above is not one of those; it’s a collection of links to whatever he’s seen recently that caught his eye, including the creativity test that (I guess) has become popular on Twitter. I’m not doing a lot with social media at the moment, preferring to put my emotional energy toward other things such as writing, so I wasn’t aware of that.
The instructions for this test are: Please enter 10 words that are as different from each other as possible, in all meanings and uses of the words. They mean nouns, single words, no proper nouns, no technical jargon.
Naturally, I found a creativity test that is basically a vocabulary test just irresistible.
Here are my results:
Your score is 94.44, higher than 99.31% of the people who have completed this task
civet | |||||||
missile | 103 | ||||||
pillow | 93 | 89 | |||||
peppercorn | 72 | 96 | 96 | ||||
trust | 106 | 90 | 88 | 106 | |||
quartz | 89 | 96 | 83 | 86 | 103 | ||
vivaciousness | 82 | 107 | 108 | 83 | 105 | 101 | |
civet | missile | pillow | peppercorn | trust | quartz | vivaciousness |
---|
The average score is 78, and most people score between 74 and 82. The lowest score was 24 and the highest was 96 in our published sample. Although the scores can theoretically range from 0 to 200, in practice they range from around 6.2 to 109.6 after millions of responses.
I’m not sure why only the first seven of ten nouns are included in the analysis. My other three nouns were “bicycle,” “solitude,” and “follicle.” You’re not supposed to use specialized vocabulary, so I didn’t put in “ovoviviparity,” though I wanted to. I thought “follicle” was not too jargony. If I’d known that only the first seven nouns counted, I’d have replaced either “missile” or “pillow” with “follicle.”
Now, is a large vocabulary plus an ability to decide whether nouns are “not like each other” a measure of creativity? Is this a sound argument? Beats me. Could be, I guess, though I’d offhand guess that this is a correlated measure rather than a direct measure. I could argue it either way in theoretical terms. This test does include the caveat that it “measures only one aspect of one type of creativity,” which certainly makes it difficult to refute. Sure, it’s probably to some degree a measure of one aspect of one type of something we could call “creativity.”
Scott shows that just plugging in a random selection of nouns right out of a dictionary, in order, from apple to appeasement, will give you a pretty good score.
Anyway, if you enjoy vocabulary quizzes, and who doesn’t, here you go.
If I’m reading this right, it thinks that “civet” and “peppercorn” are the two most similar words of the seven, and “pillow” and “vivaciousness” the most distant. I can’t say this makes it any clearer to me what scale(s) you could use to make this kind of judgement.
I gave up on trying to figure out how their algorithm decides which words are most similar and least similar. Personally, I consider missile and pillow most similar, as they’re the most, I don’t know, straightforward, concrete, obvious nouns? Or something? However, the algorithm does not agree, as when I replace “pillow” with “follicle” and try again, the score lowers a bit.
Civets and peppercorns are both biological. Although one is the full organism and one is the fruit of a tree.
Trust and vivaciousness are both abstract qualities, but the words themselves are structured differently. Maybe that plays a role?
Missile and trust are pretty opposite in meaning AND one is concrete and the other is abstract, so they see more different to me than the algorithm seems to think they are.
So, who knows? I don’t think this test is likely to capture much reality. Way too much that is handwavy about the whole idea of “measuring” how “different” nouns are from other nouns.
I scored 89.9, 95% percentile. I have a hard time seeing the semantic relationship between REVERSAL and SHARP, but at 74, it really brought down the score. I do sort of see the VALENTINE and ARCHANGEL are related, but it still scored higher. MICROFAUNA, alas, was rejected.
Oh yes, my first word, ABECEDARIAN, was also rejected. I do wish it gave warnings before running the numbers.
Reversal and sharp. Hmm. That is a puzzle. They’re both abstract? Abstract-ish. Is “reversal” possibly also used somehow in music?
Let’s see how badly this gets mangled:
Your score is 96.01, higher than 99.65% of the people who have completed this task
goldfinch
sediment 102
honor 108 106
quasar 85 86 113
democracy 99 100 71 103
script 109 99 91 106 89
ax 93 98 93 87 92 88
goldfinch sediment honor quasar democracy script ax
Maybe the existence of sharp reversals is a link between the two words? Though it’s only supposed to consider “sharp” as a noun, I thought.
It says on the results page that “The score is computed by taking the average semantic distance between each of the words. These distances are computed by measuring how often the words are used together in similar contexts.” If I had known this I would definitely have chosen differently!
The word “grave” was the one that pulled my score way down… primarily occurring with “disappointment.” I think this is a result of a problem in the algorithm. In “a grave disappointment” we definitely have grave as an adjective, not a noun.
Your score is 91.98, higher than 98.18% of the people who have completed this task
chloroform
logistics 102
scribe 97 101
disappointment 105 88 93
peninsula 100 85 96 97
impersonator 86 102 86 98 97
grave 88 95 81 66 83 85
chloroform logistics scribe disappointment peninsula impersonator
I note that we’re all over the 90% percentile so far. Maybe we should propose research on whether or not fans of Rachel Neumeier are significantly more likely to score high on a verbal creativity test than a random sample of the population.
Oh! “Sharp reversal”, I get it. Except it asks about nouns, which ‘sharp’ is not in this sentence. Including homographs is a little unfair.
I’m pretty sure writers and serious readers will score much higher than average. Readers just have bigger vocabularies, or I would be astonished if not.
That measure of difference would screw up the score for every noun that is also an adjective. It would be more honest to instruct people to choose either nouns or adjectives.
I got a response back from the researchers. Their corpus is not tagged with parts of speech, so the algorithm is definitely not distinguishing between nouns and adjectives. I think some computational linguistics folks could perhaps recommend a more robust corpus.
Wow, they REALLY need to mention that nouns that are also adjectives will screw up your score. That’s quite unfair.
I think I’ll see what score I get if I just put in some of the most utterly obscure nouns possible. I bet that will get a high score no matter whether some of the nouns might be related in meaning or otherwise.
Likewise nouns that are also verbs, like “trust” in your list or “honor” in Mary Catelli’s.
Obscure nouns: nearly unknown mammal edition
Chevrotain
Addax
Nilgae
Gaur
Colugo
Genet
Olingo
Tenrec
Solenodon
Aardwolf
The program may disqualify some of these as unknown. Let me just see … Ah! The program did not recognize ANY of these mammal names.
Humph. Some program.