Hay Levenshtein, keep your distance
by Mellissa Martinez
I recently came across a Facebook post in which a friend shared a picture of a delicious looking pastry from her hometown in Poland. The caption read, “home sweat home.” I could not help but double over in laughter.
Don’t get me wrong, I was not laughing at the fact that a trilingual person made a mistake in English; I was amused by the hilarity of the mix-up, and the fact that just one little vowel can make such a difference.
This reality of English plagues native speakers as well. The omission or switching of one letter can create a variety of alternative realities.
Just the other day I told someone (over text) not to be so ‘shellfish,’ and I recently read a correction in the paper, which had reported that a man was on ‘drugs,’ when in fact, he was on ‘drums.’ Sometimes the letter switch can create a drastically opposing meaning. Consider ‘appeal’ vs. ‘appall,’ ‘homely’ vs. ‘comely,’ and ‘slaughter’ vs. ‘laughter.’ Words such as these are abundant in English, and, generally they are not related in any meaningful way.
A few decades ago these mistakes were much less common. If I accidentally struck an ‘e’ instead of an ‘a’ on my typewriter in the 1980s, I would typically catch the error and spend five minutes waiting for the whiteout to dry so that I could roll the paper back into the machine and move on.
Now, largely because of a mathematical equation, we are constantly embarrassing ourselves with mistaken words. Here’s why:
In 1965, Soviet mathematician Vladimir Levenshtein came up with a metric to measure the distance between two sequences. In linguistics, the Levenshtein distance—also called the edit distance—is used to determine the number of single-character edits required to change one word to another. These changes include inserting, removing or changing a letter. For example, the Levenshtein distance between ‘sweat’ and ‘sweet’ is one, while ‘shellfish’ and ‘selfish’ is two.
The Levenshtein distance is now used to code programs, which benefit from the matching of words. These include speech recognition, plagiarism detection, auto spelling correction, and word suggestion. When it comes to spelling, if a text contains a word that is not in the dictionary, the computer will replace it with a frequently used dictionary word with the closest Levenshtein distance. The program will also consider keyboard distance, possible omitted spaces and prefixes.
Although I find programs like autocorrect and voice translation very helpful, some of the resulting errors can be downright hysterical. One popular website shows a text in which a father writes his son “mom and I are going to Divorce next week.” When the kid responds with a terrified, “WHAT?! Call me!” The dad explains, “oh, sorry…auto correct changed Disney to Divorce.” In another, a mother tells her daughter, “honey, I’m going to teach you how to drink this weekend. I think you’ll be good at it.” ‘Drink,’ in this place, was intended to be ‘drive.’
You may wonder how ‘divorce’ can replace ‘Disney’ as they don’t seem to have a close Levenshtein distance. My theory is that when typing on or speaking into the phone, we tend to be sloppy or quick. If the finger accidentally presses a ‘v’ after ‘di,’ perhaps the computer jumps to ‘divorce’ because it is commonly used in that person’s phone.
So why do we find these word slip-ups so funny? According to the 19th century philosopher Arthur Schopenhauer, humor arises when our expectations are violated. The basic idea behind his “Incongruity Theory” is that we live in a world of patterns and order. This is especially true when it comes to language—we follow expected rules. When we see or hear something that doesn’t fit these rules (like me telling my son not to be a shellfish), we laugh.
I suspect that it’s only a matter of time before the algorithm improves and these silly errors become less prominent. Until then, let’s enjoy the levity of word play. Take a few minutes to scroll through autocorrect fails online and you will find yourself in tears.
Words are so important to our social interactions and I think it’s hopeful that the social faux pas of mistaking ‘pubic’ for ‘public’ or ‘anus’ for ‘angus’ still has the ability to bring the 10-year-old silliness out in all of us.