Researchers have delved into the source code of Google's latest KitKat operating system and found a list of all the words Google deems inappropriate enough to block with its filter. The result is a bizarre list of 1,400 words that the Google operating system will refuse to recognise if typed or swiped.
All of the old favourites are there, or course. Every one of the seven dirty words made famous by comedian George Carlin are on the list. So are "butt" and "geek," an exhaustive list of pornographic sub genres, as well as just about every racial, ethnic, homophobic, misogynist and generally unpleasant world imaginable.
However, some of Google's other choices are a little more obscure. "Gonadatrophia" is out, and so is "irrumination" (Google them). Other unusual entries are "thud" and "LSAT". There's also an unhealthy preoccupation with women's bodies, with "lactation", "uterus" and "preggers" all on the censored list.
For users of medicinal opiates, "morphine" is forbidden, as well as "demerol" and "malonylurea," a precursor to modern barbiturates. If you think Google is trying to pre-empt Breaking Bad-style antics among its users, you'd be wrong, though: "marijuana," "methamphetamine" and even "bong" are all allowed.
The list of words might even cause some problems for our Muslim brothers and sisters, with "Sunni" on the forbidden list, along with "Iftar," the evening meal that Muslims take at sunset during the month of Ramadan.
As for the political extremes, "klansmen" and "supremacist" are blocked, but "Nazi" is just fine, presumably to allow for scholars who engage in debates about historicity over SMS.
The list of banned words has certainly raised some eyebrows, particularly in sectors of society who want to see more openness around issues of sexual health. "STI" and "Tampax" are blocked, for instance.
"I try to Swype-type the word 'condom' and I get 'condition' or 'confusion,'" said Jillian York, a spokesperson for the Electronic Frontier Foundation. "There is no context in which that makes any sense. Grow up, Android."
A note within the KitKat source code illuminates the process behind word suggestions, and how certain words are deemed too rude to be shown. Each word in the Google dictionary is given an integer value, or "weight," from 0 to 255, where 255 is a 100 per cent probability of the word occurring.
However, "a weight of 0 is taken to mean profanity - words that should not be considered a typo, but that should never be suggested explicitly."
Fans of colourful language should be aware that the words are only blocked if users activate the filter in the Google Keyboard Settings section of their phone.
Google's predictive dictionary contains 165,000-words in total, but the company has so far kept quiet about how words are included, how often they are updated, and by what process they are deemed inappropriate.