Why do spam emails contain random words?

The phenomenon of random words in spam emails might seem pointless, but these unwanted messages are sent quite deliberately

Wednesday, 21 July, 2021

The impact of Monty Python continues to reverberate around the world, almost half a century after some of the programme’s most iconic sketches aired.

In computing circles, the show lent its name to a programming language (Python), and inspired the term ‘spam’ to describe unsolicited bulk email.

The endless repetition of the word in a two-minute TV sketch was adopted by early internet users as an illustration of the repetitive and unwelcome familiarity of recurring junk messages.

The term was originally applied to message boards, before carrying over into email, as the affordability of mass mailing encouraged companies to bombard people with communiques.

This in turn led to the creation of spam filters, which drove the spammers in a direction the original Pythons would probably have grudgingly admired…

I don’t want any spam!

Spam filters were set up to distinguish commonly-used terminology in emails, triggering an evolving game of cat-and-mouse between senders and the mail servers of the recipients.

Brand names like Rolex and Viagra often appeared in unsolicited bulk messages, as did words like ‘free’ and ‘pleasure’, so servers began blocking messages with this content.

In response, spammers began deliberately mis-spelling words – R0lex and V!agra, for instance. But then the filters cottoned onto this as well.

Next came a focus on phrases rather than individual words in email titles and body text, which was once again shut down.

The spammers attempted to get round this by changing the order of words, but the filters continued to evolve in response.

Next came the phenomenon many people will be familiar with today, and the ones which the surrealist Pythons might have appreciated – random words in spam emails.

A message may arrive bearing the subject line “monkey spat teapot would Half acrimony”, with body text comprising an indeterminate number of words which make absolutely no sense.

So what’s being attempted when random words in spam emails are sent out?

Keeping spam at Bayes

Bayesian poisoning is the official term for trying to defeat a spam filter by incorporating random words in spam emails.

It’s also known as a word salad, republishing dictionary words in a completely random order.

While these messages might not make much sense, they could still contain harmful vectors like virus-laden attachments or embedded images containing Trojans.

Bamboozling Bayesian filters also makes it harder to accurately identify wheat from chaff in future. Today’s junk mail could be softening the filters up for tomorrow’s assaults.

Mass mailings additionally demonstrate how many spam messages may successfully arrive in future, since it’s often possible to determine whether a message has been delivered and read.

Considering a response rate of 0.000008 per cent can be enough to generate a profit for spammers, it’s no wonder they’re so determined to see if junk mail can evade spam filters.

Neil Cumins author picture


Neil is our resident tech expert. He's written guides on loads of broadband head-scratchers and is determined to solve all your technology problems!