Twitter has revealed its latest attempt to tackle spam messages, a new system called BotMaker.
The traditional machine learning methods used to identify spam are difficult to implement when dealing with real time content, but the company claims BotMaker has caused a 40 per cent decrease in spam since its introduction.
In a blog post released this week, engineer Raghav Jeyaraman outlined the techniques used by BotMaker. The key is to break down content into real-time, near-real-time and batch jobs.
A tool called Scarecrow attempts to prevent spam message before they are posted to Twitter by identifying problem account names or URLs, for example. Another tool, Sniper, searches messages for anything that might have been missed by Scarecrow, possibly as there wasn't enough time to analyse certain features. Batch jobs complete the system by analysing large quantities of offline data to discover long-term behaviour patterns that will assist the online models in the future.
Jeyaraman also added that BotMaker's ability to detect spam in the write path has also been hugely beneficial for the microblogging site.
Removing spam from your site is a major factor in retaining a happy user base and Twitter's BotMaker has been a long time in the making.
The social network revealed that it was working with researchers from the University of California back in 2012 to develop a system to detect spambots. While it has not been confirmed that Twitter's latest spam defence emerged directly from this research, it seems likely to be connected in some way.
Last year, one of the researchers, Chris Grier, told Gigaom that the algorithm developed by the university could theoretically be turned into an online system to spot spam accounts in real time.