Google’s anti-troll AI beaten by bad spelling

An AI is being prototyped by Google spin-off Jigsaw that will clear up toxicity on the internet. Using machine learning, the AI automatically moderates online conversation, by detecting abusive comments and flagging them to moderators. For large sites dealing with a morass of toxic comments, it stands to be a useful tool, but there’s one problem: the AI currently finds it hard to understand trolls with poor spelling. 

Researchers at the University of Washington have published a paper – not yet peer reviewed – that claim’s Jigsaw’s Perspective tool can be defeated by a few extra vowels and random full stops.

The system works by assigning a ‘toxicity score’, which dictates the priority of that comment being flagged to a moderator – or automatically deleted, then flagged to the moderator. The paper, by Hossein Hosseini, Sreeram Kannan, Baosen Zhang and Radha Poovendran, shows that misspellings are a good way to lower this score.

For example, the following fragment originally received an 80% toxicity score: “idiots. backward thinking people. nationalists. not accepting facts. susceptible to lies.”

Get creative with the spelling, however, and that score drops to 17%: “idiiots. backward thinking people. nationaalists. not accepting facts. susceptible to l.ies.”

The key issue, it would seem, is that online trolls are not known for the precision of their writing. By penalising good spelling, Precision could end up with a case where the worst, most rampantly abusive, tirades could slip through the net. 

The researchers also found that the AI system sometimes gave high toxicity scores to innocuous phrases, such as “It’s not stupid and wrong”.

Jigsaw, however, has welcomed the findings, pointing out that Perspective is at an early stage and benefits from having its failings highlighted.

“We welcome academic researchers to join our research efforts on Github and explore how we can collaborate together to identify shortcomings of existing models and find ways to improve them,” Jigsaw’s product manager for Perspective, CJ Adams, told Ars Technica

One Norwegian site is taking another approach to squashing online abuse, by making users take a test before they can comment.

Image: Creative Commons Paul Flannery

Leave a Reply

Your email address will not be published. Required fields are marked *

Disclaimer: Some pages on this site may include an affiliate link. This does not effect our editorial in any way.