AI taught to recognise hate speech to fight Twitter trolls
Keyboard-bashing trolls are the norm in the Twitterverse and the social network hasn’t been doing a fab job of controlling it. So much so that one third-person creative agency called Possible, has thought it necessary to step in and turn hate speech into something good, launching a campaign called We Counter Hate.
Partnering with Spredfast, Possible has been able to train an AI to recognise hateful tweets. When the AI detects the hateful message, a human moderator will send the troll a counter-tweet, alerting them that their tweet has been countered by karmic goodness. For every retweet We Counter Hate gets, Possible, with donations from the public, will make a $1 contribution to the non-profit organisation, Life After Hate.
Talking to VentureBeat, Possible explained how the team trained its bot to recognise hate speech by adapted Gregory Stanton’s Ten Stages of Genocide, condensing his document down to the bits only relevant to Twitter – from tweets that are intended to dehumanise to tweets that are intended to polarise communities.
Working with Spredfirst to help moderate incoming messages, the AI categorises these tweets into the different streams of hate speech. Possible then feeds this through its system so that it can learn the linguistic nuances of speech. Once it has been trained, it flags messages for human moderators to intervene and identify what’s really hate speech and what might have just been confused by the system.
(Above: An example of @we_counter_hate in action)
Combatting trolls on their favourite platform? It’s a nice idea. Except…there’s a few major flaws.
The problems with using bots to fight trolls
Trolls love to ruin inherently good causes. I can’t help but be reminded of a painful series of events involving actor, Shia LaBeouf. In an online art performance called He Will Not Divide Us, LaBeouf attempted to launch a 24-hour stream protesting Donald Trump’s inauguration. It was supposed to be shown for Trump’s entire term in office, but its home at the Museum of the Moving Image lasted barely three weeks following a large-scale troll assault.
We Counter Hate seems prime to replicate Labeouf’s recipe for failure. It’s not difficult to imagine scenarios in which trolls continuously find ways to break the bot.
Secondly, doesn’t the campaign seem like it’s, in a perverse way, almost incentivising hate speech? If you want to see more of the countering tweets, hate-filled tweets need to be posted on the social network in the first place. People might just end up posting hate speech in order to trigger the bot.
While campaigns like this are, in theory, a good thing, most of my scepticism errs on the side that says this will all end in tears for the users. The thing is, there’s no single clear solution to tackling online trolls. The problems arguably stem from the systems of social networking themselves; something the likes of Twitter and Facebook often tie themselves up in knots trying to unpick. There’s no single bulletproof solution, and a bot that ultimately works in the favour of marketing one creative agency is unlikely to solve the problem on its own.