Web blocking: 12 Symantec staff review 2,000 sites a day
A team of a dozen people at Symantec is responsible for deciding whether thousands of sites should be off-limits to UK internet users using parental-control filters.
Symantec creates blacklists of sites, categorising content such as porn, gambling or hate speech. It then passes that list to ISPs and mobile operators, which choose categories they want to block and apply their own filters accordingly.
O2, Vodafone and EE all use Symantec’s tools and block content on mobile by default. Sky also uses Symantec’s categorisation tool for its new network-level, parental-control filter, Broadband Shield.
Hate sites are not categorised by machine, only human eyes can categorise hate as hate
Sky’s filter was rolled out in December, after the government pressured the UK’s major ISPs to introduce stricter controls for content unsuitable for children. TalkTalk and BT have already introduced network-level filters, with Virgin’s to follow shortly.
Now the filters are going live, reports have demonstrated how the filters miscatgorise and unfairly block sites.
Detailing how its categorisation system works to PC Pro, Symantec revealed that a team of a dozen staff around the world manually categorises around 2,000 sites for filtering each day – or an average of 21 an hour on an eight-hour shift.
That’s part of a much larger automated process that maps up to 20,000 sites every day.
The staff make judgements on ambiguous content that can’t be easily categorised via the automated system, such as sex-education sites, illegal content, and hate speech.
“The technology will ‘crawl’ the sites but also for many categories [or] sites the team of a dozen around the world checks the content themselves,” a spokesperson said.
Symantec told PC Pro that staff are responsible for deciding what goes into the “hate” category – such as racist sites – because the decision requires human judgement. “Sites in this category are not categorised by machine, only human eyes can categorise hate as hate,” the spokesperson said.
Those 12 staff don’t only cover the UK, but categorise sites across 20 different languages, Symantec said.
Symantec said the team must make “objective decisions”, but said it provides guidelines to help, adding that the team was “extensively trained” in how to apply the definitions.
The process is similar to that used by McAfee, which also creates a list of sites to be used in filtering systems – and previously told PC Pro the job is “popular” among students.
As the ISP filters go live, there have been numerous reports demonstrating the controls can block legitimate sites, known as “overblocking”. That could mean teenagers, as well as being unable to access adult content, might not be able visit sex-education sites.
Sky blocked the TorrentFreak news website last month and knocked out numerous sites around the web when it accidentally classified the jQuery site as malware.
Though both errors were rectified, the reports highlighted the murky process by which ISPs, providers and security companies decide what to block.
Symantec told PC Pro accidental blocking might happen when its automated tool puts sites in the wrong category.
“[Most] of the time, the team finds the error and changes the category, but in some cases the site owner or customer will find the individual site and request a recategorisation,” the spokesperson said.
It isn’t clear what happens if a site has been miscategorised on the basis of a decision made by one of Symantec’s staffers, or what happens if Symantec refuses to recategorise a site.
Virgin, BT, TalkTalk and Sky are all currently part of a working group to tackle the issue of overblocking, working on a “master list” of sites – such as sex-education charities – to exclude from their filters.