Even algospeak won’t save us from upload filter overblocking

Over on the EFF blog, Cory Doctorow points to an interesting article in the Washington Post about “algospeak“:

“Algospeak” is becoming increasingly common across the Internet as people seek to bypass content moderation filters on social media platforms such as TikTok, YouTube, Instagram and Twitch.

Algospeak refers to code words or turns of phrase users have adopted in an effort to create a brand-safe lexicon that will avoid getting their posts removed or down-ranked by content moderation systems. For instance, in many online videos, it’s common to say “unalive” rather than “dead,” “SA” instead of “sexual assault,” or “spicy eggplant” instead of “vibrator.”

It is not just words and phrases that platform filters block. As readers of this blog know only too well, increasingly such filters in the EU and elsewhere will be blocking what they regard as unauthorised uses of copyright material. This means that it is the detailed coding of the algorithms embedded in upload filters that determine what can be put online, not what is laid down in the law. In her Walled Culture interview, the EFF’s Katharine Trendacosta explained the chilling effect that YouTube’s Content ID algorithms have already had on the quality of user uploads to YouTube:

if it’s a video to teach you how to do something, or to teach you a film technique, they’re not going to use the best example, they’re going to use the one that Content ID doesn’t take down. So you’re actually as a user and a viewer getting a lesser product, because you are being given only what can make it pass the filter, not what those people actually have a right to use.

The same will be true in the EU. Regardless of what the EU politicians thought they were laying down with Article 17’s upload filters, what matters for the hundreds of millions of people who use sites like YouTube or Facebook is what those algorithms permit.

In one important respect, upload filtering is worse than content moderation. The latter can be circumvented by the use of algospeak that constantly evolves to stay one step ahead of the filtering. Although people might experiment in order to establish what kind of copyright material would be blocked by upload filters, that doesn’t answer the question as to whether it should be blocked under the relevant law. Only the courts can decide on complex copyright cases that filters cannot and will never manage to grasp. This will inevitably be a long and expensive process, which will discourage most people from even trying.

Featured image by Fructibus.

Follow me @glynmoody on TwitterDiaspora, or Mastodon.