European Parliament sabotages the AI Act by failing to recognise that the right to read is the right to train

Walled Culture recently wrote about an unrealistic French legislative proposal that would require the listing of all the authors of material used for training generative AI systems. Unfortunately, the European Parliament has inserted a similarly impossible idea in its text for the upcoming Artificial Intelligence (AI) Act. The DisCo blog explains that MEPs added new copyright requirements to the Commission’s original proposal:

These requirements would oblige AI developers to disclose a summary of all copyrighted material used to train their AI systems. Burdensome and impractical are the right words to describe the proposed rules.

In some cases it would basically come down to providing a summary of half the internet.

Leaving aside the impossibly large volume of material that might need to be summarised, another issue is that it is by no means clear when something is under copyright, making compliance even more infeasible. In any case, as the DisCo post rightly points out, the EU Copyright Directive already provides a legal framework that addresses the issue of training AI systems:

The existing European copyright rules are very simple: developers can copy and analyse vast quantities of data from the internet, as long as the data is publicly available and rights holders do not object to this kind of use. So, rights holders already have the power to decide whether AI developers can use their content or not.

This is a classic case of the copyright industry always wanting more, no matter how much it gets. When the EU Copyright Directive was under discussion, many argued that an EU-wide copyright exception for text and data mining (TDM) and AI in the form of machine learning would be hugely beneficial for the economy and society. But as usual, the copyright world insisted on its right to double dip, and to be paid again if copyright materials were used for mining or machine learning, even if a licence had already been obtained to access the material.

As I wrote in a column five years ago, that’s ridiculous, because the right to read is the right to mine. Updated for our AI world, that can be rephrased as “the right to read is the right to train”. By failing to recognise that, the European Parliament has sabotaged its own AI Act. Its amendment to the text will make it far harder for AI companies to thrive in the EU, which will inevitably encourage them to set up shop elsewhere.

If the final text of the AI Act still has this requirement to provide a summary of all copyright material that is used for training, I predict that the EU will become a backwater for AI. That would be a huge loss for the region, because generative AI is widely expected to be one of the most dynamic and important new tech sectors. If that happens, backward-looking copyright dogma will once again have throttled a promising digital future, just as it has done so often in the recent past.

Featured image modified from Wikimedia.

Follow me @glynmoody on Mastodon.

Cookie Consent with Real Cookie Banner