Earlier this week, Dr Emily Dolson, Assistant Professor at Michigan State University, posted the tweet shown above. It’s a warning that one of Dr Dolson’s files violates Google Drive’s Copyright Infringement policy, and that some features related to the file “may have been restricted”. As she tweeted, the file contains a single line with the number “1”. Following up on this, another Twitter user, Chris Jefferson, tried generating files containing the numbers -1000 to 1000, and found that “Google also hates 500, 174, 833, 285 and 302“, as well as the numbers 186, 451, 336, 173, 266. He then decided to delete the experimental files, in case the system regarded them as a such a flagrant copyright violation that his account was shut down.
Of course, numbers can’t be copyrighted, at least in this form. Every digital file is, in fact, a single number written in binary. Copyrighting the file therefore means the number is copyrighted. The difference being that the (very) long binary number contains within it some claimed originality – the key test for whether it can be copyrighted. The number “1” obviously has nothing original.
Replying to Dr Dolson, people offered a number of theories as to why Google Drive flagged up the file as infringing, and it’s worth reading the thread if you are interested in what happened. A day after the initial post, the Google Drive team wrote on Twitter that it was “very much aware of this now“, and then a few hours after that it tweeted: “The Drive team has identified the issue, and this shouldn’t happen for new files from this point forward. We’re currently working on unblocking ones that have been incorrectly flagged.”
But that’s not really the important issue here, which is that an automated system was able to consider the single digit “1” as not just copyrightable, but an infringement. This underlines a key point about the fallibility of automated copyright checks. The present case was an obvious and trivial error – to humans, at least. In the future we can expect millions of erroneous accusations of copyright infringement made by automated systems that are completely unable to take into account complex aspects like fair use, parody, quotation etc. The copyright industry’s push around the world for laws requiring upload filters – successful, in the case of the EU’s Directive on Copyright in the Digital Single Market – is a disaster waiting to happen.
There’s one other aspect worth noting about the Google Drive blunder. As the image shows, Dr Dolson was informed “A review cannot be requested for this restriction”. Not only do we have a system that is incapable of making the correct call about copyright infringement for even the simplest case, there is also no way to appeal against its errors. In the present example, the Google Drive team were shamed into action because of the mockery on Twitter. Millions of people whose files will be blocked unjustly by flawed upload filters may not be so lucky, and could be forced to accept copyright injustice whether they like it or not.
Featured image by Dr Emily Dobson.