When it comes to figuring out what fair use is and isn’t in the age of artificial intelligence (AI), we currently have more questions than answers. It’s challenging our view of what fair use is, and really, it’s no surprise. Whenever there’s a major technological breakthrough or change, copyright and other intellectual property laws often come into question (remember the peer-to-peer file-sharing network Napster?).
But while we wait for the US Copyright Office’s AI Initiative to wrap up and clarify things, we can explore current theories and cases surrounding the issue. So, in this article, we’ll look at the impact of AI on fair use in today’s landscape.
Under Section 107 of the Copyright Act, fair use allows users to use copyrighted materials under certain circumstances (some of their examples include teaching and news reporting). However, navigating fair use can be complex.
Depending on the situation, there are four general standards to determine if a work falls into this category:
So, where does AI fit in? Well, there are a few theories.
While we will need to wait until various lawsuits settle and new precedents are set before we have a clear answer, it isn’t stopping some theories from floating around about whether or not AI is fair use. Let’s look at some arguments on both sides of the coin.
One of the arguments in favor of AI being fair use is how it works. See, the point of generative AI isn’t to replicate copyrighted materials but to gather information from them and use it to create something new. It does this through statistical modeling, where it figures out which words or images typically go next to each other and which do not.
As long as it is transformative enough — falling under the first standard of Section 107 of the Copyright Act — it may be appropriate to consider it fair use (which applies to training AI models as well).
The idea that training AI models is fair use is an important part of OpenAI’s defense in its recent lawsuit with the New York Times — and there may be some precedent here. In Author’s Guild, Inc. v. Google Inc., the court found that Google’s copying of copyrighted books and making them available online was transformative enough to qualify as fair use.
An article published by the University of California, Berkeley Libray, suggests that in text data mining (TDM) practices, fair use applies to creating and mining copyrighted works — especially when using it to create something new that doesn’t “re-express the underlying works to the public in a way that could supplant the market for the originals.”
Therefore, the argument is that training AI should fall under fair use as well.
Further, the article emphasizes that “training of AI models by using copyright-protected inputs falls squarely within what courts have determined to be a transformative fair use, especially when that training is for nonprofit educational or research purposes.” (source: UC Berkeley Library)
As for the arguments against considering AI fair use, let’s look at the other side of the New York Times v. OpenAI lawsuit. One of the plaintiff’s most relevant claims in this case is that in training their AI, OpenAI copied copyrighted material from the New York Times’ servers to their servers without permission, violating Section 106 of US Copyright Law.
They’re arguing that this practice should be considered copyright infringement and not fair use, as it impacts the fourth standard: the marketability of their content. OpenAI would’ve had to license the material from the New York Times if they hadn’t scraped it themselves. The New York Times claims that this hurts the marketability of their articles, as they’ve missed out on this revenue, so it isn’t fair use.
This attempt to protect intellectual property echoes the concerns of many publishers regarding AI use. In fact, in the education sector, some publishers require libraries to sign content license agreements preventing them from using their content to train AI tools, and others charge additional fees to do so.
In the case of generative AI, there is some precedent for it not being fair use. In Andy Warhol Foundation for Visual Arts, Inc. v. Goldsmith, the Supreme Court found that one of Warhol’s paintings based on another artist’s photo of Prince was not fair use. So, if generative AI comes up with something similar, does that mean it also doesn’t count as fair use?
Yes, whether or not the courts find that AI is fair use can have important implications for AI companies and their users. For example, professionals within the UC Berkeley Library state that if AI training is not fair use, it would:
And this is just for research. If the New York Times wins its lawsuit against OpenAI, then it may need to remove all New York Times material from its large language model. This could cause other publications to follow suit, which could eventually leave ChatGPT as little more than a shell of its former self.
In its short time in the spotlight, artificial intelligence is already impacting how we approach research, training, careers, and hobbies, mostly to our benefit. But it’s also challenging our understanding of what fair use is, making it difficult for both AI companies and content creators to navigate this aspect of intellectual property laws.
Until the courts clarify the relationship between AI and fair use, content creators and publishers may want to protect themselves from possible copyright infringement when using AI tools. For example, using an AI detector or plagiarism checker on generative AI content can help you identify instances of AI writing so you can review if it appears to be duplicate content from an online publication.
It will be interesting to see how AI impacts fair use laws in the United States in the next few years. Hopefully, the benefits of their decision will outweigh the costs.