As noted in both opinions, courts consider and weigh four factors identified in the Copyright Act to determine whether use of a copyrighted work constitutes fair use.
Two groundbreaking decisions from the Northern District of California—Kadrey v. Meta Platforms, Inc. and Bartz v. Anthropic PBC—shed light on how courts are approaching the use of copyrighted materials in training large language models (LLMs). Both cases involved authors alleging copyright infringement based on the use of their books to train generative AI models, and both courts held that use of the copyrighted materials to train the AI models was transformative. The court in Anthropic held, however, that copying pirated books constitutes copyright infringement and the transformative nature of the use did not rescue such infringement. Conversely, the Meta court held that copying from pirate sites to train AI is fair use, but only because the plaintiffs failed to submit evidence of market harm, which the court believed to be the most relevant factor. As such, while use of copyrighted works to train AI may be fair use, copying works without permission carries the risk of infringement.
As noted in both opinions, courts consider and weigh four factors identified in the Copyright Act to determine whether use of a copyrighted work constitutes fair use. The factors include:
(1) [T]he purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work.
The first factor in particular looks at whether the use is sufficiently transformative, namely whether and to what extent the new work merely supersedes the objects of the original creation (supplanting the original), or instead adds something new, with a further purpose or different character. The factors work in tandem—and the more transformative, generally the less likely there is to be an impact on the market for the original, though not always.
In Anthropic, Judge William Alsup held that using copyrighted materials to train LLMs was fair use as a matter of law. He emphasized that the purpose and character of using works to train LLMs was transformative and even went as far as to say that “the technology at issue was among the most transformative many of us will see in our lifetimes.” The court also deemed as fair use the scanning and digitizing of books from print copies that Anthropic had purchased. However, Anthropic’s use of digital books copied from pirated websites was found to be infringing and not fair use. The order states that copying a book from a pirate website to create a central library that can be used for a various, unspecified purposes is not transformative. He further held that this infringement cannot be cured after the fact by buying a copy of a book that the defendant previously pirated. The order did not decide whether use of pirated books downloaded solely for training AI and then immediately deleted would be fair use. However, in dicta the court stated:
There is no decision holding or requiring that pirating a book that could have been bought at a bookstore was reasonably necessary to writing a book review, conducting research on facts in the book, or creating an LLM. Such piracy of otherwise available copies is inherently, irredeemably infringing even if the pirated copies are immediately used for the transformative use and immediately discarded.
The following is a high-level summary of the court’s finding on each fair use factor:
|
Training LLM |
Purchased Digitized Books |
Pirated Books |
Purpose and Character |
Quintessentially transformative—favors fair use |
Format change from print to digital is transformative—favors fair use. |
Pirating books to create central library is not transformative—favors authors. |
Nature of Work |
All books were published but contained expressive elements for which they were chosen—favors authors. |
||
Amount and Substantiality |
Billions of words needed to train model so use was reasonably necessary to the transformative use—favors fair use. |
Copying entire work necessary to convert to digital and delete hard copy—favors fair use. |
Almost any unauthorized copying to create a library is too much—favors authors. |
Market Effect |
No market effect for training LLM; potential market effect for emerging market for licensing data—but factor favors fair use. |
Format change does not usurp authors’ entitlement—favors fair use. |
Copies from pirated sources that could have been bought—favors authors. |
Holding |
Fair Use |
Fair Use |
Infringing |
In Meta, Judge Vince Chhabria also found that the purpose and character of use of the copied books was highly transformative and granted summary judgment in favor of Meta, specifically finding that its use of copyrighted books to train its Llama models constituted fair use. The court highlighted that the plaintiffs failed to present concrete evidence of market harm, particularly regarding whether Meta’s AI outputs diluted demand for their books. Though acknowledging the risks generative AI poses to creative markets, the court underscored that rulings must rest on evidence—not hypotheticals. Unlike Anthropic, the Meta court did not find that pirating of books was infringing. Instead, the court held that the record was insufficient to definitively rule that the fair use defense did not apply to Meta’s piracy of the authors’ books. The court made it clear:
[T]his ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful. It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.
The following is a high-level summary of the court’s finding on each fair use factor:
Purpose and Character |
Highly transformative, commercial use is not as important where secondary use is highly transformative—favors fair use. |
Nature of Work |
Books are highly expressive—favors authors. |
Amount and Substantiality |
Copying whole book is reasonably necessary for transformative purpose of training LLM—favors fair use. |
Market Effect |
Meta presented evidence of no market harm and plaintiffs failed to present any evidence of harm sufficient to create a question of fact—favors fair use. |
Holding |
Downloading and Training Fair Use |
While the courts agreed that LLM training can be transformative and thus permitted fair use, they diverged on how to weigh the fair use factors, particularly with regard to market harm. Anthropic treated transformative use as nearly dispositive. The court downplayed concerns about market displacement as analogous to saying that training schoolchildren to write well would result in an explosion of competing works. In contrast, Meta stated that market harm was the most important factor in the fair use analysis, suggesting that dilutive effect may be highly relevant. The court was also critical of the Anthropic court’s analogy between training schoolchildren and training an LLM, which it says is “blowing off” the most important fair use factor. It noted that teaching children to write is not like using books to create a product that allows a single individual to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take. However, ultimately the court held that Meta presented evidence that there was no market harm and plaintiffs both failed to raise dilutive effect as a theory of harm and failed to present any empirical evidence to support the two alternative theories of harm presented to the court. Anthropic also treated the creation of a central library as a separate use case for the fair use analysis, stating that the fact that only some of the books were used for training meant that the creation of the library was not transformative. In Meta, although the court also acknowledged that downloading is a different use from any copying done in the course of training the LLM, it held that “downloading must still be considered in light of it ultimate, highly transformative purpose: training [the LLM].” The Meta court found that the downloading was transformative, even if some copies were not ultimately used for training.
Neither case referenced the preliminary report issued by the Copyright Office on May 9, 2025, which suggested that AI training will fall along a spectrum with some training more transformative than others depending on the “functionality of the model and how it is deployed” and raising the possibility of market harm, which will require fact-specific analysis.
The implications of these Northern District decisions are twofold. First, companies leveraging copyrighted material for LLM training should proceed with caution. Although the Meta court found fair use, it was clear that its decision was limited to the specific legal arguments and evidence (or lack thereof) in that particular case. Further, the Anthropic decision held that use of pirated books to create a central library for training AI is infringement and not fair use.
Second, the cases reaffirm that fair use remains a highly fact-specific and unpredictable doctrine in copyright law. Indeed, the Meta court made a point of stating that its order was limited to the facts and evidence of this particular case, and that in most other cases, plaintiffs are likely to often win against similar use cases. Accordingly, companies should carefully assess the risks before relying on fair use as a defense, particularly in novel or evolving contexts.
For More Information
If you have any questions about this Alert, please contact Michelle Hon Donovan, Mark Lerner, Jennifer Lantz, Meghan C. Killian, Alanna B. Newman, any of the attorneys in our Intellectual Property Practice Group, any of the attorneys in our Artificial Intelligence Group or the attorney in the firm with whom you are regularly in contact.
Disclaimer: This Alert has been prepared and published for informational purposes only and is not offered, nor should be construed, as legal advice. For more information, please see the firm's full disclaimer.