In the first substantive decision regarding whether use of copyrighted works to train an artificial intelligence (“AI”) tool constitutes fair use under copyright law, the U.S. District Court for the District of Delaware in Thomson Reuters Enterprise Centre GmbH v. Ross Intelligence Inc. recently granted summary judgement to Thomson Reuters Enterprise Centre GmbH (“Thomson Reuters”), finding that Ross Intelligence Inc.’s (“Ross’s”) use of certain Thomson Reuters Westlaw headnotes to train its AI model constituted copyright infringement and rejecting Ross’s fair use defense.
In this alert, we summarize the Delaware District Court’s findings on fair use and offer initial thoughts on what this fair use opinion could mean for the pending copyright infringement lawsuits involving the unauthorized training of generative AI models with copyrighted materials.
Background
Thomson Reuters owns Westlaw, a legal research platform that contains, among other things, case law, statutes, and regulations. Westlaw also contains headnotes, which consist of copyrighted editorial content created by Thomson Reuters that summarizes key points of law and case holdings.
Ross created a competing legal research platform that uses AI. Ross initially asked Westlaw to license Westlaw’s content but Thomson Reuters refused because Ross was a competitor. Ross then hired LegalEase Solutions (“LegalEase”) to obtain training data for its AI platform. Ross instructed LegalEase to create bulk memos with legal questions and answers using Westlaw’s headnotes and used these bulk memos to train its AI search tool. Thomson Reuters sued Ross for copyright infringement when it discovered this use of its Westlaw headnotes.
The Ruling
The court first determined that Ross committed direct copyright infringement, finding that Westlaw headnotes were copyrightable material and that Ross copied original elements of these headnotes when it trained its AI model. On the question of copying, the court compared Westlaw headnotes to the bulk memos Ross used to train its AI model, finding that Ross actually copied 2,243 of Thomson Reuters’s headnotes and that the bulk memos were substantially similar to these headnotes.
The court then rejected all of Ross’s defenses to copyright infringement, including the affirmative defense of fair use, considering each of the statutory fair use factors set forth in the Copyright Act: (1) the purpose and character of the use; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used; and (4) the effect of the use on the potential market for or value of the copyrighted work. 17 U.S.C. §107.
With respect to factor one, the court found in favor of Thomson Reuters. The court determined that the purpose and character of Ross’s use of Westlaw headnotes to train its AI model was solely commercial and not transformative because Ross used the Thomson Reuters headnotes to create a legal research tool that directly competes with Westlaw. The court cited the Supreme Court’s decision in Andy Warhol Found. for the Visual Arts, Inc. v. Goldsmith, 598 U.S. 508, 529-31 (2023), finding that Ross’s use is not transformative because it does not have a “further purpose or different character” from Thomson Reuters’s. Id. at 529. The court noted, however, that Ross’s AI model is not generative AI (AI that writes new content itself); rather, when a user enters a legal question, Ross’s search engine spits out judicial opinions that have already been published.
Importantly, the court rejected Ross’s argument that its use was transformative because the headnotes themselves do not appear as part of the final product, but instead were copied in an intermediate step. The court distinguished other cases permitting intermediate copying under fair use, factor one including Google LLC v. Oracle Am., Inc. 593 U.S. 1, 24 (2021) and Sony Comput. Ent., Inc. v. Connectix Corp., 203 F.3d 596, 599, 606-607 (9th Cir. 2000), because the copying was necessary to innovate in the context of these computer code cases and here, Ross did not need to copy Westlaw headnotes to achieve its new purposes.
While the court then found that fair use factors two and three favored Ross, the court found in favor of Thomson Reuters for factor four, the effect of Ross’s copying on the potential market or value of the copyrighted work, noting that it is the single most important element of fair use. The court held that Ross’s product was created to compete with Westlaw by developing a market substitute and that it did not matter whether Thomson Reuters actually licensed its headnotes to others or used the data to train its own legal tools, because the effect on the potential market for AI training was enough. Specifically, the court held that the market in question was not only “legal-research platforms” but the potential derivative market of “data to train legal AI models.” Significantly, the court was not swayed by Ross’s argument that the public benefit of their product outweighed its unauthorized use of Thomson Reuters materials because “there is nothing that Thomson Reuters created that Ross could not have created for itself or hired LegalEase to create for it without infringing Thomson Reuters’s copyrights.” Balancing all of the four fair use factors, the court rejected Ross’s fair use defense, granting summary judgement to Thomson Reuters on this issue.
What’s Next
While not a decision involving generative AI, the Delaware District Court’s decision on fair use in Thomson Reuters offers some insight as to how courts might apply the four fair use factors in pending generative AI copyright infringement cases, particularly with respect to fair use factors one and four. With respect to factor one, the purpose and character of the use, generative AI defendants will likely argue that, unlike Ross’s non-generative AI model, which directly competed with Westlaw, the purpose of their use of copyrighted material to train is for the creation of an entirely new product that produces new outputs that do not compete with the copyrighted materials they are trained on (although the extent of this competition will likely vary depending on the particular material used to train each model and the specific similarity and uses of the resulting output). While AI defendants may argue that the copying of materials without authorization was absolutely necessary and therefore transformative, plaintiffs in these cases will cite the Thomson Reuters court’s argument that the “intermediate copying” cases of Google and Sony Comput. are limited to the computer-programming context, where there is a need to copy expression in order to reach underlying ideas.
The Thomson Reuters court’s decision on fourth fair use factor, the effect of the use upon the potential market for or value of the copyrighted work, will also likely be relied upon by plaintiffs in generative AI copyright infringement cases to support their argument that by using copyrighted material without permission or compensation, the defendants deprived copyright owners from being able to license this content for AI training. These markets are no longer “potential” markets, as many publishers and content creators have signed deals with generative AI companies to license their data, including Reuters, who signed an AI deal to license its fact-based news content to Meta last October. Nevertheless, plaintiffs and defendants in pending copyright infringement lawsuits involving the use of copyrighted material for generative AI training will continue to battle over the fourth fair use factor, because whether a new generative AI product acts as a market substitute for the works that were copied to train it depends on the specific facts of each case.
Source – JD Supra
https://www.jdsupra.com/legalnews/ai-training-using-copyrighted-works-5212221/