On October 1, 2024, Christopher Farnsworth, a bestselling fiction author known for his novels Blood Oath, Flashmob, The Eternal World, and The President's Vampire, filed a class action lawsuit against Meta Platforms Inc. in the U.S. District Court for the Northern District of California. This action is primarily based on Section 501 of the Copyright Act of 1976 (17 U.S.C. § 501), which penalizes the infringement of exclusive reproduction and distribution rights protected under Section 106 of the same statute.
The complaint alleges that Meta unlawfully copied and reproduced nearly 200,000 copyright-protected books through a dataset known as "Books3," part of a larger database called "The Pile." Books3 consists of pirated books obtained from the site “Bibliotik,” a repository widely recognized for hosting unlicensed content. In a research paper dated February 27, 2023, Meta acknowledged utilizing Books3 to train its Llama 1 model, with full knowledge of the pirated nature of these works.
The plaintiff underscores the irony of Meta’s actions, as Meta itself invoked the DMCA process to protect its own rights over Llama 1 in March 2023, while allegedly developing that same model in violation of third-party copyright rights.
The lawsuit highlights the economic harm suffered by authors, deprived not only of book sales but also of potential licensing revenues, in the context of a rapidly growing market for AI training data.
The class action, filed under Rules 23(a), (b)(2), (b)(3), and (c)(4) of the Federal Rules of Civil Procedure governing class actions, seeks statutory or actual damages at the plaintiffs' discretion, attorney’s fees, and a permanent injunction prohibiting Meta from continuing these infringing practices. The class action represents all copyright holders whose works are registered with the Copyright Office and have a publication number, and who claim that their works were used by Meta to train its language models.
This lawsuit mirrors a similar case filed in August 2023 by authors Mona Awad and Paul Tremblay against Anthropic and its Claude model, based on comparable legal arguments regarding unauthorized use of protected works for AI training purposes.
These lawsuits, along with numerous other ongoing actions in the United States, illustrate a growing trend of authors challenging the use of their works by AI companies, underscoring a significant conflict between traditional intellectual property rights and emerging artificial intelligence practices. In the coming months, these cases are expected to provide crucial insights into the Anglo-American notion of "fair use" as applied to artificial intelligence, determining whether AI systems are indeed culpable of the “original sin” of massive copyright infringement for training their language models.