Home » US federal judge certifies class action against Anthropic over AI training piracy – JURIST
Posted in

US federal judge certifies class action against Anthropic over AI training piracy – JURIST

US federal judge certifies class action against Anthropic over AI training piracy - JURIST

A US federal judge in California certified a class action lawsuit against the AI company Anthropic on Thursday. The lawsuit encompasses millions of copyrighted books that were allegedly pirated to train Anthropic’s Claude AI platform, setting the stage for what may become one of the largest copyright infringement cases in US history. 

Senior District Judge William Alsup of the US District Court for the Northern District of California granted class certification for authors whose works were downloaded from two major pirate libraries, LibGen and PiLiMi. The class certification follows last month’s mixed ruling on fair use, where Judge Alsup granted partial summary judgment for Anthropic. The court previously found that using copyrighted works to train large language models constituted “quintessentially transformative” fair use, comparing it to how humans read and learn from books. The judge also previously ruled that Anthropic’s conversion of legitimately purchased print books to digital format was legal. However, the court rejected Anthropic’s fair use defense for its initial acquisition of pirated books, finding that downloading millions of copyrighted works from internet pirate sites to build a “central library” was not protected.

The class action certified on Thursday covers “all beneficial or legal copyright owners of the exclusive right to reproduce copies of any book” downloaded from LibGen and PiLiMi. Court documents reveal Anthropic downloaded approximately five million books from LibGen and two million books from PiLiMi. In the ruling, Judge Alsup compared this case to the landmark Napster litigation, describing Anthropic’s conduct as “Napster-style downloading of millions of works.”

Under 17 U.S.C. § 504(c)(2), willful infringement of copyrighted materials can carry a penalty of up to $150,000 per work. With millions of allegedly pirated works noted in this lawsuit, Anthropic faces potentially catastrophic liability. The court rejected Anthropic’s argument that such enormous potential damages should preclude class certification, citing precedent from the US Court of Appeals for the Ninth Circuit that “the very purpose” of federal class action rules would be undermined if defendants could escape class treatment simply due to the size of their potential liability. The largest recorded copyright judgments pale in comparison to the potential damages in this case. In 2021, Oracle prevailed in a copyright lawsuit seeking $8.8 billion in damages against Google, but that decision was later overturned by the Supreme Court. In 2014, a lawsuit in which Viacom initially sought over $1 billion in damages from YouTube settled out of court for an undisclosed amount.

Accounting for inflation and technological change, no copyright case has approached this scale of alleged systematic infringement of published works by a single corporate defendant. Anthropic generates several billion dollars annually from Claude, which was trained on these allegedly pirated works. This case may establish crucial precedents for the entire AI industry, as many companies have used similar training methodologies. A significant judgment could fundamentally reshape how AI companies acquire training data.