Nvidia Fights Back Against Copyright Claims Over AI Training Data
#Regulation

Nvidia Fights Back Against Copyright Claims Over AI Training Data

Chips Reporter
3 min read

Nvidia has filed a motion to dismiss a lawsuit alleging it used pirated books to train AI models, arguing plaintiffs failed to prove specific copyright infringement.

Nvidia is pushing back against claims that it trained AI models on pirated books, telling a federal court in California that its alleged contact with the library 'Anna's Archive' doesn't amount to proof of copyright infringement. In a motion to dismiss filed January 29, the company argues that the authors behind Nazemian v Nvidia have failed to plausibly show that their specific works were downloaded or used in training, despite expanding their complaint to include new theories and datasets.

The Nazemian case was filed in early 2024 and heard in the Northern District of California before Judge Jon Tigar and concerns allegations that Nvidia's AI tools and reference models were trained on copyrighted books sourced from so-called shadow libraries, including Anna's Archive and Books3. The plaintiffs' amended complaint references internal discussions during which Nvidia employees allegedly sought confirmation regarding access to Anna's Archive, arguing that this amounts to evidence of unlawful use.

Featured image

In its motion to dismiss, Nvidia argues that the amended complaint fails to allege even the most basic elements required for a copyright infringement claim. According to the filing, the plaintiffs "do not allege facts showing that Nvidia copied any of their copyrighted works, when any such copying occurred, or which Nvidia models supposedly contain those works." The company says that without those details, the claims are entirely speculative.

Addressing the Anna's Archive allegations directly, Nvidia states that while the complainant describes internal discussions and inquiries about potential access to the site, they don't allege that Nvidia actually obtained or downloaded any of the plaintiffs' works from it. The motion goes on to argue that discussing or evaluating possible data sources isn't equivalent to copying copyrighted materials, and that copyright law requires plaintiffs to plead facts showing reproduction of protected works.

"It's equally plausible Nvidia did not [obtain the Plaintiffs' works]." Pulling no punches, Nvidia also criticizes the plaintiffs' reliance on allegations made "on information and belief," arguing that this approach improperly attempts to use discovery as a substitute for pleading. Nvidia goes on in its motion to remind the court that copyright plaintiffs must allege infringement before discovery begins, not rely on discovery to determine whether infringement occurred in the first place, which Anna's Archive appears to be attempting to do in this case.

Beyond Anna's Archive, Nvidia seeks to narrow the scope of the case by challenging the inclusion of additional datasets and models, such as Megatron 345M, added in the amended complaint, arguing that the plaintiffs improperly lump together multiple models and tools without explaining how any particular model was trained on their works. In several instances, Nvidia points to its own public documentation to argue that the plaintiffs' assumptions about training data are contradicted by publicly available sources.

The amended complaint also introduces a secondary liability theory tied to Nvidia's NeMo Megatron framework and its support for downloading large public datasets such as The Pile. Nvidia responds that the complaint doesn't allege a predicate act of direct infringement by any third party, which is required to sustain claims of contributory or vicarious copyright infringement. Providing optional tooling, the company argues, doesn't establish liability absent specific allegations that users infringed copyrights using that tooling.

The motion to dismiss is due to be heard in the U.S. District Court, Northern District of California, on April 2, 2026.

Comments

Loading comments...