
Apple is facing multiple legal challenges over allegations that it trained its artificial intelligence models—specifically those under the umbrella of Apple Intelligence—using copyrighted books without authorization. Plaintiffs in two major lawsuits claim Apple used large datasets of books, some allegedly pirated, to build or improve its AI models. The lawsuits argue that Apple did not obtain licenses, did not provide compensation or credit, and in some instances used materials from “shadow libraries” or other unauthorized sources.
Key Plaintiffs and Claims
- Grady Hendrix and Jennifer Roberson: Authors who say their books were included in a dataset called Books3, which was used in Apple’s OpenELM models. They allege that Apple employed datasets derived from “shadow libraries,” including Books3 (which itself was associated with hosts of pirated books).
- Susana Martinez-Conde and Stephen Macknik: Neuroscientists who also claim that specific titles they authored—Champions of Illusion: The Science Behind Mind-Boggling Images and Mystifying Brain Puzzles and Sleights of Mind: What the Neuroscience of Magic Reveals About Our Everyday Deceptions—were used without their permission. They accuse Apple of relying on pirated and infringing copies from “shadow libraries” to train Apple Intelligence.
Legal Concerns
The core legal issue is whether Apple’s use qualifies as fair use. Using pirated or unlicensed texts could constitute copyright infringement, exposing Apple to financial and legal consequences. Courts will examine the nature of the data, the amount used, and whether the AI training harms the market for the original works.
Potential Consequences
If the plaintiffs succeed, Apple could face statutory damages, injunctions to stop unlicensed use, and possible requirements to destroy or disable infringing models. The case may also set industry-wide precedents on how AI companies acquire and use copyrighted material.
Broader Implications
This lawsuit highlights tensions between innovation and intellectual property. It raises questions about transparency in AI training, fair compensation for creators, and the ethical responsibilities of tech companies in using copyrighted works.
As Apple navigates these allegations, the outcome could shape legal standards for AI training data and influence how tech companies approach intellectual property in the rapidly evolving AI landscape.