The Day Publishing Fought Back: Five Publishers and Scott Turow Sue Meta and Mark Zuckerberg
When Mark Zuckerberg’s company decided to “move fast and break things,” nobody imagined the things would include the entire publishing industry.
I’ve been watching AI copyright litigation unfold for three years now, and most of it has felt asymmetrical. Individual authors, brilliant, determined, but outgunned, filing complaints against trillion-dollar corporations. It was David without the sling.
That changed on Tuesday, May 5, 2026.
Five of the world’s largest publishing houses, Elsevier, Cengage Learning, Hachette Book Group, Macmillan Publishers, and McGraw Hill, filed a proposed class action against Meta Platforms, Inc. and its CEO Mark Zuckerberg in Manhattan federal court. Joined by bestselling novelist Scott Turow, representing authors as a class representative, the plaintiffs allege what may be the most comprehensively-documented act of copyright infringement in the history of artificial intelligence development. The case is captioned Elsevier Inc. et al. v. Meta Platforms, Inc. and Mark Zuckerberg.
The allegation, stripped to its core: Meta pirated millions of copyrighted books and journal articles to train its Llama large language models. Without permission. Without payment. Without even a polite email asking first. And it did so, the complaint argues, with the personal approval of its founder and CEO.
This isn’t just another AI lawsuit. It’s the first time publishers, the institutions that have been the economic backbone of writing for centuries, have stepped into the courtroom as plaintiffs. And they brought receipts.
Who’s Suing and Why This Time Is Different
To understand why this lawsuit matters, you need to know who walked into the room.
The five publishers span the entire landscape of human knowledge. Elsevier dominates scientific and medical journals. Cengage and McGraw Hill are educational publishing titans. Hachette and Macmillan are trade publishing royalty, the names on the spines on your bookshelf.
They’re joined by Scott Turow. If you’ve read Presumed Innocent — or watched the Apple TV+ adaptation, you know his work. But Turow isn’t just a novelist. He’s a past president of the Authors Guild and a Harvard-trained lawyer who has spent decades advocating for writers’ rights. He’s not a decorative plaintiff. He’s a strategic one, serving as class representative on behalf of all authors who’ve had their work ingested without consent.
That combination, institutional publishers plus a named author representative, is deliberate. It closes a door that Meta walked through in earlier litigation (more on that shortly).
The publishers are asking for monetary damages and an injunction requiring Meta to destroy all infringing copies in its possession. They’ve also asked the court to certify the case as a class action representing all similarly situated copyright holders.
The Allegations, Piracy, Torrenting, and a CEO’s Green Light
Here’s where this gets less abstract and more… well, pirate-y.
The complaint alleges that Meta didn’t purchase the books it used to train Llama. It didn’t license them. It torrented them, roughly 82 terabytes’ worth, from shadow libraries like LibGen, Z-Library, and Anna’s Archive. These are not gray-area websites. These are notorious pirate repositories that host millions of copyrighted works without authorization.
Think of it this way: if you wanted to build a library, you could buy the books, borrow them from a lending system, or break into a bookstore at 3 a.m. with a duffel bag. The complaint says Meta chose the duffel bag.
But the most damaging detail isn’t the torrenting. It’s what happened when the licensing question reached the top of the organization.
The Smoking Gun: “Lean Into Fair Use”
Between January and April 2023, Meta’s AI team discussed increasing the company’s “dataset licensing” budget from $17 million to $200 million. That money would have paid publishers properly, negotiated access, clean licensing, the kind of arrangement that companies like Reuters and CNN have since signed with Meta.
Then the issue was, in the complaint’s phrasing, “escalated” to Mark Zuckerberg.
Word came back down: stop. Kill the licensing effort.
The complaint cites a Meta employee who explained the reasoning: if they licensed “once [sic] single book, we won’t be able to lean into the fair use strategy.”
That line, “lean into the fair use strategy”, may become one of the most quoted sentences in AI copyright law. It transforms what could have been a dispute over legal interpretation into a question of intent. Meta didn’t accidentally wander into a gray area. It made a strategic decision, at the highest level, to bypass licensing in favor of arguing that using pirated works was legally permissible.
The publishers are also alleging that Meta stripped copyright-management information from the works, essentially removing digital fingerprints that would identify who owned what, specifically to conceal where the training data came from.
The Legal Backstory, Kadrey, Bartz, and a Judge Who Telegraphed His Next Move
If you’re new to this saga, a quick timeline will help, because Tuesday’s filing is really chapter three of a trilogy.
Kadrey v. Meta, The Authors Who Lost but Opened the Door
In 2023, a group of authors including Sarah Silverman, Richard Kadrey, Ta-Nehisi Coates, Junot Díaz, and Michael Chabon filed a class action against Meta. Their claim was essentially the same: Meta used their books without permission to train Llama.
In June 2025, Judge Vince Chhabria in the Northern District of California granted summary judgment for Meta. His reasoning: Meta’s use of the books to train Llama was “transformative” enough to qualify as fair use, and the individual authors had not demonstrated sufficient market harm to overcome that defense. Meta won.
But here’s the thing, and this is where the publishers’ new case breathes, Chhabria practically wrote a roadmap for future plaintiffs in his ruling. He described Meta’s win as potentially “in significant tension with reality” and noted explicitly that the ruling applied only to those specific authors. If future plaintiffs could present stronger evidence of market harm, he suggested, the outcome might be very different.
The publishers took notes.
The Anthropic Settlement, The $1.5 Billion Shadow
Then in September 2025, Anthropic, backed by Amazon and Google, agreed to pay $1.5 billion to settle a class-action lawsuit brought by authors who made substantially similar allegations. It was the largest copyright settlement in U.S. history, covering approximately 500,000 books, at roughly $3,000 per work.
That settlement didn’t establish legal precedent, settlements are agreements, not rulings. But it established a financial precedent: when AI companies train on pirated material without licensing and get caught in court, the price tag has a “B” after it.
Meta, however, has already won once. The question now is whether it can win against a materially different set of plaintiffs.
What Publishers Bring That Individual Authors Simply Couldn’t
This is the analytical heart of the case, and it’s worth pausing here, because it’s where most news coverage stops short.
The publishers’ complaint isn’t just louder than the Kadrey case. It’s structurally different in three ways that matter to the fair-use analysis.
Market Harm You Can Actually Measure
When a novel by Sarah Silverman was allegedly ingested by Llama, proving market harm was difficult. Did Llama outputs actually displace sales of her memoir? That’s a fuzzy, speculative question.
But when Llama answers a biology student’s question that would have required consulting a Cengage textbook, the substitution is direct and measurable. Academic publishers track revenue by course, by institution, by semester. If AI-generated answers cannibalize textbook sales, the publishers can produce spreadsheets showing exactly how and by how much.
That’s not speculation. That’s evidence. And it’s precisely the kind Judge Chhabria said was missing from Kadrey.
The Licensing Market Meta Chose to Ignore
Since 2023, Meta has signed licensing deals with Reuters, CNN, Fox News, People Inc., and USA Today for content. Those agreements demonstrate that a legitimate licensing market exists, that AI companies can pay for content when they choose to.
The existence of those deals creates a legal problem for Meta’s fair-use defense. In fair-use analysis, courts look at whether the defendant’s use harms an existing or potential licensing market. Meta’s own behavior, licensing some content while pirating other content, is strong evidence that it bypassed the publishers deliberately. The complaint calls out that bad-faith selectivity directly.
The Catalogue Is Orders of Magnitude Larger
Kadrey involved roughly 666 specific books from a small group of individual authors. The new complaint covers the entire publishing output of five companies that together account for a substantial share of the world’s academic, educational, and trade publishing.
That scale matters. Fair use is, by design, a case-by-case analysis. You can plausibly argue that training on 600 books is transformative. Training on millions of works, including the core educational content of entire disciplines, is a different argument entirely.
The Wider War, Why This Case Matters Beyond Meta
I’ve been writing about the collision between AI and copyright for a few years now, and there’s a pattern emerging that I think is worth naming out loud.
In the early months of the generative AI boom, the narrative was: “This is new technology, the old rules don’t fit, we need to figure this out.” That framing implicitly gave AI companies permission to operate in a legal vacuum.
We’re no longer in that vacuum.
The Kadrey ruling said: fair use can apply, but plaintiffs with stronger market-harm evidence may succeed. The Bartz settlement said: if you train on pirated material, the financial consequences can reach $1.5 billion. This new case asks: what happens when institutional publishers, with institutional resources, institutional documentation, and institutional market data, face the same defendant that beat individual authors?
The answer will shape the economics of AI for a decade. If the publishers win, AI training data becomes a licensed commodity, and that licensing market, which already exists in fragments, becomes a structural fixture. Every AI company currently relying on broadly scraped corpora would need to renegotiate its data pipeline or face similar suits.
If Meta wins, training on pirated material at scale becomes legally durable in the United States, and the regulatory response would shift to legislatures. That’s a much slower, more politically complicated path.
What Happens Next, A Timeline for Watching
Don’t expect a resolution next week. Or next month. Or probably next year.
The procedural calendar for this case, class certification, motions to dismiss, summary-judgment briefing, potential trial scheduling, will likely take 18 to 24 months in the ordinary course. Meanwhile, several other AI-training copyright cases are moving through U.S. courts, and some may reach the appellate level before this one is resolved.
The publishers are, in effect, placing a long bet. But it’s the most credible long bet the publishing industry has yet placed against an AI-training defendant.
For creators watching from the sidelines, here’s what matters: Tuesday’s filing is not just a lawsuit. It’s the publishing industry saying, collectively, that the copyright framework built over centuries still applies, even when the infringer is faster, larger, and wealthier than anyone imagined possible.
What This Means for You, and What Happens Now
Here’s the part I keep coming back to.
When I talk to writers, editors, teachers, people who make their living from words, the most common emotional response to AI isn’t fear of the technology. It’s the feeling of being stolen from. The quiet, aching disbelief that someone could take everything you’ve written, feed it into a machine, and then sell access to that machine without ever acknowledging that you existed.
This lawsuit is, in some ways, that feeling translated into legal language.
Macmillan CEO Jon Yaged put it bluntly: “It is unconscionable that one of the world’s most valuable companies chose to steal millions of works from creators for its own self-enrichment.” McGraw Hill CEO Philip Moyer added that “there is a vibrant market for AI companies to license intellectual property, and it is well established that AI models can be built and innovation can flourish without violating these rights.”
Translation: the industry is willing to work with AI companies. It’s not anti-tech. It’s anti-theft.
The road ahead is long and uncertain. But for the first time in the AI-copyright story, the side of human creators just got a lot more formidable allies.
Whether that’s enough? We’ll find out in Manhattan.
Comments
Post a Comment