HarperCollins, one of the most significant and influential publishing companies, has come under scrutiny following its proposal to license select nonfiction works for AI training. According to 404 Media, the deal involves a one-time payment of $2,500 per book for a three-year period, contingent on author opt-in. This offer arrives as authors and publishers alike grapple with the expanding role of artificial intelligence in the literary landscape.
HarperCollins is the second-largest consumer book publisher globally, with operations in 15 countries and a catalog of over 200,000 titles in both print and digital formats.
The company publishes approximately 10,000 new titles annually in 16 languages, allowing it to reach readers in over 200 countries. As one of the “Big Five” English-language publishers, HarperCollins plays a dominant role in shaping the global book market, controlling a substantial portion of the trade book market alongside its competitors.
Authors Push Back on Licensing Terms
Author Daniel Kibblesmith, known for Santa’s Husband, was quick to spotlight HarperCollins’ offer. Sharing the details on social media, he voiced frustration over the flat payment, which he viewed as undervaluing intellectual property.
Abominable.
— Daniel Kibblesmith (@kibblesmith.com) November 15, 2024 at 8:36 PM
The non-negotiable nature of the deal further fueled dissatisfaction, especially when split among contributors, such as illustrators. Kibblesmith’s public response highlighted wider concerns within the author community about potential long-term impacts on their creative rights and compensation. Drew Broussard, another writer, says he is “worked up about this“, adding “this is the moment and we have to deal with it now or we’re—not to put too fine a point on it—fucked.“
“If you are a HarperCollins writer, I suggest getting on the horn to your agent immediately and getting them to preemptively tell HC (and any/every publisher!) that your work will never be available to train LLMs or other AI processes,” he adds.
As Broussard points out, authors are probably not “going to be able to rely on the courts to protect [them], seeing as the Authors Guild suit against OpenAI has already been partially defanged by a California judge.“
The Authors Guild sued OpenAI for using their copyrighted works without permission, and is currently pushing for access to documents from current and former OpenAI employees, which OpenAI resists.
HarperCollins defended their initiative, emphasizing that it remains voluntary and framed as a way to balance author rights with new technological opportunities. The company noted that protecting shared revenue and maintaining royalty streams were central to the proposal.
However, critics argue that even voluntary agreements could set a precedent that undermines author autonomy and control over their work.
Related: |
Penguin Random House Has a Firm Stance on AI Content Use
In contrast, Penguin Random House (PRH), another major global book publisher, has taken a definitive approach by prohibiting the use of its books for AI training. This policy change, implemented in October 2024, applies to all new and reprinted titles and marks a significant stance within the publishing industry.
The clause in PRH’s copyright notices states: “No part of this book may be used or reproduced in any manner for the purpose of training artificial intelligence technologies or systems,” aligning with the EU’s DSM Directive 2019/790 that allows publishers to opt out of text and data mining. PRH UK CEO Tom Weldon reaffirmed the company’s commitment to defending authors’ rights, though he mentioned that PRH might explore AI internally for its purposes under controlled conditions.
Industry Tensions and Legal Battles
The conversation around AI training and copyright extends beyond book publishing, touching other areas of media. While HarperCollins and PRH represent divergent paths in how publishers handle AI, media outlets face similar challenges.
For example, The New York Times recently filed a cease-and-desist letter against Perplexity AI, a startup backed by Jeff Bezos, accusing it of summarizing articles without authorization. Other media houses like Forbes and WIRED have raised similar issues.
Perplexity plays innocent, pointing to its revenue-sharing program for publishers and clarifying that its platform indexes content for citation rather than training AI models.
The legal landscape surrounding AI training remains nothing but complex. In the U.S., authors and organizations continue to take legal action against companies like OpenAI for using copyrighted materials without clear licensing agreements.
Book Publishers and Media Companies: Diverging Strategies
Book publishers and media outlets may share concerns over AI, but their approaches differ. Book publishers like PRH focus on ensuring written works are not used for AI training without permission, motivated by protecting authors’ long-term rights and earnings. PRH’s robust response contrasts sharply with HarperCollins’ more exploratory stance, which some argue risks eroding these protections.
Meanwhile, media companies are concerned with safeguarding their articles from unlicensed use that impacts their revenue models and brand presence. To navigate this, some have sought partnerships that balance adaptation with compensation.
TIME, for instance, entered a multi-year agreement with OpenAI, offering controlled access to its content. The Atlantic, News Corp, the Financial Times have made similar moves.
Condé Nast also secured a deal with OpenAI, allowing access to content from The New Yorker, Vogue, and other key publications. CEO Roger Lynch of Condé Nast highlighted that these arrangements help address shifting search engine functionalities while maintaining investment in high-quality journalism.
The varied responses underscore the uncertain future that publishers and media outlets face as AI becomes integral to content consumption and production. While PRH’s firm ban signals a protective stance for authors, HarperCollins’ offer reflects an attempt to explore new revenue models amid AI technologies.