News Corp has filed a major lawsuit against Perplexity AI, claiming the startup is using its copyrighted material without permission. The media giant, which owns outlets like The Wall Street Journal and New York Post, says Perplexity’s “answer engine” is lifting and summarizing articles to deliver responses without directing users to original sources.
According to the lawsuit, this has cost News Corp significant ad revenue, as fewer people are visiting their websites as a result. The legal action comes at a time when tensions between AI companies and publishers are heating up over content usage practices, with news organizations accusing tech firms of “freeriding” on their hard work. In the case of Perplexity, News Corp is seeking up to $150,000 for every instance of infringement.
Perplexity’s Answer Engine Model
Perplexity AI provides a search engine experience where users get quick, concise answers without needing to visit individual web pages. The platform collects data from multiple sources, including news outlets, which are then summarized for the user. However, publishers like News Corp argue that this model robs them of web traffic and potential earnings.
Robert Thomson, the CEO of News Corp, blasted Perplexity’s approach, saying the company “wants to skip paying for content.” Thomson pointed out that while Perplexity may claim to respect copyright laws, its actions demonstrate otherwise. He further suggested that this case could lead to more lawsuits from other media companies feeling similarly slighted by AI-driven platforms.
Perplexity Faces Multiple Accusations
Perplexity AI is no stranger to controversy. Just last week, The New York Times demanded that Perplexity stop using its articles, accusing the startup of illegally scraping and summarizing content. The Times sent a cease-and-desist letter to Perplexity, with the possibility of further legal action if the issue isn’t resolved by the end of the month.
Perplexity AI’s meteoric rise—reaching a valuation of over $1 billion in early 2024—has attracted scrutiny from various publishers, including Forbes and Condé Nast. Major media outlets such as Forbes and Wired have accused the company of replicating their articles. There have also been accusations of broader plagiarism by various outlets. Forbes spotted a duplicated piece from its website on Perplexity, identified only by a small “F” icon.
Perplexity has attempted to mitigate these concerns by adjusting its content usage practices. The company has introduced a revenue-sharing program for publishers whose content contributes to search results that generate advertising revenue. It also offers participating publishers a free subscription to its premium services, in an effort to foster cooperation rather than conflict with media outlets.
Despite this, Perplexity maintains that it doesn’t scrape data for training AI models. The company says it indexes web pages to pull factual content for user queries, claiming that “no one owns facts” and that they act as an aggregator rather than a content thief.
OpenAI’s Deal With News Corp
While Perplexity faces lawsuits, its competitor OpenAI has chosen a different path. Earlier this year, OpenAI signed a massive content deal with News Corp, paying over $250 million to license the publisher’s archives. This agreement allows OpenAI to use articles from outlets like The Wall Street Journal and MarketWatch to improve its language models legally.
News Corp was quick to praise OpenAI for its principled approach to content usage, contrasting it with Perplexity’s unauthorized use of material. With increasing pressure from news organizations, more AI firms may be forced to follow suit or face potential legal battles.
The Scraping Problem in AI
One of the key issues at play here is how AI platforms collect and use content. Traditionally, websites use a “robots.txt” file to instruct search engines and bots on what they can and can’t index. While companies like OpenAI and Google have largely respected these directives, some of Perplexity’s partners may have ignored these guidelines. According to Perplexity CEO Aravind Srinivas, the startup “respects” robots.txt commands but cannot control how every third-party source it relies on behaves.
As AI models continue to improve, the reliance on scraped data from the web has become a focal point of tension between AI developers and content creators. With media companies arguing that their work is being used without fair compensation, the industry is now facing an array of legal challenges over copyright infringement.
Rise of Licensing Agreements
To avoid these conflicts, many AI companies have started seeking out licensing deals with content providers. OpenAI’s deal with News Corp followed a similar agreement between OpenAI and Reddit, as well as another with Stack Overflow, illustrating the growing trend toward formalized partnerships between AI developers and content creators.
The financial terms of these agreements are significant, with reports suggesting that OpenAI’s deal with News Corp alone is worth over $250 million. For publishers, this represents a way to monetize their content in a world increasingly dominated by AI-driven platforms. However, for smaller AI startups like Perplexity, the financial burden of such agreements may be too great, potentially limiting their ability to compete in an increasingly crowded space.
Future Legal Battles for AI Tech Companies
While making inroads in this area, OpenAI has not been immune to accusations. In May, Eight leading newspapers owned by investment firm Alden Global Capital began legal action against OpenAI and Microsoft. At the center of the lawsuit is the allegation that OpenAI’s ChatGPT and Microsoft’s AI assistant, Copilot, were trained on copyrighted news articles without obtaining permission or offering compensation to the publishers.
The lawsuit follows a similar action taken by the New York Times against the same companies, accusing them of similar copyright infringements. Last year, Sarah Silverman, Christopher Golden, and Richard Kadrey accused both OpenAI and Meta of copyright infringement.
Similarly, authors Paul Tremblay and Mona Awad filed a lawsuit against OpenAI in June 2023. The lawsuit not only demands compensation for the alleged copyright violations but also urges the court to prevent OpenAI from continuing what they deem as “unlawful and unfair business practices.”
In July the same year, a group of leading news publishers also considered suing AI companies over copyright infringement. In their case, the publishers allege that the AI firms are infringing on their intellectual property rights and undermining their business model by scraping, summarizing, or rewriting their articles and distributing them on various platforms, such as websites, apps, or social media.
Last Updated on November 7, 2024 2:24 pm CET