Quora’s artificial intelligence chatbot, Poe, has come under fire for a feature that lets users download HTML versions of articles from paywalled websites, stirring legal and ethical debates on copyright violations.
Functionality and Legal Concerns
Poe, which has received a $75 million investment from Andreessen Horowitz, allows users to generate HTML captures of articles by entering a URL into the platform’s Assistant bot. WIRED found that it was possible to download articles from a variety of paywalled publishers, including The New York Times, Bloomberg Businessweek, and Forbes.
The process begins when users input a URL into Poe’s Assistant bot. This bot then retrieves the web content, often bypassing protocols designed to restrict such access. Server logs from testing sites confirmed that a “Quora Bot” accessed these articles immediately after being prompted by a user.
Additionally, the Assistant bot relies on Anthropic’s Claude model, which processes text fetched by Quora’s server rather than having direct internet access. Consequently, users receive HTML files that resemble PDFs of the original articles, thus providing full access to paywalled content.
AI Sparks Legal Copyright Battles
The issue mirrors ongoing legal battles involving major companies. The New York Times is currently suing OpenAI and Microsoft for similar copyright infringements, and Forbes has made accusations against Perplexity AI for what they describe as “willful infringement.” These disputes highlight the escalating tension between AI platforms and traditional media institutions.
OpenAI is trying to work around this with licensing deals for accessing paywalled content for AI model training. They have already secured such deals with TIME magazine, The Atlantic and Vox Media, News Corp, and the Financial Times, reflecting a strategic shift towards more formalized content usage frameworks.
However, the debate about fair content use for AI training is ongoing. Recently, Mustafa Suleyman, head of Microsoft‘s AI division, has sparked a fierce debate by asserting that content on the open web is available for anyone to copy and use freely, including scraping bots that feed systems to train AI models.
During an interview, Suleyman stated there has been a long-standing assumption, dating back to the ’90s, that online content is like “freeware” and can be reused without restriction.
Last Updated on November 7, 2024 3:45 pm CET