HomeWinBuzzer NewsKudurru: Novel Tool Blocks AI Web Scraping of Images for DALL-E, Midjourney...

Kudurru: Novel Tool Blocks AI Web Scraping of Images for DALL-E, Midjourney and Co.

Kudurru offers more than just blocking; it allows users to "poison" data for scrapers, replacing the original image with an alternate one, potentially disrupting AI model training.

-

Spawning has launched Kudurru, a tool designed to combat unauthorized web scraping of artists' works by AI image generators. This move comes amid growing concerns from artists and illustrators about their creations being used without consent or compensation.

Kudurru operates as a network of websites that can detect web scraping in real-time. Kudurru´s network is equipped to identify web scraping activities as they occur. When a scraper is detected on one domain, the tool swiftly identifies the scraper's IP address, subsequently blocking access across all other domains integrated with the Kudurru software.

Dual-Pronged Strategy: Block and Disrupt

Beyond mere detection and blocking, Kudurru introduces an added layer of defense. Users are presented with the option to “poison” the data being accessed by scrapers. In this scenario, instead of procuring the original image, scrapers are fed an alternate image, potentially skewing the data for the AI model in training. For example, if all images retrieved from a Kudurru-protected site bear the message “NO AI,” the AI model could erroneously associate that particular style or theme with the “NO AI” directive.

Spawning's official documentation on Kudurru underscores its capability to both protect individual data and disrupt broader processes. By enlisting in the Kudurru network, users not only safeguard their own digital assets but also amplify the collective defense against unauthorized scrapers. The efficacy of the tool is directly proportional to the size of its network: a more expansive network translates to swifter and more efficient scraper detection.

Concerns Over Protecting Copyrights within AI Training

AI models need data to train with and how companies acquire this data has become an area of controversy. Many content creators have hit back against companies like with lawsuits over using their works to train AI models. 

A collective of writers, including prominent figures like Michael Chabon and David Henry Hwang, have filed a lawsuit against OpenAI.  They argue that OpenAI unlawfully utilized their copyrighted works to train its AI model, ChatGPT. Chabon and the group have also brought a similar lawsuit against Meta Inc. for the same reasons

Earlier in the year, Sarah Silverman, Christopher Golden, and Richard Kadrey accused both OpenAI and Meta of copyright infringement. They claim technology companies obtained their books from illegal sources, such as websites that offer free downloads of pirated books.

In July, a group of leading news publishers also considered suing AI companies over copyright infringement. The publishers allege that the AI firms are infringing on their  rights and undermining their business model by scraping, summarizing, or rewriting their articles and distributing them on various platforms, such as websites, apps, or .

SourceSpawning
Luke Jones
Luke Jones
Luke has been writing about all things tech for more than five years. He is following Microsoft closely to bring you the latest news about Windows, Office, Azure, Skype, HoloLens and all the rest of their products.

Recent News