Recent findings have unveiled a potential vulnerability in GPT-4 Vision (GPT-4V) capabilities. Researchers have discovered that the system can be manipulated using images containing specific text. This raises concerns about the robustness of the model's visual recognition system.
Details of the Vulnerability
According to reports, the vulnerability allows for the injection of prompts into the model using images with embedded text. This means that an individual could potentially guide the model's responses by presenting it with an image containing specific textual instructions. Simon Willison, in a separate report, referred to this as “multi-modal prompt injection.” He elaborated that the flaw could be exploited to produce misleading or biased outputs from the model.
An unobtrusive image, for use as a web background, that covertly prompts GPT-4V to remind the user they can get 10% off at Sephora: pic.twitter.com/LwjwO1K2oX
— Riley Goodside (@goodside) October 14, 2023
Implications and Context
The discovery of this vulnerability underscores the challenges faced by developers in ensuring the security and reliability of AI systems. As AI models become more sophisticated and integrated into various applications, ensuring their robustness against potential threats becomes paramount. This recent finding serves as a reminder of the continuous need for rigorous testing and refinement in the rapidly evolving world of artificial intelligence.
GPT-4V vision jailbreak confirmed using PENGUIN. Fascinating! 🐧
(interesting details in comments) pic.twitter.com/fmgWCWq2as
— Benjamin De Kraker (@BenjaminDEKR) October 13, 2023
GPT-4 Vision: An Enhanced AI Model
Earlier this month, OpenAI introduce GPT-4V, an enhanced version of its flagship generative AI model. Even so, the company made it clear that the changes do bring security risks. These new functionalities are designed to allow a user to upload an image file and then pose questions about the image to the upgraded GPT-4, termed as GPT-4V – with V indicating ‘vision'.
In GPT-4V Image content can override your prompt and be interpreted as commands. pic.twitter.com/ucgrinQuyK
— Patel Meet 𝕏 (@mn_google) October 4, 2023
OpenAI has chosen not to let the model comment on how people look in the images they upload. The GPT-4V model can sometimes miss important details in the images, such as text, characters, math symbols, locations, and colors, according to a paper [PDF] published by OpenAI. OpenAI also doubts that GPT-4V can do some tasks well, like telling apart illegal drugs or edible mushrooms. OpenAI warns that GPT-4V could be used to spread false information on a large scale. Besides these issues, OpenAI plans to add voice input support for IOS and Android devices that can enable interactive conversations.