OpenAI is launching enhancements for its GPT-4 model that will allow the AI model to respond to queries about a submitted image by a user. However, this comes with a caveat. The firm notes some safety risks linked with these functionalities. These new functionalities are designed to allow a user to upload an image file and then pose questions about the image to the upgraded GPT-4, termed as GPT-4V – with V indicating ‘vision'.
OpenAI has been working on implementing safeguards to minimize the potential of the neural network to expose private data or create unsuitable outputs while processing user-submitted images. For instance, they have made efforts to impede the model's ability to recognize faces or pinpoint locations from uploaded images.
They have also decided against allowing the model to comment on appearances in the uploaded images. In a released paper [PDF], OpenAI divulged that the GPT-4V model could also fail to extract information from images, erring in identifying text, characters, mathematical symbols, spatial locations, and color mappings.
OpenAI also expressed reservations regarding GPT-4V's feasibility in performing certain tasks, such as identifying illegal drugs or safe-to-consume mushrooms. Amid the release, OpenAI cautioned about GPT-4V's potential for instigating large-scale disinformation. In addition to these, OpenAI is planning to deploy voice input support for IOS and Android systems that can be used for back-and-forth communication.
An Enhanced GPT-4, But No Development of GPT-5
The statement came after Elon Musk – who co-founded OpenAI with Altman – led the FutureOfLife initiative, a project that wants to place more controls on AI development over concerns about the emergence of artificial general intelligence (AGI). An open letter from the project urged all AI developers to cease development of AI potentially more powerful than GPT-4 for at least six months.
Microsoft has already fully embraced OpenAI and the GPT-4 model. Using the GPT-4 AI has allowed Microsoft to mainstream AI into its ecosystem, including Bing Chat, Bing Image Creator, Microsoft 365 Copilot, Azure OpenAI Service, and GitHub Copilot X.