A new study shared on arXiv has pinpointed significant safety issues in multimodal AI models such as OpenAI’s GPT-4V and GPT-4o, as well as Gemini 1.5. The analysis reveals that these models often produce unsafe outputs when dealing with combined image and text inputs.
There is concern that these models could generate harmful or improper content. Unlike single-modality models, the blend of multiple data types in multimodal AI makes it challenging to predict and manage outputs, posing risks, especially in sensitive areas like healthcare, finance, and autonomous systems.
SIUO Benchmark for Multimodal AI Safety Testing
The paper introduces a new benchmark called Safe Inputs but Unsafe Output (SIUO) to assess AI models across nine distinct safety domains: morality, dangerous behavior, self-harm, privacy violations, information misinterpretation, religious beliefs, discrimination and stereotyping, controversial topics such as politics, and illegal activities and crime. The study found that large visual language models (LVLMs) did not effectively address safety issues in multimodal contexts, leading to potentially unsafe responses.
From the 15 LVLMs examined, only GPT-4V (53.29%), GPT-4o (50.9%), and Gemini 1.5 (52.1%) surpassed the 50% threshold on the new SIUO benchmark. This highlights the necessity for substantial improvements to manage multimodal safety more effectively. The study emphasizes the need for comprehensive contextual understanding, integrating insights from all input modalities, real-world knowledge, cultural sensitivities, ethical considerations, and safety hazards to accurately interpret user intent, even when not explicitly stated.
Safety Benchmark for Multimodal AI Service Providers
The SIUO benchmark provides a tool for organizations like OpenAI, Google, and Anthropic to test and refine the safety of their multimodal models. Addressing these safety concerns can help prevent potential regulatory issues and enhance public trust. The benchmark is accessible on GitHub, aiding ongoing efforts to improve AI safety.
SIUO is designed to evaluate three essential dimensions in multimodal models: integration, knowledge, and reasoning. Our goal is to assess how effectively these models can integrate information from various modalities, align with human values through substantial knowledge, and apply ethical reasoning to predict outcomes and ensure user safety. This comprehensive evaluation ensures that models can meet the stringent safety standards required in real-world applications.
The researchers behind SIUO advocate for a reevaluation of existing safety protocols to address these new challenges. There is a growing agreement that more robust safety mechanisms are required to ensure safe and responsible deployment of multimodal AI models. This includes better monitoring tools, improved training data, and enhanced regulatory frameworks.
Creating these new safety mechanisms will require cooperation among AI researchers, industry stakeholders, and policymakers. The aim is to design a comprehensive safety infrastructure that mitigates the risks associated with multimodal AI, allowing society to harness the technologies’ benefits.
Last Updated on November 7, 2024 3:48 pm CET