Microsoft has unveiled a suite of new capabilities within Azure AI Studio, aimed at bolstering the security and accuracy of generative AI applications developed on its cloud platform. These enhancements, which are at various stages of availability, are designed to address common forms of AI misuse, including hallucinations, input poisoning, and prompt injection attacks.
Preventing Prompt Injection Attacks
A major update is the introduction of Prompt Shields, now available in public preview. This feature is engineered to deter both direct and indirect prompt injection attacks, which can lead to malicious outputs from AI systems. Direct attacks, also known as jailbreak attacks, involve feeding the AI system a prompt that causes it to act outside its intended design. Indirect attacks manipulate the AI’s input data, tricking the system into accepting untrusted content as valid commands. Prompt Shields aim to detect and block these attacks in real time, integrating seamlessly with Azure OpenAI Service content filters and Azure AI Content Safety for a comprehensive defense.
Combating Hallucinations with Groundedness Detection
Another critical addition is the groundedness detection feature, which identifies and mitigates text-based hallucinations in model outputs. Hallucinations in AI outputs, where the AI generates ungrounded or irrelevant information, pose a significant challenge to the reliability and adoption of generative AI tools. Microsoft’s groundedness detection offers developers multiple options to address these ungrounded claims, enhancing the credibility and usefulness of AI-generated content. The availability of this feature, whether already generally available or still in pre-release, has not been specified by Microsoft.
Automated Safety Evaluations and Risks Monitoring
To further support developers in creating secure AI applications, Microsoft introduces automated safety evaluations. This feature, now in public preview, employs AI to test generative AI applications for potential content and security risks, augmenting manual red teaming efforts. Additionally, the risks & safety monitoring capability provides developers with insights into the usage of their AI applications, including metrics on blocked requests and the identification of users potentially engaging in misuse. This feature aims to enable developers to take proactive measures based on their product’s terms of use.
Lastly, Microsoft plans to release safety system message templates in Azure AI Studio soon. Developed to mitigate harmful content generation and misuse, these templates will assist developers in crafting precise system messages that guide AI systems towards desired behaviors, thereby improving overall system performance.
Last Updated on November 7, 2024 9:19 pm CET