Microsoft Launches Agentic Retrieval Preview for Azure AI Search

Azure AI Search gets Agentic Retrieval preview: LLMs deconstruct queries for relevant, context-aware AI, targeting 40% better RAG performance.

Microsoft has launched a public preview of “Agentic Retrieval” in Azure AI Search. This move aims to significantly advance conversational AI capabilities. The new system employs Large Language Models (LLMs) to analyze chat histories.

It then breaks down complex user questions into multiple, focused subqueries. These subqueries run in parallel across text and vector data, according to Microsoft’s announcement. The company claims this approach can improve answer relevance by up to 40% compared to traditional Retrieval Augmented Generation (RAG) method.

This development is important for developers of sophisticated AI agents. Agentic Retrieval aims to provide higher-quality, context-aware grounding data. Such data is essential for more intelligent AI applications. The feature is accessible via a new “Knowledge Agents” object in the preview REST API and upcoming Azure SDKs, as detailed by Microsoft Learn. It integrates with Azure OpenAI and requires Azure AI Search’s semantic ranker.

While offering enhanced query understanding, Agentic Retrieval does introduce some processing latency. This launch aligns with Microsoft’s broader strategy to consolidate AI offerings within Azure, particularly as older Bing Search APIs are retired. Developers should note the current preview status.

This means, as highlighted in the official supplemental terms, that Agentic Retrieval lacks a service-level agreement and is not yet recommended for production workloads, potentially having constrained or unsupported features. A new billing model will also apply, tied to Azure OpenAI and Azure AI Search usage.

How Agentic Retrieval Redefines Search

The “agentic” aspect of this technology involves an LLM, such as GPT-4o, scrutinizing conversation threads to discern true user intent. Instead of a single query, the model formulates multiple subqueries based on user input, chat history, and request parameters. Microsoft explains this enables features like query rewriting, spelling correction, and deconstructing multifaceted questions. For example, it can handle a query like “find a beachside hotel with airport transport near vegetarian restaurants.”

The “retrieval” component then executes these subqueries simultaneously. Results are merged, semantically ranked, and returned in a three-part response. This response includes grounding data for conversation, reference data for source inspection, and an activity plan detailing execution steps. Matthew Gotteiner, during a Microsoft Build session noted that overall speed relates to the number of subqueries.

More complex queries needing numerous subqueries might naturally take longer. Counterintuitively, he added, a “mini” planner generating fewer, broader subqueries might return results faster than a “full-size” planner creating many highly focused subqueries.

 

Strategic Shifts and Developer Considerations

Agentic Retrieval’s introduction coincides with Microsoft retiring its public Bing Search and Custom Search APIs, effective August 11. Developers are being guided towards Azure AI Agent Service, which includes a “Grounding with Bing Search” feature. This transition, however, has presented challenges.

Some developers have raised concerns about data handling, as information might move outside standard Azure compliance boundaries, and have reported integration complexities with tools like C# Semantic Kernel. 

Despite these transition hurdles, the move towards more advanced AI tools like Agentic RAG (ARAG) is seen as progress. Akshay Kokane, a Microsoft Software Engineer, explained in a Medium post that while traditional RAG is a good start, “as enterprise use cases become more complex, the limitations of static, linear workflows become apparent.”

He added that ARAG “addresses this gap by introducing dynamic reasoning, intelligent tool selection, and iterative refinement.” Underscoring industry interest, AT&T stated its enthusiasm, noting they are “looking forward to using Azure AI Search’s agentic retrieval with our agents to match the speed, complexity and diversity of information we’ll need to hit our targets,” as per the Microsoft Community Hub announcement.

Technical Implementation and Availability

Developers will use a “Knowledge Agent” resource in Azure AI Search, which connects to an LLM in Azure OpenAI to build and execute query plans. Currently, configuration is only via the preview REST APIs, as detailed by Microsoft Learn, or SDKs, with no Azure portal support. The feature is available in regions supporting the semantic ranker, on all Azure AI Search tiers except the free one, according to Microsoft.

Billing involves pay-as-you-go, token-based charges for query planning via Azure OpenAI. Similar charges apply for semantic ranking via Azure AI Search. However, Microsoft states these ranking costs are initially waived for agentic retrieval during the preview.

Microsoft offers extensive documentation and samples for Python, .NET, and REST to aid developers. Agentic retrieval is also part of recent Azure AI Foundry updates.

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x