HomeWinBuzzer NewsMicrosoft Expands Phi-3 AI Model Family with New Multimodal Capabilities

Microsoft Expands Phi-3 AI Model Family with New Multimodal Capabilities

Phi-3 Vision is a new Al that delivers visual reasoning tasks within Microsoft's multimodal small language models.


has introduced three new models to its Phi-3 family of small language models (SLMs): Phi-3-small and Phi-3-medium are now available, and Phi-3-vision is a new model. These models are designed to be efficient and powerful, catering to various resource-constrained environments such as on-device, edge, and offline inference scenarios. Microsoft initially announced the Phi-3 family last month. 

Capabilities and Optimization

The Phi-3 models are engineered to deliver high performance while being cost-effective. They are optimized for environments where fast response times are essential, making them suitable for mobile devices and other platforms with limited computational resources. This optimization ensures that the models can operate efficiently without consuming excessive memory or processing power.

Phi-3-Vision: A Multimodal Model

Among the new releases, Phi-3-Vision stands out as a multimodal model capable of processing both text and images. This model, which boasts 4.2 billion parameters, excels in general visual reasoning tasks. Unlike other AI models that generate images, Phi-3 Vision focuses on understanding and analyzing visual data, making it useful for tasks such as interpreting charts and graphs.

Microsoft has integrated the Phi-3-mini model into its Azure AI's Models-as-a-Service (MaaS) platform. This integration allows users to leverage the capabilities of Phi-3-mini for various applications through Azure's infrastructure. Additionally, Microsoft is enhancing its API offerings to support multimodal experiences, enabling more versatile .

New Features in Azure AI Speech

In conjunction with the Phi-3 model announcements, Microsoft is also previewing new features for Azure AI Speech. These features include speech analytics and universal translation, aimed at helping developers create high-quality, voice-enabled applications. These enhancements are expected to provide more robust tools for speech processing and analysis.

The Phi-3 family was initially introduced in April with the release of Phi-3-mini, a model with 3.8 billion parameters. The new additions, Phi-3-small and Phi-3-medium, have 7 billion and 14 billion parameters, respectively. These models are designed to be less compute-intensive, making them suitable for a wide range of devices, including and laptops.

Luke Jones
Luke Jones
Luke has been writing about all things tech for more than five years. He is following Microsoft closely to bring you the latest news about Windows, Office, Azure, Skype, HoloLens and all the rest of their products.

Recent News