Microsoft has officially made GPT-4 Turbo with Vision available to all Azure OpenAI Service customers, marking a step forward in the integration of advanced AI models into business processes. This development, as detailed in a recent blog post by the company, allows customers in the Sweden Central and East US 2 Azure OpenAI regions to deploy the “gpt-4-turbo-2024-04-09” model. This model is designed to enhance business operations by leveraging AI's power to understand and interpret images and text in a unified manner. The release comes after the integration in preview during last December.
Applications and Features
The deployment of GPT-4 Turbo with Vision has already seen a wide range of applications across various sectors. Retailers are using the model to improve online shopping experiences, while media and entertainment companies are utilizing it to manage digital assets more effectively. Additionally, the model aids various organizations in extracting insights from charts and diagrams, showcasing its versatility in processing visual information. Despite the absence of certain features from the public preview, such as Optical Character Recognition (OCR), object grounding, video prompts, and specific image data processing capabilities, Microsoft is committed to integrating these features in future updates. The forthcoming inclusion of “JSON mode and function calling for inference requests involving image (vision) inputs” promises to further enhance the model's utility.
GPT-4V introduces several key features designed to streamline the development process. Notably, it supports JSON mode and function calling, facilitating easier integration with existing codebases. The model maintains the impressive 128,000 tokens in the context window of its predecessor, GPT-4 Turbo, allowing for extensive data processing in a single request. Developers can now input images either through direct links or by passing base64 encoded images, expanding the model's utility in various applications.
Pricing and Future Developments
Microsoft has set the pricing for GPT-4 Turbo with Vision at $0.01 per 1,000 tokens for input and $0.03 per 1,000 tokens for output, with additional costs for enhanced features. This pricing strategy aims to make the technology accessible to a broad range of users, from startups to large enterprises, facilitating innovation and efficiency improvements across industries.