OpenAI has expanded its developer offerings by rolling out the full version of its o1 model through its API. This advanced reasoning model, which excels at complex, multi-step tasks, introduces new features that promise to reshape how developers build AI-powered applications.
The update is part of the “12 Days of OpenAI” series of announcements, where the company releases new features and updates for its major products.
Alongside the o1 model, OpenAI has also announced enhancements to its Realtime API for voice interactions and a new preference fine-tuning method, providing developers with unprecedented flexibility.
The o1-2024-12-17 model replaces the o1-preview version launched earlier this year. According to OpenAI, the updated model offers “more comprehensive and accurate responses, particularly for questions pertaining to programming and business, and is less likely to incorrectly refuse requests.”
These improvements, coupled with a 60% reduction in reasoning token usage, make the o1 model faster, more efficient, and more versatile.
Advancing Reasoning via API with the o1 Model
OpenAI’s o1 model is designed to tackle tasks requiring logical consistency and analytical depth, outperforming previous iterations on benchmarks such as SWE-Bench Verified and AIME.
OpenAI reports that accuracy for programming tasks has risen from 52.3% to 76.6%, while performance on mathematical problems jumped from 42% to nearly 80%.
Category | Eval | o1-2024-12-17 | o1-preview |
---|---|---|---|
General | GPQA diamond | 75.7 | 73.3 |
MMLU (pass @1) | 91.8 | 90.8 | |
Coding | SWE-bench Verified | 48.9 | 41.3 |
LiveCodeBench | 76.6 | 52.3 | |
Math | MATH (pass @1) | 96.4 | 85.5 |
AIME 2024 (pass @1) | 79.2 | 42.0 | |
MGSM (pass @1) | 89.3 | 90.8 | |
Vision | MMMU (pass @1) | 77.3 | — |
MathVista (pass @1) | 71.0 | — | |
Factuality | SimpleQA | 42.6 | 42.4 |
Agents | TAU-bench (retail) | 73.5 | — |
TAU-bench (airline) | 54.2 | — |
One standout feature is structured output support, allowing developers to generate responses in predefined formats such as JSON.
This ensures seamless integration with external systems like APIs and databases, making the model ideal for applications in customer support, logistics, and data analysis.
The model also introduces visual reasoning capabilities, enabling the analysis of images for tasks such as debugging or scientific research. For instance, developers can now input visual data, such as scanned documents or blueprints, and receive context-aware responses.
Additionally, a new “reasoning effort” parameter lets developers control how long the model spends on each task, balancing precision and efficiency.
OpenAI explained in its blog, “We are rolling out access incrementally while working to expand access to additional usage tiers and ramping up rate limits.”
Enhancing Voice Interactions with Realtime API
OpenAI also made significant updates to its Realtime API, which powers real-time voice interactions. The addition of WebRTC, Web Real-Time Communication — a protocol for low-latency communication — enables developers to create seamless voice applications for virtual tutors, assistants, and translation tools. It enables peer-to-peer (P2P) connections without requiring additional plugins or software.
OpenAI highlighted the advantages of WebRTC, stating, “In scenarios where you would like to connect to a Realtime model from an insecure client over the network (like a web browser), we recommend using the WebRTC connection method. WebRTC is better equipped to handle variable connection states, and provides a number of convenient APIs for capturing user audio inputs and playing remote audio streams from the model.“
The implementation of WebRTC uses so-called ephemeral tokens, temporary API keys specifically designed for securely authenticating client-side applications when connecting to the OpenAI Realtime API over WebRTC. Their purpose is to ensure a safe, short-lived authentication mechanism that avoids exposing sensitive standard API keys directly in client environments like web browsers.
The upgrades to the Realtime API simplify the development process, reducing the code required for voice applications while improving audio quality and response accuracy. Developers can now build applications that begin formulating responses while users are still speaking, enhancing responsiveness.
Pricing adjustments make voice applications more accessible. The cost of GPT-4o audio tokens has been reduced by 60%, while cached input tokens are now 87.5% cheaper. OpenAI has also introduced GPT-4o mini, a cost-effective option for developers seeking affordable alternatives, priced at $10 per million input tokens.
Refining AI Behavior with Preference Fine-Tuning
Preference fine-tuning is a new customization method that allows developers to refine model behavior based on paired comparisons of responses. Unlike traditional fine-tuning, which relies on exact input-output pairs, preference fine-tuning teaches the model to distinguish between preferred and less desirable responses.
OpenAI describes this method as particularly effective for subjective tasks, such as tailoring tone and style in creative writing or ensuring compliance with specific formatting requirements. According to OpenAI, early adopters like a financial analytics firm, reported that preference fine-tuning improved response accuracy by 5% for complex, out-of-distribution queries.
“We started testing Preference Fine-Tuning with trusted partners who have seen promising results so far. For example, Rogo AI(opens in a new window) is building an AI assistant for financial analysts that breaks down complex queries into sub-queries.
Using their expert-built benchmark, Rogo-Golden, they found that while Supervised Fine-Tuning faced challenges with out-of-distribution query expansion—such as missing metrics like ARR for queries like “how fast is company X growing”—Preference Fine-Tuning resolved these issues, improving performance from 75% accuracy in the base model to over 80%.”
Expanding SDK Options for Developers
To support a broader range of programming environments, OpenAI has also introduced official SDKs for Go and Java, alongside its existing libraries for Python, Node.js, and .NET. These SDKs simplify integration, enabling developers to deploy AI models in scalable backend systems or enterprise applications.
The Go SDK is designed for lightweight and efficient server-side applications, while the Java SDK caters to enterprise-grade solutions, offering strong typing and robust support for large-scale projects. OpenAI’s documentation provides detailed guidance for leveraging these new tools.
Previous Announcements During the “12 Days of OpenAI”
On December 16, OpenAI made its ChatGPT live web search feature available to all users, allowing anyone to retrieve up-to-date information directly from the web.
December 14 brought new customization options to ChatGPT, letting users streamline tasks and manage projects effectively. Projects allows users to group chats, files, and custom instructions into dedicated folders, creating an organized workspace for managing tasks and workflows.
As a huge improvement to its advanced voice mode for ChatGPT, OpenAI on December 12 added vision capabilities, enabling users to share live video and screens for real-time analysis and assistance.
On December 11, OpenAI fully released Canvas, a collaborative editing workspace that offers advanced tools for both text and code refinement. Initially launched in beta in October 2024, Canvas replaces ChatGPT’s standard interface with a split-screen design, allowing users to work on text or code while engaging in conversational exchanges with the AI.
The addition of Python execution is a standout feature of Canvas, enabling developers to write, test, and debug scripts directly within the platform. OpenAI demonstrated its utility during a live event by using Python to generate and refine data visualizations. OpenAI described the feature as “reducing friction between idea generation and implementation”.
On December 9, OpenAI officially launched Sora, its advanced AI tool for generating videos from text prompts, signaling a new era for creative AI. Integrated into paid ChatGPT accounts, Sora allows users to animate still images, extend existing videos, and merge scenes into cohesive narratives.
Released on December 7 was Reinforcement Fine-Tuning as a new framework designed to enable the customization of AI models for industry-specific applications. It is OpenAI’s latest approach to improving AI models by training them with developer-supplied datasets and grading systems. Unlike traditional supervised learning, which focuses on replicating desired outputs
On December 5, OpenAI unveiled ChatGPT Pro, a new premium subscription tier priced at $200 per month, aimed at professionals and enterprises seeking advanced AI capabilities for high-demand workflows.