OpenAI has rolled out the o1 model, representing a new chapter in AI development aimed at tackling more complex reasoning tasks. The update, which includes the o1-mini variant, hopes to significantly boost problem-solving skills, particularly in coding and multi-step challenges. It is the first release based on the company's “Strawberry” AI, which enhances reasoning in OpenAI models.
Revolution in Training Methodologies
The o1 model embodies a notable shift from OpenAI's previous development strategies. Moving beyond reliance on pattern recognition from vast datasets, the model adopts a fresh optimization algorithm backed by a unique training set.
By incorporating reinforcement learning, it navigates queries using a method known as “chain of thought,” enhancing its reasoning capabilities. Jerry Tworek, leading OpenAI's research team, mentions that although accuracy is improved, the issue of AI generating incorrect information, known as hallucinations, is not entirely resolved.
Access and Pricing Information
Both ChatGPT Plus and Team subscribers now have access to o1-preview and o1-mini, with plans for broader availability to Enterprise, Edu, and eventually free ChatGPT users, though the timeline remains uncertain. Developers face higher costs for access, with the o1 version priced at $15 per million input tokens and $60 per million output tokens, a rise from GPT-4o's fees of $5 and $15.
The o1 model stands out in areas such as mathematics, where it can explain its problem-solving approaches. According to Bob McGrew, OpenAI's chief research officer, the model impressively solved 83 percent of challenges from the International Mathematics Olympiad, a marked improvement over GPT-4o's 13 percent success rate. However, when tasks require vast factual knowledge or involve web browsing and handling files or images, o1 is less adept.
Technical Insights and Training Process
Internally referred to as “strawberry,” o1-preview and o1-mini models come with specific trade-offs in cost and efficiency to improve reasoning strengths. Both models utilize a reasoning framework, referred to as “chain of thought prompting,” detailed in a 2022 paper focusing on zero-shot reasoning in large language models. This approach aids in managing tasks that demand a deeper level of thinking and revisiting past steps.
API access is restricted to tier 5 accounts with a minimum of $1,000 in API credit usage. These models lack support for certain capabilities such as system prompts, streaming, batch processing, tool usage, and image input. Response times can vary significantly, depending on the complexity involved, ranging from seconds to minutes.
Reasoning tokens, which are billed as output tokens in the platform, are crucial even though they remain hidden from API responses. The model now allows for up to 32,768 for o1-preview and up to 65,536 for o1-mini. OpenAI has opted to keep the detailed reasoning steps concealed to maintain compliance with policy, ensure user safety, and protect competitive interests, thus not exposing intermediary steps potentially containing sensitive data.
Ongoing AI Arms Race: Gemini Live Becomes Free
OpenAI rival Google has this week ramped up its own AI services by bringing the Gemini Live feature to Android users for free. Originally introduced with the Pixel 9, Gemini Live enables dynamic voice interactions where users can interject during voice responses.
Gemini Live can be initiated by tapping the circular waveform icon with a sparkle, located on the Gemini overlay or within the main application. This action launches an immersive full-screen interface equipped with “Hold” and “End” buttons. The tool operates discreetly in the background, facilitating simultaneous tasks or screen locking without affecting its functionality. Recorded conversations are preserved as text transcripts within the app's history section, enabling convenient access and review.