In a surprise incident this weekend, OpenAI exposed its upcoming o1 model, which promises advanced reasoning capabilities. Jimmy Apples, a notorious X user known for circulating insider leaks about OpenAI, drew widespread attention when he posted a link granting temporary access to this unreleased model.
SEEMS LEGIT. I AM GETTING A O1 TYPE MODEL WITH ATTACH FILE CAPABILITY WHEN I APPEND” Apples tweeted on November 2. He tested and confirmed features such as image analysis before warning that OpenAI might close the loophole soon. This prompted a surge of interest from other enthusiasts eager to experience the model’s enhanced reasoning abilities firsthand.
Seems legit. I am getting a o1 type model with attach file capability when I append https://t.co/ZMeUCtAWmr
Successfully had it analyse an image.
Might not last long after this post. https://t.co/vXwhfHgxVr
— Jimmy Apples 🍎/acc (@apples_jimmy) November 2, 2024
The o1 model demonstrated improved performance, particularly in complex reasoning tests. One example involved the SimpleBench benchmark, a standard evaluation for language models. While the currently available o1-preview model struggled with certain questions, the leaked o1 model answered a complex physics-related query correctly, as shared by another X user, @legit_rumors. The question involved predicting the position of a purple ball in relation to a blue one, a challenge that o1-preview failed but o1 handled seamlessly.
SimpleBench evaluates how well AI models comprehend general knowledge through multiple-choice questions, with human test-takers achieving a benchmark score of 83.7%. In comparison, o1-preview only managed 41.7%, and GPT-4o lagged far behind at 17.8%. The ability of the o1 model to reason more accurately marked a significant leap forward, even as its temporary availability was cut short when OpenAI closed the loophole. By late November 2, 2024, attempts to access o1 through the leaked URL were redirected, and previously open chats with o1 began displaying error messages.
@legit_rumors says he was also able to get confirmation that the model was o1 and not o1-preview, sharing the following image.
He also tested the leaked model for image processing and analysis which worked as well. The o1 model allowed image uploads and used reasoning to analyze it afterwards, “thinking” about it for 7 seconds. This was also reproduced by other users joining the community discussion on X.
After a while, OpenAI seemingly closed the leak. Accessing o1 via the URL parameter now does not work anymore and redirects to another model. Already open chats with o1 stopped working and started to produce error messages.
Background: OpenAI’s o1 Model and Its “Strawberry” Framework
The abrupt leak comes just months after OpenAI launched its o1-preview model, the first AI model under its new “Strawberry” AI framework. The official release was a pivotal moment for OpenAI, introducing a more refined approach to reasoning and problem-solving.
The o1 model represents a strategic shift from traditional training methodologies, moving away from simple pattern recognition to more sophisticated processes. This includes using a specialized optimization algorithm and reinforcement learning, known as “chain of thought prompting,” which helps the model systematically break down complex queries. Jerry Tworek, leading the research team, admitted that while the model shows marked improvements in reasoning, the persistent issue of “hallucinations”—instances of AI fabricating false information—remains unresolved.
At the time of the official release, OpenAI touted significant advances in areas like mathematics. Bob McGrew, OpenAI’s Chief Research Officer, reported that o1 solved 83% of problems from the International Mathematics Olympiad, a considerable improvement over GPT-4o’s 13% success rate. However, the model struggles when broader factual knowledge is needed, and it still faces limitations in image processing when compared to multimodal models like GPT-4o.
API Access and Tiered Pricing
Access to the o1-preview model, initially made available to ChatGPT Plus and Team subscribers, comes with a high price. While it costs developers $15 per million input tokens and $60 per million output tokens—substantially higher than the rates for GPT-4o—API access remains exclusive to tier 5 accounts, which require at least $1,000 in API credits.
Moreover, certain features, such as image input and reasoning token visibility, remain restricted. OpenAI’s decision to hide reasoning tokens, even though they are billed, is part of a broader policy to protect user data, ensure safety, and maintain a competitive edge.
Last Updated on November 7, 2024 2:13 pm CET