OpenAI Launched its "Strawberry" o1 AI model

Last week, OpenAI launched its much-rumored “strawberry” AI model o1. Designed to excel in complex reasoning, problem-solving, and STEM fields like coding and mathematics, the o1 AI model represents a huge advancement from previous models, such as GPT-4 and GPT-4o. Unlike previous models that focus on speed, OpenAI’s o1 takes more time to think through its responses, which enhances its ability to solve harder problems.

This new series of “reasoning” is currently available in two versions: o1-preview and o1-mini.

Let’s have a look at the features and capabilities of this new OpenAI’s model and how is it better compared to other AI models.

Advanced Reasoning through “Chain of Thought”

The o1 model is built on a new approach known as “chain-of-thought” reasoning. This method allows the models to process queries step-by-step, breaking down complex problems into smaller parts, much like how humans think through challenges.

By mimicking human-like thought processes, o1 excels at tasks requiring intricate reasoning, such as multistep problem-solving in mathematics, scientific research, and coding.

In OpenAI’s testing, o1 outperformed GPT-4o by a significant margin, particularly in academic fields. For example, it achieved an impressive 83% success rate in a qualifying exam for the International Mathematics Olympiad, compared to GPT-4o’s 13%.

Improved Hallucination Mitigation

One of the main challenges with AI models like GPT-4 has been the issue of “hallucinations” i.e., generating false or unsupported information. OpenAI has addressed this with the o1 models through its enhanced reasoning process. By taking more time to think through problems step by step, o1 reduces the likelihood of providing inaccurate or misleading information.

However, the issue is not entirely resolved. OpenAI has acknowledged that while hallucinations have decreased, they still occur, and the team continues to work on mitigating this problem in its future updates.

Performance in STEM and Beyond

The o1 model shines in science, technology, engineering, and mathematics (STEM) applications. For example, o1 reached the 89th percentile in Codeforces, a competitive programming platform, showcasing its coding abilities.

Moreover, OpenAI trained the o1 models on a diverse range of public, proprietary, and custom datasets, giving these models a deep understanding of both general knowledge and specialized fields.

Safety and Alignment: A Top Priority

OpenAI has made significant strides in improving the safety and ethical alignment of its models with the o1 AI model. The model’s reasoning capabilities extend to its ability to follow safety guidelines more rigorously. By reasoning through context and applying ethical considerations, o1 significantly reduces its vulnerability to “jailbreaking” attempts, where users try to bypass safety measures. In safety tests, o1 scored 84 out of 100 on resistance to jailbreaking, far surpassing GPT-4o’s score of 22.

Additionally, OpenAI has partnered with AI safety institutes in the U.S. and U.K. to conduct careful testing and evaluations of o1, ensuring it adheres to high safety and alignment standards.

Cost and Accessibility

While o1-preview offers advanced capabilities, it comes with a higher price tag. The API for o1-preview costs $15 per 1 million input tokens and $60 per 1 million output tokens, significantly more expensive than GPT-4o. This makes it ideal for enterprise-level users or those who require advanced reasoning for specific high-stakes projects.

On the other hand, the o1-mini model offers a more budget-friendly alternative. At 80% lower cost than o1-preview, it remains highly effective for coding and reasoning tasks, making it accessible to educational institutions, startups, and smaller businesses.

What’s Next for o1 AI model?

The release of o1 AI model is just the beginning. OpenAI plans to continuously update and improve the o1 series, with enhancements to browsing, file and image uploads, and other advanced features on the horizon. Future updates will also aim to further refine the model’s reasoning abilities, reduce hallucinations, and expand its safety and ethical compliance.

“In our tests, the next model update performs similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology.”

OpenAI’s o1 preview and o1-mini are accessible to the ChatGPT Plus and Team users. The company plans to make o1-mini available to all ChatGPT Free users soon.

As OpenAI continues to refine and expand the o1 series, it is setting the stage for the next generation of AI models, moving ever closer to the goal of human-like intelligence. While it may be slower and more expensive than previous models, its ability to think like a human makes it a powerful tool for anyone needing deep, accurate reasoning.