OpenAI’s new supermodel: Strawberry becomes o1

It’s here: OpenAI’s secret Project Strawberry is now officially available as a preview under the name o1. The model scores particularly well in coding and mathematical questions. A certain group of users can already test o1.

In July 2024, internal sources and documents revealed that OpenAI is working on a new project code-named Strawberry, which is intended to improve the ability of its own AI models to reason. After it was reported at the beginning of September, citing sources from company circles, that the launch was planned for this fall, the time has now come: OpenAI has officially announced the new OpenAI o1 model series.

The o1 and o1-mini models are initially available as a preview and require a little more time to “think” before users receive an answer. In return, they have the ability to solve complex tasks and problems, especially in the scientific and mathematical fields as well as in coding. In addition, the improved logical thinking ability allows them to test different problem-solving strategies and learn from mistakes, explains OpenAI im Blog Post:

We trained these models to spend more time thinking through problems before they respond, much like a person would. Through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes.

The preview versions of o1 and o1-mini are available today for ChaGPT Plus and Team users. When using ChatGPT, you can easily switch between GPT-4o, o1 and o1-mini – however, the use of o1 is limited to 30 messages per week and o1-mini to 50 messages per week. In the future, however, this rate will be set higher; ChatGPT will also be able to switch between the different models independently (depending on the use case). It is also encouraging that OpenAI plans to release o1-mini for users of the free version.

Can the model keep up with math and coding professionals?

Initial tests have already shown o1’s impressive capabilities. The model performs as well as doctoral students on various tasks in the fields of physics, chemistry and biology. In an aptitude test for the International Mathematical Olympiad (IMO), o1 also achieved a score of 83 percent – GPT-4o’s score was 13 percent. The model is also impressive in coding, reaching the 89th percentile in Codeforces competitions. o1 also has no difficulty with complex puzzles, as the following video illustrates.

o1-mini: faster and cheaper

Not only in scientific questions and puzzles, but also in coding – for example, in the development of video games – o1 achieves better results than the previous models. The reason: The model thinks before it answers.

OpenAI explains that the o1-mini model can be a good and often more sensible alternative to o1, particularly in the coding area. This is because o1-mini works faster and is also 80 percent cheaper than o1-preview. For programmers who want to use a model with logical thinking skills but do not need comprehensive general knowledge of AI, o1-mini may be the better choice.

How many “R”s are in the word “Strawberry”?

Although the Strawberry project has since been renamed o1, the model can answer a question that is often problematic for Large Language Models (LLMs), which may also be why it has its code name: How many “R”s are there in the word “Strawberry”? While GPT-4o incorrectly answers this question with “two”, o1 correctly recognizes that the word contains three Rs.

Post by @crumbler

View on Threads

In the blog post, OpenAI also discusses specific use cases where the use of o1 can be useful. These include the annotation of cell sequencing data by health researchers, the generation of complex mathematical formulas for quantum optics by physicists, and the creation and execution of multi-step workflows by developers.

Despite its impressive capabilities, the model does have some limitations. For example, it is currently not possible to search the Internet for information or upload files and images with o1 or o1-mini. Before these functions are available for the new models, many users will probably be better off using GPT-4o in everyday life. But when it comes to logical thinking, OpenAI is setting new standards with o1 and is providing a potential AI game changer for numerous use cases.


Google Gemini Live now available free of charge to users

© Google



Source: onlinemarketing.de