Currently, GPT-3.5 Turbo works on the website and can be used by anyone free of charge, it is now being replaced by GPT-4o mini.
OpenAI has announced that the GPT-4o mini – a new, smaller version of the latest GPT-4o model – will replace the GPT-3.5 Turbo in ChatGPT. The updated model is available today for free users and those with ChatGPT Plus or Team subscriptions, and will arrive next week for ChatGPT Enterprise users. The GPT-4o mini – just like its big brother launched in May – is multimodal, which means it can interpret images, text and sound, and it will also be able to generate images. Image input is enabled in the API.
GPT-4o mini supports 128K token input context and October 2023 data. (Tokens are bits of data, roughly syllables, that language models use to process information.) It’s also very cheap as an API product, costing 60 percent less than GPT-3.5 Turbo: 15 cents per million input tokens and 60 cents per million output tokens . According to OpenAI, the GPT-4o mini will be the company’s first AI model that “instruction hierarchy” uses a new technique that causes the model to favor some instructions over others, which can make it difficult for people to perform prompt injection attacks or jailbreaks, or actions that subvert built-in tuning or system prompt instructions.
According to OpenAI, the GPT-4o mini performs well on a number of benchmarks such as MMLU (basic knowledge) and HumanEval (coding), but the problem is that these benchmarks don’t really mean much, and the actual practical use of the models they don’t matter in terms of This is because the sense of quality from the model’s output has more to do with style and structure than raw factuality or mathematical ability. This kind of subjectivity is one of the most frustrating things about artificial intelligence right now.
All that said, what’s certain is that according to OpenAI, the new model outperformed last year’s GPT-4 Turbo in the LMSYS Chatbot Arena leaderboard – which measures user ratings after randomly comparing the model to another model. But even this metric isn’t as useful as the AI community had hoped, because people have noticed that while the mini’s big brother (GPT-4o) regularly outperforms the GPT-4 Turbo in the Chatbot Arena, it tends to underperform in general produce useful outputs (for example, giving too long answers or performing tasks that were not even asked of him).
OpenAI is not the first company to release a smaller version of an existing language model. This is common practice in the AI industry from vendors such as Meta, Google, and Anthropic. These smaller language models are designed to perform simpler tasks at lower cost, such as creating lists, summarizing, or suggesting words instead of performing in-depth analysis. Smaller models are typically aimed at API users who pay a fixed price per token input and output to use the models in their own applications, but in this case the GPT-4o mini will save money as part of the free-to-use ChatGPT from OpenAI for him too.
Smaller large language models (LLMs) tend to have fewer parameters—parameters are stores of numerical values in a neural network that store learned information. Fewer parameters mean that LLM has a smaller neural network, which generally limits the depth of an AI model’s ability to interpret context. Models with more parameters typically “think deeper” because the number of relationships between concepts stored in numerical parameters is greater.
To complicate matters, however, there is not always a direct correlation between parameter size and capabilities. The quality of the training data, the efficiency of the model’s architecture, and the training process itself all affect the model’s performance, as seen recently in “very smart” small models like the Microsoft Phi-3. Fewer parameters mean less computation to run the model, which means either less powerful (and cheaper) GPUs are required, or less computation is required on existing hardware, resulting in a lower power bill, i.e. lower cost to the user.
Source: sg.hu