Mistral AI says ML2 supports 128K context windows and dozens of languages, including French, German, Spanish, Arabic, Chinese, Japanese, and Korean. It also supports more than 80 coding languages, including Python, Java, C, C++, JavaScript, and Bash.
This announcement follows Meta’s announcement of the Llama 3.1 family, which includes the company’s most advanced model, the 405B model (405 billion parameters). Llama 3.1, Meta said, supports 128K context length and eight languages. OpenAI recently released its most affordable small AI model, the GPT-4o Mini.
Mistral AI claims that its benchmarking results show that ML2 performs on par with leading models such as GPT-4o, Claude 3 Opus, and Lama 3 405B in areas such as coding and inference. On the popular benchmark test MMLU, ML2 scored 84%, Lama 3.1 405B scored 88.6%, and GPT-4o scored 88.8%. GPT-4o Mini scored 82%. Mistral AI models are available on Vertex AI, Azure AI Studio, Amazon Bedrock, and IBM Watson x.ai.
Key features of ML2
Many experts say the race in AI is shifting toward conversational and multimodal models that strive for complex mathematics, advanced reasoning, and efficient code generation.
According to Neil Shah, co-founder of Counterpoint Research, leading AI companies such as Mistral AI are focusing on minimizing errors, enhancing inference capabilities, and optimizing model performance for scale.
“ML2 excels in its ability to pack more performance into its size, requiring only 246GB of memory at up to 16-bit precision for training,” Shah said. “The Mistral Large 2 benefits enterprises by offering higher precision in a smaller footprint than competing offerings. It can produce more accurate, concise, and contextual responses faster than larger models that require more memory and compute.”
Additionally, enterprises that rely heavily on Java, TypeScript or C++ will benefit from the superior code generation performance and accuracy that Mistral AI’s benchmarks claim, Shah added.
Faisal Kausa, senior analyst at Techarc, explains that using an open-source model like Mistral could allow for the creation of specialized LLMs tailored to specific industries or regions.
He continued, “Eventually, over time, these kinds of specialized LLMs will emerge. Generative AI is useful, but in many cases it requires domain-specific understanding, which can only be gained when creating an LLM. So it’s important to not only provide an LLM for using AI models, but also to have an open-source platform that can be adapted and further developed to create these specific platforms.”
Charlie Dye, Forrester principal analyst, said ML2’s advanced capabilities, designed to run efficiently on a single H100 node across code generation, math, inference, performance, and cost-effectiveness, along with its multilingual support and broad support on major cloud platforms, will significantly enhance the competitiveness of enterprises’ AI initiatives.
License and other issues
One thing that users may be concerned about is that Mistral releases ML2 under the Mistral Research License, which allows use and modification for research and non-commercial purposes only. Any commercial use requires a separate commercial license.
Shah noted that “Mistral AI would have incurred significant data and training costs for ML2, so it has naturally narrowed the scope of commercial use without a license and required a strict commercial license, which could drive up the price. In certain regions, such as emerging markets, this could be a stumbling block.”
Prabhu Ram, vice president of industry research group at Cybermedia Research, added that while Mistral AI shows promise and potential, there are still some concerns. Data transparency, model interpretability, and bias risk are still important areas for improvement, he noted.
editor@itworld.co.kr
Source: www.itworld.co.kr