Updated Claude 3.5 Sonnet Model Challenges GPT-4o and Gemini 1.5 Pro

Anthropic released the updated Claude 3.5 Sonnet model and the new Claude 3.5 Haiku model. announced. The updated Claude 3.5 Sonnet model offers overall improvements with significant gains in coding. Claude 3.5 Haiku is Anthropic’s answer to OpenAI’s GPT-4o Mini and Google’s Gemini 1.5 Flash. It is stated that this model will have the same price level as the previous one, but will be offered with significant performance improvements.

Claude 3.5 Sonnet Improvements

  • Its SWE-bench Verified score increased from 33.4% to 49.0%, the best score ever achieved by any model in the industry.
  • The TAU-bench score increased from 62.6% to 69.2% in the retail field and from 36.0% to 46.0% in the airline field.
  • GPQA and MMLU Pro scores improved to 65% and 78% respectively, which is better than Gemini 1.5 Pro.

What Does Claude 3.5 Haiku Offer?

The new Claude 3.5 Haiku model outperforms the Claude 3 Opus, the largest model in Anthropic’s previous generation, in many AI benchmarks. Claude 3.5 Haiku scores 40.6% in SWE-bench Verified, beating the original Claude 3.5 Sonnet and OpenAI GPT-4 Turbo. Claude 3.5 Haiku will initially be available as a text-only model, with image support added later.

Anthropic also emphasized that joint pre-deployment testing of the new Claude 3.5 Sonnet model was conducted by the US Artificial Intelligence Security Institute (US AISI) and the UK Security Institute (UK AISI) as part of the agreement they signed earlier this year. According to Responsible Scaling Policies, the updated Claude 3.5 Sonnet model falls under the ASL-2 Standard.

The updated Claude 3.5 Sonnet is now available to all developers for the same price via Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI. The new Claude 3.5 Haiku model will be available later this month.

Source: www.technopat.net