Skip to main content

GPT-4o and Gemini 1.5 Pro just got beat in the AI race

a screenshot of claude 3.5 sonnet, with an 8-bit crab
Anthropic

There’s a new leader, technically, in the race for AI assistant dominance, and it’s Anthropic’s new Claude 3.5 Sonnet. The newly released model outperforms both Gemini 1.5 Pro and ChatGPT-4o across a spectrum of benchmark tests, the company announced on Thursday.

Recommended Videos

This new iteration of Sonnet is the first in Anthropic’s upcoming line of 3.5 models, and it significantly outperforms the more expansive Opus 3.0 model, and does so at a fraction of the larger model’s energy cost. Compute efficiency is becoming an increasingly important aspect of AI system design, especially as the cost of both powering and cooling AI data centers soars while the infrastructure pushes into the gigawatt range.

Claude 3.5 Sonnet for vision

“Claude 3.5 Sonnet operates at twice the speed of Claude 3 Opus,” the Anthropic team wrote in a blog post. “This performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks such as context-sensitive customer support and orchestrating multistep workflows.”

The new model has reportedly set benchmark results across three standardized tests: graduate-level reasoning with GPQA, undergraduate-level knowledge with MMLU, and coding proficiency with HumanEval. It beat out Google’s Gemini 1.5 Pro, Meta’s Llama-400b, and OpenAI’s ChatGPT-4o, though not by any huge margin and typically only by a couple percentage points.

A table showing Claude 3.5 Sonnet's performance compared to other leading AI systems.
Anthropic

Sonnet 3.5 is being billed as Anthropic’s “strongest vision model yet. ” It’s capable of performing a number of vision-based tasks — like interpreting charts and graphs or transcribing text from imperfect image sources like screenshots or scanned receipts — more accurately than Opus 3.0. In fact, Sonnet 3.5 beat out Opus 3.0 by anywhere from 6 to 17 points across industry standard vision benchmarks. The new model is also reportedly much more competent at handling humor and can converse in a much more lifelike manner.

Sonnet will also be the first Anthropic AI to offer the Artifacts feature to users. Rather than generate images or code snippets directly into the flow of the conversation, Artifacts will create that content in a dedicated space to the side of the chat. This allows users to create “a dynamic workspace where they can see, edit, and build upon Claude’s creations in real time, seamlessly integrating AI-generated content into their projects and workflows,” the Anthropic team claims. It also announced that Claude will soon support team collaboration wherein a company can store its data, documents and projects in a single, central silo, with Claude acting as an on-demand assistant.

You can try out Claude 3.5 Sonnet today for free on the Claude.ai website and the Claude iOS app (a Claude Pro or Team subscription will garner you significantly higher rate limits). Third-party integration is also available through the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude Haiku 3.5 and Opus 3.5 are scheduled for release later in the year.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
The Microsoft AI CEO just dropped a huge hint about GPT-5
A photo of Mustafa Suleyman.

The timeline on GPT-5 continues to be a moving target, but a recent interview with Microsoft AI CEO Mustafa Suleyman sheds some light on what GPT-5 and even what its successor will be like.

Mustafa Suleyman on Defining Intelligence

Read more
GPT-4o: What the latest ChatGPT update can do and when you can get it
OpenAI developer using GPT-4o.

GPT-4o is the latest and greatest large language model (LLM) AI released by OpenAI, and it brings with it heaps of new features for free and paid users alike. It's a multimodal AI that enhances ChatGPT with faster responses, greater comprehension, and a number of new abilities that will continue to roll out in the weeks to come.

With increasing competition from Meta's Llama 3 and Google Gemini, OpenAI's latest release is looking to stay ahead of the game. Here's why it's so exciting.
Availability and price
If you've been using the free version of ChatGPT for a while and jealously eyed the features that ChatGPT Plus users have been enjoying, there's great news! You too can now play around with image detection, file uploads, find custom GPTs in the GPT Store, utilize Memory to retain your conversation as you chat so that you don't need to repeat yourself, and analyze data and perform complicated calculations.

Read more
Google Gemini vs. GPT-4: Which is the best AI?
A person typing on a laptop that is showing the ChatGPT generative AI website.

Google Gemini and OpenAI's ChatGPT that uses the GPT-4 model are two of the most advanced artificial intelligence (AI) solutions available today. They can comprehend and interact with text, images, video, audio, and code, as well as output various alterations of each. they also provide expertise that would cost a lot to replicate with an expert human.

But if you're weighing which tool to put your time and energies into learning how to use, you want to pick the best one. Which is the more capable AI tool? Gemini or GPT-4?
Availability and pricing
Gemini is available in Pro and Nano form, though Ultra has yet to be released. Google

Read more