Anthropic

Anthropic: Claude 3.5 Sonnet

This model is deprecated

Claude 3.5 Sonnet was deprecated on Oct 22, 2025 and can no longer be run here. Its evaluation results and details remain available for reference. Try Claude Sonnet 4.6 instead.

Claude 3.5 Sonnet Overview

Claude 3.5 Sonnet, introduced by Anthropic in June 2024 and upgraded in October 2024, is the mid-tier model of the Claude 3.5 family, designed to balance speed, cost, and advanced reasoning. It is multimodal, handling text and images, and the upgraded version introduced a “computer use” beta feature, allowing the model to interact with a desktop environment for tasks like cursor control and typing. With a 200,000-token context window, it supports long-context reasoning, coding, and multi-step workflows.

Sonnet is positioned as a cost-effective alternative to higher-end models, priced at $3 per million input tokens and $15 per million output tokens. It performs well on benchmarks relative to Claude 3 and earlier Opus models, and its upgrades enhanced tool use, coding, and agentic reasoning. However, Anthropic announced in August 2025 that Claude 3.5 Sonnet will be retired on October 22, 2025, with Claude Sonnet 4 recommended as its successor. Common use cases include enterprise-scale assistants, research workflows, and productivity automation.

Claude 3.5 Sonnet Details & Performance

Details

Resources

Vision Tasks

Vision LanguageObject DetectionClassificationOCRVisual Question AnsweringCaptioning

Features

Foundation VisionLLMs with Vision CapabilitiesMultimodal Vision

Usage

Past 30 Days

Not available

Not in Playground

Performance

Avg. Latency

Arena Rankings

Alternatives to Claude 3.5 Sonnet

Other models worth comparing for similar use cases.

Anthropic
Claude Sonnet 4.6
Claude Sonnet 4.6 is Anthropic's mid-tier large language model, released February 17, 2026, designed to balance performance, cost, and versatility for professional and developer use. It supports text and vision-based tasks with advanced reasoning, agentic capabilities, and Adaptive Thinking — a mode where the model dynamically scales its internal reasoning depth. A beta context window of up to 1,000,000 tokens (200K standard) enables processing of entire codebases or document collections in a single request. Parameters are undisclosed.Optimized for coding, computer use, long-context reasoning, agent planning, and knowledge work, Sonnet 4.6 delivers a full generational upgrade over Sonnet 4.5 and approaches Opus 4.5-level performance across many benchmarks at a fraction of the cost. It is the default model on Claude.ai, Claude Cowork, and is available via API and major cloud platforms — making it well suited for production workloads requiring strong reasoning without flagship pricing.
Anthropic
Claude Sonnet 4.5
Claude Sonnet 4.5, released by Anthropic in September 2025, is the company’s most advanced Sonnet-series model, built for high-performance reasoning, coding, and long-horizon agentic workflows. It is a multimodal system that accepts both text and images, with a 200,000-token context window designed for handling large documents and extended interactions. Anthropic highlights its improvements in reliability, reduced sycophancy, and alignment, making it suitable for sustained enterprise use.The model delivers strong results in coding and autonomous workflows, achieving 61.4% on the OSWorld benchmark and leading performance on SWE-bench Verified. It introduces infrastructure features such as a memory tool (beta), checkpointing for Claude Code, parallel tool use, and tighter integration with VS Code. Compared to Opus, which targets broader reasoning, Sonnet 4.5 is optimized for structured, long-duration tasks. Positioned against leading offerings from OpenAI and Google, it is aimed at enterprise automation, software engineering, and research-intensive applications.
Anthropic
Claude Sonnet 4
Claude 4 Sonnet, released by Anthropic in May 2025, is the mid-tier model in the Claude 4 family, designed to balance capability, cost, and speed. It is multimodal, accepting both text and images, and extends beyond prior versions with improved “computer use” support, allowing API-driven interaction with desktop-like interfaces. By default, it supports 200,000 tokens of context, but as of August 2025, it also offers a 1 million-token context window in public beta—making it one of the most context-capable models available for processing entire codebases or large document sets in a single request.Sonnet 4 is significantly cheaper than the flagship Opus while still demonstrating strong reasoning, coding, and instruction-following ability with reduced hallucinations. Its extended context capabilities and lower latency make it well-suited for enterprise-scale knowledge management, software development, research assistants, and productivity automation where both cost efficiency and high reliability are essential.
Google
Gemini 2.5 Flash
Gemini 2.5 Flash, released on June 17, 2025, is Google DeepMind’s production-ready, efficiency-focused model in the Gemini 2.5 family. It is multimodal, accepting text, images, video, and audio as inputs, with text as the primary output format. The model supports 1 million input tokens and up to 65K output tokens, enabling it to process very large contexts such as books, long video transcripts, or extensive datasets. Its training knowledge extends to January 2025.Designed as a price-performance leader, Gemini 2.5 Flash balances speed and reasoning power, making it suitable for everyday enterprise and developer use cases without the higher latency and cost of Pro models. It supports advanced workflows like function calling, code execution, search grounding, URL context ingestion, and structured outputs. While efficient and scalable, output length is still limited compared to its input capacity, and multimodal outputs (e.g. image or audio generation) remain restricted to specialized or preview variants.
Google
Gemini 3.5 Flash
Gemini 3.5 Flash is a multimodal language model developed by Google DeepMind and released at Google I/O 2026. It is built on the Gemini 3 Flash reasoning foundation and introduces configurable thinking levels (minimal, low, medium, and high) that allow developers to tune the depth of internal reasoning before a response is generated. The model accepts text, image, video, audio, and PDF inputs and produces text output, with a 1 million token context window and up to 65,000 output tokens per request. It is natively multimodal, processing visual inputs alongside text to support tasks such as image captioning, classification, optical character recognition, object detection, and visual grounding, where the model references specific regions within an image or video frame.Its vision capabilities extend to interpreting UI screenshots, diagrams, charts, and real-world scenes, as well as understanding video and live frame sequences for activity and scene recognition. The model supports combined tool use, including Google Search, URL context, code execution, and custom functions, within a single request, and it uses reasoning context from previous turns when thought signatures are present in the conversation history, enabling persistent multi-turn reasoning chains. Gemini 3.5 Flash carries a knowledge cutoff of January 2026 and is available via the Gemini API, Google AI Studio, Google Antigravity, and the Gemini Enterprise Agent Platform.
Qwen
Qwen3 VL 30B A3B Instruct
Qwen3 VL 30B A3B Instruct is an open-weight multimodal large language model developed by Alibaba as part of the Qwen family, built for instruction-following tasks that unify text generation with visual and video understanding. Released around October 2025 under the Apache-2.0 license, it targets efficient, high-fidelity vision-language reasoning across very long contexts.The model accepts text and image inputs and produces text outputs, with strong performance in OCR, spatial reasoning, long-video understanding, and agentic or GUI-centric visual tasks. It uses a Mixture-of-Experts (A3B) design with ~31.1B total parameters and ~3B active per token, paired with Qwen3-VL’s unified multimodal stack (including Interleaved-MRoPE and DeepStack fusion) to process text, images, and video in a single architecture. OCR support expands to 32 languages, enhancing document workflows. With a native ~262K token context window (extendable further), it stands out today for its balance of scale, efficiency, long-context support, and open accessibility in multimodal systems.

Claude 3.5 Sonnet License

Proprietary

License terms and commercial-use guidance for Claude 3.5 Sonnet.

License information is provided as a guide and is not legal advice.