OpenAI

OpenAI: GPT-5 Nano

GPT-5 Nano Overview

GPT-5 Nano, released by OpenAI on August 7, 2025, is the smallest and most cost-efficient model in the GPT-5 family. Like its larger counterparts, it is multimodal—accepting text and images, supporting tool use, structured outputs, and reasoning—but it is optimized for speed, low latency, and affordability. It features input and output token limits of roughly 272K and 128K tokens respectively, enabling large-context processing even at its compact scale. Its knowledge cutoff is around May 2024, slightly earlier than the full GPT-5 model.

GPT-5 Nano is well-suited for high-volume or cost-sensitive deployments such as mobile apps, embedded AI systems, or rapid-response APIs. While it offers less depth on complex reasoning and coding tasks compared to GPT-5 Mini or Pro, it retains core multimodal and agentic capabilities, making it an attractive option where efficiency and scale matter more than maximum performance.

GPT-5 Nano Interactive Demo

GPT-5 Nano Details & Performance

Details

Resources

Vision Tasks

Vision LanguageObject DetectionClassificationOCRVisual Question AnsweringCaptioning

Features

Foundation VisionLLMs with Vision CapabilitiesMultimodal Vision

Usage

Past 30 Days

Performance

Avg. Latency

Arena Rankings

GPT-5 Nano Vision Evals

Visual Understanding

72 models · 67 tasks
HighestLowest
This model#46 of 7258.21% pass rate · better than 29%
Score58.21%pass rate across 67 tasks
Speed6.58savg response per task
Cost$0.0003 / task$0.050 in · $0.400 out / 1M
Tokens2.7K / task1.8K in · 591 out
Score key:≥75%40–74%<40%
CategoryPassedScore
Defect Detection13 / 15
86.7%
Document Understanding6 / 9
66.7%
Object Understanding9 / 14
64.3%
Spatial Understanding11 / 19
57.9%
Object Counting0 / 10
0%
HighestLowest
This model#33 of 5069% pass rate · better than 34%
Score69%pass rate across 229 tasks
Speed6.15savg response per task
Cost$0.0002 / task$0.050 in · $0.400 out / 1M
Tokens891 / task122 in · 539 out
Score key:≥75%40–74%<40%
CategoryPassedScore
License Plate Recognition25 / 30
83.3%
VQA & Extraction44 / 60
73.3%
Text Recognition21 / 30
70%
Focused Scene OCR64 / 99
64.6%
Handwritten Math4 / 10
40%

Scores based on a single evaluation run · Methodology

View all Vision Evals →

GPT-5 Nano Pricing

GPT-5 Nano costs $0.050 per 1M input tokens and $0.400 per 1M output tokens.

Input$0.050 / 1M tokens
Output$0.400 / 1M tokens
Cached input$0.010 / 1M tokens

Pricing updated Jun 28, 2026

Price vs. performance

Estimated cost per task vs. Visual Understanding score, for this model and others ranked near it. Upper-left is the sweet spot (high quality, low cost).

10 of 10 models plotted

ModelScoreMedian tokensEst. cost / taskCompare
AnthropicClaude Opus 4.767.2%2.6K$0.015Compare
GoogleGemma 4 31B67.2%467$0.0001Compare
AnthropicClaude Opus 4.6 64.2%2.3K$0.014Compare
AnthropicClaude Sonnet 4.559.7%2.3K$0.0092Compare
AnthropicClaude Opus 4.159.7%2.1K$0.040Compare
OpenAIGPT-5 Nano(this model)58.2%2.7K$0.0003
QwenQwen3.5 397B A17B58.2%1.5K$0.0006Compare
GoogleGemini 2.5 Flash55.2%476$0.0005Compare
GoogleGemini 2.5 Flash-Lite53.7%301$0.0000Compare
MoonshotAIKimi K2.535.8%2.7K$0.0021Compare

Alternatives to GPT-5 Nano

Other models worth comparing for similar use cases.

OpenAI
GPT-5.4 Mini
GPT-5.4 mini is a fast, cost-efficient model developed by OpenAI and released on March 17, 2026, optimized for high-throughput workloads and subagent orchestration. It supports text and image inputs within a 400,000-token context window, making it ideal for processing extensive visual datasets and large codebases in a single request. Designed for low-latency production environments, the model integrates with key API features including function calling, web search, and tool-based computer use, allowing it to assist in automated workflows that require navigating digital interfaces.Compared to the previous GPT-5 mini, this version runs more than twice as fast while approaching the performance levels of the flagship GPT-5.4 on reasoning and coding benchmarks. While the larger GPT-5.4 introduces native, state-of-the-art computer-use capabilities, GPT-5.4 mini provides a scalable alternative for interpreting screenshots and reasoning over dense UI layouts. For vision tasks on Playground, it excels at extracting structured information from visual documents and assisting in agentic tasks that involve real-time interpretation of software interfaces alongside text.
Google
Gemini 2.5 Flash-Lite
Gemini 2.5 Flash-Lite, released for general availability on July 22, 2025, is the most cost-efficient model in the Gemini 2.5 family, designed for high-volume and latency-sensitive tasks. It is multimodal, supporting text, images, video, audio, and PDFs as inputs, with text as its primary output. The model handles up to 1 million input tokens and generates outputs up to 64K tokens, making it suitable for large-scale document or media processing at low cost. It is built on a Sparse Mixture-of-Experts architecture with native multimodal support, though exact parameter counts are undisclosed.Flash-Lite offers the lowest usage cost among Gemini 2.5 models. It introduces developer controls for “thinking mode,” allowing fine-tuning of reasoning depth vs. efficiency. It also integrates native tools such as code execution, search grounding, and URL context. While strong on translation, classification, coding, and general multimodal reasoning, it lacks support for image or audio generation in its stable release and is less capable than Gemini 2.5 Flash or Pro on complex reasoning-heavy workflows.
Anthropic
Claude Haiku 4.5
Claude Haiku 4.5 is Anthropic’s lightweight model in the Claude 4.5 series, released in October 2025 under a proprietary license. Designed for speed and cost efficiency, it delivers near-frontier performance while maintaining Anthropic’s AI Safety Level 2 standard. Haiku 4.5 supports both text and multimodal (text and image) inputs, integrates tool use and extended reasoning, and features a 200,000 token context window, making it adept at handling long or complex workflows. Though the parameter count remains undisclosed, it achieves about 73.3% on SWE-bench Verified, reflecting strong coding and reasoning ability. Haiku 4.5 is ideal for developers and researchers seeking rapid, cost-effective model calls for analysis, coding, or multimodal understanding.
Qwen
Qwen3.5 9b
Qwen3.5-9B is a 9-billion-parameter multimodal foundation model developed by Alibaba Cloud's Qwen team, released on March 2, 2026 as part of the Qwen3.5 model family. Designed for efficient multimodal reasoning and long-context language tasks, it notably outperforms the older Qwen3-30B, a model more than three times its size, on key benchmarks including GPQA Diamond, IFEval, and LongBench.The model supports vision-language inputs through an early-fusion multimodal architecture built on a dense hybrid foundation of Gated Delta Networks and Gated Attention. It can also operate in a text-only mode by skipping the vision encoder during inference. It provides a 262,144-token context window (extensible to ~1M tokens via YaRN) and is released under the Apache License 2.0. Within the current AI landscape, Qwen3.5-9B offers a strong balance of capability and efficiency, making it well-suited for multimodal assistants, document analysis, long-context reasoning, and developer-deployed agentic systems.
Moondream 2
Moondream 2 is a small open-source vision-language model from Moondream, the company founded by Vikhyat Korrapati. It was first released in early 2024 and updated through mid-2025. At approximately 1.9 billion parameters, it is designed to run efficiently on consumer hardware such as laptops and edge devices while supporting a practical range of multimodal tasks. Moondream 2 combines a vision encoder based on SigLIP with a compact language backbone, trained for image understanding tasks rather than as a general chat model.The model accepts an image paired with a natural language prompt and produces text responses, supporting visual question answering, image captioning, and image-conditioned dialogue. Later Moondream 2 releases added object localization through a point API that returns coordinates for queried objects, along with improvements to OCR, counting, and document understanding. Moondream 2 is distributed under the Apache 2.0 license and is available through Hugging Face and the maintainer's distribution. Because the model is updated frequently, production deployments should pin to a specific revision rather than tracking the latest release. A successor model, Moondream 3 (Preview), was released in September 2025 with a 9B mixture-of-experts architecture and 2B active parameters, offering substantially stronger visual reasoning than Moondream 2 while retaining the efficiency-focused design. A referring expression segmentation extension to Moondream 3 was released in March 2026.

Other OpenAI GPT Nano models

Other versions in the same family as GPT-5 Nano.

GPT-5 Nano License

Proprietary

License terms and commercial-use guidance for GPT-5 Nano.

License information is provided as a guide and is not legal advice.