Kimi K2.5 is a frontier-scale multimodal AI model developed by Moonshot AI and released on January 27, 2026. As a significant advancement within the Kimi K2 family, it utilizes a sparse Mixture-of-Experts (MoE) architecture with 1 trillion total parameters (32 billion active per inference) and a massive 256K-token context window. The model features native multimodal integration via a 400M-parameter MoonViT encoder, allowing it to process text, images, and video frames simultaneously. Built for both speed and depth, it offers "Instant" and "Thinking" modes, the latter of which excels at expert-level reasoning, scoring 50.2% on the Humanity’s Last Exam (HLE) benchmark when equipped with tools.
The model is released under a Modified MIT License, which remains open-weight but requires attribution for high-revenue commercial entities. It introduces an "Agent Swarm" paradigm capable of coordinating up to 100 specialized sub-agents for parallel workflows, significantly reducing latency in complex research tasks. For vision tasks, Kimi K2.5 demonstrates strong autonomous visual debugging capabilities, where it can inspect its own generated UI outputs against visual specifications to iteratively refine frontend code. This makes it a powerful choice for developers testing automated UI reconstruction, high-fidelity OCR document processing, and multi-step agentic research grounded in complex visual data.
Drag and drop an image here, or click to browse
Captioning will run automatically
—
Usage
Past 30 Days| Category | Passed | Score |
|---|---|---|
| Document Understanding | 5 / 9 | 55.6% |
| Defect Detection | 7 / 15 | 46.7% |
| Object Understanding | 6 / 14 | 42.9% |
| Spatial Understanding | 5 / 19 | 26.3% |
| Object Counting | 1 / 10 | 10% |
| Category | Passed | Score |
|---|---|---|
| Handwritten Math | 5 / 10 | 50% |
| VQA & Extraction | 20 / 60 | 33.3% |
| Text Recognition | 8 / 30 | 26.7% |
| Focused Scene OCR | 10 / 99 | 10.1% |
| License Plate Recognition | 2 / 30 | 6.7% |
Scores based on a single evaluation run · Methodology
View all Vision Evals →Kimi K2.5 costs $0.375 per 1M input tokens and $2.02 per 1M output tokens.
Pricing updated Jul 4, 2026
Estimated cost per task vs. Visual Understanding score, for this model and others ranked near it. Upper-left is the sweet spot (high quality, low cost).
6 of 6 models plotted
| Model | Score | Median tokens | Est. cost / task | Compare |
|---|---|---|---|---|
| Claude Haiku 4.5 | 58.2% | 2.3K | $0.0030 | Compare |
| GPT-5 Nano | 58.2% | 2.7K | $0.0003 | Compare |
| Qwen3.5 397B A17B | 58.2% | 1.5K | $0.0006 | Compare |
| Gemini 2.5 Flash | 55.2% | 476 | $0.0005 | Compare |
| Gemini 2.5 Flash-Lite | 53.7% | 301 | <$0.0001 | Compare |
| Kimi K2.5(this model) | 35.8% | 2.7K | $0.0021 | — |
Other models worth comparing for similar use cases.
License terms and commercial-use guidance for Kimi K2.5.
License information is provided as a guide and is not legal advice.