Qwen3.5-397B-A17B is a 397B-parameter (17B active) open-weight multimodal model developed by Alibaba’s Qwen team, released on 2026-02-16 under Apache-2.0. It supports text and image inputs with text outputs, combining a sparse Mixture-of-Experts architecture with Gated Delta Networks for efficient scaling. The model provides native vision-language reasoning and a large ~262K token context window, extendable to ~1M tokens.
As the first open-weight release in the Qwen3.5 family, it positions itself as a high-capacity, long-context alternative in the large vision-language space, balancing scale and efficiency via sparse activation. It is designed for advanced reasoning, coding, agent workflows, and multimodal understanding tasks.
Drag and drop an image here, or click to browse
Captioning will run automatically
—
Usage
Past 30 Days| Category | Passed | Score |
|---|---|---|
| Document Understanding | 7 / 9 | 77.8% |
| Defect Detection | 10 / 15 | 66.7% |
| Object Understanding | 9 / 14 | 64.3% |
| Spatial Understanding | 11 / 19 | 57.9% |
| Object Counting | 2 / 10 | 20% |
Scores based on single evaluation run · Methodology
View all Vision Evals →Qwen3.5 397B A17B costs $0.385 per 1M input tokens and $2.45 per 1M output tokens.
Pricing updated Jun 22, 2026
Other models worth comparing for similar use cases.
License terms and commercial-use guidance for Qwen3.5 397B A17B.
License information is provided as a guide and is not legal advice.