Gemma 3 4B, released on March 12, 2025, is the mid-sized member of Google DeepMind’s open-weight Gemma 3 family. With about 4 billion parameters, it is multimodal—supporting text and image inputs and generating text outputs. Like the larger Gemma 3 models, it features a 128,000-token input context window with an output capacity of ~8,192 tokens, enabling it to handle long documents and mixed text–image reasoning tasks.
The 4B variant is designed as a balance between efficiency and capability: it offers multilingual support across 140+ languages, strong summarization and reasoning performance, and compatibility with moderate hardware. Inference can run with ~6.4 GB VRAM in BF16, or significantly less in quantized 8-bit (~4.4 GB) or 4-bit (~3.4 GB) modes, making it accessible to developers outside large-scale infrastructure. While it lags behind the 12B and 27B versions on the most complex reasoning and multimodal benchmarks, its lower compute footprint makes it ideal for research, prototyping, and practical deployment where efficiency matters.
Drag and drop an image here, or click to browse
Captioning will run automatically
—
Usage
Past 30 Days| Category | Passed | Score |
|---|---|---|
| Defect Detection | 9 / 15 | 60% |
| Document Understanding | 5 / 9 | 55.6% |
| Object Understanding | 6 / 14 | 42.9% |
| Spatial Understanding | 5 / 19 | 26.3% |
| Object Counting | 0 / 10 | 0% |
| Category | Passed | Score |
|---|---|---|
| License Plate Recognition | 26 / 30 | 86.7% |
| Text Recognition | 22 / 30 | 73.3% |
| Focused Scene OCR | 63 / 99 | 63.6% |
| VQA & Extraction | 35 / 60 | 58.3% |
| Handwritten Math | 1 / 10 | 10% |
Scores based on a single evaluation run · Methodology
View all Vision Evals →Gemma 3 4B costs $0.050 per 1M input tokens and $0.100 per 1M output tokens.
Pricing updated Jun 27, 2026
Estimated cost per task vs. Visual Understanding score, for this model and others ranked near it. Upper-left is the sweet spot (high quality, low cost).
6 of 7 models plotted · 1 not yet evaluated
| Model | Score | Median tokens | Est. cost / task | Compare |
|---|---|---|---|---|
| Claude Opus 4.1 | 59.7% | 2.1K | $0.040 | Compare |
| GPT-5 Nano | 58.2% | 2.7K | $0.0003 | Compare |
| Qwen3.5 397B A17B | 58.2% | 1.5K | $0.0006 | Compare |
| Gemini 2.5 Flash | 55.2% | 476 | $0.0005 | Compare |
| Gemini 2.5 Flash-Lite | 53.7% | 301 | $0.0000 | Compare |
| Gemma 3 4B(this model) | 37.3% | — | — | — |
| Kimi K2.5 | 35.8% | 2.7K | $0.0021 | Compare |
Other models worth comparing for similar use cases.
License terms and commercial-use guidance for Gemma 3 4B.
License information is provided as a guide and is not legal advice.