ai/qwen3

Verified Publisher

By Docker

Updated 3 months ago

Qwen3 is the latest Qwen LLM, built for top-tier coding, math, reasoning, and language tasks.

Model
103

100K+

ai/qwen3 repository overview

Qwen3

logo

Qwen3 is the latest generation in the Qwen LLM family, designed for top-tier performance in coding, math, reasoning, and language tasks. It includes both dense and Mixture-of-Experts (MoE) models, offering flexible deployment from lightweight apps to large-scale research.

Qwen3 introduces dual reasoning modes—"thinking" for complex tasks and "non-thinking" for fast responses—giving users dynamic control over performance. It outperforms prior models in reasoning, instruction following, and code generation, while excelling in creative writing and dialogue.

With strong agentic and tool-use capabilities and support for over 100 languages, Qwen3 is optimized for multilingual, multi-domain applications.


📌 Characteristics

AttributeValue
ProviderAlibaba Cloud
Architectureqwen3
Cutoff dateApril 2025 (est.)
Languages119 languages from multiple families (Indo European, Sino-Tibetan, Afro-Asiatic, Austronesian, Dravidian, Turkic, Tai-Kadai, Uralic, Astroasiatic) including others like Japanese, Basque, Haitian,...
Tool calling
Input modalitiesText
Output modalitiesText
LicenseApache 2.0

Available model variants

Model variantParametersQuantizationContext windowVRAM¹Size
ai/qwen3:8B-Q4_K_M

ai/qwen3:latest
8BMOSTLY_Q4_K_M41K tokens5.80 GiB4.68 GB
ai/qwen3:0.6B-Q4_00.6BMOSTLY_Q4_041K tokens1.51 GiB441.67 MB
ai/qwen3:0.6B-Q4_K_M0.6BMOSTLY_Q4_K_M41K tokens1.53 GiB456.11 MB
ai/qwen3:0.6B-F160.6BMOSTLY_F1641K tokens2.27 GiB1.40 GB
ai/qwen3:4B-F164BMOSTLY_F16262K tokens8.92 GiB7.49 GB
ai/qwen3:4B-UD-Q4_K_XL4BMOSTLY_Q4_K_M262K tokens3.80 GiB2.37 GB
ai/qwen3:8B-F168BMOSTLY_F1641K tokens15.54 GiB15.26 GB
ai/qwen3:14B-Q6_K14BMOSTLY_Q6_K41K tokens12.27 GiB11.28 GB
ai/qwen3:30B-A3B-F1630B-A3BMOSTLY_F1641K tokens57.55 GiB56.89 GB
ai/qwen3:30B-A3B-Q4_K_M30B-A3BMOSTLY_Q4_K_M41K tokens18.35 GiB17.28 GB
ai/qwen3:4B-UD-Q8_K_XL4BMOSTLY_Q8_0262K tokens6.13 GiB4.70 GB
ai/qwen3:8B-Q4_08BMOSTLY_Q4_041K tokens5.56 GiB4.44 GB

¹: VRAM estimated based on model characteristics.

latest8B-Q4_K_M

🧠 Intended uses

Qwen3-8B is designed for a wide range of advanced natural language processing tasks:

  • Supports both Dense and Mixture-of-Experts (MoE) model architectures, available in sizes including 0.6B, 1.7B, 4B, 8B, 14B, 32B, and large MoE variants like 30B-A3B and 235B-A22B.
  • Enables seamless switching between thinking and non-thinking modes:
    • Thinking mode: optimized for complex logical reasoning, math, and code generation.
    • Non-thinking mode: tuned for efficient, general-purpose dialogue and chat.
  • Offers significant improvements in reasoning performance, outperforming previous QwQ (in thinking mode) and Qwen2.5-Instruct (in non-thinking mode) models on mathematics, code generation, and commonsense reasoning benchmarks.
  • Delivers superior human alignment and excels at: Creative writing, Role-playing, Multi-turn dialogue, Instruction following with immersive conversations.
  • Provides strong agent capabilities, including: Integration with external tools and best-in-class performance in complex agent-based workflows across both thinking and unthinking modes.
  • Offers support for 100+ languages and dialects, with robust multilingual instruction following and translation abilities.

Considerations

  • Thinking Mode Switching
    Qwen3 supports a soft switch mechanism via /think and /no_think prompts (when enable_thinking=True). This allows dynamic control over the model's reasoning depth during multi-turn conversations.
  • Tool Calling with Qwen-Agent
    For agentic tasks, use Qwen-Agent, which simplifies integration of external tools through built-in templates and parsers, minimizing the need for manual tool-call handling.

Note: Qwen3 models use a new naming convention: post-trained models no longer include the -Instruct suffix (e.g., Qwen3-32B replaces Qwen2.5-32B-Instruct), and base models now end with -Base.


🐳 Using this model with Docker Model Runner

First, pull the model:

docker model pull ai/qwen3

Then run the model:

docker model run ai/qwen3

For more information, check out the Docker Model Runner docs.


Benchmarks

CategoryBenchmarkQwen3
General TasksMMLU87.81
MMLU-Redux87.40
MMLU-Pro68.18
SuperGPQA44.06
BBH88.87
Mathematics & Science TasksGPQA47.47
GSM8K94.39
MATH71.84
Multilingual TasksMGSM83.53
MMMLU86.70
INCLUDE73.46
Code TasksEvalPlus77.60
MultiPL-E65.94
MBPP81.40
CRUX-O79.00

Tag summary

Content type

Model

Digest

sha256:67368f373

Size

7.5 GB

Last updated

3 months ago

docker model pull ai/qwen3:4B-F16

This week's pulls

Pulls:

18,000

Last week