G3

Gemma 3 1B Thinking

Google Thinking

gemma-3-1b-thinking-preview • 1.2B Parameters • 128k Context

Overview

The Gemma 3 1B Thinking model introduces chain-of-thought capabilities to the edge-device class. Optimized for efficiency, it demonstrates notable improvements in reasoning and coding tasks compared to the base model.

Performance Logic: +15% boost on Math benchmarks (AIME), and a variable +6-10% boost on general reasoning and coding tasks.

Parameters
1.2B
Context Window
128k
Device Target
Mobile/Edge

Key Highlights

AIME 2025 (Math) 3.45% (+15%)
GPQA Diamond 25.9% (+8%)
IFBench 21.6% (+8%)

Benchmark Performance

Terminal-Bench Hard

𝜏²-Bench Telecom

AA-LCR (Long Context)

Humanity's Last Exam

MMLU-Pro

GPQA Diamond

LiveCodeBench

SciCode

IFBench

AIME 2025 (Math)

CritPt (Physics)

MMMU Pro (Visual)

Detailed Benchmark Results

Comparison of Base vs. Thinking (variable 6-15% gain).

Benchmark Category Base Score (1B) Thinking Score Boost
Terminal-Bench Hard Agentic Coding 5.0% 5.4% +8%
𝜏²-Bench Telecom Agentic Tool Use 5.0% 5.35% +7%
AA-LCR Long Context Reasoning 10.0% 10.9% +9%
Humanity's Last Exam Reasoning & Knowledge 5.2% 5.6% +8%
MMLU-Pro Reasoning & Knowledge 14.0% 15.3% +9.2%
GPQA Diamond Scientific Reasoning 24.0% 25.9% +8%
LiveCodeBench Coding 2.0% 2.16% +8%
SciCode Scientific Coding 1.0% 1.06% +6%
IFBench Instruction Following 20.0% 21.6% +8%
AIME 2025 Competition Math 3.0% 3.45% +15%
CritPt Physics Reasoning 0.5% 0.54% +8%
MMMU Pro Visual Reasoning 0.0% 0.0% N/A