Demo · Module 10 · Interactive
Drag the temperature. Watch the model
change its mind.
The cat sat on the
model's next-token distribution ↓
Temperature
balanced
1.00T
0 · greedy
0.3
0.7 · creative
1.0 · neutral
1.5
2.0
3.0 · chaos
Probability distribution · 8 candidates
What temperature actually does. Given a vector of
logits from the model's final layer, softmax produces probabilities by computing exp(logit/T) / sum(exp(logit/T)). When T is small the largest logit dominates exponentially — the model becomes greedy and almost always picks "mat". When T = 1.0 you get the model's raw distribution. When T is large the exponentials get squashed toward each other and the distribution flattens — every candidate becomes nearly equally likely, which is what gives you chaotic, low-quality output. The temperature knob lives inside the softmax, not after it. Real-world use: 0.0 for code completion (greedy), 0.7-1.0 for chat, 1.0+ for creative writing, never above ~1.5 in production.