The Brier score is the mean squared error between probability forecasts and outcomes: for one forecast, (forecast − outcome)², where the outcome is 1 if the event happened and 0 if it didn't. Lower is better. Answering 50% on everything scores exactly 0.25, so a sustained average below 0.25 means your probabilities carry real information. Paste your forecast history below — everything runs in your browser, nothing is uploaded or stored.
One line per forecast: probability, outcome — probability as a percent (70) or decimal (0.7); outcome as 1 (happened) or 0 (didn't). Example:
70, 1 55, 0 0.9, 1 25, 0
| Mean Brier score | Reading |
|---|---|
| 0.25 | Coin-flip baseline — answering 50% on everything scores exactly this. |
| 0.20 – 0.25 | Some information, weak calibration. Typical for beginners with < 30 forecasts. |
| 0.15 – 0.20 | Meaningfully informative forecasts, comparable to engaged tournament forecasters. |
| < 0.15 | Strong. Superforecaster teams in the ACE tournament research averaged near this range on comparable question sets. |
Context matters: scores are only comparable across the same question set, and roughly 30–50 resolved forecasts are needed before an average separates skill from luck.
Both are proper scoring rules — your expected score is optimized by reporting your honest probability. The Brier score is bounded (0 to 1 per forecast) and forgiving of extreme misses; the logarithmic score is unbounded and punishes a confident miss infinitely hard (a 100% call that misses scores negative infinity). Brier is the standard for forecasting tournaments precisely because it is bounded and easy to interpret; log score is common in machine learning as cross-entropy loss.
A single average hides which probabilities are off. Calibration groups your forecasts into buckets (all the ~70% calls together, all the ~90% calls together) and asks whether each bucket's events actually happened at that rate. Most people are overconfident in the 80–95% range — their "90%" events happen roughly 75% of the time. The calculator above prints your calibration table automatically when you have enough forecasts per bucket.
You need resolved questions to have outcomes at all. Free options: forecasting tournaments (Good Judgment Open, Metaculus), a plain spreadsheet against news events with deadlines, or BearScout — TraderBear's free forecasting game that locks your probability on real prediction-market questions and Brier-scores it against the market's ask price at call time. The market price is the benchmark that matters: being more accurate than it, not just being right, is evidence of genuine forecasting skill. More on the method: How to practice forecasting.
BearScout locks your forecast, waits for settlement, and scores you against the market — free, pseudonymous, no real money.
Play BearScout →