For education only. TraderBear is not a registered investment adviser. Nothing here is investment advice. This tool computes math on numbers you provide; it stores nothing.
HomeTools › Brier score calculator

Brier score & calibration calculator

The Brier score is the mean squared error between probability forecasts and outcomes: for one forecast, (forecast − outcome)², where the outcome is 1 if the event happened and 0 if it didn't. Lower is better. Answering 50% on everything scores exactly 0.25, so a sustained average below 0.25 means your probabilities carry real information. Paste your forecast history below — everything runs in your browser, nothing is uploaded or stored.

One line per forecast: probability, outcome — probability as a percent (70) or decimal (0.7); outcome as 1 (happened) or 0 (didn't). Example:

70, 1
55, 0
0.9, 1
25, 0

How to read your score

Mean Brier scoreReading
0.25Coin-flip baseline — answering 50% on everything scores exactly this.
0.20 – 0.25Some information, weak calibration. Typical for beginners with < 30 forecasts.
0.15 – 0.20Meaningfully informative forecasts, comparable to engaged tournament forecasters.
< 0.15Strong. Superforecaster teams in the ACE tournament research averaged near this range on comparable question sets.

Context matters: scores are only comparable across the same question set, and roughly 30–50 resolved forecasts are needed before an average separates skill from luck.

Brier score vs log score

Both are proper scoring rules — your expected score is optimized by reporting your honest probability. The Brier score is bounded (0 to 1 per forecast) and forgiving of extreme misses; the logarithmic score is unbounded and punishes a confident miss infinitely hard (a 100% call that misses scores negative infinity). Brier is the standard for forecasting tournaments precisely because it is bounded and easy to interpret; log score is common in machine learning as cross-entropy loss.

Calibration: the second half of the picture

A single average hides which probabilities are off. Calibration groups your forecasts into buckets (all the ~70% calls together, all the ~90% calls together) and asks whether each bucket's events actually happened at that rate. Most people are overconfident in the 80–95% range — their "90%" events happen roughly 75% of the time. The calculator above prints your calibration table automatically when you have enough forecasts per bucket.

Where to get scored forecasts for practice

You need resolved questions to have outcomes at all. Free options: forecasting tournaments (Good Judgment Open, Metaculus), a plain spreadsheet against news events with deadlines, or BearScout — TraderBear's free forecasting game that locks your probability on real prediction-market questions and Brier-scores it against the market's ask price at call time. The market price is the benchmark that matters: being more accurate than it, not just being right, is evidence of genuine forecasting skill. More on the method: How to practice forecasting.

Get scored on live questions.

BearScout locks your forecast, waits for settlement, and scores you against the market — free, pseudonymous, no real money.

Play BearScout →