Model Performance

All Seasons · 13,628 graded predictions

In-sample data: 12,931 of 13,628 predictions were generated after games completed. Only the 697 prospective predictions are reliable for evaluation. Metrics will become trustworthy as the current season progresses and daily ETL generates pre-game predictions.
67%
Overall Accuracy
80%
Last 50 Games
0.2226
Brier Score (↓ better)
0.7214
Log Loss (↓ better)
0.6805
AUC (↑ better)
13,628
Predictions Graded

Calibration

When the model says X%, does the favored team win X% of the time? The vertical bar shows the predicted probability; the filled bar shows actual.

50–55%1436 games50% actual(-3pp)
55–60%1605 games54% actual(-3pp)
60–65%1680 games58% actual(-5pp)
65–70%1817 games64% actual(-3pp)
70–75%1438 games70% actual(-3pp)
75–80%1242 games75% actual(-2pp)
80–100%4410 games78% actual(-14pp)
Actual win rate Predicted

By Confidence Tier

Accuracy when the model is highly confident in its pick.

ConfidenceGamesAccuracy
60%1058771%
65%890773%
70%709076%
75%565277%

Score Projection Accuracy

How close were the Monte Carlo score projections to actual results? The model's own projected run differential acts as the spread.

4.8
Avg margin error (runs)
+1.0
Margin bias (under-projects)
56%
Own-line cover rate
Within 2 runs of projected margin28%
Within 5 runs of projected margin61%

Based on 13628 games with score projections. “Own-line cover rate” = how often the actual margin exceeded the model's projected margin in the predicted direction.

Top Teams by Elo

Rolling Elo rating (1500 = average). Updated after every game result.

RankTeamGamesElo
1Texas Longhorns2671797
2North Carolina Tar Heels2671774
3Southern Miss Golden Eagles1741765
4Auburn Tigers2531757
5UCLA Bruins2511751
6LSU Tigers2831741
7Oregon State Beavers2681738
8Florida State Seminoles2531738
9Florida Gators2821736
10Mississippi State Bulldogs2481735

D1 Diamond Top 25 — 2026 Season

Composite ranking of all D1 teams with ≥5 games played.

#TeamRecordEloRun DiffScore
1TEX1601797+7.694
2GT1521704+9.589
3UGA1531709+8.487
4TA&M1511726+7.287
5MSST1521735+6.885
6AUB1421757+5.184
7CLEM1521734+5.284
8OU1421734+6.484
9UNC1531774+5.883
10MISS1531719+4.983
11FLA1531736+4.182
12FSU1331738+4.682
13USM1521765+3.582
14USC1501710+5.082
15UCLA1421751+5.881
16NCST1431673+7.980
17LSU1351741+3.979
18WAKE1521705+4.779
19UK1521692+4.678
20ALA1531722+4.178
21ORE1331703+5.778
22UVA1431709+5.177
23ORST1141738+1.876
24TENN1341717+3.476
25UCSB1321682+4.176

Minimum 5 games played. wOBA/FIP from FanGraphs.

How to Read This

Overall Accuracy — % of games where the model's favored team won. A coin flip is 50%; sharp sports models typically land 58–65% on college baseball.

Brier Score — measures probability calibration quality. Lower is better. A random model scores 0.25; perfect calibration scores 0.

Log Loss — penalises confident wrong predictions more harshly than Brier score. Lower is better. Random baseline is ~0.693 (ln 2); well-calibrated sports models typically land 0.55–0.62.

AUC — probability that the model ranks a random home win above a random home loss. 0.5 = no skill; 1.0 = perfect ranking. Measures discrimination independently of calibration.

Calibration chart — a perfectly calibrated model's bar would exactly touch the line in every bucket. Bars to the right mean the model is underconfident; left means overconfident.

Score projection — tracks how close the Monte Carlo run totals were to reality. “Avg margin error” is the mean absolute difference between projected and actual run differential. “Own-line cover rate” treats the model's own projected margin as the spread — above 50% means the model tends to under-project winning margins.

Elo ratings — self-correcting power ratings that update after every game. 1500 is average; top programs typically reach 1600–1700 by mid-season.