Look at the poor geek, crying in his Red Bull.
Lately, I’ve had a bit of a crisis of confidence. Week 15 of the 2016-17 WCHA slate was pretty crazy, as a league that hadn’t often gone to overtime saw seven matches in eight games go past regulation time. I trust the underlying principles of what I’m doing, but even I had some concerns.
One of the things that I did was look at what I call BELOW 2016 — the concept that I’ve been using — versus BELOW 1500, where everyone started at an estimated BELOW of 1500, with adjustments in game play from there. The third table in the above link shows that there’s not a huge difference, but it exists.
Then I got to wondering about whether the three-goal multiplier was worth the hassle and held any predictive value. I decided that it probably did.
It’s all really just an estimate!
But both of these got me to thinking more about what BELOW is: it’s an estimate distilled into a rating. Is there anything special about that one specific number? What’s the error band on that estimate?
Then I realized what I needed to do: I needed to randomize the range of BELOW.
Think about it like this. BGSU’s current BELOW rating is 1559, or a team that wins about 58% of the time against any average WCHA team. When you look at the Falcons’ point percentage, that seems pretty close to the truth: they’ve gotten 53% of the possible points for the games that they’ve played.
But what if that points percentage were right? We can infer from the table at the bottom of the BELOW explainer that BG could be more like a 1520 or so. That’s a pretty big swing! Even those we know that BG has played a schedule that averages at 1516, not 1500. So the truth is probably somewhere between 1520 and 1559.
Now, I could divine this for everyone, or I could take advantage of the Monte Carlo simulation approach and make things random.
Count on Monte Carlo
If you aren’t familiar with the process, here is a very short explanation:
- ABOVE starts with two BELOW estimates, one for each team.
- ABOVE calculates an expected value from the Elo family of functions.
- Based on that expected value, one team is declared the lesser team and the other the greater, with the break-even point at the EV of the lower team.
- Historical data of the number of shootouts, 3v3 overtimes, 5v5 overtimes, one-goal wins, and two-goal wins create bands centered around that expected value, which I consider the basis. Overtime games seem to be generally a toss-up, so the real action is in regulation.
- Because of the skew of the basis away from a 50-50 matchup, the stronger team is far more likely to get to a 2+ goal victory. This is consistent with the data. If it weren’t, the system would correct and revert teams to the mean.
- ABOVE does a dice roll using the Python random.random() function, which generates a pseudo-random floating-point number between 0.0 and 1.0. This number is checked against those bands, and whenever it falls into a slot, that’s the winner.
- Do this lots of times. For the series-by-series predictions, I run them 1,000,000 times.
So here’s why I’m going to randomize: if the BELOW estimate is wrong for some reason — randomizing will help even out the estimate, just as randomizing the potential results will get you a good estimate.
How Randomizing BELOW Inputs Works
Let’s look at Bowling Green again. Their BELOW is 1559. If we say that the error band of BELOW is, oh, 40 points, then ABOVE’s random-integer function could produce a revised BELOW estimate between 1519 and 1599 — a range of being a 53% team and a 64% team.
So? Bowling Green is playing Ferris State, who has a BELOW estimate of 1486. That means that their randomized BELOW range could be 1446 to 1526. Yes, you will very likely come up with a handful of scenarios where the random generator has Ferris State as a favorite!
I will not be releasing revised estimates of the series-by-series predictions prior to the drop of the puck on Friday — what’s out there is what’s happening. But this could be a game-changer.