2015-16 Week 11 BELOW Ratings

I introduced BELOW yesterday.  If you haven’t read that post, you’re gonna be confused.

We know that the WCHA is jam packed in the middle this year: in 70 games, there’ve been 14 ties (20%), seven of which were 1-1 results (10%) and one 0-0 result.  Furthermore, six wins have come in OT.

As a result of all of this, the BELOW ratings through the first half of the WCHA season — 70 games over nine weeks of play — are packed pretty tightly.

2015-16 BELOW Rankings, Week 11

BELOW uses an Elo-style ranking system to iteratively rate WCHA teams in league play. It is both reflective and predictive.
TeamStartWeek 11RankChange
MSU168817011st+20
MTU165416212nd-33
BGSU157516093rd+25
NMU150715064th-1
BSU153415025th-32
UAA141614756th+59
UAF152314617th-68
LSSU140314408th+37
FSU148914399th-50
UAH1400135710th-43

That’s ugly: one great team in Minnesota State, two very good teams in Michigan Tech and Bowling Green, six stacked-together teams in Northern Michigan, Bemidji State, Alaska-Anchorage, Alaska, Lake Superior, and Ferris State, and … the depressing season that is Alabama-Huntsville.

[We pause for weeping from Michael Napier and me.]

The biggest movers are Anchorage (+59, or 8.4% increase in winning percentage) and Lake State (+37, 5.3%), who are bounding out of the cellar and currently sitting tied for 6th in the WCHA men’s standings.  Also strongly improving their lot this year are BG (+34, 4.9%) and Mankato (+20, 2.9%), who are looking to push hard and fight for the Broadmoor and McNaughton trophies.

Regressing strongly are Fairbanks (-60, -8.5% — miss U Parayko), Huntsville (-43, -6.2% I have no idea what’s wrong, other than everything seems to be wrong), and Ferris (-42, -6.0%; miss U Motte).  Bemidji (-36, -5.2%) and Tech (-33, -4.7%) have fallen off as well, with the Beavers threatening to miss the playoffs and Tech looking to hold on to home ice in the MacInnes.

Northern hasn’t made any real change this season other than scaring Southern Baptist and Pentecostal preachers with their 6-6-6 mark.

I haven’t put together predictions just yet, but I think that it’s worth looking at the schedule that each of the team teams has left.  If a matchup is in bold, that team is the favorite.

UAH: v Fairbanks, @ Ferris, @  Anchorage, v Mankato, @ Northern, @ Bemidji, v BG.

UAA: v Bemidji, @ Lake, @ BG, v Huntsville, @ Northern, v Lake,  Mankato, @ Alaska.

UAF: v Bemidji, @ Huntsville, @ Northern, v Ferris, v BG, @ Tech, v Anchorage.

BSU: @ Fairbanks, @ Anchorage, v Ferris, @ Lake,  v Tech, v Huntsville, @ Mankato

BGSU: @ Lake, v Anchorage, v Tech, @ Fairbanks, @ Mankato, v Ferris, @ Huntsville

FSU: v Huntsville, @ Bemidji, @ Fairbanks, v Northern, @ BG, @ Lake (essentially a wash)

LSSU: v BG, v Anchorage, @ Mankato, @ Tech, v Bemidji, @ Anchorage, v Northern, @ Ferris (essentially a wash)

MSU: v Northern, @ Tech, v Lake, v BG, @ Huntsville, @ Anchorage, v Bemidji

MTU: v Mankato, @ BG, v Lake, @ Bemidji, v Fairbanks, H/H with Northern

NMU: @ Mankato, v Fairbanks, v Anchorage, @ Ferris, v Huntsville, @ Lake, H/H with Tech

I’ll have some thoughts on predictions as we round towards January.  Week 14 is light — just the first half of Bemidji’s trip to the 49th, starting with a trip to Fairbanks — but we’ll keep an eye on things going forward.

Introducing BELOW

It’s time to Bring Elo to the WCHA.

Arpad Elo’s rating system for chess was adopted in the early 1960s.  The principle is extremely simple:

Team A and Team B have certain known ratings from their past performance.  Knowing those ratings, we can infer that Team A is better than Team B (or vice versa).  We can use that information to create an expected value of the match.

Here’s an example.  Based on last year’s results, I had Michigan Tech’s starting 2015-16 BELOW as 1655, while Ferris State’s was 1481.  If you go from the tradition that Elo ratings are average at 1500, that makes sense: Tech was 21-5-2 in the league and 29-10-2 overall, while Ferris was 13-14-1 and 18-20-2, respectively.

So let’s keep those numbers in mind when we get to Week 3 of the 2015-16 season, when the Huskies went to Big Rapids to play the Bulldogs.  Tech housed Ferris the first night, 5-1, but lost the next night, 2-3.

For Friday night’s result, Tech’s rating got a little bit better — 16 points better, even with a resounding win — but fell 34 points with the one-goal loss on Saturday.  Why does that matter?  It’s all about expected value.

DifferenceWin Expectation of Weak TeamWin Expectation of Strong Team
050.0%50.0%
1048.6%51.4%
2047.1%52.9%
3045.7%54.3%
4044.3%55.7%
5042.9%57.1%
7539.4%60.6%
10036.0%64.0%
15029.7%70.3%
20024.0%76.0%
25019.2%80.8%
30015.1%84.9%
35011.8%88.2%
4009.1%90.9%
4507.0%93.0%

Based on that, you can see that Tech, being 174 points better than Ferris in BELOW, would be expected to win between 70.3% and 76.0% of the time.  (It was 73.1%, to be exact.)  That the teams split was unexpected, and BELOW reacted to that.

Creating the Initial 2015-16 BELOW Ratings

I ginned up that table because I needed to see how to assign relative point values based on 2014-15 data (notably KRACH along with general winning-percentage).  It took me about an hour, but I got there.

Screenshot 2015-12-12 11.20.40

It’s what you’d expect: the great teams are really good, the average teams hover around 1500, and the bad teams are about as far away from the middle as the good teams are.

The Discount

Now, Elo ratings were designed around chess players, who presumably learn and grow or get older/less able/disinterested and get worse.  The math also works in terms of the player as a random variable: sometimes you’ll be great when you’re just not that good, and sometimes you’ll have a stinker when you’re a top player.  If you need an example of that: Tech 1, UAH 0 (3 OT).

But there’s a continuity to Elo as applied to an individual relative to how it applies to teams, whose complexions regularly change.  The fairest thing to do is to discount all teams over the change of a season.  If a team is truly good (or truly bad), they will exhibit that early on open that gulf again.

As such, I’ve created a discount for teams based on the 2014-15 end-of-season, KRACH-based ratings.  Without any historical data to base it on, I figured that it’s reasonable that teams would regress to the mean by 1/3: that is, they’d move closer to 1500, but not by all that much.  As an example, UAH’s 1350 rating for 2014-15 became a 1400 rating for 2015-16.

The other value of the discount is that an improving (or declining) team will progress/regress to the mean very quickly.  Why is this valuable?  Elo is going to be able to reward consistent success from past failures strongly — first as upsets and then simply as confirming and amplifying the signal.  If the 1/3 discount is unrealistic — a truly bad team is truly bad, or a great team retains its greatness — BELOW self-corrects over time.

Expected Value

Let’s consider Tech and Ferris again.  Going into that first game, Tech’s expected value to win was 73.1% — a very strong score from  team that had been at the top of Division I the previous season.  With Ferris being just under average at the league and national levels, you’d expect that a win for Tech and a loss for Ferris wouldn’t move the BELOW needle very much: all the game did was confirm the expected result.

Indeed, that’s what happened: Tech’s BELOW went from 1655 to 1671, while Ferris dropped from 1481 to 1465.  Note that the point change is of equal magnitude and opposite distance.

But the next night, Ferris wins against an expected winning percentage of 23.4%, with its BELOW jumping to 1504, or slightly above average.  Why the difference?  It’s the variance in the expected value.

In the Friday game, Tech’s EW% was .725; after the game, it was 1.00.  The difference, .275, is credited to the Huskies and run through the Elo algorithm, which makes it harder and harder to move away from 1500 and easier to move back when the unexpected happens.  That’s how you get the Saturday result of more than twice the change in the rating.

Again, we need to step back and think about this.  The scores were 5-1 Tech, 3-2 Ferris.  Tech was top flight; Ferris was below average.  An Elo algorithm like BELOW says, “Well, we still think that Tech is very good, but it’s possible that Ferris is a little bit better this year than last year.”  An 18 point increase argues that Ferris is 2.6% better than last year.

But that’s just one game, and now WCHA teams have played 70, one-half of the league’s matches.  We’ve moved from having two truly great teams and three truly awful teams to just one great one (Minnesota State) and one awful one (Alabama-Huntsville).  The league has regressed, and BELOW confirms that.  (More on the Week 11 ratings in my next post.)

Predictions

The real value of something like Elo is that it is inherently predictive. Since it also treats the play of individual parties as a random variable, you can apply statistical methods and Monte Carlo simulations (i.e., rolling the dice lots of times and seeing what’s up).  I’ll be setting that up over the next few weeks so you can start playing with the scenarios, but for now, you can use an online Elo calculator if you’d like to get a sense of it.

There are lots of things to consider:

  • What’s the inherent instability of the system?  Upsets happen, injuries occur, etc.  There’s a constant in Elo systems that covers that.  Sports implementations of Elo have generally held it around 20, and that’s where we are.
  • Additionally, it feels like there should be a goal margin bonus.  Absolutely pounding a team should be more impressive than a one-goal win.  I use a 5x multiplier at present.  A UAH 2-1 loss to Mankato last night cost the Chargers 4 points; a crazy 7-2 upset win would’ve given them 59 and almost pushed them into the 1400s.  A 10x multiplier makes that 86.  What’s “right”?  I don’t know.  I can do some modeling against the games in hand and see what’s closest to the data.  I’d probably feel better if I took it back to 2013-14 to have a larger data set.
  • Should there be a home team bonus in our very tough travel league?  Right now, six of the 10 teams do better on the road overall.  I’d have to look at historical data for the last two years to be sure.  Also, the goals per game of visiting and home teams is virtually the same as of Games 1-70: 2.39 G/GM for visitors, 2/34 G/GM for home squads.
  • Most importantly, what’s the value of an overtime win?  A tie is, as you might expect, 0.50 at the end of a game.  Is an overtime win 1.0?  I don’t think so.  You can have a one-goal lead for much of the third and hold on for the victory, and that always seems like it’s a better effort than having to go to the extra 5:00 frame.  Currently, I’m using 0.60 for OT wins and 0.40 for OT losses.