Tuesday, March 14, 2017

How Well Did The Computers Predict The Field?

This is my annual post where I break down the computer numbers of the bubble teams. Three ratings are included this year:

RPI (naturally)
BPI Strength of Record (a measure of resume strength)
Pomeroy (a measure of team strength)

Average Rating Error is the root squared mean of the difference between each team's seed and where the team's seed would be if ranked strictly by the rating system (looking at 1-11 seeds only).

Below that is listed the ten lowest-rated at-large teams and the ten highest-rated non-NCAA Tournament teams in each computer system.

Note that all of these numbers are as of Monday morning (i.e. they include all of the results up through Selection Sunday but do not include any post-Selection Sunday tournaments).

Average Rating Error:
1.59 - BPI SOR
2.21 - RPI
2.66 - Pomeroy


Ten highest rated teams to miss the Tournament (NIT seed given):

33. Illinois State (1)
45. UT-Arlington (6)
49. Monmouth (4)
52. Georgia (2)
53. California (1)
54. Houston (2)
58. Akron (7)
60. Belmont (7)
62. Charleston (5)
64. Illinois (2)

Ten lowest rated teams to earn an at-large (seed given):
61. Marquette (10)
57. Kansas St (11)
56. Providence (11)
51. Northwestern (8)
50. Michigan State (9)
48. Virginia Tech (9)
44. Seton Hall (9)
43. South Carolina (7)
42. Miami-Florida (8)
41. USC (11)

  BPI Strength of Record

Ten highest rated teams to miss the Tournament (NIT seed given):

45. Illinois State (1)
51. Clemson (2)
53. TCU (4)
54. Georgia (2)
55. Monmouth (4)
57. Indiana (3)
58. Syracuse (1)
59. Pittsburgh (-)
60. Illinois (2)
61. UT-Arlington (6)

Ten lowest rated teams to earn an at-large (seed given):
52. Vanderbilt (9)
48. Michigan State (9)
46. Marquette (10)
44. Providence (11)
43. USC (11)
42. VCU (10)
41. Dayton (7)
40. South Carolina (7)
37. Kansas State (11)
36. Wake Forest (11)


Ten highest rated teams to miss the Tournament (NIT seed given):

34. Clemson (2)
41. TCU (4)
42. Indiana (3)
46. Texas Tech (-)
47. Utah (3)
49. Houston (2)
50. Syracuse (1)
51. Illinois State (1)
54. Alabama (3)
57. Georgia (2)

Ten lowest rated teams to earn an at-large (seed given):
61. USC (11)
56. Providence (11)
53. Seton Hall (9)
52. VCU (10)
45. Maryland (6)
44. Virginia Tech (9)
43. Michigan State (9)
40. Xavier (11)
39. Northwestern (8)
38. Arkansas (8)


Remember, we're judging resumes here
In order to deny the power that RPI has over the Selection Committee, it's often pointed out that some highly rated RPI teams get left out. This year, for example, RPI #33 Illinois State got left out. Last year, four RPI Top 40 teams got left out (#30 St. Bonaventure, #34 Akron, #38 Saint Mary's, and #39 Princeton). The year before there were also four RPI Top 40 teams left out (#29 Colorado State, #34 Temple, #45 Tulsa, and #46 Old Dominion).

Remember that we're judging resume strength here, and even the Selection Committee is aware that the RPI is a horrible measure of that. The RPI ranking itself has never been the primary seeding mechanism, and so it's not a shock that BPI Strength of Resume correlates better with NCAA Tournament seed than RPI.

That said, it was surprising to me just how close the BPI Strength of Resume was to seed. Just one team outside the Top 50 earned an at-large bid, while just one team inside the Top 50 was denied. Looking at the Average Rating Error, it's clear that BPI Strength of Record was far, far more accurate than RPI.

The RPI Is Screwing Mid-Majors As Badly As Ever
You might have noticed something above. Notice how all of those RPI Top 40 teams that got left out were mid-majors? In fact, in the nine years that I've been doing these "How Well Did The Computers Predict The Field" posts, 16 teams from the RPI Top 40 have been denied, and all of them have been from outside the Power 6 conferences. Why? Because the bracket is dominated by RPI peripherals.

RPI Top 50 and RPI Top 100 wins have and will continue to dominate the process. Not only is this unfair to mid-majors who cannot get RPI Top 50 opponents on their homecourt, but by using RPI Top 50 wins as a counting stat rather than as a rate stat, mid-majors are penalized twice (why exactly is a 6-8 record vs the RPI Top 50 seen as profoundly more impressive than a 3-4 record vs the RPI Top 50?).
Winning on the road at decent mid-majors is very hard, and you don't get rewarded for it. Mid-majors like Illinois State are forced to play those games, and so Illinois State suffered "bad" losses, all away from home and against decent mid-majors (San Francisco, Tulsa, and Murray State). Major conference teams avoid those games like the plague.

Syracuse was able to pile up RPI Top 50 home wins over the likes of Wake Forest and Miami-Florida. Illinois State would've been in the bracket easily if they could have ever gotten back-end RPI Top 50 teams like Wake Forest and Miami-Florida to show up on their home court. The system isn't fair, and we know that it isn't fair, but it's going to continue until the Selection Committee is willing to admit the problem.

Worst Bracket Mistakes
Once we understand that we're measuring resumes here, it's clear that the biggest snub was not Syracuse (the team that ESPN was trying hard to push), but Illinois State. Syracuse had its 6 RPI Top 50 wins only, and the rest of its resume was not even particularly close to at-large worthy. It's also clear that mid-majors like Monmouth and UT-Arlington deserved actual consideration. I threw UT-Arlington into my bracket because I thought that the Selection Committee would be under pressure to add a goofy mid-major rather than using the last spot on a thoroughly mediocre Kansas State resume, and UT-Arlington's resume scratched a lot of typical Selection Committee itches, but it was not to be. They had no time for mid-majors this year.

You want to know how badly the Selection Committee hated mid-majors this year? Even by Strength of Record, if we ignore how good a team was, Wichita State showed up 30th in the BPI SOR ratings. This means that they deserved an 8 seed even if we completely and utterly disregarded how good they are.  Yet they got a 10 seed. Once again: The idea that the advanced analytical models are actually being used in the Selection Committee room is utter garbage.

An underrated bracket mistake is Vanderbilt, who had the weakest resume in the at-large field according to BPI yet earned a 9 seed. Why? Because of a non-conference strength of schedule and an overall strength of schedule ranked #1 in the nation by RPI. The problem is that Pomeroy rated their strength of schedule just 33rd in non-conference play and 17th overall. Vanderbilt did a great job manipulating their RPI SOS with lots of games against the likes of Bucknell, Belmont, and Chattanooga. I've written before about how coaches can easily manipulate this very key metric. The RPI doesn't just screw up the bracket because of RPI Top 50/100 wins, but also because of RPI SOS. It's a virus in the system that infects everything.

1 comment:

Unknown said...

Thanks for fighting the good fight, Jeff. A school like Syracuse would probably be perfectly happy to play SUNY bodybag games all season long if the ACC didn't impose a conference schedule upon them. It's not like they went out and challenged themselves any more than they had to. And I continue to be flummoxed by the relative disregard of bad losses compared to "good wins", although it's realistic to think that this might have contributed to Syracuse's exclusion this time around.

Given that some much excitement is generated each year by the "David v. Goliath" angle, it strikes me as counterintuitive that the committee would continue to reward major-conference mediocrity in lieu of actual underdogs like Illinois State or Monmouth. Obviously, said committee is largely controlled by those same conferences, which explains pretty much everything, but it's still frustrating to watch this play out year after year.