Monday, March 14, 2011

How Well Did The Computers Predict The Field?

I'll get back to previewing the 2011 Tournament bracket in my next post, but I always like to have a "how well did the computer predict the field" post. If for nothing else, I like having this data for the future. This post will be kept on the left side of the page among my "BP Classics" posts.

Let me post the numbers first, and then I'll discuss them at the bottom of this post:


Ten highest rated teams to miss the Tournament (NIT seed given):

35 - Harvard (6)
42 - Cleveland State (2)
43 - Missouri State (3)
46 - Saint Mary's (2)
50 - Colorado State (3)
54 - Marshall (-)
58 - Boston College (1)
59 - UTEP (5)
60 - Wichita State (4)
61 - Oklahoma State (3)

Ten lowest rated teams to earn an at-large (seed given):
67 - USC (11)
64 - Marquette (11)
57 - Clemson (12)
55 - Florida State (10)
52 - Michigan (8)
49 - VCU (11)
48 - Illinois (9)
47 - Georgia (10)
45 - Michigan State (10)
44 - UCLA (7)


Ten highest rated teams to miss the Tournament (NIT seed given):

38 - Harvard (6)
44 - Virginia Tech (1)
51 - Saint Mary's (2)
53 - Colorado (1)
54 - Cleveland State (2)
55 - Boston College (1)
58 - Northwestern (4)
59 - New Mexico (4)
60 - Oklahoma State (3)
62 - Colorado State (3)

Ten lowest rated teams to earn an at-large (seed given):
78 - USC (11)
61 - VCU (11)
57 - Tennessee (9)
50 - UCLA (7)
49 - Georgia (10)
48 - Clemson (12)
46 - UAB (12)
45 - Florida State (10)
39 - Penn State (10)
37 - Illinois (9)


Ten highest rated teams to miss the Tournament (NIT seed given):

28 - Maryland (-)
33 - Virginia Tech (1)
37 - Saint Mary's (2)
42 - Washington State (2)
44 - New Mexico (4)
50 - Nebraska (5)
53 - Colorado (1)
54 - Alabama (1)
55 - Duquesne (-)
56 - Minnesota (-)

Ten lowest rated teams to earn an at-large (seed given):
86 - VCU (11)
63 - UAB (12)
60 - Georgia (10)
52 - Tennessee (9)
51 - UCLA (7)
48 - Penn State (10)
47 - USC (11)
45 - Xavier (6)
43 - Michigan (8)
41 - Texas A&M (7)


Ten highest rated teams to miss the Tournament (NIT seed given):

30 - Virginia Tech (1)
36 - Maryland (-)
43 - New Mexico (4)
47 - Saint Mary's (2)
48 - Nebraska (5)
49 - Colorado (1)
50 - Washington State (2)
51 - Alabama (1)
58 - Seton Hall (-)
59 - Miami (Fl) (2)

Ten lowest rated teams to earn an at-large (seed given):
84 - VCU (11)
57 - Georgia (10)
56 - UAB (12)
55 - Tennessee (9)
53 - UCLA (7)
52 - Old Dominion (9)
45 - Texas A&M (7)
44 - USC (11)
42 - Florida State (10)
41 - Michigan State (10)


How did the computers do?
The computer results are the same every year. The four computer ratings I've listed do not measure the same thing. The RPI and Sagarin ELO_CHESS are measures of resume quality. The Sagarin PREDICTOR and Pomeroy are measures of team quality. The Selection Committee is supposed to seed teams by resume quality rather than team quality, which is why the RPI and ELO_CHESS are always the best. The RPI, of course, is a very crude measure. The ELO_CHESS is much superior to the RPI, and does a better job of projecting the bracket. The PREDICTOR and Pomeroy ratings always mirror each other more or less, and you see a huge overlap between those two lists.

Which Selection Committee decisions were most inexplicable?
I've seen more talking heads than I can count on ESPN and other similar tv and radio channels talking about how the Selection Committee is supposed to pick the "best 37" at-large teams and that's why certain teams got bad treatment. But of course, that's not true. If we wanted the "best 37" teams then we'd have to put in Maryland, New Mexico and Washington State - three teams that weren't even remotely in consideration on Selection Sunday. I've seen many people say that UAB is a preposterous at-large teams because they're not good, and the "eye test" says they should be out. And indeed, both Sagarin and Pomeroy agree that they're one of the worst teams in the field. But the Sagarin ELO_CHESS says that they actually had one of the 37 best resumes and deserved to be in. In fact, the changes that the ELO_CHESS would have made would have been to take out VCU, USC and Tennessee (!) and to put in Virginia Tech, Saint Mary's and Harvard.

Virginia Tech was clearly the team that got screwed worst. They're the only team that's at the very top of both the "best resumes left out" and "best teams left out" lists. If we're looking at resumes alone, though, the best resume left out was actually Harvard. Of course, I told you on Selection Sunday multiple times that I'd have put them in the draw. And I also said in those posts that I'd be "shocked" if Virginia Tech got left out.... I was indeed shocked.

To me, the most preposterous team that got in was USC, and here's why. They have by far the worst resume to ever earn an at-large bid. The argument for them is that they added Jio Fontan mid-season and have been playing well since then, and are one of the 37 best possible at-large teams. And in fact the computers agree - USC was not one of the worst teams let into the field. But this is preposterous because this has never been how the Selection Committee has worked in the past. They've always cared about "body of work" and resumes. The argument was never "I know that team played like crap in the non-conference and have a horrible resume, but they are playing really well now so they should be in". By that logic throw in Alabama, since Alabama is clearly playing better ball now than several teams that got an at-large bid. Just make the non-conference games un-official scrimmages if we're not going to count them.

In the years I've been doing this we've seen an average of one team with a Sagarin ELO_CHESS of 50 or worse get an at-large, and I don't recall ever seeing one worse than 59th. I figured that with the expanded field we'd get 3 or 4 with an ELO_CHESS of 50 or worse, and we did get four. But USC as a 78??? That's insane.

As for teams in the field that got over-seeded, the two that stick out are Tennessee and UCLA. I buy Tennessee, though, because they had a lot of great wins and a lot of horrible losses. The Selection Committee always prefers teams that beat a lot of good teams and lose to a lot of bad teams to teams that do neither. And I agree with that. It makes sense to have a system that encourages hard schedules. Otherwise we end up like college football, where teams are rewarded for putting together the easiest schedule possible. But UCLA I don't understand. They played a soft schedule. What exactly did they do to earn a 7 seed?

While this post is cold comfort to fans of Virginia Tech and Harvard, we can use this information to help us put together our brackets. It's always good to know objectively which teams are over-seeded and under-seeded.

No comments: