Thursday, January 08, 2009

What's Wrong With The BCS

The BCS is one of those lightning rod discussions that always inflames sports fans. And as such, I want to make it clear that any discussion here is about the BCS rankings and not the actual system itself. I don't want to debate the college football playoff for a number of reasons. One is that any discussion that mentions this issue will immediately be dominated by that issue, which is why there is basically no discussion about the actual rankings system itself, an important issue whether you want a playoff or not (since a playoff system would presumably still use some kind of objective ranking system to determine at-large teams, similar to the NCAA Division I hockey system). Also, the politics and money considerations of college football are really not at all relevant to college basketball, while ranking systems in general are. And since this is a college basketball website, it is worthwhile to try to stay somewhat on topic. So please note that for the rest of this post, unless I specifically state otherwise, any mention of "BCS" refers to the BCS rankings, and not to the system itself. Also note that I plan on another piece in the near future to discuss how I would design the BCS. So while I'm not going to go into detail on any possible solutions, that will come soon. Without any further ado, let's get into the issues:

The Question: What is the BCS ranking?
This might be a shocker to some people, but the answer to this question is that nobody knows for sure. Nobody has actually defined what the BCS should be ranking. And this is why there are so many arguments that end up with two people arguing past each other. Should we be ranking the teams by how good their wins are, or by which teams are best? Should we rank teams by which teams were best over the whole year, or which finished the strongest? Does it even matter which teams are best? Nobody at the BCS has ever answered these questions, which is the fundamental flaw of the system. But since I asked the question, I'll answer it the best that I can. My belief is that the BCS is a system set up to put the two highest ranked teams in the human polls in a title game. Every time there is a difference between the computers and the humans, the BCS big shots change the computer polls. Over time the human polls have taken on a bigger component of the BCS, and the computer polls have been consistently changed to diverge less and less from what the humans say. This suggests that the computer polls are there for show, and to give a fake legitimacy to the rankings system. As I noted in this piece, I've actually noticed human voters sometimes intentionally under- or over-rank teams to try to balance out the computer polls, to make sure that they don't change the result from what the humans want. All of these facts combined make it clear to me that the computers are just window dressing, and that the BCS rankings are really just an ornate human poll. The emperor has no clothes.

The Common Retort: Shouldn't the humans be a big part of the BCS? They're the experts!
No, although the answer is complicated, so hear me out. First of all, the humans by themselves have the same errors that the BCS as a whole has, which is that they don't define what they're ranking. As I've explained in detail (such as here and here), the human Top 25 polls don't rank the 25 best teams. Instead, the Top 25 polls are more like the ATP, NASCAR or World Golf Rankings. You get points for wins, and lose points for losses, and teams are moved up and down according to some vague formula. The only difference is that the ATP. NASCAR and World Golf rankings are well-defined, while each human voter in college basketball and football has their own system. This is why North Carolina is not ranked #1 in the country right now, even though they are unquestionably the top team in the nation. If you polled the 103 voters in the AP and USA Today polls and asked them which is the best team in the country, I'm pretty confident that a solid plurality (if not an outright majority) would pick North Carolina. Certainly a lot more than the three that voted them #1 in this week's polls. And those that won't say it are those who have let the polls get to their head. But we know the rules: If you lose you have to be dropped. If you win you have to move up. Not only that, but we know how much you have to move. A team will never drop from the Top Ten to unranked in one week, no matter how much they deserve it. And a team will never remain #1 after a loss, no matter how much better they are than every other team. Not only that, but voters also have big biases in terms of which teams they watch and/or root for. Even taking a large number of media members from around the country doesn't completely solve things, because there are inherent biases in all voters. Nobody can completely ignore the names on the uniforms, and they can all be manipulated by the bias of analysts on television. They will all be affected by preseason rankings, as well as the previous week's rankings. And there is no way for humans to take into account more than a tiny percentage of all games in the country, because it's just not possible for the human mind to properly weight the 800 games or so played by Division I-A football teams each season.

Counter: But humans can take into account issues like injuries, or which teams improved as the season went along! Even if you want computers to dominate the BCS, you must allow the humans to make adjustments that the computers don't know about!
This is a good argument, but one which misses the point. What I talked about in the previous paragraph was human bias. And what this question treats is computer bias. Does your computer ranking over- or under-weight issues like injuries or change throughout the year? The answer is that computer rankings can be changed and improved. If they are mistreating issues like injuries then we can try to come up with a way for the computers to take those issues into account. We can adjust how much we weight early games in the season versus late games. We can adjust all of these things and then the computers re-compute. Human voting will never improve because we can never properly account for bias. We will never have a system where we can objectively say: "Voter A always over-ranks SEC teams, so we need to adjust all SEC teams down two spots in his rankings." It's just not possible.

The (Too) Obvious Solution: So should we just average the six BCS computer polls, throw out the top and bottom, and let that be the be-all and end-all?
While this is probably the obvious solution, I think it's the wrong solution. This is because the computers suffer from the same problem that the BCS as a whole suffers from: Nobody has defined what we should be ranking. Each computer ranking system accounts for completely different things. Some take into account games against Division II teams, some don't. Some give big bonuses for home field advantage, while others don't. And the BCS has preposterously taken anything about the score of games into account (which creates a massive bias towards undefeated teams, which is another issue for another day). Simply averaging these computer rankings makes no sense to anybody with any experience with statistics or numerical computer simulations. The concept of averaging polls comes from people (like those who run the BCS) who don't understand anything about computers, and rather view things like a human voter. With humans it makes sense to average to try to mask biases. For example, a poll taken this week of 10 media personalities from Big 12 states would probably put Oklahoma #1, while a poll taken this week of 10 media personalities from SEC states would probably put Florida #1. But combining all 20 voters would mostly mask those biases and put the more deserving team #1. Since you can't control for the individual bias of a single human voter (such as the example I gave of taking points away from SEC teams in a ranking put together by a guy with pro-SEC bias), you can only put a whole bunch of humans together and hope to average out their individual biases as much as possible. This makes no sense in computers where the biases are clearly defined. If one person thinks that games in September should be weighted only 80% of their total value, while another thinks that all games should be equal, do you average them and say that games in September get 90% of their value? That's not what either person believes. And since each computer poll takes into account various things (maybe the guy who underweights September games also gives a huge bonus for home field advantage) then you jumble up all of the issues and make it impossible to have a proper poll.

Conclusion: What should be the general framework for a BCS ranking system?
I don't want to go into detail on how I'd fix the BCS because this post is already long enough. That topic will get its own post. But a quick summary of some improvements would be:

Improvement #1: Define what the poll should be. Do you want to know the best teams, or the teams that "earned" the top spot by beating the best teams? Do you want the best teams all year, or the teams that finished strongest? And so on.
Improvement #2: Throw out the human voters. They can never be cured of their bias.
Improvement #3: Rather than six computer polls, have all of the experts get together to create one single computer poll. Deal with the bias issues head on, rather than trying to mask them with a fake "average."
Improvement #4: Make the whole system transparent. Anybody should be able to look up the exact procedure of the computer poll, so that they can find out exactly which issues are accounted for and how they are weighted and dealt with. If you're going to have a single computer poll determining everything then it must not be an opaque poll with no explanation of where it comes from.

No comments: