Thursday, March 04, 2010

Is It Harder To Make The Tournament From Large Leagues?

There has been a lively debate in some of the comments about the relative merits of the Big East and ACC. I've been arguing that the ACC and Big 12 are superior this year to the Big East, and the Big East folks disagree. Which is fine. But one of the key arguments that we've been having has gotten down to a math problem, which I'd like to do here. If you don't have a math background or don't care for math, you can just skip this post. But I think it's an interesting enough issue that I'd like to stretch it out to a full front page blog post. In fact, it's quite long. I'm bringing out my inner Bill Simmons here (only if Bill Simmons were good at math).

The argument is about how do we measure the relative measures of conferences. It seems quite obvious that there are two ways:

1) Which conference is harder to win?
2) In which conference is it harder to put up a winning record?

I think most everyone would agree with me that way #2 is superior. The Horizon League has been more difficult to win most of the last 5 years than the A-10, but just because there's one very good team at the top. Nobody would disagree that it's much harder to go 12-4 in the A-10 than to go 12-4 in the Horizon, so the A-10 is better. So the argument then becomes: is it harder to go 12-4 in the Big East or in the ACC? My argument is that it's harder in the ACC because a majority of the teams are Tournament quality teams. I said that the Big East could not possibly have been the "best conference ever" last year or this year because only half made the Tournament, and barely more than half were even in the Tournament discussion. In the Big East you get to play half your games against teams that just aren't Tournament quality. And that's not the case in the ACC or Big 12.

It was then argued back that it's harder to get half of a conference's teams in the Tournament if the conference is larger, citing the example of a 120 team conference, where it would be mathematically impossible for 60+1 teams to get in. I pointed out that this was silly because you wouldn't play every team in a 120 team conference. So the response back was that even in a 16 team conference it's significantly less probable for a majority of teams to make the Tournament than a 12 team conference. I will now try to quote as much of the argument as I can from multiple post comments (and I'm fairly sure I'm quoting as faithfully as possible):

"The two team conference has a 33% chance to get half it's conference in the field. It has a 66% chance of getting no team in (and yes, I realize auto bids don't work with the two team scenario, but the math is sound.) Probability will go down the larger the conference gets...

Here, I'll even give you the math equations so you know it is in fact the case:

in a four team conference it'd look like this:

((65C2 X 282C2) + (65C3 X 282C1) + (65C4)) / (347! / (343! X 4!)

In an eight team conference it looks like this:

((65C4 X 282C4) + (65C5 X 282C3) + (65C6 X 282C2) + (65C7 X 282C1) + (65C8)) / (347! / (339! X 8!))

If you don't believe me, please believe the math. It IS harder for a 16 team conference to get in 50% of its teams than it is for a 12 team conference to do so. If you get yourself a factorial calculator you can solve the above equations (or just do it longhand) and you'll see that it is in fact true."
I will go through this in detail. Unfortunately there are way too many numbers here and the logic got lost. The better way to do math (and to explain it to other readers who don't have much of a math background) is always to start out with the simplest problem and work outwards from there:

Let's go with the simple case where each team in the nation (of all 347) has an equal chance of making the NCAA Tournament. We'll later expand this to the more general case where each team has an x% chance of getting in (to deal with the fact that the Big East & ACC are better than average conferences, and their teams have higher probabilities). Let's also ignore automatic bids for now, to avoid complications.

Step 1: Imagine a 1 team conference: What is the probability that the 1 team gets in? We all know the answer to that, it's 65/347 = 18.7%

Step 2: Imagine a 2 team conference: What is the probability that both teams get in? Logically it's less than 18.7% chance, much less. The answer is: (65/347)*(64/346) = 3.5%. What is the probability that 1 of 2 gets in? Logically it should be a bit more than an 18.7% chance. In fact it's (65/347)*(282/346)+(282/347)*(65/346) = 30.5%. We can then quickly note that the odds of neither getting in would be 100%-3.5%-30.5% = 66%.

Step 3: Recreate these scenarios more formally: Now that we have simple scenarios where we know the answers, let's write them formally and make sure that we get the same answer! For a 1 team conference the formality is straightforward, where the likelihood of getting in becomes:

(# of Ways to Choose 1 from 1)*(# of Ways to Choose 64 from the remaining 346)/(# of Ways to Choose 65 teams out of 347) =

(1C1)*(346C64)/(347C65) = 0.187 = 18.7% <---The same answer we got before!

[For those unsure what "347C65" means, it is "347 Choose 65" and is the # of ways to choose 65 from a bucket of 347. And mCn = m!/(n!*(m-n)!) ].

So let's repeat for the 2 team conference: The odds of getting both in would be:

(# of Ways to Choose 2 from 2)*(# of Ways to Choose 63 from the remaining 345)/(# of Ways to Choose 65 team from 347) =

(2C2)*(345C63)/(347C65) = 3.5% <----The same answer we got before!

I can leave it to you the reader to complete the other two examples for yourselves.

Step 4: Expand to more realistic numbers: Now that we're confident in our method of using formal notation instead of doing the probabilities out by hand, we can do the case of the 4 team conference. What are the odds of getting exactly 2 of those 4 teams in the Tournament?

(4C2)*(343C63)/(347C65) = 13.9%

Okay, that seems realistic. So now let's move to ACC & Big East sizes. Let's take the ACC: what are the odds of at least half of a 12 team conference getting in?

Well that would be the probability of 6, 7, 8, 9, 10, 11 or 12 getting in. I'll get us started:

[(12C6)*(335C59)+(12C7)*(335C58)+....(12C12)*(335C53)]/(347C65) = 1.3%

That does seem a bit low until you recall that we are again giving each team an even chance of making the Tournament. In reality it's much higher because each ACC team has a higher than normal probability of getting in. But we'll get to that in a moment. Let's do the Big East case. What is the probability of at least half of a 16 team conference getting in?

Well that would be the probability of 8, 9, 10, 11... or 16 getting in. That can be written as (I apologize for the lack of greek letters...):

(Summation as n goes from 8 to 16 of:)[(16Cn)*(331C(65-n))]/(347C65)] = 0.4%

Well my goodness, the probability of at least half of the teams making the Tournament drops by a factor of 3! Argument over, right? I lost? Not quite.

Step 5: The problem is that the probabilities have been made dependent on each other, and this is not the case: What we're saying is that when one team is added to the Tournament that the probability of another team getting in drops. And I'll give you the example that proves the opposite: If a team moves from the ACC to the Big East right now, with the identical resume, would their probability of making the NCAA Tournament change? The answer is NO! Because they have not changed the total number of teams fighting for the 65 spots. Let's say instead that ten teams with excellent at-large credentials suddenly fell out of the sky into the Big East, so that it was now a 26 team conference. Suddenly the competition for an at-large bid out of the Big East would become much tougher, right? Ten more teams to compete with! Well, the odds of making the Tournament will also get harder in the ACC! They ALSO have ten more teams to compete with!

So in fact, the probability of making the Tournament is NOT affected by the size of the conference. And the real way to do this problem becomes simpler, with a binomial probability. Let's say that the probability of making the Tournament for a team is x%. The probability of not making it is y=100%-x. If you have a conference of n teams, then the probability of at least m getting into the Tournament becomes:

(Summation as z goes from m to n of)[(n!)/(z!*(n-z)!)]*(x^z)*(y^(n-z))

Let's also fix a sample size problem, which has to do with the fact that teams are more likely to finish in ANY position when you have a smaller conference. The way to think of it is that if you flip a coin four times you have a 69% chance of flipping heads at least half of the time. If you flip it 100 times you drop that to 54%. If you flip to 10,000 times that number drops to 50.4%. As you go to infinity it becomes 50. We can actually solve this problem for the NCAA situation because we have an automatic bid. So taking into account a non-zero probability of zero teams getting in is inaccurate. So for a 12 team conference, what we should be solving for is the probability that 6 or more teams get at-larges out of 11. For a 16 team conference, we should be solving the probability of 8 or more at-larges out of 15. And this will be a function of the probability of getting a bid. Here are some example values:

Probability of "more than half of a conference making the Tournament":

Probability of a bid = 65/347:
4 Team Conference: 6.6%
8 Team Conference: 2.0%
12 Team Conference: 0.63%
16 Team Conference: 0.21%

Probability of a bid = 0.5:
4 Team Conference: 48.5%
8 Team Conference: 49.9%
12 Team Conference: 50.0%
16 Team Conference: 50.0%

Probability of a bid = 0.75:
4 Team Conference: 75.6%
8 Team Conference: 91.2%
12 Team Conference: 96.3%
16 Team Conference: 98.3%

Conclusion: What we can conclude is that the probability of a single team is not affected by the size of one conference or another. If the Big East grows then it hurts the probability of both Big East and ACC teams equally. If you assume a low probability of any given team making the Tournament then of course the odds of at least half of those teams getting in will be small. But as the odds grow they actually get larger with the larger conferences. And in the obvious example of a 50% chance of making the Tournament, the size of the conference does not matter. So what we're seeing are rounding errors due to small sample sizes. And since we know that in general in major conferences the odds of making the Tournament are about 50% for a random team, we can use that scenario as the closest to reality. Which says that the size of the conference is basically irrelevant.

Addendum: If we go back to the real question, which is the difficulty to build a resume, we can demonstrate easily that the resume is irrelevant of conference size by using the standard Sagarin solver. Two teams with identical out-of-conference records that both go .500 in conference with have an identical chance of making the Tournament. So the question is, how hard is it to go .500 in conference? And the answer is, it's tougher in the ACC than the Big East. According to Sagarin's ratings (I don't want to go into how that is calculated, because this post is already long enough), a team needs an 83.85 rating to expect a .500 record in the ACC, but only an 83.44 rating in the Big East. So it's slightly harder to put up a good record in the ACC than the Big East. And that's what matters.


Unknown said...

Why do most ACC teams have poor out of conference resumes? I would expect some signature wins out of conference from the #1 conference. They fail the eyeball test.

Here's some more math based on the national bracket at the bracket project. Projected at large invites by major conference:

BE: 8
ACC: 6
Big 10: 4
Pac 10: 0
Big 12: 6
SEC: 4

That's a total of 28 projected at large bids (and I think it's being generous as the Big East will not get 9 total teams nor will the ACC get 7 total teams). There are 73 total teams in these conferences with a total of 6 automatic bids. That means there will be 67 eligible teams for an at large birth.

28/67 < .5. What this tells me is the other guy was right with his math because the odds of a bid from a major conference are less than .5.

Jeff said...

Actually, the ACC is probably getting 7. And the Big East is getting 7 or 8. So the ACC is still getting a higher percentage this season.

And the point is that on average, more than half of ACC teams make the Tournament, while less than half of Big East teams make the Tournament. On average the teams in those conference average about a 50% chance, which is why that number was a good approximate one to use, but the point is that the ACC on average has been better for the last decade, and is better this year.

The Big East is always, always overhyped. Last year they were at best the third best conference in the country, yet ESPN recited as gospel that the Big East was the best conference ever. Even the over-hype for SEC football isn't nearly as bad as the over-hype for Big East basketball.

Unknown said...

"And the point is that on average, more than half of ACC teams make the Tournament, while less than half of Big East teams make the Tournament. "

Since the ACC went to 12 teams in 2005, they've place the following number of teams in the tourney.

2006: 4 (3 at large)
2007: 7 (6 at large)
2008: 4 (3 at large)
2009: 7 (6 at large)

So, that's 4 years, 44 teams eligible for an at large (4 automatic), and only 18 at large bids. Since 18/44=.409 < .5, your statement is not true.

During the same time frame, the Big East has produced the following bids:

2006: 8 (7 at large)
2007: 6 (5 at large)
2008: 8 (7 at large)
2009: 7 (6 at large)

That's 25 at large bids from a possible 60 (64 total but again there are 4 automatic berths). Now, 25/60=.417.

So, if your thesis is that a team is more likely to get an at large bid from the ACC, the data simply do not support that conclusion since the Big East went to 16 teams and the ACC went to 12 teams.

Unknown said...

And the ACC will not get 7 bids in all likelihood. Va Tech, 9-6 in conference with no out of conference resume, is playing Ga Tech, 7-8 in conference, on Saturday.

This is basically an elimination game as the committee will not take Ga Tech with an under .500 conference record and I don't think will take Va Tech with a 9-7 conference record and an out of conference SOS in the 300's.

Anonymous said...

First off, I'm a Terps fan; second off, the Big XII is the best conference in college basketball this season, so this debate isn't the most relevant.

Most importantly, I feel that this situation would look a lot cleaner if we actually noted that ESPN, being in Bristol, Connecticut, has become a cheerleader for the 17 - 13, (7 - 10) UConn Huskies who got nuked by Providence and have no business being in the NCAA tournament. I think this permeates the blogosphere and predictors who wouldn't normally have them in don't want to look stupid and have them in.

With this more realistic scenario, the Big East has 8 teams in, and 7 if Notre Dame gets upended in the first round of the Big East tournament. The ACC will probably get 7, as the committee tends to overrate them (remember the year Maryland was 19 - 11 and a 4 seed somehow and lost to Syracuse by 2?), but six is certainly a possibility.

Part of the authors' point is that it's easier to get a winning record in the Big East than the ACC. This is true, at least the past two seasons, as there has been no one remotely bad as DePaul in the ACC, and the combination of Rutgers, Providence (who wasn't that bad last season but has been middling since losing to Pitt 88 - 61 a few years back), Seton Hall and St. Johns is about on par with the bottom three of the ACC the past two seasons.

Five really bad teams (not including South Florida even) out of 16 is a higher percentage than 3 out of 12, so yes, it is indeed easier.

But again, the Big XII is better than either of these conferences this season, so whatever.

Anonymous said...

I see where youre coming from, but you make one assumption when discussing increasing the sizes of conferences. In a vacuum, the probability of a conference adding a tournament quality team is actually less than the conference adding a nontournament quality team, because the sizes of each group arent equal. Realistically this also is usually the case. This happened with the big east when they expanded. For every Louisville that will usually make the tournament, youre going to get a Depaul that almost never will. Youll get a marquette that makes it quite often, but youll get a s florida that never makes it. In a random expansion of ten teams where the basement is set around the 150 RPI mark (which is around the floor for most big 6 conferences in any given year), you are more likely to add more nontourney quality teams than you are to add tournament quality teams.

Jeff said...

Absolutely it's very hard to add as many elite teams as bad teams. Which is why conference can get too large. The Big East added too many teams, and it's diluted the quality of the conference.

When you talk with Big Ten or Pac 10 fans, they'll generally tell you they're opposed to their conferences expanding because they're afraid of turning into the Big East.

Anonymous said...

You know who really diluted their conference: the ACC. Adding Va Tech, BC, and Miami. Obviously, it was done for football, but it turned the best basketball conference into an also ran.