Tuesday, August 26, 2008

The Mathematics of Cinderella

A thoughtful commentator to my post regarding “The Contenders” mentions that it is possible to come from outside the top 20 in the preseason poll and win the Mythical National Championship.

He cites the case of Missouri last year as coming from outside the preseason top 25 to a number one ranking late in the season.

Conceptually I admit that my statement that teams outside the preseason top 20 have “zero” chance for a BCS title is not totally true. However, is the chance so low that my original premise holds?

What would it take for the ultimate long shot to make the BCS title game? For Cinderella to fit that glass slipper?

To figure this out, we are going to create an equation to calculate the odds. Please note that, while we will attempt to be reasonably accurate, a certain amount of speculation is necessary. And while we might not agree on all the points, I think the process will show the extreme long shot prospect of an outside-the-top-20 team.

History of the BCS Title Game
In the history of the BCS title games (since 1998), there has only been a single season that had no undefeated teams going into the bowl games (2003). By season, the number of undefeated teams, and teams names (once again pre-bowl) –

1998 – 2 (Tennessee, Tulane)
1999 – 3 (FSU, Marshall, Virginia Tech)
2000 – 1 (Oklahoma)
2001 – 1 (Miami)
2002 – 2 (Ohio State, Miami)
2003 – None
2004 – 5 (USC, Oklahoma, Auburn, Utah, Boise St)
2005 – 2 (Texas, USC)
2006 – 2 (Ohio State, Boise St)
2007 – 1 (Hawaii)

Last season, perhaps the strangest we have seen ever, with a 2 loss BCS title winner didn’t result in a team ranked worse than 10th preseason with a final shot at the BCS championship. And in the end 2-loss LSU’s preseason ranking of #2 certainly helped them stay near the top, and in contention.

Even in 2003, though the “national champion” remains contested, the BCS title game took place between preseason #1 (Oklahoma) and #15 (LSU). Thus even in a year when no one remained undefeated, the preseason 15th ranked team took the title.

For the most part, the preseason rankings of BCS title game teams have been remarkably consistent. The preseason rankings of the BCS title participants since 1998 –

1998 – 2 (FSU) v 9 (UT)
1999 – 1 (FSU) v 11 (VT)
2000 – 2 (FSU) v 19 (OK)
2001 – 2 (UM) v 5 (NEB)
2002 – 1 (UM) v 12 (OSU)
2003 – 1 (OK) v 15 (LSU)
2004 – 1 (USC) v 2 (OK)
2005 – 1 (USC) v 2 (TEX)
2006 – 1 (OSU) v 8 (UF)
2007 – 2 (LSU) v 10 (OSU)

Every year since the inception of the BCS the preseason number 1 or 2 ranked team has played for the title, and the highest contender (and eventual winner) was 19th Oklahoma. If you look at an average (best against worst), the average preseason ranks of BCS participants is 1.4 v 9.3, and the overall average preseason ranking of BCS participant is 5.4. Of the 1st or 2nd ranked teams that went to the BCS title game (12 total teams), 8 were undefeated, 3 had one loss ('98 FSU, '00 FSU, '03 OK) and one had two losses ('07 LSU).

Since we can virtually expect 1 of the top two teams is going to be playing in Miami for the BCS title early next year, what are the odds someone ranked worse than 20th in the preseason can occupy that other slot?

The Cinderella Scenarios
Let’s review the scenarios where this “Cinderella” (a team ranked worse than 20th) gets the BCS championship bid –

Undefeated Scenarios
The most obvious scenario would have Cinderella facing a one-loss or undefeated top ranked preseason team. We know that in the case of at least 2 top ranked teams going undefeated, this hypothetical Cinderella would almost certainly be shut out regardless of record (see Auburn '04). In 4 of our 10 BCS years at least 2 undefeated, preseason top 20 teams made it to bowl season, so we will say 40% of the time it doesn’t even matter if Cinderella is undefeated – she won’t get to play. (If she is a “mid-major”, non-BCS team she isn’t going to play regardless of record, so this only applies to major conference BCS teams). Thus we will set the first part of equation: X = 0.6 (where X is the chance that there is not 2 undefeated, preseason top 20 teams)

Chance that Cinderella goes undefeated
What is the chance we will see an undefeated major conference team ranked higher than 20th preseason? It is very hard to put odds on this, because it simply hasn’t happened in the BCS era (Hawaii, Boise State (twice), Utah, Marshall and Tulane have all run the table since 1998, but as non-BCS teams none were seriously considered for a BCS title slot). The very process of conference play and championships seems to limit the number of undefeated teams. Based on strict mathematics, and the assumption that this Cinderella has a 50/50 chance in every game, the chance of a team to go undefeated is 2 one hundredths of 1 percent (.50 to the 12th). If we look strictly at the total number of teams that have gone undefeated since 98’ (remembering again that no 20th or higher ranked team has done so), the odds were 1.6% (19 teams of 1190 total teams).

So for the sake of argument we are going to give Cinderella a much better chance than the strict odds dictate – say 1%. The second part of equation: U = 1% (Where U is the odds of Cinderella going unbeaten).

One Loss Scenario
What about Missouri last year – what about a 1-loss Cinderella team getting the shot?

Obviously there couldn’t be more than one undefeated team remaining (so the X factor is still exclusionary). Also, Cinderella would have to “jump” every higher ranked 1-loss team. So what is the typical number of undefeated or 1-loss teams Cinderella would have to deal with?

Going into the bowl games, the average number of BCS conference undefeated and/or one loss teams per year since 1998 is 4.5, with a high of 6 in 2000 and 2004, and a low of 2 last season. The hard part here is estimating where Cinderella might end up ranked – would she be ahead of the other 1-loss BCS teams? Missouri after week 13 last year was ranked ahead of two teams that had 1-loss and had themselves been higher ranked to start the season – West Virginia and Ohio State, who were 2 and 3 respectively. Kansas also had a single loss at that point, but had been beaten by that same Missouri team the week before. All other BCS teams had 2 or more losses.

So in order to snag the desired BCS slot, Cinderella has to have one loss and get ranked in the top two. The odds of a BCS team having a single loss aren’t all that much better than going undefeated, as only 32 BCS conference teams have had one loss in the past 10 years. If we assume 66 major conference teams (with ND), the odds have been 4.9% (32 of 660). However, the vast majority of those teams were ranked in the top 20 preseason, as only 10 teams ranked higher than 20th finished the regular season with 1 loss in the past 10 years, or 1.5% of the teams (10 of 660). For our purposes let’s split the difference, and say Cinderella has a 3.2% chance of having one loss. So, the next part of our equation: L = 3.2% (where L is the odds of being a one-loss team)

Finally, we need to make our best guess how likely Cinderella is to be ranked ahead of other 1 loss teams, while accounting for the presence of no more than 1 zero loss teams. We know there are an average of 4.5 zero and one loss teams per season (which we will round to 5), and Cinderella would need to be ranked ahead of at least 3 of them. Let’s say this can happen 40% of the time (20% chance to occupy either the first or second of 5 slots). This is probably generous to Cinderella considering the others probably started higher ranked to begin with. So we have our last part of the equation: R = 0.4 (where R is the chance Cinderella is favorably ranked)

Our equation

(U + (L x R)) x X

Where U is the odds of Cinderella being undefeated, L the odds of Cinderella having 1 loss, R is the odds a 1-loss Cinderella will be favorably ranked, and X is the odds there are not two undefeated teams. (We multiply X against the whole equation because if there are two undefeated preseason ranked teams Cinderella will get "Auburned".)

With our numbers

(1% + (3.2% x .4)) x .6 = 1.37 %

So there you have it, and I stand corrected that it is impossible. At 1.37%, we can expect to see a single team ranked worse than 20th in the preseason appear in a BCS title game in the next 73 years.

Postscript & Takeaways
Is the chance really exactly 1.37% ? Almost certainly not. But the exercise does show the following emphatically –

- In order for a Cinderella to make the BCS title game they need to be undefeated or, at the worst, no more than one loss.

- No BCS conference team outside the top 20 has been undefeated in the past ten years.

- The chance of having even one loss on a season is slim – 4.9% for major conference teams over the past 10 years, and only 1.5% for teams not ranked in the top 20 in the preseason.

- Even if a one loss Cinderella exists, they have to be favorably ranked to have a shot – an iffy proposition at best.

- Regardless whether our Cinderella is undefeated or a 1 loss team, if there are 2 undefeated preseason top 20 teams (this has occurred 40% of the time in the past 10 years), she isn’t going to the BCS title game.

- Non BCS teams, although having 6 undefeated seasons in the past 10 years, don’t make BCS title games.

I stand by my original proposition - it simply isn't going to happen.


Year2 said...

Are those preseason ranks from the AP or the Coaches' Poll?

Mergz said...

Coaches' Poll, though as you know the two are vitrually the same every preseason.

TomReagan said...

Totally agree that it's very unlikely.

But does that mean that the voters basically get it right in the preseason polls?

Henry Louis Gomez said...

Not in my opinion.

I think it's a self fulfilling prophecy.

You have 119 teams of which two get chosen to play for what is supposed to be a championship. What are the chances that the two chosen are actually the two best teams?

What we have now is a game of musical chairs in which we start with about 15 players and 14 teams. When the music stops, a player gets bounced out. But in this game the eliminated players can get back in line.

It's a totally dysfunctional system.

Henry Louis Gomez said...

Also add into the equation that many poll voter don't just vote on the quality of the team. They vote based on expectations due to schedule. Why is Ohio State ranked so high? It's not because they are really the number 3 team in the country, it's because they are the best team, by far, in their conference. What would OSU's rank be if they switched schedules with Florida?

The expectation is that OSU is going to be relatively unscathed at the end of the season. I for one am rooting for them to make it to the BCS CG again and get trounced.

It would be funny as hell if they lost 3, 4 or 5 straight BCS titles.

Mergz said...

tomreagan - An interesting question which I considered as I did this work.

I think the answer is - partially. College football lends itself in some ways to easy predictions because of the gross disparity between talent on teams. 99% of the time, USC is going to beat Stanford, last year excepted.

When it comes to unbeaten teams, the pollsters can be argued to have done a decent job, as no team ranked higher than 20th has been unbeaten in 10 years (BCS conferences. Over that period there have been 13 undefeated major conference teams, and most were ranked preseaon 1st or 2nd. The highest were Oklahoma 00' ranked 19th, and Auburn 04' ranked 18th.

With the exception of Auburn, being undefeated usually takes care of itself. Where it gets sketchy is when there are several teams with similar records, like last year, where a 2 loss LSU gets the nod. There were plenty of 2 loss teams, and I think LSU's preseaon 2nd ranking is what made all the difference.