Thursday, February 22, 2007

Recruiting, Regression and Success – Another (Better?) Way

After finishing my post below and giving it some more thought, I wondered if there was not a better way than averaging 4 years of class rankings to determine strength of recruiting class.

Turns out there is. In rating the teams each year, gives them a total point number, which is uses to rank the teams. Instead of averaging the 4 classes in the 2003-2006 period, I decided to add the total scores over the 4 years to see which teams had a cumulatively better ranking. In this way, if you get a better idea of the total class strengths in cases where the disparity between classes might be very large (where say a number 1 class is significantly better than the 2nd class).

The new ratings changed our previous order, and in turn, actually made everything look a little clearer.

First and foremost, Florida jumps from 4th to 2nd in this arguably far more accurate ranking of 4 year recruiting strength. In other words, over the period of 2003-2006, Florida was 2nd only to USC in terms of total recruiting points per Perhaps in seeing this, the Gator’s MNC of last year was not such a surprise.

The chart also shows us the relative disparity between even teams ranked in the top 20. USC, with 10,258 total points, is 34% better than Tennessee, which is 10th. USC is in turn over 32 TIMES better than New Mexico State, who is last on our list with a four year total of only 315.

Finally, you have to go all the way to 18th on this list to find your first losing record.

Next, we look at the decile breakdown of the major conference teams.

Average wins by Major Conference Decile

1st (Best) – 10.43 wins
2nd – 9.50 wins
3rd – 7.29 wins
4th – 7.83 wins
5th – 6.71 wins
6th – 6.00 wins
7th – 6.67 wins
8th – 7.14 wins
9th – 7.00 wins
10th (Worst) – 5.43 wins

Once again not a perfect correlation, but one that is very strong indeed.

Finally, sparing the reader another regression chart, the correlation between the total recruiting score and wins was even stronger than our last case.

The real advantage of using the total score is that it gives us a nice way to predict what total wins should have been based on recruiting rankings, and what they might be in the future. By finding an average of wins to total score, and then applying it to each team, we get the following (for every major conference team)–

Once again, any sample size this small is imperfect, but you can see that the predictions follow the actual trend pretty closely. Certainly there are, and will always be, outliers. Also, the predictions have to serve as a mere “target” guideline, due to the impossibility of teams like Florida, Georgia and Tennessee all winning more than 11 games (can’t happen in the same conference).

Furthermore, one might look at the above chart as a “under or over” performance indicator. FSU, for instance, seems to have far better talent than its record produced. Bad coaching or bad luck? - A topic for another day.

As hinted, in my next installment I will “predict” next years win totals for teams based on the total scores of 2004-2007. And while their will again be outliers in 2007, I can say fairly confidently that teams like Wake Forest will “mean revert” and fall far closer to their prediction in 2007 than they did in 2006. I will also look at the mid-majors, to see if we can predict who next year’s Boise State might be.


Henry Louis Gomez said...

The one possible flaw may be that I believe that changed the point system they use between last year and this year. Thus the weight for this recent class is not on the same basis as the previous classes. The "error" of course applies to all teams but it may lend more importance to this year's recruiting class that will arguably be the least important in terms of 2007 wins.

Anonymous said...

This is great stuff. I don't think you're going to get much to figure out Boise St. Their offense is too much of an anomaly. I bet their defense falls into line though.

Come to think of it, I bet the defensive rankings of all teams would fall even more closely in line with the recruiting rankings.

miguel said...

I think the biggest flaw in this might be that the earlier recruiting classes figure much more prominently than the later classes.

Seniors and juniors will (generally speaking) be much more responsible for a team's success than sophomores and freshmen.

And double that for true freshmen.

If you're rocking the spreadsheet, would love to see you tweak the numbers. Maybe 100% of the value of the senior class. 80% junior, 50% sophomore and 25% freshmen.

That might give the 2007 Gators a more accurate prediction for our rather upperclassmenless team.

Henry Louis Gomez said...

Miguel, Mergz did point that out (perhaps it wa in another post) but any attempt to weight the classes would be arbitrary. Why 25% freshmen? What about schools that have a lot of juniors opting for the NFL? their freshmen and sophomores are going to have to contribute more?

So it's not easy to work around this admitted problem.