Thursday, February 22, 2007

Recruiting, Regression and Success – Why Recruiting Ratings Matter

Anyone following Saurian Sagacity is aware that I have an interest in seeing if success in recruiting, in this case as defined by the rankings at, has any correlation to success on the field.

Using the data available at, I have averaged the class ranking of 115 NCAA Division I-A teams over 4 year periods to determine who had the best average recruiting class (ARC) over the time period examined. The results of the top 20 for the most recent recruiting class can be found here.

It has been suggested that the rankings might be better if they were weighted, as an incoming freshman class is not as likely to contribute as a senior class. However, in trying to come up with some weightings, I found 3 problems –

1. There is an inconsistent participation of seniors across the schools. Wake Forest last year was a senior laden roster, while many of the more successful schools see their players leave early. I found no way to be consistent.

2. As a result, any weightings I assign would, necessarily, be arbitrary.

3. Even when I applied some weightings, the results changed very little.

Thus, an equal weighting seemed the best course.

In looking at recruiting to success in the past, I compared recruiting class rankings to final Coaches’ Poll rankings. However, upon further thought, I decided this was somewhat inaccurate. In a conference like the SEC, which tends to have highly ranked classes, some teams are necessarily knocked from the top 20 rankings due to the nature of inter-league play – you can’t all win.

So I decided to compare the rankings for the 2003-2006 period to wins for the season. I was very surprised at what I found.

The following is the top 10% of average recruiting classes for 2003-2006, and their wins in 2006 –

The average number of wins for this group is 10, a number long considered the benchmark of a successful season. Not a single team had a losing record (one would expect half to have losing records were it random).

Certainly, some low ranked teams, such as Wake at 57th, had successful years. However, if you break down the ARC ratings of the 115 teams by deciles (10 percent groups), you find the following average wins per decile –

Average wins by Decile

1st (Best) – 10 wins
2nd – 7.9 wins
3rd – 6.6 wins
4th – 6.9 wins
5th – 6.9 wins
6th – 6 wins
7th – 6.5 wins
8th – 5.5 wins
9th – 5.2 wins
10th (Worst) - 5.1 wins

While not what statisticians would call a “perfect correlation”, the results are nevertheless very strong. The teams with better recruiting classes per tended to have better results.

In looking at the overall ARC numbers, I noticed a phenomenon about halfway down the list where teams suddenly started winning 10 or more games (Boise State is a good example of this). Then I realized what it was – it was successful mid-major teams that had comparatively poor ARC ratings (to the big conferences), but decent records.

This makes obvious sense. Since the mid-majors mostly play each other, the more successful mid-major teams in recruiting will have higher wins totals than other mid-majors, and higher wins than non-competitive major conference teams.

Stripping out the mid-major teams into a separate group is very telling. First, the deciles of the major conferences when mid majors are removed –

Average wins by Decile – Major Conferences

1st (Best) – 10.4 wins
2nd – 9.7 wins
3rd – 7.4 wins
4th – 7 wins
5th – 6.4 wins
6th – 5 wins
7th – 8.4 wins
8th – 6.4 wins
9th – 6.3 wins
10th (Worst) – 4.8 wins

A very strong correlation is found, with the aberration in the 7th decile. The problem with the 7th decile is the participation of three strongly winning teams (Boston College, Louisville and Oregon) and is a result of the rather small sample size.

To put these numbers to the test, I ran a regression analysis. Basically, regressions are done to see if there is a relationship between numbers. If there is a perfect relation, we have a slope of 45 degrees (or “1”). In running our numbers, we get the following –

With average recruiting class of the bottom (lower is better), and number of wins on the left, the slope (red line) shows us there appears to be a fairly strong correlation between winning and ranking of recruiting class by

For the mid-majors, the sample size is smaller, and there is correlation, though perhaps not as strong. In looking at the ARC of the mid-majors separated from the majors however, we can see strong success among the higher ARCs there –

As you can see, the best ARC’s among the mid-majors were mostly successful teams.

Conclusion – There appears to be a reasonably strong and positive correlation between average recruiting classes (ARCs) and number of wins a team posts in a season. If one is inclined to believe that talent bears a strong relationship to winning (I do), then we can say that is doing a pretty commendable job in ranking talent, and that the teams with the best talent per are more likely to win than lower ranked teams.

Essentially, recruiting ratings DO matter – and they matter very much. In at least the case of and the period of 2003-2006, their ratings of recruiting classes do give a strong correlation as to which teams were successful in 2006.

Next – Using the 2004-2007 ARCs to predict team wins for the 2007 season.


Henry Louis Gomez said...

Wow, so that's what you've been working on while I've been sitting on my ass thinking of stuff to write.

BTW, wouldn't winning percentage be a better indicator of success than wins since the number of games each team plays varies?

Gator Duck said...

Another improvement over using raw team recruiting rank would be to use the actual recruiting rankings on a player by player basis, not including any players no longer with the team for whatever reason. Of course, this is a lot more work...

Mergz said...

Henry -

Yeah, that's it. And trying to finish the Georgia part of the Rivalry series (it is a lot of stuff when it come to UGA!).

The winning percentages would be better overall indicator, though this is also pretty solid - and easier to use. Plus, for the next step, I wanted to "predict" how many wins a team might have next year, so I used win totals.

Duck -

That would be wayyyyy more work.But since Scout goes to the trouble of ranking the classes, I felt we might as well use that. Perhaps using their class scores would even be better.

Henry Louis Gomez said...

Then perhaps you could use only the regular season. Since all teams (that I'm aware of) are now playing 12 regular season games. A team's regular season schedule obviously varies somewhat from year to year but it's probably more stable than we think. A lot of the same teams are played (though the venue alternates) and cupcakes are usually replaced by cupcakes. When you throw in the championship games and the bowls there's a lot more variables involved.