A Study in Prediction Performance: Updates to the AV Ranking
At Clashmore Mike we strive to monitor the college football landscape as a whole, objectively and without bias. While we have a vested interest in Notre Dame, we attempt to view the performance of the Irish via appropriate statistical metrics that benchmark on-field production (irrespective of how said production is measured).
One such example is the AV Ranking. Rather than rely on Jeff Sagarin, Anderson & Hester, Richard Billingsley, et. al., Clashmore Mike has developed its own college football computer ranking formula.
As shown here and here, the AV Ranking was used to predict the winners of the 2008 conference championships and BCS bowl games.
Upon further review of the 2008 season, it was determined that the accuracy of the AV Ranking was lacking. Specifically, it struggled to correctly forecast the winner in contests where the two teams were separated by a narrow AV Ranking point margin.
While this is not surprising—it is always difficult to predict the winner of a closely matched contest with any regularity—there was certainly room for improvement.
What, Exactly, Was The Problem?
Never content to accept mediocrity, myself and fellow Notre Dame and college football enthusiast Vince Siciliano spent the off-season identifying and quantifying the AV Ranking shortcomings that led to the inaccuracies described above. Two culprits were identified as major contributors to these inaccuracies, both stemming from improperly benchmarking teams to their competition.
First, opponents’ opponents were not considered in the strength of schedule (SOS) algorithm. This afforded the same credit for beating a 8-5 team from the WAC and a 8-5 team from the SEC. No disrespect to the WAC, but LSU was a better team than Louisiana Tech in 2008.
Additionally, the AV Ranking made no attempt to statistically benchmark teams to their competition. Were Tulsa, Houston, Nevada, etc. really prolific offensive teams or did they artificially benefit from poor defensive competition?
This was discussed ad nauseam leading up to the national title game when the potency of Oklahoma’s record setting offense was questioned due to the host of poor defensive teams in the Big 12.
What Are The Answers?
Two problems require two solutions. The AV Ranking (complete description) previously consisted of four metrics: the aforementioned SOS, adjusted win percentage (AWP), margin of victory (MOV) and quality wins/losses (QWL). These four metrics are normalized and combined via a weighted average to achieve a final AV Ranking point value.
The SOS algorithm was updated to include two quantities, one that measures the strength of a team’s opponents and one that measures the strength of a team’s opponents’ opponents. The two were normalized and combined using a simple weighted average assigning considerably more value to the former.
Concurrently, a new AV Ranking metric was created.
This fifth metric benchmarks a team’s production to its competition by defining ratios between the statistical averages of the team and its opponents. A similar version of this analysis was conducted for Notre Dame ‘s offense (mid-year and end-year) and defense (mid-year and end-year), and is a very useful tool for appropriately gauging a team’s production.
For example, suppose Team X averaged 25 points per game (PPG) against competition that allowed—on average—15 points a game. While 25 PPG seems rather pedestrian, it understimates Team X’s ability to score. The difference ratio (here called a performance ratio) of Team X’s average PPG and the average points allowed by opposing defenses ((25 – 15)/15 = 0.67) adjusts for this disparity. In other words, as the difference ratio indicates, Team X averaged 67 percent more points than their competition typically allowed.
While a litany of statistics could be used to measure production, only 15 were selected:
- Turnover margin
- Third down efficiency (offensive and defensive)
- Red zone efficiency (offensive and defensive)
- Points per game (offensive and defensive)
- Rushing yards per attempt (offensive and defensive)
- Rushing yards per game (offensive and defensive)
- Passing yards per attempt (offensive and defensive)
- Passing yards per game (offensive and defensive)
Simply speaking (a full description can be seen here), these statistical categories were used to generate 15 performance ratios that were normalized and combined using a weighted average. Slightly more value was assigned to turnovers, third down efficiency, and red zone efficiency than the other ten statistical categories.
This metric was aptly termed the Team Performance Ratio (TPR) as it adjusts the statistical production of a team to its competition.
What About The New Results?
The updated AV Ranking was generated for the 2008 season using the regular season (i.e. no bowl game statistics or win/loss outcomes). For comparison purposes, the values prior to these updates can be viewed here. The tables below show the top 25 AV Ranked teams in addition to the top ten teams in each of the five AV Ranking metrics (SOS, AWP, MOV, QWL and TPR).
Prior to the updates detailed above, the AV Ranking correctly predicted the winner of 79.8 percent of the regular season games but only 51.7 percent of the contests where the two teams were separated by a small AV Ranking point margin (using the season-end AV point values). The updated AV Ranking correctly predicted 80.8 percent of the regular season games and 62.7 percent of those with narrow margins. While the former isn’t a large improvement, the latter certainly is.
AV Ranking
[table id=5 /]
Adjusted Win Percentage (AWP)
[table id=8 /]
Strength of Schedule (SOS)
[table id=6 /]
Team Performance Ratio (TPR)
[table id=7 /]
Margin of Victory (MOV)
[table id=9 /]
Quality Wins/Losses (QWL)
[table id=10 /]