College Football Ranking Comparison by K Massey

Added By mrflip

The first college football ranking comparison was published in 1995. It contained 5-10 ranking systems. Now that number has grown to nearly 100! David Wilson has categorized all college football rankings on the web, and maintains a list of links. I would like to take this opportunity to publicly thank him for his efforts, they are much appreciated.

The primary purpose of the ranking comparison is to present the large variety of rankings in an easily accessible and understood format. By displaying all the rankings in one web page, teams and rating systems may be easily compared with each other.

The comparison is open to any set of well developed computer generated ratings, as well as widely published national polls such as the AP or USA Today Coaches’. If you have a set of rankings that you think should be included, please let me know.
The ranking comparison gladly accepts any ranking that results from:

  • A human poll conducted by a major publication or group (e.g. the AP)
  • An advanced computer rating.
  • A mathematically based sequential rating (e.g. Elo’s update formula)
  • Publicly well-known systems (e.g. the RPI)

The current policy is to exclude the following (although some may be grandfathered in):

  • Independent fan or small group human polls.
  • Meta-rankings, which are just compilations of other systems.
  • Ad hoc calculator feasible formula systems.

If you are aware of any rankings that meet the criteria that are not currently included in the comparison, please email me.
It is a challenge to present so much information clearly on one web page. As the number of rankings has grown, a convienent display of them has become more difficult. Because it seems unneccessary to list the actual continuous scale ratings, I list only the ordinal rankings (i.e. 1st, 2nd, 3rd,…). This provides sufficient information to accomplish the goal of comparing teams and rating systems.

The plain text version sorts the teams by consensus ranking vertically, and sorts the ranking systems by correlation horizontally. Team names are listed at regular intervals so that they will always be visible. The high (red) and low (blue) rankings for each team are highlighted.

At the top of the comparison, the rankings are arranged in groups of five with links to their corresponding web sites. Each ranking system is also assigned a three character abreviation, which is displayed every ten teams. For example the Massey Ratings are abreviated by MAS. Consensus and correlation measures are displayed to the right and bottom of the page.

  • Consensus Average
    The “average” or “consensus” ranking for each team is determined using a least squares fit based on paired comparisons between teams for each of the listed ranking systems. If a team is ranked by all systems, the consensus is equal to the arithmetic average ranking. When a team is not ranked by a particular system, its consensus will be lowered accordingly.
  • Median
    The median has the advantage of being less influenced by outlier rankings.
  • Standard Deviation
    Measures the general agreement about the ranking of a particular team.
  • Correlation to Consensus
    Measures how well a particular ranking matches the consensus.
  • Ranking Violation Percentage
    The percentage of all games played such that in retrospect the lower ranked team defeated the higher ranked team. For example if team A beat team B, but ranking system X has B ranked higher than A, then that is a violation. This is basically an indication of how well a system explains past results (retrodiction). Predictive systems may have higher violation percentages.
    Note: Some more complex rating systems use other factors, such as homefield advantage, to make official predictions. Therefore a game may be mislabeled a violation when only considering the rankings.
  • Weighted Mistakes
    There are many possible measures of how well a ranking reflects actual results. In fact, that is how many models are defined. Potemkin suggested the following to weight violations based on ranking difference as well as importance.
sum(games with Rl > Rw) (Rl-Rw)*(2n-Rw-Rl) weighted mistakes = -———————————————————- 1000 where n = # teams, Rw = rank of winner, Rl = rank of loser I divide by 1000 so that 3 significant digits can be easily displayed in the table.

I would like to thank everyone who has provided comments and suggestions regarding the ranking comparison. Many of the emails I have received were very insightful. Special appreciation goes to those individuals who have allowed me to include their computer rankings in the comparison. I hope the effort has been helpful to us, as well as a valuable resource to the college football fan.