Site icon McIllece Sports

Audibles: Improving the “Strength of Record” Metric

The Audibles are a series of short-form research articles about thorny issues in college football analytics

Improving the “Strength of Record” Metric

With the season about to kick off and the news that the College Football Playoff Committee will be using “refined” metrics this season, including a version of Strength of Record (SOR), it seems a good time to review SOR and propose an improved version I call “Multinomial SOR”.

Popularized by the ESPN-FPI analytics group, the SOR metric computed at their website (2025 Resume College Football Power Index – ESPN) “reflects chance that an average Top 25 team would have team’s record or better, given the schedule.” The boldfaced type is my own, as it is critical to the calculation.

Presumably, an average Top 25 team was chosen because that would correspond to about the 12th or 13th ranked team, which makes some sense in an era of a 12-team playoff bracket. However, there are currently 136 teams in the Football Bowl Subdivision (FBS) of Division 1 college football. Computing a metric that reports how the #12 or #13 team would fare against each of the 136 FBS schedules is somewhat arbitrary and ignores the full distribution of college football teams.

A more appropriate measure would be to compare ALL teams to ALL opponents and determine the probability that a randomly selected FBS team would win each game. This gives a full multinomial distribution of FBS power and schedules, rather than a static comparison to the power level of the #12 or #13 team. The full multinomial method proposed here can clearly differentiate the quality and value of beating teams ranked in the lower half of the distribution. The ESPN method fails to distinguish much value between such opponents, even if one is ranked #90 and another is ranked #120, because a team ranked #12 in power rating would have a win probability fairly close to 1.00 against both teams.

Another enhancement included in Multinomial SOR is the inclusion of power rating error in the win probability calculations of each game. In this case, simulating every FBS team against every other FBS team 500 times* allows for random errors to be included in each game, improving the accuracy of the win probability estimation. In ESPN-FPI’s case (and likely the CFP Committee’s metrics, as well), it’s extremely unlikely that the variance of the power rating measure is accounted for.

So what do the specifics of Multinomial SOR look like? Here is the step-by-step procedure, assuming power ratings, variances, and a win probability model are previously constructed from past data:

  1. Create a Cartesian product of all FBS teams by all other FBS teams AND any FCS teams that are on an FBS schedule. Call this matchup (i,j).
  2. Merge in power ratings (X) and variances (Y) for team i and team j.
  3. Create a LOOP of size Z = 500 (for 500 simulations). Within each iteration, generate two standard Normal random numbers (R1, R2) to add random error to each power rating:
    X(i,z) = X(i) + R1*sqrt[Y(i)]
    X(j,z) = X(j) + R2*sqrt[Y(j)]
  4. After computing the X(i,z) and X(j,z) power ratings for (i,j) in simulation z, generate a Uniform random number (R3) to use in the win probability model. In my case, that is a logistic regression model that takes the form p(i,j) = 1/[1 + exp(B2*(X(i,z) – X(j,z)))]. If R3 <= p(i,j), then flag the result as w(i,j,z) = 1 (win), otherwise w(i,j,z) = 0 (loss). Notice that this assumes a neutral site, but home/away values will be derived mathematically in Step 6 below.
  5. Once all the w(i,j,z) values are calculated, compute the overall probability of beating team j at a neutral site. This is the proportion of all games against j that were wins (for all FBS teams). Call this result p(j|N): The probability that a randomly selected FBS team would beat opponent j at a neutral (N) site.
  6. Algebraically rearrange the win probability model from Step 4 to incorporate the homefield advantage and the simulation result p(j|N):
    B2*(X(.,z) – X(j,z)) = log[(1 – p(j|N))/p(j|N)]
    Add the homefield advantage regression term B1*home (from the preconstructed win probability model) to the left-hand side, for both home = 1 and home = -1:
    home game vs opponent j: B1*1 + B2*(X(.,z) – X(j,z)) = B1 + log[(1 – p(j|N))/p(j|N)]
    away game vs opponent j: B1*-1 + B2*(X(.,z) – X(j,z)) = -B1 + log[(1 – p(j|N))/p(j|N)]
    Then enter the HFA-adjusted terms into the exponential function of the logistic regression model:
    Home(H): p(j|H) = 1/[1 + exp(B1 + log[(1 – p(j|N))/p(j|N)]
    Away(A): p(j|A) = 1/[1 + exp(-B1 + log[(1 – p(j|N))/p(j|N)]
    Neutral(N): p(j|N)
  7. Now we have the probabilities that a randomly selected FBS team would win a game against opponent j at home, away, and neutral sites. Given these probabilities, the SOR value of a win is simply 1 minus the associated probability, and the value of a loss is 0 minus the associated probability. For example, for a home win against j, the SOR value is 1 – p(j,H). The value of wins are between zero and one, and the value of losses are between zero and negative one.
  8. For each FBS team, compute and sum the SOR values of all their games played. Then rank “Strength of Record” by descending SOR totals for each team.

Applied retroactively to the 2024 season (including all the postseason games), Multinomial SOR produces the following Top 25. The SOR total is the expected difference in wins against the team’s schedule relative to a randomly selected team in FBS.

2024 Multinomial SOR Top 25

#1 Ohio State (14-2) +8.27
#2 Notre Dame (14-2) +8.05
#3 Texas (13-3) +7.58
#4 Oregon (13-1) +7.54
#5 Penn State (13-3) +6.98
#6 Georgia (11-3) +6.34
#7 Arizona State (11-3) +5.35
#8 BYU (11-2) +5.24
#9 Iowa State (11-3) +5.02
#10 SMU (11-3) +4.69
#11 LSU (9-4) +4.59
#12 Missouri (10-3) +4.54
#13 South Carolina (9-4) +4.41
#14 Alabama (9-4) +4.40
#15 Indiana (11-2) +4.37
#16 Illinois (10-3) +4.27
#17 Tennessee (10-3) +4.23
#18 Ole Miss (10-3) +4.11
#19 Boise State (12-2) +3.97
#20 Colorado (9-4) +3.82
#21 Clemson (10-4) +3.71
#22 Miami (10-3) +3.65
#23 Michigan (8-5) +3.64
#24 Kansas State (9-4) +3.64
#25 Syracuse (10-3) +3.59

 

*Z = 500 simulations might not seem like very many. However, since each FBS team plays each opponent 500 times, there are (136-1)*500 = 67,500 game results to determine the difficulty of a particular opponent, and hence the value of winning or losing the game, which is what SOR is trying to determine. So Z = 500 is quite sufficient and produces consistent, stable results from one run to the next.

 

Exit mobile version