Division 1-A college football is the only major college sport that does not determine its champion on the field. Historically, two opinion polls - one of sportswriters (the Associated Press poll), the other of college coaches (the USA Today/CNN Coaches poll, formerly United Press International coaches poll) - unofficially determine the mythical national champion (MNC). Silly as it may seem, the best team is determined by individuals with personal biases who may possess limited knowledge about other teams outside their time zone or conference.
Determining the college football mythical national champion over the last five years has been especially problematic, resulting in more fervent calls for a national playoff system. Miami (Florida) was a controversial choice for the MNC in 1989, and the following two years the AP and Coaches poll named different champions. In 1990 Georgia Tech, with one tie, and Colorado, with a tie, a loss and a fifth-down victory, split the MNC. In 1991, Miami and Washington both finished undefeated and split the top spot in the polls.
A Bowl Coalition was hastily formed for the 1992 season, whereby, if feasible, the top-ranked teams would be matched in a bowl game as determined from the combination of the two voting polls. This worked out fortuitously well in 1992 (Miami and Alabama met in the Sugar Bowl, with Alabama winning), but poorly in 1993 and 1994. The 1993 season exhibited the folly of allowing sportswriters and coaches to determine the Bowl Coalition matchup, as increased politicking and documented irrational and biased voting occurred. Beyond the poor voting, a serious ethical question could be raised: sportswriters were, in essence, reporting on events in which they had control in creating! Even after the bowl games, though Florida State was named the MNC in both polls, other teams protested the results.
Last year, as in 1991, two teams, Penn State and Nebraska, finished the season undefeated. Since Penn State plays in the Big Ten Conference, which is not part of the Bowl Coalition, the two teams did not meet in a bowl. However, unlike 1991, Nebraska was named national champion in both polls (though, again, not unanimously), giving Penn State the dubious distinction of finishing a season undefeated and not being named mythical national champion on three different occasions (1968, 1969 and 1994). Justifiably so, Penn State followers cried "Foul!" to all who would listen.
There have been a number of attempts at developing bias-free rating and ranking systems for college football, some dating back to the 1920s (e.g., the Dickinson system, operational from 1926-1940, was developed by a professor of economics at University of Illinois). The two best known rating systems today are the New York Times Computer Ratings and Jeff Sagarin's ratings published in USA Today.
I would like to offer a neural network-based approach to ranking football teams that I believe offers a systematic and objective approach to assist in naming a national champion. In a recent Interfaces article , the neural network model was used to analyze the 1993 college football season, showing, among other things, that the two computer ratings system mentioned above have serious flaws in their logic. This article reports on the results of applying the same neural network approach to the 1994 season.
Neural Network Model
Neural networks are, in essence, biological inspired approaches to information processing. In the context of college football, neurons represent the teams, their output value the relative strength of the team, with weighted connections between neurons representing games played and their outcomes. The neural network calculates the ranking for each team by considering the connection weights (game outcomes) in conjunction with the corresponding output of the connected neuron (strength of opponent). A steady state solution is determined, and a ranking determined from the neural output values.
The basic "philosophy" of the approach is to recognize the interaction between game results and strength of opponent explicitly, treating each game with equal weight in determining the teams' ranking, and noting that there is more nobility in losing to a strong opponent than beating up unmercifully on a weak team. More details of the approach and the theory behind it is provided in .
Each division I-A team has a corresponding neuron in the neural network, with an additional neuron with fixed output value of zero also included to represent a generic team for those cases where a lower division team defeated a division I-A team (this happened 11 times in 1994). This was done to appropriately penalize the team for such an egregious loss. Games where a division 1-A team defeated a non-Division I-A team were excluded from consideration.
Any point differential greater than 15 points was truncated at 15. Thus, teams that are allegedly merciless in running up the score against inferior teams (Florida comes to mind here) are not overly rewarded. Model parameters were chosen such that great emphasis in the calculations was placed on strength of opponent, based upon independent feedback and comments I have received from knowledgeable colleagues, who, of all things, included some sportswriters!
1994 Season Results
Results from three different models are shown in Table 1 (if you are using a browser that can view tables, such as Netscape 1.1 or higher, click here). The standard implementation ranks Penn State ahead of Nebraska, primarily due to Penn State having a small edge in overall average schedule strength. I executed the model under a variety of different parameter settings and, in each case, Penn State remained No. 1. In fact, as the season progressed, Penn State became the No. 1-ranked team at about the sixth week (when the number of games played warrants meaningful rankings) and never relinquished its top ranking.
Admittedly, truncating the point differential at 15 points is arbitrary. The intent is to not allow teams to "fatten" their rankings by running up the score on talent-challenged opponents. For comparison purposes, I have included a second neural network ranking which ignores point differential. As before, Penn State is No. 1 across all parameter settings. Significant changes: Alabama vaults to No. 3 in the ratings (the Crimson Tide won very ugly in 1994), previous unranked North Carolina State jumps to No. 14 (the Wolfpack suffered some big losses to tough teams), while Bowling Green, Illinois and Boston College all fell out of the Top 25.
Finally, I include a third version of the neural network ranking, one that eliminates all games involving teams from the Mid-American Conference and the Big West Conference. I have not done this as an attempt to be malicious to the fans of these fine, outstanding institutions, nor as a final act of desperation to try to align the output of the model with my loyalties, but to investigate the impact that including these teams have on the final ratings. Most college fans would not concur that Bowling Green should be ranked No. 16, as in the original neural model. The perception exists (right or wrong) that the MAC and the Big West are closer in ability to Division I-AA than I-A. This is only reinforced by:
(b) Intraconference "Incest" - Non-conference games between the two conferences occur frequently.
(c) The intense flak received by some top teams for scheduling games with members of these conferences, regardless of circumstance (see Nebraska vs. Pacific, 1994).
(d) Poor record of the conferences with other I-A schools; only six wins in 1994 against teams outside the two conferences.
(e) High relative loss rate to Division I-AA schools (9 of the 11 losses in 1994 came from these two conferences).
When these two conferences are not included, the neural network model finds Nebraska the top-ranked team, with the rest of the rankings fairly consistent. Why did this flip-flop occur?
The aforementioned Nebraska-Pacific game (a 70-21 NU victory) was one of the games that helped "drag" Nebraska's schedule rating down. Recall that a teams' overall strength is determined by treating each game result equally. Eliminate this game from analysis and Nebraska's schedule is more on-par with Penn State's, and the interactions of game results with opponent strength narrowly favor Nebraska.
Relatedly, had Nebraska defeated a Top-25 team (say, Illinois) instead of Pacific, irrespective of point differential, Nebraska would have been ranked No. 1 using the original model.
Results and the Model
I'll be the first to admit that the neural model possesses implementation assumptions whose validity can be questioned. For instance, Nebraska supporters could argue (as they have vociferously in justifying their MNC) that while their schedule may have been marginally less rigorous than Penn State's, they played two excellent teams (Colorado and Miami) that Penn State's schedule could not match. This is an interesting argument. I am presently investigating how implementing non-linear consideration of opponent strength impacts the rankings. Other arguable components of the model would be the treatment of games with Division I-AA teams, the margin of victory issue, and probably other things that I have omitted.
Do the results imply that Penn State would have defeated Nebraska had they played on a neutral field in 1994? Absolutely not. The model is not intended as a prediction tool, but to assess relative team strength based only on the facts, the game results. The model does not determine who has the most talented team; it objectively determines the teams which have performed the best throughout the entire season.
What lessons can be gleaned from the 1994 neural network rankings as the 1995 season continues? The potential exists for a repeat performance of what plagued the end of the 1993 and 1994 seasons. The Bowl Alliance, as it calls itself today, has set up a possible national championship game in the Fiesta Bowl on Jan. 2, 1996. The two top-ranked teams from the Bowl Coalition will be matched up in Tempe, Ariz. The problematic item is that, similar to last year, Penn State and USC could both be undefeated and highly ranked and would meet in the Rose Bowl. Also, there are a number of Bowl Coalition teams that may be headed for undefeated seasons - Florida State, Texas A&M, either Nebraska or Colorado from the Big Eight, Florida, Auburn or Tennessee from the SEC.
Consider a worst case scenario - six undefeated teams at the end of the regular season. How do you pick No. 1 and No. 2? The neural network approach will favor those who have played the harder schedules, and will not be influenced (as voters might) by those teams who will be placing large, integer values on the scoreboard against weak opponents (advantage Florida State and Colorado; disadvantage Texas A&M). I suspect that the 1995 season will end no differently that the past two seasons.
So why the neural network over other approaches? Didn't both the Sagarin ratings and New York Times ratings also place Penn State No. 1? The two approaches have provided some very problematic and questionable results over time. Space does not permit me to discuss their flaws here, but they are thoroughly exposed in the context of the 1993 season in . Also, the Sagarin ratings come out before the season begins! There must be some subjectivity involved in these rankings.
Ultimately, one would argue (as I do, most of the time) that no rating system, however bias-free and objective, should be determining the national champion. It should be done on the field. I propose that a rating approach such as the neural network be used to create a season-ending seeded "tournament" of x teams (4, 8, etc.). Reach a consensus on the parameters of the model before the season, make the algorithm known to all (unlike the secrecy of the NCAA basketball tournament), and use it to create the tournament. Some teams will still be disappointed, but at least this would occur objectively.
One final thought: Does college football really want to put an end to the imperfect way we name our "mythical national champion"? Do we want to create a Superduper Bowl that always disappoints us like the NFL version? Some might claim that the level of disagreement as to the No. 1 team is one of the great traditions of college football, and that it keeps us going through the long eight month span between seasons. Unfortunately, I know of no model that can assist us with this more "fundamental" dilemma.