# March's Mad(ness) Scientists

*The laws of probability, the Final Four and the science of "bracketology."*

### By Sheldon H. Jacobson

The fresh smell of spring is in the air, and that means March Madness is right around the corner. This annual sporting event now brings together 68 Division I men's basketball teams, each hoping to win the national championship. Realistically, just a handful of these teams have a true chance of winning the coveted crown. Nonetheless, storylines are made with Cinderella runs and 12th seed upsets of 5th seeded teams. They call it madness for a reason.

The 2010 Final Four had an interesting mix of experienced teams and neophytes. Perennial powerhouse Duke defeated Baylor in the last of the Elite Eight Games to earn its first trip to the Final Four since 2004. Big Ten power Michigan State made its sixth visit to the Final Four since 1999, edging Tennessee by a single point. Butler, the first of the four teams to earn its way to Indianapolis, had never punched such a ticket. When West Virginia defeated Kentucky, a team with seven national championships, they made their first appearance in the Final Four since 1959. The ultimate winner was Duke, which edged Butler by two points, 61-59.

What made the 2010 Final Four special was not so much who was there, but also who was not. One can of course recognize past national champions such as North Carolina, UCLA, Indiana and Arizona, who did not even make the 2010 field on Selection Sunday. However, each year brings its own group of top teams, the so-called No. 1, 2 and 3 seeds. Since 1985, 16 No. 1 seeds have won the national crown. Only once (in 2006) has the Final Four not included a No. 1 seed. If Baylor had made a few more shots against Duke, 2010 would have been the second such occurrence of this unusual event.

A simple way to measure the rarity of a Final Four group is to sum their seeds; the higher the number, the more extreme the group. The 2010 total was 13, which is tied for the fourth highest since 1985. If both Baylor and Tennessee had won their games, the total would have been 16, making it the third highest since 1985. In either case, the 2010 Final Four was only the second time since 1985 that two teams seeded No. 5 or lower made this group (the first was in 2000).

For any given year, one can also roughly estimate the likelihood that teams seeded No. 1 and No. 2, along with two teams seeded No. 5 or lower would make the Final Four using all the tournament data since 1985 as a predictor of each round's outcome. Using a model based on the truncated geometric distribution (Jacobson et al. 2011), the 2010 field should occur only on average once every 322 tournaments. If Baylor had won their game, such a field would occur on average once every 896 tournaments! A rare group indeed.

Was the 2010 Final Four an indication that parity now exists in college basketball, or that the Selection Committee got things wrong when they seeded the teams? Although these are appetizing conclusions to draw, the 2010 Final Four teams were simply indicative of the laws of probability working their relentless magic.

To put this into perspective, only once since 1985 has there been four No. 1 seeds in the Final Four (in 2008). Using the Jacobson et al. (2011) model, a field of four No. 1 seeds should occur on average once every 39 tournaments. In other words, the fact that it has appeared exactly once in 26 tournaments is reasonable to expect, given the size of the sample. Moreover, it is over five times rarer to have four No. 1 seeds in the Final Four than to have zero No. 1 seeds in the Final Four! Given that zero No. 1 seeds in the Final Four have occurred only once in 26 tournaments, are we long overdue for such an occurrence?

In its most basic form, the game of basketball can be described as a sequence of dependent (Bernoulli) trials with well-defined outcomes. The sum of the resulting outcomes produces a final score. A superbly talented team will consistently defeat a much weaker opponent, even if the talented team plays very poorly and their weaker adversary plays well. This is why a No. 16 seed has never (so far) beaten a No. 1 seed in the first round of the tournament.

However, as the tournament progresses and each teams' average performances converge, the probability of a weaker opponent winning (or pulling "the upset") also grows. The talented team is still likely to win, and if the teams played 10 games, they may win seven, eight or even nine of them, but the chance always exists that in a single game, the weaker opponent may come out victorious. This was seen in 2010 when Northern Iowa (a No. 9 seed) defeated Kansas (a No. 1 seed and the top overall seed), and Cornell (a No. 12 seed) routed Wisconsin (a No. 4 seed). Another interesting phenomenon occurred when West Virginia (a No. 2 seed) dismantled Kentucky (a No. 1 seed) without even making a two-point basket until the second half. There was also the near miss, when No. 15 Robert Morris took No. 2 Villanova to the brink before falling by three points in overtime, 73-70. With only four No. 15 upsets of No. 2 having occurred in the first round since 1985, the last of which was in 2001, the tournament is poised for another such outcome. Indeed, single elimination tournaments are why the most talented teams do not always win, and why predicting brackets is so challenging. As such, best-of-seven series are more likely to yield the better team, as used in the professional NBA league to determine winners at each level in its playoffs.

Everyone loves upsets, which occur with great regularity and predictability every year, in the first two rounds of the tournament. On average, more than four teams seeded No. 11 to 15 win a first round game; five such upsets occurred in 2010, the same number seen in both 2008 and 2009. On average, more than three teams seeded No. 7 to 14 reach the Sweet Sixteen; four such teams were so fortunate in 2010. In fact, it is rare *not* to see a team seeded No. 11 or lower in the Sweet Sixteen; this has only happened four times since 1985.

The 2010 Final Four was special since it contained one team that had never been there and one team that had not been there since the 1950s; neither had won a national championship. The laws of probability dictate that such a scenario can and will occur, given a sufficient number of years. What is impossible to predict is when it will occur and what teams it will involve.

Has the landscape of college basketball shifted? This is highly doubtful. In each of the three years after George Mason made its magical run as an 11th seed, all the Final Four teams were seeded No. 1, 2 or 3. Indeed, this has been the case for five of the eight tournaments from 2003 to 2010. The 2010 Final Four teams suggest that the laws of probability do indeed average out and that reversion to the mean can occur at any time, in any year, without warning. The 2010 Final Four was a pleasure to observe; probability suggests that it may be a long time (possibly as long as 322 years) before a group of such seeded teams play once again for the national championship.

What can be expected in this year's tournament? Three teams will be added to the field, bringing the total of theoretical champions to 68. However, the laws of probability suggest more of what has been seen in previous years; several (but not too many) first round upsets, a low seeded team in the Sweet Sixteen, a Final Four with mostly teams seeded No. 1, 2 or 3, therefore likely producing a national champion with a top-three seeding. Think about that when you fill out your bracket.

**Sheldon H. Jacobson** (shj@illinois.edu) is professor of computer science at the University of Illinois at Urbana-Champaign, where he does research and teaches in the area of probabilistic modeling and risk analysis. He is a co-author of the paper "Seeding in the NCAA Men's Basketball Tournament: When is a Higher Seed Better?" published in the Journal of Gambling Business and Economics,* and "Seed Distributions for the NCAA Men's Basketball Tournament," to be published in *Omega.* The Web site HYPERLINK "http://bracketodds.cs.illinois.edu" http://bracketodds.cs.illinois.edu provides a model to estimate the probability of different seed combinations occurring in various rounds of the tournament.*

### Acknowledgments

The author thanks Joel Sokol (Georgia Tech), a co-creator of LRMC, Douglas King (University of Illinois), Adrian Lee (CITERI) and Alexander Nikolaev (University of Buffalo) for their comments on earlier drafts of this article.
### References

- Jacobson, S.H., Nikolaev, A.G., King, D.M., Lee, A.J., 2011, "Seed Distributions for the NCAA Men's Basketball Tournament," to be published in
*Omega. *

*OR/MS Today* © 2013 by the Institute for Operations Research and the Management Sciences. All rights reserved.