Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

The Mathematics of the NBA Playoffs

By Bill James

April 21, 2012

I was watching one of those Pardon-Me-For-Asking type shows, where two hosts debate something, and there was an exchange regarding the significance or reliability of NBA won-lost records. One co-host said that the Chicago Bulls were obviously the class of the NBA this season and should be the favorites in the playoffs, while the other one said that regular-season NBA records didn’t mean anything because the best teams just coast through the season without paying much attention to wins and losses. In his mind Miami (42-17) was obviously the best team, while Chicago (46-14) and Oklahoma City (44-16) were just young teams that were keeping their foot on the gas through the regular season. This co-host—the Miami Advocate—pointed out that in recent years the team with the best record in the NBA has generally not won the championship, and he tied some specific information to that, which I would certainly screw up if I tried to quote it.

Anyway, the question I am pursuing here is this: Is the regular-season won-lost record a true indicator of how a team is likely to perform in the NBA playoffs? There are other questions as well (What is the chance that the Bulls will win the Playoffs this year?), but essentially what I’m trying to get to is that one question.

OK, How often should the team with the best record win? Let’s say that Chicago (46-14) was playing Cleveland (22-39); how often should Chicago win? These records are a couple of days out of date, but don’t worry about that.

You can figure that by cross-multiplying the wins and losses—

Chicago Wins * Cleveland Losses

-----------------------------------------------------------------------------------------

(Chicago Wins * Cleveland Losses) + (Cleveland Wins * Chicago Losses)

That’s one of those formulas I developed in the 1970s, only (I am told) it is mathematically true, as opposed to a heuristic. Anyway, at 46-14 vs. 22-39, that would be:

  46 * 39

------------------------------------------------------------

(46 * 39) + (22 * 14)

Which is 1794/ (1794+308), which is 85.3%. When the Bulls play Cleveland, Chicago should win 85.3% of the time.

That’s a single game. But if Chicago would win 85.3% of games between the two teams, what is the probability that they would win 4 out of 7?

If the two teams were to play 7 games, the probability that Chicago would win all 7 games is .330, 33% (.853 to the 7^th power.) The probability that Chicago would go 6-1 is 39.6%. The probability that they would go 5-2 is 20.4%, and the probability that they would go 4-3 is 5.8%. Altogether, the probability that Chicago would win at least 4 of the 7 games if the regular season record is a true indicator of the quality of the teams is 98.9%.

Some of you may be distracted by the fact that not all playoff series go 7 games; sometimes they end 4-1 or 4-2. But from a mathematical standpoint that’s not relevant, since all sequences of wins or losses in which a team has four wins in less than seven games are sequences in which they would have at least four wins if the series went seven, which is all we’re really worried about; the probability that the team would win four or more games out of seven. In the case of Chicago vs. Cleveland that is 98.9% for Chicago--if the math works.

OK, let’s see if the math works. If what the Miami Advocate is saying is true, then there should be more upsets in the NBA post-season than would be predicted by the method that I just outlined. Are there, in fact, more upsets than would be predicted by this math?

I went back to 2005 to look at the data. Since 2005 (2005 through 2011) there have been 105 NBA playoff series, all of them best-of-seven series. (There are 16 teams that make the NBA playoffs each year, not 714 as is commonly believed. Each series eliminates one team, so it takes 15 series a year to eliminate all of the teams except the champion. Seven times 15 is 105.)

Of those 105 Playoff series, 56 have been first-round series. In those 56 series, given the math, we would have expected that there would be 13.93 first-round "upsets" if the regular-season won-lost records are meaningful and predictive, and more than that if the regular-season won-lost records are not meaningful.

In fact, there have been only 12 first-round upsets in those seven seasons.

Backing off to make sure everybody is keeping up. . .what does it mean to say that there "should" be a certain number of upsets? If the won-lost records are a valid indicator, shouldn’t we expect zero upsets?

No; that’s naïve math. If Chicago were to play Cleveland, Chicago should win 98.9% of the time—not 100% of the time. Let’s go back to the 2005 season.

In 2005 Miami (59-23) played New Jersey (42-40) in the first round. By the logic I outlined before, Miami had an 88.6% chance of winning the series; New Jersey, an 11.4% chance. There’s an expectation there of .114 upsets. Miami won, so there was no upset.

Detroit (54-28) played Philadelphia (43-39) in the first round. Detroit had a 77.7% chance of winning the series; Philadelphia, a 22.3% chance. There’s an expectation of .223 upsets. Detroit won, so. . .no upset.

Boston (45-37) played Indiana (44-38). Boston had a 52.7% chance of winning the series; Indiana, a 47.3% chance. There’s an expectation of .473 upsets. Indiana did win, so that’s an upset.

Chicago (47-35) played Washington (45-37). Chicago had a 55.4% chance to win; Washington, a 44.6% chance. That’s an expectation of .446 upsets. Washington did win, so that’s another upset.

Phoenix (62-20) played Memphis (45-37). Phoenix had an 89.6% chance of winning the series; Memphis, a 10.4% chance. Phoenix did win; no upset.

San Antonio (59-23) played Denver (49-33). San Antonio had a 77.2% chance of winning the series. They did win; no upset.

Seattle (52-30) played Sacramento (50-32). Seattle had a 55.7% chance to win the series, and they did. No upset.

Dallas (58-24) played Houston (51-31). Dallas had a 70.0% chance to win the series; Houston, a 30.0% chance. Dallas won; no upset.

For the year, we had an expectation of 2.331 upsets in these eight series:

	Underdog	Chance
	New Jersey	.114
	Philadelphia	.223
	Indiana	.473
	Washington	.446
	Memphis	.104
	Denver	.228
	Sacramento	.443
	Houston	.300

	Expected Upsets	2.331

There were actually two upsets in there; Indiana and Washington beat teams with better records. Over the course of the seven seasons there were 13.928 expected first-round upsets, 12 actual first-round upsets. So there is no evidence, in the first-round games, that the regular-season won-lost records are anything other than fully meaningful indicators of the strength of the teams.

In 2007 Dallas (67-15) was a 1 seed, and faced Golden State (42-40) in the first round. Dallas had an estimated 97.2% chance to win the series. Golden State won as Dirk Nowitski went 2-for-13 in the final game—by far the biggest upset in the post-season in the last seven years.

How about the second round?

Same thing, only more so. We have an expectation of 7.315 upsets in the second-round contests, whereas there actually were only 4. (I am simply ignoring those series in which the two teams had the same won-lost record, since these would not be informative in regard to our question.) In fact, oddly enough, there were fewer second-round upset than expected in each of the seven seasons. The number of expected second-round upsets ranged from a low of 0.757 in 2008 to a high of 1.280 in 2010—but there were always fewer unexpected outcomes than expected.

Third round?

In the third round we have only 14 playoff series—two a year. In 2005 both of those were upsets; Detroit (54-28) beat Miami (59-23), and San Antonio (58-24) beat Phoenix (62-20). And, for the seven years as a whole, the number of third-round upsets does exceed expectation, 7 to 4.316.

The fourth round is the championship series, one a year. These tend in general to be pretty even series, which either team has some chance to win. I’ll chart it:

Favorite	W	L	Chance	Underdog	W	L	Chance
*San Antonio*	58	24	.621	Detroit	54	28	.379
Dallas	60	22	.732	*Miami*	52	30	.268
*San Antonio*	58	24	.724	Cleveland	50	32	.276
*Boston*	66	16	.790	Lakers	57	25	.210
*Lakers*	65	17	.707	Orlando	59	23	.293
*Lakers*	57	25	.697	Boston	50	32	.303
Miami	58	24	.532	*Dallas*	57	25	.468

			4.804				2.196

In the seven years we would have expected the team with the stronger regular-season won-lost record to have lost the series 2.2 times. They have actually lost the series twice.

Short answer?

There is no reason whatsoever to believe the thesis advanced by the Miami Advocate. If anything, the advantage of the team with the better record is LARGER than it should be, not smaller. In the seven seasons, we would have expected 28 playoff series upsets (27.755). In fact, there have been only 25.

The belief of the Miami Advocate is, I believe, based on a naïve understanding of what it means to be the favorite: that the team with the best record should win, or the won-lost record is misleading. In fact, in general, the team with the best record is more likely to lose somewhere in the playoffs than to win, even if we assume that the won-lost records are true indicators.

If we take Chicago last year, they had a 96.1% chance of winning in the first round, which they did, and a 90.7% chance of winning in the second round, which they did. But their chance of winning in the third round, since Miami’s record was only four games less impressive than Chicago’s, was only 63.3%. They lost that series, but had they won that series and faced Dallas in the championships, their chance of winning then would have been only 66.3%. Their chance of winning ALL FOUR series, then, was only 37%--97%, times 91%, times 63%, times 66%. It works out to .366. Their "failure" to win all four rounds of series does not indicate that their regular-season record was misleading, merely that even the best team has less than a 50% chance of winning.

Another way to summarize the playoff record is to look at groups of teams. Since 2005 there have been three series in which the two teams had the same record, and in those three, of course, there is no favorite. But there have been 7 series in that time in which the team with the better record had a 51 to 53% chance (actually, 50.5% to 53.499%) chance of winning. As it happens, in those series the better teams are just 2-5:


Range			Wins	Losses
51%	to	53%	2	5	Range			Wins	Losses
54%	to	56%	3	1	51%	to	59%	10	7
57%	to	59%	5	1	54%	to	62%	14	6
60%	to	62%	6	4	57%	to	65%	15	10
63%	to	65%	4	5	60%	to	68%	13	10
66%	to	68%	3	1	63%	to	71%	18	7
69%	to	71%	11	1	66%	to	74%	21	5
72%	to	74%	7	3	69%	to	77%	23	5
75%	to	77%	5	1	72%	to	80%	16	5
78%	to	80%	4	1	75%	to	83%	13	2
81%	to	83%	4	0	78%	to	86%	12	2
84%	to	86%	4	1	81%	to	89%	11	2
87%	to	89%	3	1	84%	to	92%	15	2
90%	to	92%	8	0	87%	to	95%	14	1
93%	to	95%	3	0	90%	to	100%	15	1
96%	to	100%	4	1

You can see that, in general, the outcomes of series track well with the mathematical expectations based on the assumption that the won-lost record is valid, and we could expect them to track better if we had a larger sample size. Teams that should have won the series 72 to 80% of the time have in fact won their series 76% of the time; teams that should have won 84 to 92% of the time have in fact won 88%.

OK, there are three other issues we should deal briefly with here, just because, if we don’t deal with them, people will assume I am a moron and didn’t think of these things. One is the home-field advantage. The mathematical formula that underlies all of this work assumes that a team’s chance of winning is the same, home or road. Of course that can’t be right, and, since the team with the better won-lost record always has the home-court advantage in the NBA playoffs, this probably explains the fact that there have been fewer upsets than we would have expected.

There has to be some way to adjust the head-to-head multiplication formula so that it gives a "home" result and a "road" result, but I’ve never been able to figure out the best way to do that. I work on that problem for a few hours several times a year, and have for 30 years; I’ve never gotten it.

Including consideration of the home-field advantage would change the math, but probably not very much. Saying that a team "has the home field advantage" makes it sound like they have the home field advantage all seven games. If they had the home field advantage all seven games, that would change the math enormously, but in reality, "having the home field advantage" just means that you have the home field advantage four times in seven games—a much less substantial matter.

Further, it is likely that the home-field adjustment to the basic formula, if I did know how to do this, would tend to move the predicted outcomes in the direction of .500. Adding any random unknown component to a statistical competition tends to move the expected result in the direction of 50/50. So then, adding the home-field adjustment game-by-game would probably tend to move the expected winning chance of the better team down by a little bit because of the randomness of it, but then it would tend to move it back up a little bit because the better team has the home-field advantage four times in seven games. On balance, it probably wouldn’t make very much difference, but I don’t really know.

A second factor that makes the math less instructive than it might be is that the playoffs are so long that we have to presume that the quality of the teams changes at least a little bit during the course of the playoffs. Suppose that one team goes into the playoff 53-29 and then wins two rounds of playoff games 4-3 and 4-3, while the other team goes into the playoff 50-32 and wins the first two rounds 4-0 and 4-1. Are the regular-season won-lost records still the best way to study this issue? Or would it be better to say that the first team is now 61-35 (.635) while the second team is now 58-33 (.637)? It is no longer clear which team should be expected to win the series.

Sure, but. . .we’re trying to look ahead. We’re trying to look at this thing from the perspective of regular-season records, asking whether the regular-season records are a valid predictor of post-season success. The best answer to that appears to be "yes". The fact that you could probably make better predictions of you knew in advance about things that haven’t happened yet is not really helpful.

Third, of course, is the issue of scheduling parity. I don’t know much about it, but it seems unlikely that all the teams in the NBA have played schedules of exactly equal strength during the season. The cross-multiplication of records that we started with wouldn’t be useful in comparing college teams, for example, because college teams have played vastly unequal schedules during the regular season. It’s an issue.

It’s an issue, but. . .the system still seems to work out about the way it should, suggesting that the unprocessed factors may not be huge problems. The Miami Advocate, in my opinion, was probably just wrong. Chicago is the team most likely to win the NBA championship this season.

COMMENTS (14 Comments, most recent shown first)

solohbet
Dear Bill,
My apology for this tardy thanks to you and Trailbzr for the math help. Much appreciated.
6:40 AM May 11th

studes
My experience (just got three math nerds through high school) is that probability has made some inroads in high school math curricula, but not nearly enough. I agree with Nate 100%. My son is getting a PhD in Math, and it's something the graduate students actively talk about.
1:00 PM Apr 25th

tkoegel
I'm not sure the information here regarding high school probability/statistics courses is up-to-date. Certainly when I was in high school (late 1970s), that was true. I was interested in statistics and had to take it at the local junior college. But, here in the SF Bay Area, of course, several of the college prep schools offer "AP Statistics" as one of their advanced placement offerings. True, the top track of math students will be in AP differential calculus. But the AP course in statistics is offered every year. Maybe we can credit the folks who own the SAT and AP tests for promoting common sense.

And yes, what little statistics I retained from the JC is far more valuable than my calculus and diff EQ from high school and college.
9:47 AM Apr 25th

jdw
Bill,

The NBA's current scheduling format:

* 4 games against the other 4 division opponents, [4x4=16 games]

* 4 games against 6 (out-of-division) conference opponents, [4x6=24 games]

* 3 games against the remaining 4 conference teams, [3x4=12 games]

* 2 games against teams in the opposing conference. [2x15=30 games]

The 4/4/3 in-conference split isn't a massive impact when comparing teams within a conference: just 4 games.

The bigger impact is the 52/30 split between games in-conference and in the other conference. We have seen large splits in the quality of teams in the two conferences, the early 00's standing out as extremely strong top tier teams in the West while the top of the East was weaker.

That would only impact the method for the Finals, and perhaps not in a significant way.

Overall, a very interesting article. I wonder if anyone would take it back further than 2005 and do some trending: have things changed over time or not?
3:15 PM Apr 23rd

petesmaluck
Great article Bill. Using the same process, I wonder how different the MLB and NHL's results would be compared to the NBA. I will give it a try.
6:06 PM Apr 22nd

Trailbzr
About Prob & Stat education. My high school listed a Prob & Stat class in its syllabus, but it had been cancelled three years in a row. In fact, I was told that I had been the only student who had even tried to sign for up it in those three years.

I've attended (as an observer) a couple of statistics education seminars, and fundamental barrier seems to be that HS education has a vocational track for those not going to college and a preparatory track form those that are. The vocational track will teach a real-life skill that can be usefully applied today. The college prep track doesn't. My HS offered a consumer and personal finance mathematics course that taught some useful skills to those completing their education at the HS level; but the college prep students left woefully unprepared in that area.

Since there aren't HS-level vocations that require Prob and Stat, they won't teach it as an applied course. At the college level, statistics is commonly not even offered as an undergraduate major, so high schools don't teach a prep course in it.
4:29 PM Apr 22nd

bjames
Responding to Solohbet. . .. .let us start with the fact that .853 to the 7th power is .330. Let's represent that as B B B B B B B (seven straight Bs), with "B" standing for "Bulls". The chance of B B B B B B B is .330.

What then is the chance of B B B B B B C, with "C" standing for a Cavaliers' victory, meaning a 6-1 series. Well, the "C" is less likely than the "B" by the ratio of .147 to .853, so the chance of that particular sequence would be .330 times .147, divided by .853.

However, there are 7 DIFFERENT sequences which can result in a 6-1 series outcome. . . .B B B B B B C, or
B B B B B C B, or B B B B C B B, or B B B C B B B, or B B C B B B B, or B C B B B B B, or C B B B B B B. Each of those is as likely as the others. So the chance of SOME 6-1 sequence is .330, times .147, divided by .853, TIMES 7. This I think is .396, or whatever I said that it was.

The chance of a 5-2 outcome depends in the same way on how many sequences there are. As it happens there are 21 B/C sequences which will result in 5 Bs and 2 Cs. The chance of a 5-2 outcome, then, is .330, times 21, times .147, divided by .853, times .147 again, divided by .853 again. We do the .147/.853 twice because we are substituting two ".147s" for ".853s". This results in. ..forget what it is; somewhere around .20.

I hope that helps. It is very useful math to know; it is helpful for figuring the odds of a hitting streak, or a thousand other problems. I've gotten several questions about Bartolo Colon throwing 38 straight strikes the other night, which I haven'd had a chance to study. . ..it is math like this.

I was once on a panel with Nate Silver, and Nate made the excellent point that American high schools insist on teaching trigonometry and calculus to their brightest math students, but trigonometry and calculus have extremely little utility in the lives of ordinary people, whereas schools for some reason rarely make much effort to teach probability, which is extremely useful in understanding our everyday problems.
2:46 PM Apr 22nd

mvandermast
If a particular outcome has a constant probability p of occurring in any given trial, the chance of its occurring at least 4 times out of 7 is:

35 p^4 - 84 p^5 + 70 p^6 - 20 p^7

My mnemonic for the coefficients is: "Tigers, Tigers, Orioles, Indians." 1935 and 1984 are the first (so far, only) two years that the Tigers have won the World Series while being managed by someone now in the Hall of Fame, as a manager or player (Cochrane and Sparky respectively). Similarly, 1970 is the first and only year that the Orioles were managed to a World Championship by a current HoFer (Weaver), and 1920 is the first year that the Indians won the Series with a HoFer as manager (Speaker).

As long as Steve O'Neill, Hank Bauer, and Mayo Smith remain among the uninducted, this mnemonic will stay valid . . .
11:28 AM Apr 22nd

Trailbzr
solohbet, the binomial probability of getting Y "Yes" results in N trials, each with probability P is:
Comb(N, Y) x P^Y x (1-P)^(N-Y)
Comb(N,Y) = N! / (Y! x (N-Y)!)

So the chance of an .853 event happening 5 times out of 7 is:
(7x6/2) x .853^5 x .147^2 = .2049.
7:21 AM Apr 22nd

solohbet
Understood the math for seven out of seven, which you showed; but not the six, five, or four, which you didn't show.
Any chance you could expand the formula section?
6:41 AM Apr 22nd

Trailbzr
Home teams won 60% in the NBA the last two complete seasons; so multiply the home team's win count by 1.5 to use in the multiplicative formula for an individual game. To have just one formula for a seven-game series, multiply the home court advantage team's wins by 1.07.
4:45 PM Apr 21st

Robinsong
Great article. I think that one could approximate the home court advantage for a 7 game series (by the way were the first round series 7 games as early as 2005? If not this would affect the math Bill uses). Take the calculated probability that the series will go exactly 7 games (3/7 of the calculated chance of 4-3). Multiply the favorites' probability of winning that 7 game series by 4/3 (since the home team wins 4 out of 7 times in the NBA) - this is the revised favortes' probability of winning if the series goes exactly 7 games. Add it to the chances that the favorite will win if the series goes less than 7 games.

My guess is that the home court advantage is smaller in the playoffs, particularly in 7th games, but this could be checked.
1:42 PM Apr 21st

bobfiore
Another factor that would make the home court advantage less significant in the final round is that the team at a disadvantage gets home court advantage in three of the first five games. Some even suggest that it's more advantageous to have home court/field in the middle three games of a series, which I suppose is the subject for another study.
1:24 PM Apr 21st

ventboys
A couple of things:

- I wouldn't be surprised if the NBA home court advantage was right around the fibonacci number. It's amazing how often that shows up, isn't it?

- Miami is still, to me, the most likely team to win. Their largest weakenesses, lack of depth and size inside, won't be the factor in the playoffs that it is in the regular season. Lebron has been playing an insane amount of minutes, as he always does, despite all the back to back to backs. He'll get more rest with the playoff schedules. The brackets favor Miami as well. They won't face Boston unless it's in the conference finals, and they match up well with everyone else including Chicago.

- Chicago is formidable, especially defensively, but Rose hasn't been himself all year. They don't have a consistent post up threat, and their depth won't be as helpful in the playoffs, when benches shrink.

- Boston, to me, is the real wild card. They shoot better than anyone, and as long as Garnett is healthy they defend as well as anyone other than Chicago and Miami. It would help them enormously if Miami actually overhauled Chicago and won the top seed. They probably can't beat a determined Chicago team unless Rose keeps going 1-13. They can beat Miami.
1:07 PM Apr 21st

The Mathematics of the NBA Playoffs

COMMENTS (14 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: