Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Single Game Winning Percentages

By Bill James

February 24, 2011

Suppose that a team had a "Winning Percentage" for a single game. Suppose, for that matter, that you had a "Winning Percentage" for a Date, or a Winning Percentage for a Party, or a Winning Percentage for each day’s work. . .but I digress. In real life a team’s winning percentage for a game is either 1.000 or .000, nothing in between, but let me ask you this question: Do all teams that win play equally well? Do all teams that lose play equally badly? If you root for the Texas Rangers, let us say, did the Rangers play equally well last year in every game that they won, or did they play better some days than others? Might they, perhaps, have actually played better in some of their losses than in some of their wins?

I am opening up a discussion here that could, in theory, go on for forty or fifty years. The central question that I am trying to ask is, "Is there a way to measure how well a team has played in a game, on a universal scale?" This question implies thousands of other questions to which I do not know the answers, but to which there might be some way to figure out the answers. Therefore, if people take an interest in the discussion, it’s a discussion that could go on for a very long time.

Let us begin here. Let us suppose that we state a team’s winning percentage for a game as a Pythagorean ratio—that is, as the ratio of the square of their runs scored versus the square of their runs allowed. If a team scores 4 runs and allows 3, their winning percentage for the game is .640. If they score 2 and allow 5, their winning percentage for the game is .138.

It quickly becomes apparent that this doesn’t work. Using that approach, a 1-0 game scores the same as a 20-0 game; the winning team is at 1.000, and the losing team at zero, in either case. There are myriad other problems.

We COULD address this problem by starting with the margin of victory, and stating that on a "9 scale". The two teams score about 9 runs in a game, on average. . .have throughout most of baseball history. We could say that all one-run games are 5-4 (.590 vs. .410), all two-run games are 5.5-3.5 (.712-.288), all three-run games are 6-3 (.800-.200), etc.

That isn’t a terrible approach, but. . .maybe it isn’t the best approach, either. Is a 2-0 game the same as a 12-10 game? Probably not. A 12-10 game, either team can win; quite possibly if the game had ended after seven, the other team would have won. By the Pythagorean approach, 2-0 would be 1.000/.000, whereas 12-10 would be .590/.410.

We could adjust this by adding some number of runs to each team before we do the Pythagorean, treating 2-0 as 5-3 (.735/.265), for example, and 12-10 as 15-13 (.571/.429). That’s not a terrible approach, either, and what I eventually decided to do is derived from that approach.

What about this. If a team wins a game 2 to 0, you might say that they have played extremely well on the defensive side—played at a 1.000 level on the defensive side—but have not had a particularly good day on the offensive side. Two runs in a game, you usually lose, so that is something less than a .500 performance for that game. Let me jump ahead a little bit, and then I’ll circle back and explain how we got there.

A team for each game has an "Offensive Winning Percentage" and a "Defensive Winning Percentage", based on how many runs they score and how many runs they allow. A team’s Offensive Winning Percentage for a game is:

R² / (R²+ 3.6²)

A team’s Defensive Winning Percentage, using RA for Runs Allowed, is:

1.000 minus [RA² / (RA² + 3.6²)]

3.6 squared is 12.96; we only have to figure that once. Taking the 2-0 game again. . .the "Offensive Winning Percentage" for the game is .236:

R² / (R²+ 3.6²)

4 / (4 + 12.96) = 4/16.96 = .236

The Defensive Winning Percentage is 1.000:

1.000 minus [RA² / (RA² + 3.6²)]

1.000 minus [0 / ( 0 + 12.96)] = 1 – 0 = 1.000

So the winning percentage of the winning team for a 2-0 game is .236 from the offensive side, but 1.000 from the defensive side. If the game was 12-10, the winning percentage of the winning team would be .917 from the offensive side, but .115 from the defensive side.

OK, why 3.6?

3.6 is the number that works. There were 634 major league games in 2010 in which a team scored two runs. The Winning Percentage of those 634 teams was .240 (152-482). Our formula says .236; it was actually .240. Pretty close. If a team scores 4 runs in a game, our formula says that their winning percentage should be .552. It was actually, in the major leagues in 2010, .557 (354-282). If a team scores 6 runs in a game, our formula says that their winning percentage should be .734. It was actually .735 (348-126).

I knew that this number would be somewhere around 3.5, so I checked what the error was if we used 2.4, 2.5, 2.6, 2.7. . ..4.0, 4.1, trying to locate the minimum error. That was actually kind of interesting; I was expecting to get a U-shaped curve, with the error dropping gradually, flattening out and then gradually increasing, but actually you get what is almost a V-shaped curve, with the error dropping sharply until 3.6, and then increasing sharply above 3.6. Anyway, it’s not a perfect function; it’s one of those heuristics I am so fond of. This chart compares the actual percentages to the expected data:

RUNS	Wins at that Run Level	Losses at that Run Level	Winning Percentage	Estimated Winning Percentage	Error
20	1	0	1.000	.969	0.03
19	1	0	1.000	.965	0.03
18	4	0	1.000	.962	0.15
17	6	0	1.000	.957	0.26
16	7	0	1.000	.952	0.34
15	13	0	1.000	.946	0.71
14	21	0	1.000	.938	1.30
13	26	0	1.000	.929	1.85
12	50	2	.962	.917	2.29
11	75	6	.926	.903	1.84
10	111	6	.949	.885	7.42
9	155	21	.881	.862	3.28
8	206	19	.916	.832	18.89
7	262	57	.821	.791	9.72
6	348	126	.734	.735	0.53
5	320	196	.620	.659	19.83
4	354	282	.557	.552	2.62
3	256	440	.368	.410	29.25
2	152	482	.240	.236	2.47
1	62	464	.114	.072	22.05
0	0	329	.000	.000	0.00

	2430	2430			124.87

OK, you get that or you don’t; I’m moving on. If you are truly following what I am doing, you will see immediately that I am using an un-adjusted figure which, in a finished product, would need to be park- and league-adjusted. The 3.6 represents the median runs level for major league baseball in 2010 (more or less); if runs go up, a higher figure is needed, if runs go down, a lower figure would work best. I don’t know how to park-adjust this or league-adjust it, but that can be worked out later; I don’t want to get bogged down in that right now.

We have an "Offensive Winning Percentage" for each team in each game, and we have a "Defensive Winning Percentage". The next question is, how do we combine these into one?

The Cubs on opening day of the 2010 season lost to the Braves, 16 to 5; bad opener. They scored 5 runs, which is pretty good; 5 runs, as per the chart above, gives you an Offensive Winning Percentage of .659. They allowed 16, however, and allowing 16 runs gives you a Defensive Winning Percentage of .048. How do you combine the two into one?

You could add the two together, and divide by two; offense, .659, defense, .048, add them together, .707, divide by two, .353. . .their Game Winning Percentage is .353.

This, as it turns out, is the wrong answer.

How do we know it’s the wrong answer?

We know that it is the wrong answer because, when you do that for each game and total up the results, it doesn’t match the team win total. If you do that for every game on the Yankees, you get 88 wins. They didn’t win 88; they won 95. If you do that for every game on the Pirates, you get 68 wins. They didn’t win 68; they won 57. If you just average the Offensive and Defensive Winning Percentages for each game and sum up the totals, it gives you a sum which is half-way between the team’s actual winning percentage, and .500.

To put the Offensive and Defensive Winning Percentages together, what you have to do is add the two together, and subtract .500. The Cubs in that game were .659 and .048; add them together, .707, subtract .5, .207. The Cubs’ winning percentage for that game was .207.

Of course, this method creates some games—about 12%--in which one team’s winning percentage would be greater than one, the other team’s less than zero, and we have to eliminate that by fiat, decreeing that a winning percentage greater than one is always treated as 1.000. A team will have a Winning Percentage of 1.000 for the game if the score is 4 to 0, or greater, or 5 to 1, or greater, or 7 to 2, or greater, or 12 to 3, or greater.

Not that it matters at all—since obviously this method is but a temporary outline to explain the concept—but this chart gives the Single-Game winning percentages, based on the most common scores of a game:

		T e a m S c o r e s
		0	1	2	3	4	5	6	7	8	9	10
	0		.572	.736	.910	1.000	1.000	1.000	1.000	1.000	1.000	1.000
O	1	.428		.664	.838	.981	1.000	1.000	1.000	1.000	1.000	1.000
p	2	.264	.336		.674	.817	.923	.999	1.000	1.000	1.000	1.000
p	3	.090	.162	.326		.643	.749	.825	.881	.922	.952	.975
o	4	.000	.019	.183	.357		.606	.683	.738	.779	.810	.833
n	5	.000	.000	.077	.251	.394		.577	.632	.673	.703	.727
e	6	.000	.000	.001	.175	.317	.423		.556	.596	.627	.650
n	7	.000	.000	.000	.119	.262	.368	.444		.541	.571	.594
t	8	.000	.000	.000	.078	.221	.327	.404	.459		.530	.554
	9	.000	.000	.000	.048	.190	.297	.373	.429	.470		.523
	10	.000	.000	.000	.025	.167	.273	.350	.406	.446	.477
	11	.000	.000	.000	.007	.149	.255	.332	.388	.428	.459	.482
	12	.000	.000	.000	.000	.135	.241	.318	.373	.414	.445	.468
S	13	.000	.000	.000	.000	.124	.230	.307	.362	.403	.433	.456
c	14	.000	.000	.000	.000	.115	.221	.297	.353	.394	.424	.447
o	15	.000	.000	.000	.000	.107	.213	.290	.345	.386	.417	.440
r	16	.000	.000	.000	.000	.101	.207	.283	.339	.380	.410	.433
e	17	.000	.000	.000	.000	.095	.202	.278	.334	.375	.405	.428
s	18	.000	.000	.000	.000	.091	.197	.274	.329	.370	.401	.424
	19	.000	.000	.000	.000	.087	.193	.270	.325	.366	.397	.420
	20	.000	.000	.000	.000	.084	.190	.267	.322	.363	.393	.417

In a 20-19 game, the winning team would have a single-game winning percentage of .503, the losing team of .497. Probably in a perfect system, a better-thought-through system, no team would ever be at 1.000 or at zero, since a team always does something right in the game, and something wrong. It might be that a better approach would have been to simply total up every single thing that each team does right in the game, and each thing they do wrong, and divide by the total; that might have worked better.

With this method, in any case, each team’s Wins on the season tend to be about the same as the sum of their single-game winning percentages. The Phillies actually exceeded the sum of their single-game winning percentages by 8.8 games, but no other team exceeded by more than 4.0, while on the down side the Mariners missed their expectation by 5.8. It’s about as accurate a predictor as the Pythagorean method, but I’d have to run 50 years worth of data to say anything with confidence about that.

Our instinct at this point is to begin comparing teams’ actual to expected wins, so I’m going to do that. Let me say before I do that, however, that what’s actually interesting here is the single-game winning percentages. . .NOT the macro data that can be derived from it. The macro data is just going to tell us stuff that we already know; you can do it, but you just "learn" what you already know. So I’ll do that now, but then we’ll return to the unresolved issues in the single-game estimates, which are the heart of the method.

Since we have "offensive winning percentages" and "defensive winning percentages" for each game, we can use this method to determine, at the end of the year, how many games the team has won by its offense, and how many they have won by pitching and defense. This, however, seems only to tell us what we already know. A list of the teams winning the most games by their offense is just a list of the teams scoring the most runs, in almost an identical order, with a little bit of movement by teams that score runs in efficient patterns (4,5,4,5,4,5) as opposed to inefficient patterns (0,9,0,9,0,9). The same with pitching; a list of the teams winning the most games with pitching and defense merely duplicates a list of the best ERAs or the fewest runs allowed.

It might be interesting to study whether discrepancies on these lists are predictive or random, but the interesting part of this is the single-game percentage itself. To this point in the process we have assumed that the "runs scored" and "runs allowed" are the finest level of granularity that we can work with. Of course this is not true. A team that scores five runs may have scored five runs with four hits and three walks, or they may have scored five runs on twelve hits and seven walks. Has the offense played as well on one occasion as on the other?

Of course they have not. We could thus refine the "Single Game Winning Percentage" by referencing smaller events, and, at some point, yet smaller events. You can replace "runs" with hits, walks and total bases; you can replace hits, walks and total bases with balls, strikes, and hard-hit balls. You can replace or augment outs with good defensive plays. You can factor in timing, screening out or diminishing events that occur after the game has been virtually decided.

In this method as I have outlined it here, the winning team always has a Single-Game Winning Percentage greater than .500; the losing team always less than .500. Is it appropriate to keep this requirement in the system, or would it be better to allow the losing team to have the higher percentage? That’s one of the questions that I haven’t figured out.

There are methods in use now which calculate each player’s value by analyzing the sequence of events; this was the value of the walk at the moment at which the walk was drawn. Those methods, however, rely upon the tautology that a win is defined as a win, and thus simply isolate the local value of a hit, as opposed to the global value of the same kind of hit. They measure hitting plus timeliness, pitching plus timeliness, fielding plus timeliness.

This method is quite different; it un-hooks us from the assumption that all wins are the same, and thus isolates win contributions not by their timing, but by their quality. It asks the question that your mother might have asked. You won, she might say, but how well did you really play?

Much of sabermetrics consists of learning to distinguish between performance and luck. That was why Voros’ realization had such an impact; Voros showed us a crack where luck could be peeled away from performance. I’m not suggesting that this is something like that; I’m sure it isn’t. I’m just suggesting that it is a kind of interesting way to question what we think we know.

COMMENTS (15 Comments, most recent shown first)

Tom071362
A team loses 15-5, and gets a WP of .213. A team loses 20-10, but gets a WP .417. Both teams lose by 10 runs (and give up a LOT of runs), but one has almost twice the WP that the other does. Neither team deserved anywhere close to a .500 WP, it seems to me (i.e, intuitively, I would think the loser in the second case should have a WP well below .400, much closer to the .213 the 15-5 loser had). I like the concept, though (but I'm not sure how you'd use it).
5:13 PM Mar 4th

Trailbzr
Ritchie, you're 31 days early posting that.
11:48 AM Mar 3rd

Richie
Oh, and they might not show up on your screen in visible font if you haven't signed up for the special $3.25 monthly premium membership.
7:42 PM Mar 1st

Richie
A little something for everybody on this site, mauimike. You'll find the site 'bikini babe' pictures over under "Fun Stuff". Course, you may have to do a little digging...
7:39 PM Mar 1st

mauimike
Damn, when you folks get together and talk this stuff are you sober. Probably, very. So it goes. Can I smile?
5:41 AM Feb 27th

ScottSegrin
It seems that if you're going to measure how well a team played in a particular game, you need to take more into consideration than just the score of the game. Consider the following two games:

Team A: 4R, 4H, 0E, 1LOB, 2SB, 0CS, vs. Halliday
Team B: 3R, 5H, 0E, 4LOB, 0SB, 0CS

Team C: 4R, 9H, 4E, 11LOB, 0SB, 3CS vs. Suppan
Team D: 3R, 5H, 0E, 9LOB, 0SB, 0CS

I don't think you can say that Teams A and C played equally well. Team A got great pitching and made the most of its offensive opportunities against and All-Star pitcher, while Team C fumbled around, ran themselves out of innings, and almost lost what should have been a sure win.
9:16 PM Feb 26th

Trailbzr
Here's another suggestion on how to research this.

Step One: Using the Retrosheet game files, tabulate over many seasons "What was the average record of a 4-3 winner in their other 161 games?" Since both bad teams and good teams win games 4-3, the average record is probably only about .501; whereas 10-2 winners might be .575 on average in their other games.

Step Two: Using the actual W-L records of all seasons used in the above, randomly pull a VERY large number of 20 games sets, and ask "What is the average record in the other 142 games of teams that went 13-7."

By matching up the pairs of other-game records, you can make statements like "The predictive value of winning 6-2 is the same as the value of going 14-6, so is comparable to a .700 percentage."

5:04 PM Feb 25th

ventboys
This principle, applied to football, might have even more predictive value than in baseball. The bounces just don't even out in football.
1:10 PM Feb 25th

MarisFan61
Sort of following up on John Durkee's post....

Two things occur to me that aren't taken into account in this, one of which I'm afraid will perhaps seem mythical and the other irrelevant, but which many people might feel relate to "playing well" or "not playing well."

1. The thing that may seem mythical to many: "Pitching to the scoreboard," and more generally, playing to the scoreboard. Over the course of a season, these things probably mostly 'even out' to the point that they may not be significant, but in any given game, they can be significant, and can make the margin of victory or defeat misleading regarding "how the teams played."

2. The method assumes that the *result* of each play is the only indicator of how the players played. In general with sabermetrics, that's fine (need I say), but I think when you talk about something like "how a team played," I think this is a significant quibble. Man on 3rd, 2 out, Willie McCovey hits a line drive toward right field, Richardson catches it, game over, McCovey's team loses the World Series. Did McCovey not "play well" on that play? Luis Gonzalez barely gets his bat on the ball, bloops it to left field, his team wins the game and the World Series. Did Gonzalez play better on that play than McCovey had? If you're looking at small samples like single games, I think this factor is relevant, although I guess it's unlikely to be one that sabermetrics would recognize and might even consider absurd.

As with some other of my criticisms, like about the "Brooks Robinson Tournament," I suppose my point is a semantic one; just don't put it at all in terms of "how the team played," and there's no issue.
12:18 PM Feb 25th

rpriske
Is it true that a team always does SOMETHING wrong in a game?

Making outs, I guess.
9:16 AM Feb 25th

jdurkee
I think the problem with this method is that it confuses playing "well" with winning. They are not the same. Isn't it possible that both teams can play "well," but one of them has to lose? Surely, we know that a team can play like crap and still win, if the other team plays crappier.

I don't believe this is a method describing excellence (playing "well"), merely describing winning. And we already have a method for that -- counting the wins at the end of the season.

To produce a method describing excellence, one would first have to define what is excellence. Perhaps designate certain plays as representing excellence (a strikeout, a turned DP, an out made that was unexpected by probability, etc.). Then count the number of same.

Finally, I am not sure that game score reveals excellence. We have all seen too many 1 - 0 games where neither side deserved to win.

Course I could bee worng...
11:26 PM Feb 24th

bjames
As long as you're talking about constructing a player evaluation tool based on this, you've totally missed the point.
10:54 PM Feb 24th

Trailbzr
When you originally described this two(?) years ago, I suggested you could study it through the Johnson Run Produced formula. If you ignore the .16 in the front, it awards 2 points for a walk, 3 for a single, 2 for an extra base, and you could assign additional points for basepath events and errors, to get a total measure of what each team did on offense and defense.

Then map each run total into its Johnson point distribution; and each Johnson point total into its run distribution. So you'd say something like "a team that score three runs did it with 10-15 Johnson points 2% of the time; and teams with 10-15 Johnson points scored 0 runs 5% of the time, 1 run 10% of the time... 6 runs 0.5%, and hence construct a measure of "what distribution of scores results from the set of all teams that played as well as a team that scores three runs."

Then construct a W-L% from that.
10:20 PM Feb 24th

MattGoodrich
I use all kinds of Sabermetic formulas and concepts with my softball team's stats, resulting in much head shaking and eye rolling from my teammates. I can only imagine the response if I threw this at them.
9:38 PM Feb 24th

boutilij
Interesting. If the single game offensive/defensive winning percentages could be broken down far enough to isolate each player's contribution, you'd also have another alternative to win shares and loss shares - or am I missing something?
8:31 PM Feb 24th

Single Game Winning Percentages

COMMENTS (15 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: