Suppose that a team had a "Winning Percentage" for a single game. Suppose, for that matter, that you had a "Winning Percentage" for a Date, or a Winning Percentage for a Party, or a Winning Percentage for each day’s work. . .but I digress. In real life a team’s winning percentage for a game is either 1.000 or .000, nothing in between, but let me ask you this question: Do all teams that win play equally well? Do all teams that lose play equally badly? If you root for the Texas Rangers, let us say, did the Rangers play equally well last year in every game that they won, or did they play better some days than others? Might they, perhaps, have actually played better in some of their losses than in some of their wins?
I am opening up a discussion here that could, in theory, go on for forty or fifty years. The central question that I am trying to ask is, "Is there a way to measure how well a team has played in a game, on a universal scale?" This question implies thousands of other questions to which I do not know the answers, but to which there might be some way to figure out the answers. Therefore, if people take an interest in the discussion, it’s a discussion that could go on for a very long time.
Let us begin here. Let us suppose that we state a team’s winning percentage for a game as a Pythagorean ratio—that is, as the ratio of the square of their runs scored versus the square of their runs allowed. If a team scores 4 runs and allows 3, their winning percentage for the game is .640. If they score 2 and allow 5, their winning percentage for the game is .138.
It quickly becomes apparent that this doesn’t work. Using that approach, a 1-0 game scores the same as a 20-0 game; the winning team is at 1.000, and the losing team at zero, in either case. There are myriad other problems.
We COULD address this problem by starting with the margin of victory, and stating that on a "9 scale". The two teams score about 9 runs in a game, on average. . .have throughout most of baseball history. We could say that all one-run games are 5-4 (.590 vs. .410), all two-run games are 5.5-3.5 (.712-.288), all three-run games are 6-3 (.800-.200), etc.
That isn’t a terrible approach, but. . .maybe it isn’t the best approach, either. Is a 2-0 game the same as a 12-10 game? Probably not. A 12-10 game, either team can win; quite possibly if the game had ended after seven, the other team would have won. By the Pythagorean approach, 2-0 would be 1.000/.000, whereas 12-10 would be .590/.410.
We could adjust this by adding some number of runs to each team before we do the Pythagorean, treating 2-0 as 5-3 (.735/.265), for example, and 12-10 as 15-13 (.571/.429). That’s not a terrible approach, either, and what I eventually decided to do is derived from that approach.
What about this. If a team wins a game 2 to 0, you might say that they have played extremely well on the defensive side—played at a 1.000 level on the defensive side—but have not had a particularly good day on the offensive side. Two runs in a game, you usually lose, so that is something less than a .500 performance for that game. Let me jump ahead a little bit, and then I’ll circle back and explain how we got there.
A team for each game has an "Offensive Winning Percentage" and a "Defensive Winning Percentage", based on how many runs they score and how many runs they allow. A team’s Offensive Winning Percentage for a game is:
R2 / (R2 + 3.62)
A team’s Defensive Winning Percentage, using RA for Runs Allowed, is:
1.000 minus [RA2 / (RA2 + 3.62)]
3.6 squared is 12.96; we only have to figure that once. Taking the 2-0 game again. . .the "Offensive Winning Percentage" for the game is .236:
R2 / (R2 + 3.62)
4 / (4 + 12.96) = 4/16.96 = .236
The Defensive Winning Percentage is 1.000:
1.000 minus [RA2 / (RA2 + 3.62)]
1.000 minus [0 / ( 0 + 12.96)] = 1 – 0 = 1.000
So the winning percentage of the winning team for a 2-0 game is .236 from the offensive side, but 1.000 from the defensive side. If the game was 12-10, the winning percentage of the winning team would be .917 from the offensive side, but .115 from the defensive side.
OK, why 3.6?
3.6 is the number that works. There were 634 major league games in 2010 in which a team scored two runs. The Winning Percentage of those 634 teams was .240 (152-482). Our formula says .236; it was actually .240. Pretty close. If a team scores 4 runs in a game, our formula says that their winning percentage should be .552. It was actually, in the major leagues in 2010, .557 (354-282). If a team scores 6 runs in a game, our formula says that their winning percentage should be .734. It was actually .735 (348-126).
I knew that this number would be somewhere around 3.5, so I checked what the error was if we used 2.4, 2.5, 2.6, 2.7. . ..4.0, 4.1, trying to locate the minimum error. That was actually kind of interesting; I was expecting to get a U-shaped curve, with the error dropping gradually, flattening out and then gradually increasing, but actually you get what is almost a V-shaped curve, with the error dropping sharply until 3.6, and then increasing sharply above 3.6. Anyway, it’s not a perfect function; it’s one of those heuristics I am so fond of. This chart compares the actual percentages to the expected data:
RUNS
|
Wins at that Run Level
|
Losses at that Run Level
|
Winning Percentage
|
Estimated Winning Percentage
|
Error
|
20
|
1
|
0
|
1.000
|
.969
|
0.03
|
19
|
1
|
0
|
1.000
|
.965
|
0.03
|
18
|
4
|
0
|
1.000
|
.962
|
0.15
|
17
|
6
|
0
|
1.000
|
.957
|
0.26
|
16
|
7
|
0
|
1.000
|
.952
|
0.34
|
15
|
13
|
0
|
1.000
|
.946
|
0.71
|
14
|
21
|
0
|
1.000
|
.938
|
1.30
|
13
|
26
|
0
|
1.000
|
.929
|
1.85
|
12
|
50
|
2
|
.962
|
.917
|
2.29
|
11
|
75
|
6
|
.926
|
.903
|
1.84
|
10
|
111
|
6
|
.949
|
.885
|
7.42
|
9
|
155
|
21
|
.881
|
.862
|
3.28
|
8
|
206
|
19
|
.916
|
.832
|
18.89
|
7
|
262
|
57
|
.821
|
.791
|
9.72
|
6
|
348
|
126
|
.734
|
.735
|
0.53
|
5
|
320
|
196
|
.620
|
.659
|
19.83
|
4
|
354
|
282
|
.557
|
.552
|
2.62
|
3
|
256
|
440
|
.368
|
.410
|
29.25
|
2
|
152
|
482
|
.240
|
.236
|
2.47
|
1
|
62
|
464
|
.114
|
.072
|
22.05
|
0
|
0
|
329
|
.000
|
.000
|
0.00
|
|
|
|
|
|
|
|
2430
|
2430
|
|
|
124.87
|
OK, you get that or you don’t; I’m moving on. If you are truly following what I am doing, you will see immediately that I am using an un-adjusted figure which, in a finished product, would need to be park- and league-adjusted. The 3.6 represents the median runs level for major league baseball in 2010 (more or less); if runs go up, a higher figure is needed, if runs go down, a lower figure would work best. I don’t know how to park-adjust this or league-adjust it, but that can be worked out later; I don’t want to get bogged down in that right now.
We have an "Offensive Winning Percentage" for each team in each game, and we have a "Defensive Winning Percentage". The next question is, how do we combine these into one?
The Cubs on opening day of the 2010 season lost to the Braves, 16 to 5; bad opener. They scored 5 runs, which is pretty good; 5 runs, as per the chart above, gives you an Offensive Winning Percentage of .659. They allowed 16, however, and allowing 16 runs gives you a Defensive Winning Percentage of .048. How do you combine the two into one?
You could add the two together, and divide by two; offense, .659, defense, .048, add them together, .707, divide by two, .353. . .their Game Winning Percentage is .353.
This, as it turns out, is the wrong answer.
How do we know it’s the wrong answer?
We know that it is the wrong answer because, when you do that for each game and total up the results, it doesn’t match the team win total. If you do that for every game on the Yankees, you get 88 wins. They didn’t win 88; they won 95. If you do that for every game on the Pirates, you get 68 wins. They didn’t win 68; they won 57. If you just average the Offensive and Defensive Winning Percentages for each game and sum up the totals, it gives you a sum which is half-way between the team’s actual winning percentage, and .500.
To put the Offensive and Defensive Winning Percentages together, what you have to do is add the two together, and subtract .500. The Cubs in that game were .659 and .048; add them together, .707, subtract .5, .207. The Cubs’ winning percentage for that game was .207.
Of course, this method creates some games—about 12%--in which one team’s winning percentage would be greater than one, the other team’s less than zero, and we have to eliminate that by fiat, decreeing that a winning percentage greater than one is always treated as 1.000. A team will have a Winning Percentage of 1.000 for the game if the score is 4 to 0, or greater, or 5 to 1, or greater, or 7 to 2, or greater, or 12 to 3, or greater.
Not that it matters at all—since obviously this method is but a temporary outline to explain the concept—but this chart gives the Single-Game winning percentages, based on the most common scores of a game:
|
|
T e a m S c o r e s
|
|
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
|
0
|
|
.572
|
.736
|
.910
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
O
|
1
|
.428
|
|
.664
|
.838
|
.981
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
1.000
|
p
|
2
|
.264
|
.336
|
|
.674
|
.817
|
.923
|
.999
|
1.000
|
1.000
|
1.000
|
1.000
|
p
|
3
|
.090
|
.162
|
.326
|
|
.643
|
.749
|
.825
|
.881
|
.922
|
.952
|
.975
|
o
|
4
|
.000
|
.019
|
.183
|
.357
|
|
.606
|
.683
|
.738
|
.779
|
.810
|
.833
|
n
|
5
|
.000
|
.000
|
.077
|
.251
|
.394
|
|
.577
|
.632
|
.673
|
.703
|
.727
|
e
|
6
|
.000
|
.000
|
.001
|
.175
|
.317
|
.423
|
|
.556
|
.596
|
.627
|
.650
|
n
|
7
|
.000
|
.000
|
.000
|
.119
|
.262
|
.368
|
.444
|
|
.541
|
.571
|
.594
|
t
|
8
|
.000
|
.000
|
.000
|
.078
|
.221
|
.327
|
.404
|
.459
|
|
.530
|
.554
|
|
9
|
.000
|
.000
|
.000
|
.048
|
.190
|
.297
|
.373
|
.429
|
.470
|
|
.523
|
|
10
|
.000
|
.000
|
.000
|
.025
|
.167
|
.273
|
.350
|
.406
|
.446
|
.477
|
|
|
11
|
.000
|
.000
|
.000
|
.007
|
.149
|
.255
|
.332
|
.388
|
.428
|
.459
|
.482
|
|
12
|
.000
|
.000
|
.000
|
.000
|
.135
|
.241
|
.318
|
.373
|
.414
|
.445
|
.468
|
S
|
13
|
.000
|
.000
|
.000
|
.000
|
.124
|
.230
|
.307
|
.362
|
.403
|
.433
|
.456
|
c
|
14
|
.000
|
.000
|
.000
|
.000
|
.115
|
.221
|
.297
|
.353
|
.394
|
.424
|
.447
|
o
|
15
|
.000
|
.000
|
.000
|
.000
|
.107
|
.213
|
.290
|
.345
|
.386
|
.417
|
.440
|
r
|
16
|
.000
|
.000
|
.000
|
.000
|
.101
|
.207
|
.283
|
.339
|
.380
|
.410
|
.433
|
e
|
17
|
.000
|
.000
|
.000
|
.000
|
.095
|
.202
|
.278
|
.334
|
.375
|
.405
|
.428
|
s
|
18
|
.000
|
.000
|
.000
|
.000
|
.091
|
.197
|
.274
|
.329
|
.370
|
.401
|
.424
|
|
19
|
.000
|
.000
|
.000
|
.000
|
.087
|
.193
|
.270
|
.325
|
.366
|
.397
|
.420
|
|
20
|
.000
|
.000
|
.000
|
.000
|
.084
|
.190
|
.267
|
.322
|
.363
|
.393
|
.417
|
In a 20-19 game, the winning team would have a single-game winning percentage of .503, the losing team of .497. Probably in a perfect system, a better-thought-through system, no team would ever be at 1.000 or at zero, since a team always does something right in the game, and something wrong. It might be that a better approach would have been to simply total up every single thing that each team does right in the game, and each thing they do wrong, and divide by the total; that might have worked better.
With this method, in any case, each team’s Wins on the season tend to be about the same as the sum of their single-game winning percentages. The Phillies actually exceeded the sum of their single-game winning percentages by 8.8 games, but no other team exceeded by more than 4.0, while on the down side the Mariners missed their expectation by 5.8. It’s about as accurate a predictor as the Pythagorean method, but I’d have to run 50 years worth of data to say anything with confidence about that.
Our instinct at this point is to begin comparing teams’ actual to expected wins, so I’m going to do that. Let me say before I do that, however, that what’s actually interesting here is the single-game winning percentages. . .NOT the macro data that can be derived from it. The macro data is just going to tell us stuff that we already know; you can do it, but you just "learn" what you already know. So I’ll do that now, but then we’ll return to the unresolved issues in the single-game estimates, which are the heart of the method.
Since we have "offensive winning percentages" and "defensive winning percentages" for each game, we can use this method to determine, at the end of the year, how many games the team has won by its offense, and how many they have won by pitching and defense. This, however, seems only to tell us what we already know. A list of the teams winning the most games by their offense is just a list of the teams scoring the most runs, in almost an identical order, with a little bit of movement by teams that score runs in efficient patterns (4,5,4,5,4,5) as opposed to inefficient patterns (0,9,0,9,0,9). The same with pitching; a list of the teams winning the most games with pitching and defense merely duplicates a list of the best ERAs or the fewest runs allowed.
It might be interesting to study whether discrepancies on these lists are predictive or random, but the interesting part of this is the single-game percentage itself. To this point in the process we have assumed that the "runs scored" and "runs allowed" are the finest level of granularity that we can work with. Of course this is not true. A team that scores five runs may have scored five runs with four hits and three walks, or they may have scored five runs on twelve hits and seven walks. Has the offense played as well on one occasion as on the other?
Of course they have not. We could thus refine the "Single Game Winning Percentage" by referencing smaller events, and, at some point, yet smaller events. You can replace "runs" with hits, walks and total bases; you can replace hits, walks and total bases with balls, strikes, and hard-hit balls. You can replace or augment outs with good defensive plays. You can factor in timing, screening out or diminishing events that occur after the game has been virtually decided.
In this method as I have outlined it here, the winning team always has a Single-Game Winning Percentage greater than .500; the losing team always less than .500. Is it appropriate to keep this requirement in the system, or would it be better to allow the losing team to have the higher percentage? That’s one of the questions that I haven’t figured out.
There are methods in use now which calculate each player’s value by analyzing the sequence of events; this was the value of the walk at the moment at which the walk was drawn. Those methods, however, rely upon the tautology that a win is defined as a win, and thus simply isolate the local value of a hit, as opposed to the global value of the same kind of hit. They measure hitting plus timeliness, pitching plus timeliness, fielding plus timeliness.
This method is quite different; it un-hooks us from the assumption that all wins are the same, and thus isolates win contributions not by their timing, but by their quality. It asks the question that your mother might have asked. You won, she might say, but how well did you really play?
Much of sabermetrics consists of learning to distinguish between performance and luck. That was why Voros’ realization had such an impact; Voros showed us a crack where luck could be peeled away from performance. I’m not suggesting that this is something like that; I’m sure it isn’t. I’m just suggesting that it is a kind of interesting way to question what we think we know.