Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Opposition Adjusted Winning Percentage

By Bill James

June 11, 2019

Opposition Adjusted Winning Percentage

This is just kind of a stupid little thing; I was just messing around with the data, not expecting to find anything really interesting, and probably I found even less than I expected. But my thought was, what if you "weighted" each win and loss for a starting pitcher by the quality of the opponent? In other words, if a pitcher gets a win against a team which wins 100 games, we count that as 100 wins, whereas if he wins a game against a team which wins only 60 games, we count that as only 60 wins—and vice versa, if he loses a game to a bad opponent, a team which lost 100 games, then we count that as 100 losses, whereas if he loses a game to a good team, a team which lost only 60 games, then we count that as only 60 losses. How would that change the winning percentages?

The one interesting thing that turns up from this study is that the pitcher who gains the most in a single season and the pitcher who gains the most in a career turn out to be the same pitcher: Alex Kellner. Alex Kellner won 20 games as a rookie for the 1949 Philadelphia A’s, a fluke season, and then led the league in losses the next two seasons, going 8-20 and 11-14; not often you could lead the league in losses with 14, but he did. He hung around for a long time after that, and pitched well for the 1958 Cincinnati Reds, going 7-3 with a 2.30 ERA for them in seven starts and eleven relief appearances.

Anyway, Kellner in 1950 was 8-20, but 8-19 as a starting pitcher. However, in my data he is only 8-14, since I am missing some of his games from my log. In the games I have he was 8-14 as a starter (.364), but if you weight the wins and losses for the quality of competition he improves to .452, an 88-point improvement. (The missing data does not cause the improvement; those games are missing both from the "natural" and the "quality adjusted" winning percentage. As far as we know, they don’t exist.)

Anyway, when we figure the CAREER records, Alex Kellner again is the pitcher who improves the most. Kellner had a career record of 92-108, a .460 winning percentage, although in my data it is .459 (84-99). But when you adjust for the quality of competition that he faced, his winning percentage improves to .495, a 36-point improvement.

Trying to figure out logically whether this adjustment would help pitchers on bad teams and hurt pitchers on good teams, I couldn’t figure it out, but when you check the data, it obviously does; all of the pitchers who are helped the most in their career winning percentages pitched mostly for bad teams, while those who are hurt the most pitched mostly for good teams.

What you may notice is that these numbers are not that large. The questions I was asking were things like "are there any pitchers who might be in the Hall of Fame if this had been factored in?" and "Are there are any Cy Young Awards that might have gone to a different pitcher if this had been factored in?" But a 36-point pickup in winning percentage (a) is not huge, (b) is larger than anyone else had, and (c) the largest pickups are for guys who had 100-200 career decisions. Hall of Fame candidates need to have 350-500 decisions. The gains and losses in that range are not that large. The largest gain for anyone who might be looked upon as a Hall of Fame candidate is 14 points, for Billy Pierce.

Billy Pierce, 1950s White Sox pitcher. . . . .he was a little guy, and a two-pitch pitcher. He just threw a fastball and a slider. Well, early in his career he threw a big curve, but the big curve led to a lot of walks, and he "added" a slider. Mostly the slider replaced the curve, and (according to contemporary accounts) he mostly just lived off of those two pitchers, although he would later claim that he threw all three.

Anyway, Pierce was a tremendous pitcher for a good long time. He had a .553 career winning percentage as a starter, .554 in my data, but if you adjust for the quality of competition, it was .568, a 14-point jump. Baseball Reference confirms that he did do the bulk of his pitching against quality teams. The two teams that he started against most often in his career, the two teams against which he had the most starts, decisions, and innings, were the Yankees and the Cleveland Indians, the two best teams in the American League in his era. In his career he started 187 games against sub-.500 teams, going 102-56 against them, but started 245 times against .500+ teams, going 109-113.

You can compare that to, let’s say, Jim Bunning; Pierce’s career winning percentage was 17 points higher than Bunning’s against sub-.500 teams (.646 to .629), and 20 points higher against .500-or-better teams (.491 to .471), but, because Pierce pitched most often against good teams whereas Bunning had 258 starts against sub-.500 teams, 261 against .500-or-better teams, their career winning percentages are almost the same. It’s relevant; I don’t know if it puts Pierce over the Hall of Fame line or not, but it counts a little bit. Perhaps a more interesting one is that Pierce and Don Drysdale have almost the same career won-lost records, 211-169 for Pierce, 209-166 for Drysdale—and almost exactly the same splits against good teams and sub-.500 teams. Pierce was 102-56 against sub-.500 teams; Drysdale was 100-55. Pierce was 109-113 against .500 or better teams; Drysdale was 109-111.

Anyway, the one Cy Young Award which seems like it might have gone the other way, had the voters been aware of this, is also Billy Pierce. In 1957 Billy Pierce and Warren Spahn both went 20-11 as starting pitchers. Both pitched a few times in relief, 3 times for Pierce, 4 for Spahn; Pierce took a loss as a reliever, making him 20-12, while Spahn picked up a win, making him 21-11. That small difference, in the world in which the Cy Young Award rested very heavily on won-lost records, may well have made Spahn the Cy Young Award winner. Possibly not; Spahn won 15 out of 16 votes, Pierce got none. Anyway, back to the thesis; Pierce was 20-11 as a starter, but adjusting for the quality of the competition, he goes up to .703, while Spahn, also 20-11, goes down to .622, giving Pierce an advantage of 81 points, or 2 ½ games. If Pierce had gone 22-10, let’s say, and Spahn had been 19-12—a 2 ½ game difference—Pierce quite certainly would have won the award. So I think you can argue that this fact, had it been known at the time, might have given Pierce the award.

As we look at things in the modern world, Pierce might have won it anyway. Pierce had far more strikeouts (171 to 111) with fewer walks (71 to 78) and fewer home runs allowed (18 to 23), so Pierce wins all three of the true outcomes. Spahn had a better ERA, but Spahn gave up 13 un-earned runs; Pierce gave up 5. None of that is really relevant to the current discussion.

Other than that, don’t really see any Cy Young Awards that would be changed by this knowledge. Some of you will wonder about Pete Vuckovich’s notorious award, 1982; Vuckovich was 18-6, a .750 percentage, but if you adjust for the quality of the wins and the losses, it goes up .770, while Dave Stieb, who should have won the award if it was going to go to a starting pitcher, goes down from .548 to .545. Doesn’t change my opinion; Stieb was still a better pitcher than Vuckovich, but what I am saying is, the quality of the opposition was the not the meaningful variable. If you look at historic seasons—Carlton in ’72, Blue in ’71, Koufax every year, etc.—it doesn’t really change anything; they all come out about the same. Dwight Gooden in 1985 (24-4, .857) adjusts up to .862; Whitey Ford in 1961 (25-4) adjusts down to .859. I mean, we all know that Whitey Ford in ’61 was not really THAT good, but this method does not adjust for things like Run Support and Park Effects and Defensive Support; those are other issues. A 1.000 winning percentage in this method is always 1.000 after you make adjustments, no matter how badly you pitch, and .000 is always .000, no matter how well you pitch.

It takes a lot to move the "opposition adjusted" winning percentage very far from the "true" winning percentage. Frank Lary, notorious Yankee Killer, went 7-1 against the Yankees in 1958, 5-1 against them in 1959—but his winning percentages adjust upward by only 12 points in 1958, 31 points in 1959. This chart summarizes the career data for Lary:

YEAR	FIRST	LAST	W	L	WPct	ADJUSTED
1955	Frank	Lary	13	14	.481	.477
1956	Frank	Lary	21	13	.618	.602
1957	Frank	Lary	11	15	.423	.425
1958	Frank	Lary	16	13	.552	.564
1959	Frank	Lary	17	10	.630	.661
1960	Frank	Lary	15	15	.500	.495
1961	Frank	Lary	23	9	.719	.719
1962	Frank	Lary	2	5	.286	.287
1963	Frank	Lary	4	9	.308	.314
1964	Frank	Lary	3	5	.375	.343
1965	Frank	Lary	2	3	.400	.410
			127	111	.534	.536

Lary in his career was 127-111 as a starter, 1-5 as a reliever; we’re not missing any data for him, except that my game logs don’t include relief appearances. This is the career data for Sandy Koufax:

YEAR	FIRST	LAST	W	L	WPct	ADJUSTED
1955	Sandy	Koufax	2	1	.667	.631
1956	Sandy	Koufax	2	4	.333	.367
1957	Sandy	Koufax	4	4	.500	.425
1958	Sandy	Koufax	8	10	.444	.456
1959	Sandy	Koufax	8	6	.571	.525
1960	Sandy	Koufax	7	13	.350	.314
1961	Sandy	Koufax	17	13	.567	.576
1962	Sandy	Koufax	14	7	.667	.657
1963	Sandy	Koufax	25	5	.833	.804
1964	Sandy	Koufax	19	5	.792	.814
1965	Sandy	Koufax	26	8	.765	.757
1966	Sandy	Koufax	27	9	.750	.736
			159	85	.652	.645

This is Bob Gibson:

YEAR	FIRST	LAST	W	L	WPct	ADJUSTED
1959	Bob	Gibson	2	5	.286	.284
1960	Bob	Gibson	2	6	.250	.324
1961	Bob	Gibson	13	11	.542	.540
1962	Bob	Gibson	14	13	.519	.500
1963	Bob	Gibson	17	9	.654	.637
1964	Bob	Gibson	18	11	.621	.646
1965	Bob	Gibson	20	12	.625	.598
1966	Bob	Gibson	21	12	.636	.646
1967	Bob	Gibson	13	7	.650	.634
1968	Bob	Gibson	22	9	.710	.701
1969	Bob	Gibson	20	13	.606	.619
1970	Bob	Gibson	23	7	.767	.793
1971	Bob	Gibson	16	13	.552	.545
1972	Bob	Gibson	19	11	.633	.608
1973	Bob	Gibson	12	10	.545	.546
1974	Bob	Gibson	11	13	.458	.470
1975	Bob	Gibson	2	8	.200	.205
			245	170	.590	.588

And this is Roger Clemens:

YEAR	FIRST	LAST	W	L	WPct	ADJUSTED
1984	Roger	Clemens	9	4	.692	.678
1985	Roger	Clemens	7	5	.583	.645
1986	Roger	Clemens	24	4	.857	.824
1987	Roger	Clemens	20	9	.690	.700
1988	Roger	Clemens	18	12	.600	.617
1989	Roger	Clemens	17	11	.607	.605
1990	Roger	Clemens	21	6	.778	.767
1991	Roger	Clemens	18	10	.643	.669
1992	Roger	Clemens	18	11	.621	.615
1993	Roger	Clemens	11	14	.440	.452
1994	Roger	Clemens	9	7	.562	.497
1995	Roger	Clemens	10	5	.667	.658
1996	Roger	Clemens	10	13	.435	.430
1997	Roger	Clemens	21	7	.750	.740
1998	Roger	Clemens	20	6	.769	.782
1999	Roger	Clemens	14	10	.583	.622
2000	Roger	Clemens	13	8	.619	.634
2001	Roger	Clemens	20	3	.870	.864
2002	Roger	Clemens	13	6	.684	.619
2003	Roger	Clemens	17	9	.654	.658
2004	Roger	Clemens	18	4	.818	.822
2005	Roger	Clemens	13	8	.619	.606
2006	Roger	Clemens	7	6	.538	.569
2007	Roger	Clemens	6	6	.500	.452
			354	184	.658	.659

You can see that this adjustment really doesn’t make any difference most of the time. Whatever is wrong with the won-lost record, this adjustment is not going to fix it.

COMMENTS (16 Comments, most recent shown first)

rjazzguy
Off topic, and I apologize, but I trust Win Shares far more than I trust WAR, but WAR seems to be popular due to its portability. Do you ever toy with the idea of updating Win Shares on a daily basis, like WAR does? Those WAR guys are so smug! I hate ‘em; I do! Grrrrrrr...
2:43 PM Jun 14th

wdr1946
A "good team" is a team which has a good won-lost record at the end of the season. What about those teams which start the season with a bang and then tail off? Or are on a hot streak in mid-season but then tail off? What about teams faced which have good hitters but mediocre pitchers, or vice-versa?
2:59 AM Jun 13th

ajmilner
I don't know how far back your database goes, but what about Lefty Grove? Connie Mack supposedly held him from pitching too much against the Yankees, and when Lefty was with Philadelphia the only real good two AL teams were the A's and the Yanks...
7:53 PM Jun 12th

MarisFan61
Clarification:
When I say that "on average" pitchers on good teams are pitching against teams that are below .500 (and 'conversely'), I don't mean that this applies to ALL pitchers, but that there's such a tendency; that's it's largely true.

Whitey Ford could well be one of the exceptions. As per the common notion, for much of his career his starts were somewhat concentrated against the better teams.
(We've looked at this on Reader Posts.)

10:32 AM Jun 12th

MarisFan61
I was going to say the same as what 110phil just said: this adjustment does seem pretty easily to be a thing that would make it be expected for pitchers on bad teams to be helped and pitchers on good teams to be hurt. On average, pitchers on bad teams are pitching against teams that are over .500, and conversely.*
But since you said you couldn't logically figure that out, I wonder if there's some fallacy here, if it's not that simple theoretically.

* (it's not really a converse but that's what is always said for such a thing) :-)

------------

About Spahn and Pierce in the 1957 Cy Young: I wonder if, even if Pierce had had such a better W-L record than Spahn (22-10 vs. 19-12), wouldn't it still have been quite possible (not likely, I agree) that Spahn would have gotten it because of being on a pennant winner?

This depends on whether being on a pennant winner helps in the Cy Young vote, which I've thought it does -- not like for MVP but somewhat.
Does it??
10:19 AM Jun 12th

110phil
The reason pitchers on good teams are hurt by this measure ... wouldn't that have to be because pitchers don't pitch against their own team? So good pitchers would be facing slightly-below-average teams.

In, say, a 12-team league, a pitcher on a 92-70 team would be pitching against teams that went an average 80-82, collectively.
12:37 AM Jun 12th

evanecurb
I always wondered about this. I'm glad you took the time to do the research.
12:22 AM Jun 12th

tangotiger
Bill, it was in this article:

https://www.billjamesonline.com/on_babe_ruth_lost_in_time_/

Since one SD at the team level is 0.060, then we can figure that we have 16 "full time" players, which would mean the SD at the player level is root16 times .060, or .240. In other words, the "player win%" has one SD = .240.

3:44 PM Jun 11th

trn6229
Very interesting as usual, thank you, Bill.

In your book on managers, you talk about Don Larsen with a poor record for the Browns and Orioles, 7-12 and 3-21 and then with the Yankees 9-2 and 11-5. Casey Stengel did not pitch in a rotation. Whitey Ford always started against the White Sox and Indians who were good during that era and Larsen would start against the Orioles and Athletics and Senators.

Does that make sense? When Ralph Houk took over as Yankees manager, Ford started in a rotation and won 25, 17, 24 and 17 games from 1961-1964.

I think a good pitcher should be able to defeat a good team or at least be .500 and pitch very well against the dregs of the league.

I am aware that both Sandy Koufax and Juan Marichal beat the early Mets teams like a drum. Sandy was 17-2 against the Mets and Juan was 17-0 from 1962-1966 and 9-8 from 1967-1973. Data from Retrosheet.
2:52 PM Jun 11th

tangotiger
Bill:

The 1 SD = 0.060 is something you've talked about as well, I believe. Maybe even in an article about 2-4 months ago?

If you take the standard deviation of all team winning percentages over the last few decades, you'll get something like one SD = 0.072. Since random variation is one SD = .039 (which we get as 0.5/root162), we can back that out of our observed .072 to get a "true" spread of .060. That would be .060^2 + .039^2 = .072^2. In other words, true^2 + random^2 = observed^2
2:12 PM Jun 11th

NigelTufnel
I've always thought that if Jim Palmer had won the last game of the season in 1982 to put the O's into the playoffs over the Brewers, he would have won the Cy Young (he would have been 16-4, with a sub-3.00 ERA). Stieb still might have been a better choice, but I'm an Orioles fan, so Palmer winning would have made me happy.
10:40 AM Jun 11th

bjames
If the adjustment doesn't so a lot most of the time, does it reinforce the theory that you need to beat the teams you should beat and just break even against the good teams?

I doubt it. It's just an inflexible measurement. It doesn't matter who you beat or who you lose to; it's how much you pitch against the better team. Assuming that a good team is 100-60 and a bad team is 60-100,if you beat the good team and lose to the bad team, that's 100 and 100, but if you beat the bad team and lose to the good team, that's 60 and 60, which is the same percentage. It's just an inflexible measurement.

I'm not following Tom's math AT ALL. I don't see where he is getting the standard deviation estimate. But then, I just got out of bed, so. . .foggy.

10:19 AM Jun 11th

smbakeresq
If the adjustment doesn't so a lot most of the time, does it reinforce the theory that you need to beat the teams you should beat and just break even against the good teams?

9:51 AM Jun 11th

tangotiger
Oops, wrote too quickly. Kellner showed a 36 point improvement for his career. He has the equivalent of 205 9-inning games, which if you take the root, gives you 14. And so, .060/14 = 0.004 = 1 SD. He's at 9 SD, so far higher than expected by random.

In 1950, with 225 IP, that's 25 9-inning games, or a divisor of 5. One SD = .060 / 5 = .012. At 88 points, that's 7 SD. So, he clearly kept facing the same tough teams in a non-random way.
9:41 AM Jun 11th

tangotiger
Very interesting. Let's see what our expectation might have been.

One SD in true talent at the team level is about 0.060 wins per game(*). So, if you had 3600 IP, the equivalent of 400 9-inning games, that would get reduced by a factor of 20 (square root of 400). So, 0.060 wins becomes 1 SD = 0.003 wins per game. Which basically means our expectation is that every pitcher is +/- 0.010 wins per game (given 3600 IP).

Bill has Pierce with a 0.014 wins per game jump. A bit higher than I'd have expected from the leader.

(*) Note that 0.060 wins per game is what it's been in the past few decades. It might have been higher in the time of Pierce. So that 0.014 might make a bit more sense.

9:26 AM Jun 11th

bjames
Forgot to mention in the article, meant to mention: Pierce's 58-point improvement in 1957, if the quality of the opposition is considered, is the largest in baseball in that season, while Spahn's 23-point drop-off is the third-largest of any pitcher.
3:34 AM Jun 11th

Opposition Adjusted Winning Percentage

COMMENTS (16 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: