Opposition Adjusted Winning Percentage

June 11, 2019
 

Opposition Adjusted Winning Percentage

 

            This is just kind of a stupid little thing; I was just messing around with the data, not expecting to find anything really interesting, and probably I found even less than I expected.   But my thought was, what if you "weighted" each win and loss for a starting pitcher by the quality of the opponent?   In other words, if a pitcher gets a win against a team which wins 100 games, we count that as 100 wins, whereas if he wins a game against a team which wins only 60 games, we count that as only 60 wins—and vice versa, if he loses a game to a bad opponent, a team which lost 100 games, then we count that as 100 losses, whereas if he loses a game to a good team, a team which lost only 60 games, then we count that as only 60 losses.   How would that change the winning percentages?

            The one interesting thing that turns up from this study is that the pitcher who gains the most in a single season and the pitcher who gains the most in a career turn out to be the same pitcher:  Alex Kellner.   Alex Kellner won 20 games as a rookie for the 1949 Philadelphia A’s, a fluke season, and then led the league in losses the next two seasons, going 8-20 and 11-14; not often you could lead the league in losses with 14, but he did.   He hung around for a long time after that, and pitched well for the 1958 Cincinnati Reds, going 7-3 with a 2.30 ERA for them in seven starts and eleven relief appearances. 

            Anyway, Kellner in 1950 was 8-20, but 8-19 as a starting pitcher.  However, in my data he is only 8-14, since I am missing some of his games from my log.   In the games I have he was 8-14 as a starter (.364), but if you weight the wins and losses for the quality of competition he improves to .452, an 88-point improvement.    (The missing data does not cause the improvement; those games are missing both from the "natural" and the "quality adjusted" winning percentage.   As far as we know, they don’t exist.)

            Anyway, when we figure the CAREER records, Alex Kellner again is the pitcher who improves the most.   Kellner had a career record of 92-108, a .460 winning percentage, although in my data it is .459 (84-99).   But when you adjust for the quality of competition that he faced, his winning percentage improves to .495, a 36-point improvement.  

            Trying to figure out logically whether this adjustment would help pitchers on bad teams and hurt pitchers on good teams, I couldn’t figure it out, but when you check the data, it obviously does; all of the pitchers who are helped the most in their career winning percentages pitched mostly for bad teams, while those who are hurt the most pitched mostly for good teams.  

            What you may notice is that these numbers are not that large.   The questions I was asking were things like "are there any pitchers who might be in the Hall of Fame if this had been factored in?" and "Are there are any Cy Young Awards that might have gone to a different pitcher if this had been factored in?"   But a 36-point pickup in winning percentage (a) is not huge, (b) is larger than anyone else had, and (c) the largest pickups are for guys who had 100-200 career decisions.   Hall of Fame candidates need to have 350-500 decisions.   The gains and losses in that range are not that large.    The largest gain for anyone who might be looked upon as a Hall of Fame candidate is 14 points, for Billy Pierce.  

            Billy Pierce, 1950s White Sox pitcher. . . . .he was a little guy, and a two-pitch pitcher.   He just threw a fastball and a slider.   Well, early in his career he threw a big curve, but the big curve led to a lot of walks, and he "added" a slider.   Mostly the slider replaced the curve, and (according to contemporary accounts) he mostly just lived off of those two pitchers, although he would later claim that he threw all three.  

            Anyway, Pierce was a tremendous pitcher for a good long time.   He had a .553 career winning percentage as a starter, .554 in my data, but if you adjust for the quality of competition, it was .568, a 14-point jump.    Baseball Reference confirms that he did do the bulk of his pitching against quality teams.   The two teams that he started against most often in his career, the two teams against which he had the most starts, decisions, and innings, were the Yankees and the Cleveland Indians, the two best teams in the American League in his era.   In his career he started 187 games against sub-.500 teams, going 102-56 against them, but started 245 times against .500+ teams, going 109-113. 

            You can compare that to, let’s say, Jim Bunning; Pierce’s career winning percentage was 17 points higher than Bunning’s against sub-.500 teams (.646 to .629), and 20 points higher against .500-or-better teams (.491 to .471), but, because Pierce pitched most often against good teams whereas Bunning had 258 starts against sub-.500 teams, 261 against .500-or-better teams, their career winning percentages are almost the same.   It’s relevant; I don’t know if it puts Pierce over the Hall of Fame line or not, but it counts a little bit.   Perhaps a more interesting one is that Pierce and Don Drysdale have almost the same career won-lost records, 211-169 for Pierce, 209-166 for Drysdale—and almost exactly the same splits against good teams and sub-.500 teams.   Pierce was 102-56 against sub-.500 teams; Drysdale was 100-55.   Pierce was 109-113 against .500 or better teams; Drysdale was 109-111.  

            Anyway, the one Cy Young Award which seems like it might have gone the other way, had the voters been aware of this, is also Billy Pierce.   In 1957 Billy Pierce and Warren Spahn both went 20-11 as starting pitchers.   Both pitched a few times in relief, 3 times for Pierce, 4 for Spahn; Pierce took a loss as a reliever, making him 20-12, while Spahn picked up a win, making him 21-11.   That small difference, in the world in which the Cy Young Award rested very heavily on won-lost records, may well have made Spahn the Cy Young Award winner.  Possibly not; Spahn won 15 out of 16 votes, Pierce got none.   Anyway, back to the thesis; Pierce was 20-11 as a starter, but adjusting for the quality of the competition, he goes up to .703, while Spahn, also 20-11, goes down to .622, giving Pierce an advantage of 81 points, or 2 ½ games.   If Pierce had gone 22-10, let’s say, and Spahn had been 19-12—a 2 ½ game difference—Pierce quite certainly would have won the award.  So I think you can argue that this fact, had it been known at the time, might have given Pierce the award.  

            As we look at things in the modern world, Pierce might have won it anyway.   Pierce had far more strikeouts (171 to 111) with fewer walks (71 to 78) and fewer home runs allowed (18 to 23), so Pierce wins all three of the true outcomes.   Spahn had a better ERA, but Spahn gave up 13 un-earned runs; Pierce gave up 5.   None of that is really relevant to the current discussion. 

            Other than that, don’t really see any Cy Young Awards that would be changed by this knowledge.   Some of you will wonder about Pete Vuckovich’s notorious award, 1982; Vuckovich was 18-6, a .750 percentage, but if you adjust for the quality of the wins and the losses, it goes up .770, while Dave Stieb, who should have won the award if it was going to go to a starting pitcher, goes down from .548 to .545.   Doesn’t change my opinion; Stieb was still a better pitcher than Vuckovich, but what I am saying is, the quality of the opposition was the not the meaningful variable.   If you look at historic seasons—Carlton in ’72, Blue in ’71, Koufax every year, etc.—it doesn’t really change anything; they all come out about the same.   Dwight Gooden in 1985 (24-4, .857) adjusts up to .862; Whitey Ford in 1961 (25-4) adjusts down to .859.   I mean, we all know that Whitey Ford in ’61 was not really THAT good, but this method does not adjust for things like Run Support and Park Effects and Defensive Support; those are other issues.  A 1.000 winning percentage in this method is always 1.000 after you make adjustments, no matter how badly you pitch, and .000 is always .000, no matter how well you pitch. 

            It takes a lot to move the "opposition adjusted" winning percentage very far from the "true" winning percentage.  Frank Lary, notorious Yankee Killer, went 7-1 against the Yankees in 1958, 5-1 against them in 1959—but his winning percentages adjust upward by only 12 points in 1958, 31 points in 1959.   This chart summarizes the career data for Lary:

YEAR

FIRST

LAST

W

L

WPct

ADJUSTED

1955

Frank

Lary

13

14

.481

.477

1956

Frank

Lary

21

13

.618

.602

1957

Frank

Lary

11

15

.423

.425

1958

Frank

Lary

16

13

.552

.564

1959

Frank

Lary

17

10

.630

.661

1960

Frank

Lary

15

15

.500

.495

1961

Frank

Lary

23

9

.719

.719

1962

Frank

Lary

2

5

.286

.287

1963

Frank

Lary

4

9

.308

.314

1964

Frank

Lary

3

5

.375

.343

1965

Frank

Lary

2

3

.400

.410

 

 

 

127

111

.534

.536

 

            Lary in his career was 127-111 as a starter, 1-5 as a reliever; we’re not missing any data for him, except that my game logs don’t include relief appearances.  This is the career data for Sandy Koufax:

YEAR

FIRST

LAST

W

L

WPct

ADJUSTED

1955

Sandy

Koufax

2

1

.667

.631

1956

Sandy

Koufax

2

4

.333

.367

1957

Sandy

Koufax

4

4

.500

.425

1958

Sandy

Koufax

8

10

.444

.456

1959

Sandy

Koufax

8

6

.571

.525

1960

Sandy

Koufax

7

13

.350

.314

1961

Sandy

Koufax

17

13

.567

.576

1962

Sandy

Koufax

14

7

.667

.657

1963

Sandy

Koufax

25

5

.833

.804

1964

Sandy

Koufax

19

5

.792

.814

1965

Sandy

Koufax

26

8

.765

.757

1966

Sandy

Koufax

27

9

.750

.736

 

 

 

159

85

.652

.645

 

            This is Bob Gibson:

YEAR

FIRST

LAST

W

L

WPct

ADJUSTED

1959

Bob

Gibson

2

5

.286

.284

1960

Bob

Gibson

2

6

.250

.324

1961

Bob

Gibson

13

11

.542

.540

1962

Bob

Gibson

14

13

.519

.500

1963

Bob

Gibson

17

9

.654

.637

1964

Bob

Gibson

18

11

.621

.646

1965

Bob

Gibson

20

12

.625

.598

1966

Bob

Gibson

21

12

.636

.646

1967

Bob

Gibson

13

7

.650

.634

1968

Bob

Gibson

22

9

.710

.701

1969

Bob

Gibson

20

13

.606

.619

1970

Bob

Gibson

23

7

.767

.793

1971

Bob

Gibson

16

13

.552

.545

1972

Bob

Gibson

19

11

.633

.608

1973

Bob

Gibson

12

10

.545

.546

1974

Bob

Gibson

11

13

.458

.470

1975

Bob

Gibson

2

8

.200

.205

 

 

 

245

170

.590

.588

 

 

            And this is Roger Clemens:

YEAR

FIRST

LAST

W

L

WPct

ADJUSTED

1984

Roger

Clemens

9

4

.692

.678

1985

Roger

Clemens

7

5

.583

.645

1986

Roger

Clemens

24

4

.857

.824

1987

Roger

Clemens

20

9

.690

.700

1988

Roger

Clemens

18

12

.600

.617

1989

Roger

Clemens

17

11

.607

.605

1990

Roger

Clemens

21

6

.778

.767

1991

Roger

Clemens

18

10

.643

.669

1992

Roger

Clemens

18

11

.621

.615

1993

Roger

Clemens

11

14

.440

.452

1994

Roger

Clemens

9

7

.562

.497

1995

Roger

Clemens

10

5

.667

.658

1996

Roger

Clemens

10

13

.435

.430

1997

Roger

Clemens

21

7

.750

.740

1998

Roger

Clemens

20

6

.769

.782

1999

Roger

Clemens

14

10

.583

.622

2000

Roger

Clemens

13

8

.619

.634

2001

Roger

Clemens

20

3

.870

.864

2002

Roger

Clemens

13

6

.684

.619

2003

Roger

Clemens

17

9

.654

.658

2004

Roger

Clemens

18

4

.818

.822

2005

Roger

Clemens

13

8

.619

.606

2006

Roger

Clemens

7

6

.538

.569

2007

Roger

Clemens

6

6

.500

.452

 

 

 

354

184

.658

.659

 

            You can see that this adjustment really doesn’t make any difference most of the time.   Whatever is wrong with the won-lost record, this adjustment is not going to fix it. 

 

           

 
 

COMMENTS (16 Comments, most recent shown first)

rjazzguy
Off topic, and I apologize, but I trust Win Shares far more than I trust WAR, but WAR seems to be popular due to its portability. Do you ever toy with the idea of updating Win Shares on a daily basis, like WAR does? Those WAR guys are so smug! I hate ‘em; I do! Grrrrrrr...
2:43 PM Jun 14th
 
wdr1946
A "good team" is a team which has a good won-lost record at the end of the season. What about those teams which start the season with a bang and then tail off? Or are on a hot streak in mid-season but then tail off? What about teams faced which have good hitters but mediocre pitchers, or vice-versa?​
2:59 AM Jun 13th
 
ajmilner
I don't know how far back your database goes, but what about Lefty Grove? Connie Mack supposedly held him from pitching too much against the Yankees, and when Lefty was with Philadelphia the only real good two AL teams were the A's and the Yanks...
7:53 PM Jun 12th
 
MarisFan61
Clarification:
When I say that "on average" pitchers on good teams are pitching against teams that are below .500 (and 'conversely'), I don't mean that this applies to ALL pitchers, but that there's such a tendency; that's it's largely true.

Whitey Ford could well be one of the exceptions. As per the common notion, for much of his career his starts were somewhat concentrated against the better teams.
(We've looked at this on Reader Posts.)


10:32 AM Jun 12th
 
MarisFan61
I was going to say the same as what 110phil just said: this adjustment does seem pretty easily to be a thing that would make it be expected for pitchers on bad teams to be helped and pitchers on good teams to be hurt. On average, pitchers on bad teams are pitching against teams that are over .500, and conversely.*
But since you said you couldn't logically figure that out, I wonder if there's some fallacy here, if it's not that simple theoretically.

* (it's not really a converse but that's what is always said for such a thing) :-)

------------

About Spahn and Pierce in the 1957 Cy Young: I wonder if, even if Pierce had had such a better W-L record than Spahn (22-10 vs. 19-12), wouldn't it still have been quite possible (not likely, I agree) that Spahn would have gotten it because of being on a pennant winner?

This depends on whether being on a pennant winner helps in the Cy Young vote, which I've thought it does -- not like for MVP but somewhat.
Does it??
10:19 AM Jun 12th
 
110phil
The reason pitchers on good teams are hurt by this measure ... wouldn't that have to be because pitchers don't pitch against their own team? So good pitchers would be facing slightly-below-average teams.

In, say, a 12-team league, a pitcher on a 92-70 team would be pitching against teams that went an average 80-82, collectively.
12:37 AM Jun 12th
 
evanecurb
I always wondered about this. I'm glad you took the time to do the research.
12:22 AM Jun 12th
 
tangotiger
Bill, it was in this article:

https://www.billjamesonline.com/on_babe_ruth_lost_in_time_/

Since one SD at the team level is 0.060, then we can figure that we have 16 "full time" players, which would mean the SD at the player level is root16 times .060, or .240. In other words, the "player win%" has one SD = .240.

3:44 PM Jun 11th
 
trn6229
Very interesting as usual, thank you, Bill.

In your book on managers, you talk about Don Larsen with a poor record for the Browns and Orioles, 7-12 and 3-21 and then with the Yankees 9-2 and 11-5. Casey Stengel did not pitch in a rotation. Whitey Ford always started against the White Sox and Indians who were good during that era and Larsen would start against the Orioles and Athletics and Senators.

Does that make sense? When Ralph Houk took over as Yankees manager, Ford started in a rotation and won 25, 17, 24 and 17 games from 1961-1964.

I think a good pitcher should be able to defeat a good team or at least be .500 and pitch very well against the dregs of the league.

I am aware that both Sandy Koufax and Juan Marichal beat the early Mets teams like a drum. Sandy was 17-2 against the Mets and Juan was 17-0 from 1962-1966 and 9-8 from 1967-1973. Data from Retrosheet.
2:52 PM Jun 11th
 
tangotiger
Bill:

The 1 SD = 0.060 is something you've talked about as well, I believe. Maybe even in an article about 2-4 months ago?

If you take the standard deviation of all team winning percentages over the last few decades, you'll get something like one SD = 0.072. Since random variation is one SD = .039 (which we get as 0.5/root162), we can back that out of our observed .072 to get a "true" spread of .060. That would be .060^2 + .039^2 = .072^2. In other words, true^2 + random^2 = observed^2
2:12 PM Jun 11th
 
NigelTufnel
I've always thought that if Jim Palmer had won the last game of the season in 1982 to put the O's into the playoffs over the Brewers, he would have won the Cy Young (he would have been 16-4, with a sub-3.00 ERA). Stieb still might have been a better choice, but I'm an Orioles fan, so Palmer winning would have made me happy.
10:40 AM Jun 11th
 
bjames
If the adjustment doesn't so a lot most of the time, does it reinforce the theory that you need to beat the teams you should beat and just break even against the good teams?

I doubt it. It's just an inflexible measurement. It doesn't matter who you beat or who you lose to; it's how much you pitch against the better team. Assuming that a good team is 100-60 and a bad team is 60-100,if you beat the good team and lose to the bad team, that's 100 and 100, but if you beat the bad team and lose to the good team, that's 60 and 60, which is the same percentage. It's just an inflexible measurement.

I'm not following Tom's math AT ALL. I don't see where he is getting the standard deviation estimate. But then, I just got out of bed, so. . .foggy.

10:19 AM Jun 11th
 
smbakeresq
If the adjustment doesn't so a lot most of the time, does it reinforce the theory that you need to beat the teams you should beat and just break even against the good teams?

9:51 AM Jun 11th
 
tangotiger
Oops, wrote too quickly. Kellner showed a 36 point improvement for his career. He has the equivalent of 205 9-inning games, which if you take the root, gives you 14. And so, .060/14 = 0.004 = 1 SD. He's at 9 SD, so far higher than expected by random.

In 1950, with 225 IP, that's 25 9-inning games, or a divisor of 5. One SD = .060 / 5 = .012. At 88 points, that's 7 SD. So, he clearly kept facing the same tough teams in a non-random way.
9:41 AM Jun 11th
 
tangotiger
Very interesting. Let's see what our expectation might have been.

One SD in true talent at the team level is about 0.060 wins per game(*). So, if you had 3600 IP, the equivalent of 400 9-inning games, that would get reduced by a factor of 20 (square root of 400). So, 0.060 wins becomes 1 SD = 0.003 wins per game. Which basically means our expectation is that every pitcher is +/- 0.010 wins per game (given 3600 IP).

Bill has Pierce with a 0.014 wins per game jump. A bit higher than I'd have expected from the leader.

(*) Note that 0.060 wins per game is what it's been in the past few decades. It might have been higher in the time of Pierce. So that 0.014 might make a bit more sense.


9:26 AM Jun 11th
 
bjames
Forgot to mention in the article, meant to mention: Pierce's 58-point improvement in 1957, if the quality of the opposition is considered, is the largest in baseball in that season, while Spahn's 23-point drop-off is the third-largest of any pitcher.
3:34 AM Jun 11th
 
 
©2019 Be Jolly, Inc. All Rights Reserved.|Web site design and development by Americaneagle.com|Terms & Conditions|Privacy Policy