Remember me

How Reliable are Won-Lost Records, Part 2

September 13, 2022
                                        How Reliable are Won-Lost Records?

Part II

 

 

This article is a companion piece to the article "A Reliable Batting Average", which was posted here on August 22.  While that article postulated a "true batting average" and dealt with the question "How many at bats does it take for a batter to reach his true batting average?", this article postulates a "true winning percentage" for a pitcher and deals with the question "How many starts does it take for a pitcher to reach his true winning percentage?" 

            Despite having the same goals, the method used in that article and the method used in this article have almost nothing in common.  That process just would not work with pitcher won-lost records.  Here’s how the study was done.

            First, I established a winning percentage for each pitcher in each game, based simply on how well the pitcher pitched, without regard to the run support or the outcome of the game. I’ll explain in a minute how the one-game winning percentage was established. 

            Then I figured the average winning percentage for each pitcher in each season and in his career.  Starts only; relief appearances don’t contribute.  The result is the pitcher’s True Winning Percentage. 

            Then I compared the pitcher’s real-life winning percentage, what it shows in the Encyclopedias and other sources, to his True Winning Percentage. 

            Then I formed groups of all of the pitchers within the study to ask "what is the normal gap between the pitcher’s won-lost record and his True Winning Percentage? 

            The heavy lifting in that structure is the first problem, assigning a winning percentage to a each start.   After we get that done, it’s pretty straightforward. 

            I’ll use Don Aase to illustrate the process.  Don Aase, because of the alphabet, is the top line of the spreadsheet in the 329,989-line data file that I use to study things like this, and I always write the formulas on the top line, so I know Don Aase’s career better than I know my wife’s face.  I usually don’t use him to illustrate stuff, but he works really well here, so I’ll Aase-it. 

            Don Aase, like Tanner Houck forty years later, started his career pitching like a Hall of Famer.  In his first major league appearance, July 26, 1977, Aase pitched a complete game, striking out 11 batters and beating Milwaukee 4-3.  One of Milwaukee’s runs was un-earned.  Aase’s Game Score was 68.

            When the starting pitcher has a Game Score of 68, his team usually wins.  In my data there are 5,103 games in which a starting pitcher has had a Game Score of 68, and the winning percentage of those teams is .762 34.  We could say, then, that for that game Aase has a winning percentage contribution of .762. 

            The biggest problem with that is that, while Game Scores are built around 50, they are not anchored in place at an average of 50.  Pitching in Coors Field in 1999 is a lot different than pitching in Dodger Stadium in 1968.  We have to adjust the Game Score norms for (a) the season, (b) the league, and (c) the park. 

            The Aase Game 1 was in Fenway Park in 1977.   The adjusted average Game Score for Fenway Park in 1977 was 44.84, which is exceptionally low.   The American League in 1977 was at the time the highest-scoring league since 1956, and Fenway Park was the best hitters’ park in baseball, with a Park Run Factor of 137.   When hitting goes up, pitching goes down, so the average Game Score drops there by 5 points.  Of course you have to figure the average Game Score in the park and then try to adjust it for the hitting and pitching of the home team, so that’s a pain, but 44.84 is the best I can do. 

            Aase’s Game One, then, is not 68 over 50, but 68 over 44.84.  Calling it 45, he’s +23.  

            When a starting pitcher is 23 points better than the park/season norm, his team will win the game 85.2% of the time.  This data is actually "smoothed out"; even though there are thousands of pitchers at +23, thousands at +22, etc., the data is still a little bit jumpy, so that sometimes you’ll get an illogical jump between two consecutive numbers, indicating that, with only 5,000 entries, it is STILL not a completely sufficient data sample.  In this case, it’s .852 before being smoothed out and .852 after being smoothed out, but there’s probably some little change on the extra digits, I don’t know.

            Anyway, when a pitcher is +23 vs. the park/season norm, his team will win the game 85.2% of the time, so we will enter Aase’s winning percentage for his first major league game at .852.   He did win the game, so his winning percentage was 1.000 (1-0), but his "true winning percentage" was only .852—meaning that if you pitch that well every game, sometimes the team will lose anyway.  

            Aase’s second start was in California, at the Big A, which was a pitcher’s park, with a park/season norm of 52.23.  Aase pitched a 3-hit shutout, a Game Score of 87.   In his second game, then, he was +35 vs. the norm.

            When a pitcher is +35 vs the norm, his team will win the game 94.4% of the time, so Aase’s winning percentage contribution for that game is .944.  Two games, .852 and .944, a total of 1.796, an average of .898.   Aase’s true winning percentage, after two games, was .898.  His winning percentage was still 1.000 (2-0), but his true winning percentage was .898.  That’s a gap (discrepancy, error margin) of .102. 

            Aase’s third start was at the Oakland Coliseum, another pitcher’s park, so the park/season norm was 52.29.   Aase again pitched well; 7 innings, 5 hits, 1 run allowed.  He pitched well, but not AS well; his Game Score was only 62.  62 in that park in that season is +10.  When a starting pitcher has a Game Score of +10, his team will win the game 63.9% of the time, so Aase’s winning percentage contribution for that game is .639.   That makes a total, for the three games, of 2.435, which is an average of .812, so now Aase is 3-0 (1.000) but with a true winning percentage of .812.  After three starts, there is a discrepancy of 188 points between his winning percentage, based on Wins and Losses, and his true Winning Percentage, based entirely on how well he has pitched.  

            Note that the true winning percentage is in no sense a projection of how well Aase will pitch in the future.  It is a summary (an estimate, a statement) of how well he has pitched so far.  It changes with every start.

            In Aase’s fourth career start, back in Fenway on August 11, 1977, he was hit hard for the first time, and took his first career loss.  Don Baylor and Bobby Bonds hit home runs off of him, and he gave up six runs in five innings, posting a Game Score of 29.  29 against the park norm (44.84) is -16, so Aase is -16 for the game.  When a pitcher’s Game Score is 16 points worse than the park/league norm, his team will win the game only 26.5% of the time, so Aase’s Winning Percentage Contribution for that game is .265.  Through four games, his total Winning Percentage Contribution is 2.700, or a .675 average.  So after four games, Aase has a winning percentage of .750 (3-1) but a true winning percentage of .675.

            And on and on we go; probably you don’t want to hear about every start of Don Aase’s career.   By the end of his first season Aase (13 starts) was 6-2, a .750 Winning Percentage, but with a true Winning Percentage of .606.   He pitched well, albeit not quite as well as his 6-2 record would suggest.   But if he had pitched as well for his entire career as he did in 1977 (.606) he would probably be in the Hall of Fame.    As we know, he isn’t. 

            After his rookie season the Red Sox, not really believing the Aase was a Hall of Famer, traded him to California for Jerry Remy, straight-up swap.  In his first season in California, 1978, Aase made 29 starts and finished 11-8.   He actually didn’t pitch that well—4.03 ERA, strikeout/walk ratio of 93-80—so his true winning percentage was just .477.  His winning percentage was 102 points better than his actual performance.  Through two seasons he was 17-10, a .630 winning percentage, but with a true winning percentage of .518. 

            In his second season in LA Aase made 28 starts, finished 8-9 (.471) with an almost-matching true winning percentage of .468.   In his third season in LA (the fourth season of his career) he was pulled from the starting rotation at the end of July with a won-lost record of 5-13, 4.90 ERA.   He actually didn’t pitch that badly.  His winning percentage was .278 (5-13) albeit with a true winning percentage of .419.   

            Aase was taken out of the starting rotation after he gave up four runs to Detroit without getting anyone out in the first inning.  He was in the majors for 10 years after that, but never made another major league start.   In his four years as a starting pitcher Aase made 91 starts.  He was 30-32 as a starting pitcher, a .484 winning percentage, with a true winning percentage of .480.   So in his case, his won-lost record is a very, very accurate reflection of how well he actually pitched, with just a 4-point gap between his credited and his true winning percentages. 

            If everybody was off by just 4 points, I would have to amend my guess that won-lost records were probably 80% accurate to say that they were 99% accurate.  Unfortunately, such is not the case.  I’m not sure how to generalize about the data or sure if there is any value in generalizing about the data, but the 80% off-the-cuff estimate that I made on Twitter is a little bit generous.   I might go for 70%.

            There are many pitchers whose career won-lost records are accurate reflections of how well they pitched, ignoring the outside contributions that are poured into the pitcher’s record by circumstances.  Pitchers whose career True Winning Percentages are nearly identical to their record book won-lost percentages include, in ascending order of wins as a starter (within my data) . .pitchers whose records accurately describe their pitching include Juan Pizarro, Harry Brecheen, Ralph Terry, Alex Fernandez, Tim Lincecum, Harvey Haddix, Buddy Black, Earl Wilson, Mike Scott, Ryan Dempster, Pat Hentgen, Frank Lary, Virgil Trucks, Larry Dierker, Josh Beckett, Bret Saberhagen, Barry Zito, Felix Hernandez, Fernando Valenzuela, Larry Jackson, Billy Pierce, Tim Wakefield, David Cone, Don Drysdale, Pedro Martinez, John Smoltz, Luis Tiant, Randy Johnson, Tom Seaver, Don Sutton, Steve Carlton and Roger Clemens. 

            And many more; I abbreviated the list.  But there are a lot of pitchers who are just as good as their records tell you that they are. 

            And there are many who don’t.  In tomorrow’s article, I’ll address the more general issues of how long it takes for winning percentages and True Winning Percentages to start to drift together. 

            Some reader of yesterday’s installment suggested I use the method I used in an article from August, 2017.  I think that is essentially what I have done.  This method is not EXACTLY the same as that method, but it is essentially the same method, but just re-tooled to answer a different question. 

 

 

 
 

COMMENTS (8 Comments, most recent shown first)

jgf704
Also, I would think that winning percentage would correlate to a more simplified game score that includes only IP and runs allowed.
2:23 PM Sep 17th
 
jgf704
Don't know if your database is set up this way, but... Instead of comparing the pitcher's performance to average (for the league and park), what if you compared him to the opposing pitcher's performance in that game? For example, in that first game, the opposing starting pitcher was Larry Sorenson, with a game score of 50. And so Aase was +18 (rather than +23), and Sorenson would be -18 (instead of +5). So Aase would be credited with a slightly smaller +Wpct than before, while Sorenson's would be alot smaller.

Not proposing to do this instead of what you did, but in addition. Basically, it would quantify the effect of the opposing pitcher.
10:25 PM Sep 15th
 
abiggoof
Evan, probably mostly true plus or minus defense and especially good or bad teams, but a few especially good or bad seasons could muck up the picture. The other day I was looking at some pitcher or other with a tremendous career, but an ERA that skyrocketed the last few years. A year with a 6 ERA can really drive up the career total — even after 12-15 years — but may be negligible on W-L. The “real” value is far better, not that we should discount the rotten egg year.
6:38 PM Sep 14th
 
evanecurb
Of the 100 major league pitchers who had 200 decisions and ERA+ between 98 and 102, just 14 of those had a W-L record of between .490 and .510. (98/200 = .490, .102/200 = .510).
10:54 AM Sep 14th
 
evanecurb
Typo: Norman was 104-103, not 101-103.
10:50 AM Sep 14th
 
evanecurb
I sometimes do a shorthand critique of W-L records by eyeballing the pitcher's W-L record vs. his ERA+. Mike Torrez, for example, had a 98 OPS+, which I would expect to be a tick below a .500 pitcher His W-L record was 185-160, so his actual record exceeded expectations. Ned Garver was the opposite; a 112 ERA+ with a 129-157 record.

This method is definitely a shortcut and does not take the defense behind the pitcher into account. If it did, the discrepancies between Torrez's and Garver's expected and actual records would likely be even greater.

Rick Wise had an ERA+ of 101, W-L record of 188-181. Seems about right. Fred Norman was 101-103, ERA+ of 98. Also about right.

10:49 AM Sep 14th
 
Robinsong
It will be interesting if the reliability of win-loss record has decreased as starter innings have dropped. I would think that this is also a problem with Game Scores. Game Scores would be a less reliable predictor of the outcome of the game as starter innings drop.
9:40 AM Sep 14th
 
Manushfan
I remember the first Aase start in 77, first batter he faced was Von Joshua. He was a nice addition to the Sox rotation that summer . Later on he was a closer for the O's I think....

10:20 PM Sep 13th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy