More Data
I Must Have More Data
OK, my son was home over the Christmas season, and while he was at home he updated some files that he created for me eight or ten years ago. The files analyze game data from Retrosheet (retrosheet.org) and create game logs for starting pitchers. The last time these files were updated—2014—I had game logs for the years 1952 to 2014, not every game but most every game. The updated system gives me logs for the years 1921 to 2018, and also more data about each game—how many doubles and triples the pitcher allowed in each game, for example, and how many stolen bases he allowed and caught stealing. I had about 240,000 pitcher starts in the log before, maybe 250,000, I don’t remember, but anyway now I have 329,988. I spent about a month converting his files into the files I need, and since then I have been studying things, mostly based on Game Scores.
I have 82 pages of articles, essays, comments, explanations, etc. that I have written based on this data, with the intention of sharing that with you. I don’t want to dump 82 pages on you today, because I don’t know that anybody would read it. I’ll start publishing it at a rate of seven pages a day, five days a week; should last until the end of the month, more or less.
I know that when I do a long series of articles people kind of stop reading them after a while, but I do hope that you’ll come back and read the last article of the series, "A Conclusion in Regard to Baseball Reference WAR", which should run on February 28, I think. Thanks.
Coverage
I have 79% coverage for 1921, 79% for 1922, 81% for 1923, 82% for 1924, 81% from 1925, 83% from 1926, 85% from 1927, etc. All of this is courtesy of Retrosheet.org, which is the third-greatest American Institution, after the Smithsonian and the Kansas University basketball team.
Anyway, the percentage drops as low as 69% in 1944, and goes to basically 100% in 1958. After 1958 there’s an occasional game missing. We can’t accurately evaluate all pitchers from 1922 or 1935 or something, because we are missing some games for some pitchers.
Average Game Scores by Pitcher
We have a Game Score for each pitcher in each game. The simplest way to proceed toward a performance summary for each pitcher in each season is simply to look at the average Game Scores. I’ve probably done this for you before, with less data, but the highest Average Game Score for any pitcher in the data is 76.1, for Bob Gibson in 1968:
Year
|
First
|
Last
|
St#
|
Year Avg
|
1968
|
Bob
|
Gibson
|
34
|
76.1
|
1968
|
Luis
|
Tiant
|
32
|
72.4
|
1965
|
Sandy
|
Koufax
|
41
|
71.8
|
1971
|
Tom
|
Seaver
|
35
|
70.9
|
1997
|
Pedro
|
Martinez
|
31
|
70.8
|
These are the highest averages for pitchers with 30 or more starts in a season, within my data. Let’s do the top 10:
Year
|
First
|
Last
|
St#
|
Year Avg
|
1968
|
Bob
|
Gibson
|
34
|
76.1
|
1968
|
Luis
|
Tiant
|
32
|
72.4
|
1965
|
Sandy
|
Koufax
|
41
|
71.8
|
1971
|
Tom
|
Seaver
|
35
|
70.9
|
1997
|
Pedro
|
Martinez
|
31
|
70.8
|
1985
|
Dwight
|
Gooden
|
35
|
70.4
|
1971
|
Vida
|
Blue
|
39
|
69.9
|
1963
|
Sandy
|
Koufax
|
40
|
69.8
|
1946
|
Hal
|
Newhouser
|
30
|
69.5
|
1972
|
Steve
|
Carlton
|
41
|
69.3
|
Oh, hell, let’s do the top 20:
Year
|
First
|
Last
|
St#
|
Year Avg
|
1968
|
Bob
|
Gibson
|
34
|
76.1
|
1968
|
Luis
|
Tiant
|
32
|
72.4
|
1965
|
Sandy
|
Koufax
|
41
|
71.8
|
1971
|
Tom
|
Seaver
|
35
|
70.9
|
1997
|
Pedro
|
Martinez
|
31
|
70.8
|
1985
|
Dwight
|
Gooden
|
35
|
70.4
|
1971
|
Vida
|
Blue
|
39
|
69.9
|
1963
|
Sandy
|
Koufax
|
40
|
69.8
|
1946
|
Hal
|
Newhouser
|
30
|
69.5
|
1972
|
Steve
|
Carlton
|
41
|
69.3
|
1924
|
Dazzy
|
Vance
|
32
|
69.3
|
1966
|
Sandy
|
Koufax
|
41
|
69.1
|
1978
|
Ron
|
Guidry
|
35
|
68.9
|
1946
|
Bob
|
Feller
|
32
|
68.8
|
1968
|
Denny
|
McLain
|
41
|
68.8
|
1969
|
Bob
|
Gibson
|
35
|
68.3
|
1972
|
Gaylord
|
Perry
|
40
|
68.2
|
1999
|
Randy
|
Johnson
|
35
|
68.1
|
1968
|
Dave
|
McNally
|
35
|
68.1
|
1966
|
Juan
|
Marichal
|
36
|
68.0
|
You’re probably wondering why I do things like that, aren’t you? It’s because I want you to actually read the list. If I just give you a list of 20 pitchers, you’ll just look at the top five. But if I give you the top five, let you look at them, then give you five more, then you’ll look at the next five. I’m relying on my understanding of how you process information to try to get you to take in a little more information.
Of course, this is a very simple approach to the problem, and there are 50 different things "wrong" with it. There are 50 different reasons why this is not a reliable list of the best pitcher/seasons in my data. We’re going to attack that list of problems one by one, making the list more reliable and more reliable by making adjustments for biases in the data.
But let me point out before I do that: This is not a bad list. Gibson in 1968, Koufax in 1963, 1965, and 1966, Ron Guidry in 1978, Vida Blue in 1971, Doc Gooden in 1985, Pedro Martinez in 1997, Steve Carlton in 1972. . . these are the greatest pitching seasons in baseball history, or among them. We’re going to make the list better, but we’re not starting from zero. The process, even at this naïve level, will find the Cy Young Award winner most of the time or half of the time or something. It’s pretty good.
Margins Above 50
Of course, to average a Game Score of 56 in 40 starts is different from a Game Score average of 56 in 30 starts. It has a different impact on the won-lost record of the team.
We can adjust for this by measuring instead the pitcher’s margin above or below average, assuming the average to be a Game Score of 50. If a pitcher has a Game Score of 86, we record that as +36, meaning that, in that game, he is 36 points above average. His value for the season is his total above average. That moves Sandy Koufax, 1965, ahead of Gibson and Tiant, into the number one spot:
Year
|
First
|
Last
|
Margin
|
1965
|
Sandy
|
Koufax
|
892
|
1968
|
Bob
|
Gibson
|
886
|
1963
|
Sandy
|
Koufax
|
793
|
1972
|
Steve
|
Carlton
|
792
|
1966
|
Sandy
|
Koufax
|
783
|
1971
|
Vida
|
Blue
|
775
|
1968
|
Denny
|
McLain
|
771
|
1971
|
Tom
|
Seaver
|
732
|
1972
|
Gaylord
|
Perry
|
728
|
1968
|
Luis
|
Tiant
|
717
|
Same pitchers, just a little different order. By the way, the WORST pitcher with 30 or more starts was Jose Lima in 2005. Lima was 5-16 with a 6.99 ERA. His average Game Score was 38.2, and he was 377 points below average (below 50) for the season. The -377 isn’t the worst season the data; it’s the second-worst. Claude Willoughby in 1930 made 24 starts for the 1930 Philadelphia Phillies, finishing 4-17 with a 7.59 ERA. We only have 18 of those 24 starts in our data, but his total for those 18 starts was -406.
Margins Above a Truer Average
This method (above) assumes that the average Game Score is 50. Of course, the actual average Game Score is (a) higher in some seasons than in others, and (b) influenced by the park. To move forward from this point, we need to remove those biases from the data. Our list is dominated by 1963-1972 pitchers because those were pitching-dominated years.
To remove those biases, we begin by figuring the average game score at home and on the road for every team in the data. For example, the 1923 Philadelphia Phillies pitchers had an average Game Score, in their home games within my data, of 33.6. They had the worst pitching staff in the league, park-adjusted, and they also had a park factor of 144. The combination made the average Game Scores of their starting pitchers, for the season, 39.0—33.6 at home, and 44.4 on the road. The 1968 Cleveland Indians had an average starting pitcher Game Score of 61.6, the highest in the data—61.5 at home, and 61.8 on the road. Their starting rotation was Luis Tiant (21-9, 1.60 ERA), Sam McDowell (15-14, 1.81), Sonny Siebert (12-10, 2.97) and Stan Williams (13-11, 2.50), and Steve Hargan (8-15, 4.15).
I figured the Average Game Score for every team’s pitchers, home and road, and also and equally important, the Average Game Score for every team’s opposing pitchers, home and road. Based on that data, we can calculate a Park Effect for each park in each season. These park effects work backwards from traditional park effects; that is, a hitter’s park leads to low Game Scores for the pitchers, thus a park effect below 100, whereas a pitcher’s park leads to high Game Scores, thus a park effect above 100. Also, of course, park effects derived from Game Scores are not a perfect linear match with Park Effects derived from runs scored.
With the park effects and the team averages, we can calculate the "actual" or "true" average for every game—that is, the expected Game Score for the start, based on the team and the park. This also adjusts for year-to-year differences, of course, since each team is within a season. The highest expected Game Score for any game(s) in the data is 63.48, which is the Expected Game Score against the 1965 New York Mets for their games played in Dodger Stadium. The Mets had a terrible offense, scoring just 495 runs that season despite playing in a hitter’s park. Dodger Stadium had a Park Effect of 76. The combination would expect to yield very high average game scores. The second-highest average for any situation, 63.22, would be the 1965 Mets playing in the Astrodome.
On the other end of the spectrum, the lowest expected Game Score would be for the pitcher facing the 1936 New York Yankees in Sportsman’s Park in St. Louis. The 1936 Yankees had a famously intimidating offense, which scored 1,065 runs, and Sportsman’s Park was the best hitter’s park in the league, with a Park Effect of 120 (the park run effect. The park effect on Game Scores was .882.) The combination yields an expected Game Score, facing the 1936 Yankees in Sportsman’s park, of 35.22.
OK, the expected Game Score can go as high as 63.48 or as low as 35,22, but that’s very unusual. Of the 329,988 games in my data, the expected park effect was higher than 60 for only 1,134, and lower than 40 for only 454. More than 99.5% of the time, the expected Game score was between 40 and 60. For 84% of games, the expected Game Score is between 45 and 55. It hangs around 50.
Applying this to Pitchers
The New York Mets played a double header against the Los Angeles Dodgers in Dodger Stadium on June 20, 1965, facing Sandy Koufax in the first game and Don Drysdale in the second. Koufax pitched a 1-hitter with 12 strikeouts. The one hit was a home run, but he won the game, 2-1. In the second game Drysdale pitched a complete game as well, but gave up 9 hits and 3 runs, all earned, and the Mets beat him, 3-2.
Koufax had a Game Score of 91, which would be +41 if compared to 50, but when you adjust for the fact that he is facing the 1965 Mets in Dodger Stadium, it isn’t +41; it is merely +27.52. Drysdale had a Game Score of 56, which would be a good game in a neutral situation--+6—but is actually a poor game under the circumstances, now scoring at negative 7.48.
On the other hand, Ivy Andrews of the Browns faced Joe DiMaggio, Lou Gehrig and company at Sportsman’s Park in St. Louis on July 22, 1936. He pitched a complete game but gave up 10 hits and 5 walks, had no strikeouts but managed to beat the Yankees 6-5. Giving up 10 hits and 5 runs in a game, no strikeouts, would not ordinarily be considered a strong performance, and yields a Game Score of only 42. Compared to a neutral average he would be -8, but considering that he was facing one of the greatest offenses of all time in a bandbox ballpark, it’s actually pretty good. Compared to expectations, it is +6.78.
We will treat Drysdale’s contribution to the team, then, at -7.48—he did lose the game, after all—and Ivy Andrews at +6.78.
OK, now we can recalculate each pitcher’s contributions to the success of his team, game by game. The #1 season in our data is no longer Koufax or Gibson; it is now Pedro Martinez in the year 2000. Pedro made just 29 starts; he was 18-6 with a 1.74 ERA, struck out 284 batters in 217 innings, and walked only 32. Compared to expectations game by game, he was +796 points. These are the top three seasons in my data:
Year
|
First
|
Last
|
Margin
|
2000
|
Pedro
|
Martinez
|
795.7
|
1999
|
Randy
|
Johnson
|
699.9
|
1999
|
Pedro
|
Martinez
|
681.4
|
And these are the top six seasons in my data:
2000
|
Pedro
|
Martinez
|
795.7
|
1999
|
Randy
|
Johnson
|
699.9
|
1999
|
Pedro
|
Martinez
|
681.4
|
1965
|
Sandy
|
Koufax
|
679.7
|
1997
|
Roger
|
Clemens
|
659.9
|
1997
|
Pedro
|
Martinez
|
648.4
|
Sandy Koufax in 1965 still does very well; his is still the fourth-best season in the data, among 16,000+ pitcher/seasons. But the most dominant pitcher is no longer Koufax or Gibson, from the 1960s; it is now Pedro Martinez. Martinez had amazing numbers, pitching in Fenway Park in seasons in which the league ERA was close to 5.00.
These are the top 51 seasons in my data, arranged chronologically by pitcher:
Year
|
First
|
Last
|
Margin
|
1924
|
Dazzy
|
Vance
|
646
|
1928
|
Dazzy
|
Vance
|
495
|
1931
|
Lefty
|
Grove
|
511
|
1937
|
Lefty
|
Gomez
|
529
|
1939
|
Bob
|
Feller
|
553
|
1940
|
Bob
|
Feller
|
583
|
1939
|
Bucky
|
Walters
|
500
|
1953
|
Robin
|
Roberts
|
496
|
1963
|
Sandy
|
Koufax
|
577
|
1965
|
Sandy
|
Koufax
|
680
|
1966
|
Sandy
|
Koufax
|
613
|
1965
|
Juan
|
Marichal
|
508
|
1966
|
Juan
|
Marichal
|
505
|
1968
|
Bob
|
Gibson
|
600
|
1969
|
Bob
|
Gibson
|
505
|
1968
|
Luis
|
Tiant
|
511
|
1971
|
Vida
|
Blue
|
606
|
1971
|
Tom
|
Seaver
|
584
|
1973
|
Tom
|
Seaver
|
551
|
1972
|
Steve
|
Carlton
|
614
|
1980
|
Steve
|
Carlton
|
562
|
1972
|
Gaylord
|
Perry
|
534
|
1974
|
Gaylord
|
Perry
|
518
|
1973
|
Nolan
|
Ryan
|
569
|
1977
|
Nolan
|
Ryan
|
527
|
1978
|
Ron
|
Guidry
|
585
|
1985
|
Dwight
|
Gooden
|
612
|
1986
|
Mike
|
Scott
|
594
|
1986
|
Roger
|
Clemens
|
516
|
1997
|
Roger
|
Clemens
|
660
|
1998
|
Roger
|
Clemens
|
518
|
1993
|
Randy
|
Johnson
|
497
|
1995
|
Randy
|
Johnson
|
582
|
1997
|
Randy
|
Johnson
|
552
|
1999
|
Randy
|
Johnson
|
700
|
2000
|
Randy
|
Johnson
|
594
|
2001
|
Randy
|
Johnson
|
647
|
2002
|
Randy
|
Johnson
|
616
|
2004
|
Randy
|
Johnson
|
568
|
1995
|
Greg
|
Maddux
|
537
|
1997
|
Pedro
|
Martinez
|
648
|
1999
|
Pedro
|
Martinez
|
681
|
2000
|
Pedro
|
Martinez
|
796
|
2001
|
Curt
|
Schilling
|
497
|
2002
|
Curt
|
Schilling
|
516
|
2004
|
Johan
|
Santana
|
517
|
2006
|
Johan
|
Santana
|
503
|
2009
|
Zack
|
Greinke
|
497
|
2011
|
Justin
|
Verlander
|
521
|
2015
|
Clayton
|
Kershaw
|
530
|
2017
|
Corey
|
Kluber
|
514
|
I made it 51 so that I could sneak a second Dazzy Vance season onto the list. He makes the list twice although we are missing 2 starts for him in 1924 and one in 1928, so I figured he should catch a break.
Now we have an opposite problem. Whereas before the bias of the list favored a pitcher working in a pitcher’s park in a pitcher’s era, it now favors a pitcher working in a hitter’s park in a hitter’s era. Why? Because the average Game Scores are lower, which creates more "space" for the superior pitcher to work in, a larger canvas for him to paint on. When you make the scores larger, you make the differences between pitchers larger.
That’s a smaller problem than the other one that we had. This is LESS of a bias than the other bias, but it is still some bias. We’re making progress. We’re working on it.
The worst pitcher/season in the data? Still Jose Lima in 2005, at -330.8.