Re-Starting Runs Saved Against Zero
I am back to work on the project of estimating how many runs have been saved (prevented) by each pitcher and each defensive player since 1900. I started back to work on it over the weekend of June 6-7. Fortunately, when I was working on it before, I saved all of the articles in one word file, so that I was able to read through what I had done beginning to end—a good research practice which is quite unlike me; ordinarily I would have to chase down all 20-some articles and try to piece together the sequence.
Anyway, reading through the articles or as much of them as I could stand to read, I immediately spotted several major errors in my approach, as you sometimes will when you walk away from your work and can look at it with fresh eyes. Making these various mistakes, I had entangled myself in a jungle of bad assumptions and false starts, to such an extent that I was confused, and, for the last 5 or 7 articles, just kind of wandering around aimlessly, hacking through the underbrush to no particular purpose.
In the previous set of articles I created two very different ways to estimate a team’s runs saved, both of which were frankly terrible. In re-reading the material, one of my first thoughts was "There has to be a better way to establish Team Runs Saved, so what is it?" This led to a new and much better Team Runs Saved method. I would go so far as to say that the previous two methods were wrong, and this one is right.
I suppose what I should emphasize now is that the theoretical zero point is the point at which runs saved are equal to runs scored. Supposing that an average team in a given set of circumstances would allow 750 runs, then the zero point is 1500 runs allowed. The new figure for Team Runs Saved is calculated thus:
1) Figure the League Average of Runs Scored Per Inning,
2) Apply that to Team’s Innings Pitched (League Average Expected Runs) (LAXR)
3) Park-Adjust that,
4) To figure the expected Runs Allowed for the team (TXR),
5) Divide the team’s innings pitched by two,
6) And subtract that number from Team Expected Runs Allowed (TXR).
7) The resulting figure is the Park Run Adjustment, or PRA. The Park Run Adjustment can be either positive or negative, and averages zero in theory, although it is not exactly zero in fact. The PRA is a positive number in a hitter’s park, and a negative number in a pitcher’s park. It would average zero in fact if the historical run average for all teams was 4.50 runs. Since the historical run average for all teams is not 4.50 but 4.47, the average PRA is not zero, but -6.
8) The new figure for Team Runs Saved is Innings Pitched, plus the PRA, minus the runs allowed by the team. In other words, we are assuming that, in a neutral park in an average league, a team with zero pitching and zero defense would allow one run per inning, or 9 runs per 9 innings. We adjust this for the PRA—which is actually a park AND LEAGUE run adjustment, since what we were adjusting is the league norm—and subtract the number of runs that they allowed. The difference is the number of runs that they saved.
For illustration, the 1968 St. Louis Cardinals—the team for which Bob Gibson posted a 1.12 ERA—allowed 472 runs in 4,437 thirds of an inning, which is 1,479 innings. The league totals were 5,577 runs scored with 44,043 thirds of an inning. Given their innings pitched, which were almost exactly one-tenth of the league’s total—it was a ten-team league—we would have expected the Cardinals to allow 562 runs scored (561.8407).
However, Busch Stadium in that season functioned as a pitcher’s park, with a park adjustment factor of .925 439. Based on that, we multiply the league average expected runs (LAXR), 562, by the park adjustment (.925 439) to get the expected runs allowed by the team, which, for the 1968 Cardinals, is 520 (519.949).
If they had been expected to allow 4.50 runs per game—assumed to be the average number—if they had been expected to allow that number, that would be 739.5 runs. Their expected runs allowed is only 520, which is a difference of 220 runs. The team’s Park Run Adjustment is -220 (-219.551). What we are saying, in English, is that this team is expected to allow less than 4.50 runs per game because of their park and league.
The Cardinals pitched 1,479 innings. Had they had zero pitching and zero defense in a historically neutral park, then, we theorize that they would have allowed 1,479 runs. Because of the park and league, we reduce that number by 220 runs, making 1,259. We theorize that, in this run condition, had they had zero competent pitching and zero defense, they would have allowed 1,259 runs.
In fact, they allowed only 472 runs. The difference is 787 runs—1259, minus 472. So the Cardinals pitching and defense in 1968 "saved" 787 runs. This is a good number. Among all teams in baseball history since 1900, it ranks in the top 20%. It’s not a GREAT pitching-and-defense combination; it’s a very good pitching-and-defense combination, working in a very low-run environment.
This method of establishing Runs Saved by a Team is clearly better than either of the two methods that I tried before in this series of articles. This method does extremely well on three critical tests of what the Team Runs Saved should be:
1) The average for a team is almost the same as the average runs scored by a team,
2) The standard deviation, also, is almost the same, thus fulfilling the essential condition that offense and defense are equal components in success, and
3) Good teams do dramatically better in this measurement than weak teams.
There are 2,550 teams in my data, or five groups of 510 teams. The top 510 teams, the 510 teams which we credit with saving the most runs, have an average won-lost record of 91-68. The bottom 510 teams have an average won-lost record of 62-88.
The essential difference between this method and the method that I started with is that it applies the park/league run adjustment AS A NUMBER, whereas the earlier method applied the adjustment AS A PERCENTAGE. Applying it as a percentage, the earlier method applied the run adjustment TWICE, essentially applying it to the runs that they would have allowed if they had had an average defense, and then applying it again to the runs they would have allowed above that number, if they had very bad pitching and defense. This method applies the run adjustment only once. The earlier method created very high Runs Saved numbers for teams playing in high-run contexts, thus concluding that the team which saved or prevented the most runs, in all of baseball history, was the 2000 Colorado Rockies.
The 2000 Rockies DID save a lot of runs, and they still do very well in the new method. They’re just not #1 anymore. Before, I was crediting them with saving 1,227 runs, with the average figure being 699. Now, we’re crediting them with saving 880 runs, with the average being 705.
There are two facts which convince me that this new method is "right", and that the old method was, frankly, wrong. One is the standard deviation of runs saved. The standard deviation of runs scored by a team within the data is 90. By the new method, the standard deviation of runs SAVED (prevented) by a team is 89. By the old method, it was 127. In other words, the old method was saying that, in winning games, preventing runs was much more important than scoring them. How I failed to pick this up in my original work, I don’t know. I should have picked it up sooner.
Also, the new method—despite the lower standard deviation--creates a much higher correlation between Runs Saved and Winning Percentage. The new method does a much better job of making the good teams look good and the bad teams look bad. So I’m pretty much convinced that this is the method I should have used before. Under the new method, these are the 20 teams in the study which prevented the most runs:
YEAR
|
City
|
Team
|
Lg
|
Team Runs Saved
|
1926
|
Philadelphia
|
A's
|
AL
|
1135
|
1905
|
Chicago
|
Cubs
|
NL
|
917
|
2017
|
Cleveland
|
Indians
|
AL
|
991
|
2019
|
Houston
|
Astros
|
AL
|
1041
|
2018
|
Houston
|
Astros
|
AL
|
932
|
1955
|
Boston
|
Red Sox
|
AL
|
1103
|
2007
|
Boston
|
Red Sox
|
AL
|
1052
|
1970
|
Chicago
|
Cubs
|
NL
|
1065
|
1973
|
Baltimore
|
Orioles
|
AL
|
906
|
2002
|
Atlanta
|
Braves
|
NL
|
903
|
1906
|
Chicago
|
Cubs
|
NL
|
797
|
1913
|
New York
|
Giants
|
NL
|
887
|
1997
|
Atlanta
|
Braves
|
NL
|
908
|
1991
|
Toronto
|
Blue Jays
|
AL
|
952
|
1929
|
Philadelphia
|
A's
|
AL
|
1049
|
2011
|
Texas
|
Rangers
|
AL
|
1026
|
1969
|
Baltimore
|
Orioles
|
AL
|
829
|
1957
|
Brooklyn
|
Dodgers
|
NL
|
977
|
1904
|
Cincinnati
|
Reds
|
NL
|
938
|
1993
|
Atlanta
|
Braves
|
NL
|
879
|
Of those 20 teams, all 20 had winning percentages of at least .545, and all were at least 14 games over .500. 15 of the 20 teams won at least 90 games, and ten of the 20 teams won more than 100 games. The top 45 teams on the list all had winning records.
Among the teams showing as having saved the FEWEST runs, almost all came from the strike-shortened 1981 and 1994 seasons. Among teams which played a full season, the teams which show as having saved the fewest runs are almost all 100-loss teams.
In the end, if you could be patient enough to let me get there, which I know some of you will not be but I hope most of you will. . .in the end, you will see that this "Team Runs Saved" number acts as a "cage" from which the wild animals cannot escape. The wild animals are the individual category numbers. The number for strikeouts is a wild animal; the number for home runs allowed is a wild animal. The Team Runs Saved number is the cage from which these Wild Animals cannot escape.
In the end, the category numbers will act as "claim points" against the fixed boundary of the team runs saved. What the system is essentially saying is, "We know that the St. Louis Cardinals in 1968 saved 787 runs as opposed to a team which had no ability to prevent runs. Who will step forward to claim those runs?" We WANT the individual numbers on the 1968 Cardinals to add up to 787, as much as possible—but we know that they won’t. We’ll have to adjust them to make them add up. But that’s way down the road. After we have struggled long and hard to make everything add up right, we will adjust for the fact that it doesn’t. The Team Runs Saved number is the "control". It is what we will adjust to, when we get to that point.