Saving Private Runs
We move on now to the second stage of our Project, the project being to estimate how many runs have been saved by each player in major league history. The first thing that we need to do here is to estimate the number of runs saved by each team.
In order to do that, we need five pieces of data for each team:
The League number of outs recorded (Innings Pitched X 3)
The League number of runs scored
The team number of outs recorded
The Adjusted Park Factor for the team, and
The number of runs allowed by the team.
Given that, it’s not complicated. In 2019 there were 21,690.2 innings pitched in the American League, and 11,859 runs scored (12,018 runs allowed; there is a small discrepancy one way or the other.) Anyway, 21,690.2 innings is 65,072 outs. The Houston Astros had an outstanding pitching staff led by Justin Verlander and Gerrit Cole. They pitched 1,462.1 innings, or 4,387 outs, so they accounted for 4387/65072 of the league’s outs, or .067418. One fifteenth, basically. They thus could be expected to allow .067418 of the league’s runs allowed; .067418 times 12,018 is 810.2251, so they could have been expected to allow 810 runs.
Except that they pitched in a hitter’s park, so the expected runs allowed is higher than that. Adjusted Park factor of 1.037; their expected runs allowed go up to 840.32.
The "zero point" for them is twice that number. If they had allowed twice that number of runs, that would be 1680.64 runs allowed. They actually allowed only 640 runs, or 1040.64 runs less than they theoretically might have allowed, had they had zero talent on their pitching staff and in their defensive play.
This is a high number. It is, in fact, the 19^{th} highest of all time. In this system, you will have a high number if
(a) You have good pitching and defense, and
(b) You play in a hitter’s park.
You will have a low number of runs saved against zero if
(a) You have bad pitching and defense, and
(b) You play in a pitcher’s park.
Later, at the end of the process, we will take the park effects back out of it, and put teams in a highrun environment on an equal footing with teams in a lowrun environment. But you have to figure RUNS saved first, before you can parkadjust the runs.
Think of all of the greatest pitching staffs you can remember or you know about—the 1954 Cleveland Indians, the 1965 Dodgers, 1968 Cardinals, 1971 Orioles, the 1986 Mets, 1998 Braves, the 2011 Phillies, the 2017 Dodgers. Basically, very few or none of those teams are going to show up on the list of the teams saving the most runs, because almost all of them pitched in pitcherfriendly environments. The 2011 Phillies, although they are in the top 20% of teams by runs saved, actually saved fewer RUNS than the 1930 Phillies, the famous team that finished 52102 despite gargantuan hitting achievements. The 1930 Phillies played in a league in which the average team scored 878 runs, and in a park that inflated the expectation by another 12%, so they had an expectation of allowing 1022 runs, and a "zerotalent based" expectation of 2044 runs. They actually allowed 1,199 runs, 177 more than expected, but 845 less than they potentially might have allowed. The 2011 Phillies played in a league in which the average team scored 668 runs, park neutral, so they had an expectation of allowing 668 runs; actually 676 if you watch the decimal points better. Twice that would be 1,352 runs; they actually allowed 529 runs, 147 less than expected, but only 823 less than their zero point. We’ll take the park and league out of it at the end of the road.
So, the ten teams which saved the most runs, compared to the zero point, are:
YEAR

City

Team

Lg

Exp RA

R

Team Runs Saved

W

L

2000

Colorado

Rockies

NL

1062

897

1227

82

80

1900

Boston

Braves

NL

944

739

1150

66

72

1932

Philadelphia

A's

AL

944

751

1137

94

60

1926

Philadelphia

A's

AL

852

569

1135

83

67

1936

Boston

Red Sox

AL

946

764

1129

74

80

1930

Chicago

Cubs

NL

994

870

1117

90

64

1930

St. Louis

Cardinals

NL

948

784

1112

92

62

1955

Boston

Red Sox

AL

878

652

1103

84

70

1936

Cleveland

Indians

AL

970

862

1077

80

74

1970

Chicago

Cubs

NL

872

679

1065

84

78

And the 10 teams which saved the FEWEST runs, compared to the zero point, are:
YEAR

City

Team

Lg

Exp RA

R

Team Runs Saved

W

L

1904

Washington

Senators

AL

542

743

341

38

113

1981

Cleveland

Indians

AL

392

442

342

52

51

1968

Washington

Senators

AL

509

665

352

65

96

1915

Philadelphia

A's

AL

621

888

354

43

109

1981

San Diego

Padres

NL

405

455

355

41

69

1908

New York

Yankees

AL

534

710

359

51

103

1981

Pittsburgh

Pirates

NL

394

425

364

46

56

1907

Washington

Senators

AL

527

690

365

49

102

1909

Washington

Senators

AL

512

655

368

42

110

1981

Texas

Rangers

AL

381

389

372

57

48

Saving only 341 runs in a full season is equivalent to scoring only 341 runs over the course of a season. The 1908 St. Louis Cardinals scored only 371 runs in 154 games, which I think is the lowest ever. Parallel accomplishments.
I’d spend more time on these lists of teams, but it’s not that significant of an accomplishment. On average, the teams that ALLOW more runs, also SAVE more runs, because the variation in runs caused by the combination of the park and era is larger than the variation caused by the performance of the team. The 510 teams saving the most runs had an average expectation of 797 runs allowed, allowed an average of 712, and thus saved an average of 881:



Exp RA

R

Team Runs Saved

W

L

Most Team Runs Saved

797

712

881

86

71

Next Most

733

702

764

82

76

Average

693

690

696

80

78

Not too Many Runs Saved

659

687

632

76

82

Fewst Runs Saved

599

673

525

67

84

The bottom 510 teams had an expectation of allowing 599 runs, actually allowed 673, and thus saved an average of 525.
What we have to do now is to make the categories add up. We have to figure how many runs were saved by strikeouts, how many were saved by control (not walking people), how many were saved by catchers preventing the running game, how many were saved by double plays, etc., and we have to make those numbers add up to the runs saved by the team, with a reasonable margin of error. If we can do that, then we will have succeeded. If we can’t, then the effort is a failure.
Sort of a separate article here. I’ve made some changes/amendments/ adjustments to the process here, which I need to explain. I think there are four changes that I need to announce, or admit to, or something.
1) Responding to W. T. Mons comment:
BaseballReference.com has caught stealing datums for the pre1920 years. I believe they came from Pete Palmer, although where he got them I couldn't say. Anyway, they show the 1900 Phillies allowing 273 steals and throwing out 185 runners.
I wasn’t aware that that data was there. I have now copied all of that data into my spreadsheets, and I will use that going forward. Not only going forward, but going backward as well; I’ll have to recalculate the stolen base stuff that I published a few days ago, which unfortunately is not the end of the damage control; having the additional data will also force me to recalculate expected Double Plays for each team, which means that I will have to recalculate Estimated Runners on First Base for each team. It’s a pain in the ass, but you know; you just have to do those things.
2) Following the query about whether I was using Decade Norms based on each calendar or "rolling averages", I rethought that issue, and I think it might be better to go in the other direction. . .move it over to rolling averages. I haven’t done that yet. I’m not sure which one I am going to do, actually. There are obvious advantages to doing the rolling average, but at some point I am going to have to figure, for example, how many runs were saved by Bill Dickey in 1937. When I do that I will have to copy the "background level" for each player into every column. As it is set up now I have 12 "background levels", one for each decade; actually there are 132 of them, because there is a background level for each of 11 categories. If I change to a rolling average, then there are 120 background levels to work with, which is actually 1,320. It becomes a logistical struggle, to keep track of all of those—and it doesn’t REALLY make that much difference as a practical matter; a player might once every so often gain or lose one run saved because we changed the background level, but that’s all, and it’s just an estimate, anyway; it’s not a hard fact. Now I think I have talked myself out of making the change.
It’s a case of "the perfect is the enemy of the good." I have limited dataprocessing skills, always have had. What is important to me is the concept. If the idea works, if it attracts an audience, somebody who has better programming skills can straighten out the details later.
3) I have decided that, for purposes of this stage of the analysis, I am going to combine Wild Pitches, Passed Balls and Balks into one category, which we can call "OBM" or "One Base Mistakes". I thought about doing this earlier, but I was troubled by the fact that I’ll have to unbundle them later on to attribute them to individual players. But now I’ve decided that attributing them to individual players later on isn’t big a problem.
Individually, one team’s Passed Ball number is 8 standard deviations from the norm, and one team’s Balk number I think is 6 standard deviations from the norm. It’s kind of like a left fielder who sets up in left field 750 feet from the batter. The data is normally 250280 feet, but there’s this one guy who sets up 750 feet away. Data that is that far out in left field is difficult to deal with, particularly when you have to establish the norms as a first step. I’m hoping (and expecting) that combining the data will normalize it somewhat.
4) I realized that, in entering the "league" data, I had entered the league HITTER’S data in some areas in which I should have entered the league PITCHER’S data—hits by the hitters, rather than hits allowed by pitchers, for example. It doesn’t make any real difference; it’s the same one way or the other until interleague play, and not very much different after that, usually much less than a 1% discrepancy, but still . . ..I fixed those.
The consequence of these four adjustments is that many of the findings I have previously reported here would now be outdated, or, if you prefer, incorrect. It’s not what the data would show, necessarily, if evaluated today.
It’s a normal part of research. You always find things that you didn’t do right early in the study, or data that you had missed, or data problems that you hadn’t planned for. It always happens; you’re always have to go back and redo things.