The High Cost of the Free Pass
Before I get to the real work to be done today, I needed to spend a couple of paragraphs updating you about the amendments outlined before.
In terms of double plays, some of the numbers changed a little bit when I recalculated Estimated Runners on First Base, but not really; all of the teams which were best at turning the double play or worst at turning the double play are still evaluated essentially the same.
In terms of stolen base defense, more surprisingly, the 1920 Braves hold on to the #1 spot of all time, regardless of how the question is analyzed, despite the addition to the data set of the caught stealing numbers from 1900 to 1919. A few of the 19001919 teams are among the Top 10 or Top 20 in terms of turning the opposition’s running game into a weapon for the defense, but the top 6 or 8 are the same on both ends.
I have combined Wild Pitches, Passed Balls and Balks into one category called "One Base Mistakes". Now that I have done it I am not sure what the practical benefit of that is; it just makes the data look stable and useful, rather than flukish. The best team ever at avoiding One Base Mistakes was the Baltimore Orioles in the strikeshortened 1994 season (Mike Mussina, Ben McDonald, Jamie Moyer, catcher Chris Hoiles.) They had 18 Wild Pitches (the fewest in baseball), one balk (tied for the fewest in baseball, the average was 6) and 5 Passed Balls (tied for fewest in the American League, although one National League team had only 3.) They finished 6349, in second place behind the Yankees at the time that the calendar stopped. The secondbest team ever in terms of avoiding onebase mistakes was the 1977 Red Sox team that I wrote about before in this series—Tiant, Fergie Jenkins, Carlton Fisk, etc. The worst team ever was either the 19871988 Texas Rangers (in terms of the raw number of mistakes) or the 1936 Philadelphia A’s (if the raw number of mistakes is compared to period norms.) Combining them does normalize the data to an extent; the best teams (1994 Baltimore and 1977 Boston) are 2.6 standard deviations better than the norm, while the worst (1988 Texas and 1936 Philadelphia) are 4.8 standard deviations worse than the norm.
In terms of changing from a decade norm to a rollingdecade norm, I have not done that yet, although it is kind of looking like I might have to, but I haven’t done it yet.
OK, walks allowed. The 1932 Cincinnati Reds walked only 276 batters; the 1915 Philadelphia A’s walked 827. The theory of this research is that the Cincinnati Reds prevented a lot of runs by not walking people. The 276 walks is not a record low; the 1933 Reds walked only 257, but the issue here is how many batters they DIDN’T walk, that they might conceivably have walked if they had been Ryne Duren or Bobby Witt or somebody. Steve Dalkowski. The 1932 Cincinnati Reds faced 220 more batters than they did in 1933. How many of those batters they MIGHT have walked, had they been Steve Dalkowski, varies as to whether we assume that the wildest pitcher is 3 standard deviations below the norm, 4 standard deviations below the norm, or 5 standard deviations below the norm, but in any case, it is enough that we regard the 1932 team has having prevented more walks than the 1933 team. Or any other team, such as the 1918 New York Giants, who walked only 228, but that was a warshortened season, or the 1904 Red Sox, who walked only 233, but the league norms were lower in 1904 than in 1932, lower in the Cy Young era than in the Carl Hubbell era.
The Reds pitching staff in that era was led by the Nashville Narcissus, Red Lucas. Lucas was actually a pretty decent pitcher; in 1932 he was just 1317 but had a 2.84 ERA in 269 innings. Lucas, when I first became a baseball fan, held the record for career pinch hits and pinch hitting at bats. He was a career .281 hitter who was used as a pinch hitter more than 500 times in his career. In early 1933 the Reds traded for Paul Derringer, who went 727 that season despite a 3.30 ERA, league ERA was 3.34. The Reds were terrible in that era; they finished last every season from 1931 to 1934. It was miles to the outfield fences; there were very few home runs in that park, so most of their staff was just guys who weren’t actually major league pitchers; they just went out and threw the ball over the plate and let people hit it. This worked out sortofOK at home, but very poorly on the road; in 1933 they won 37 games at home but only 21 on the road. And, as I mentioned, they had no center fielder; in 1933 their center field player was Chick Hafey, who had played left field for the Cardinals for years, but had been traded to Cincinnati when he got to be 29 years old because that’s what Branch Rickey believed in; he always wanted to trade his players away before their value collapsed.
Anyway, how many batters did the Cincinnati Reds NOT walk, that they might reasonably have walked? It depends on whether we set the bar at 3 standard deviations below the norm, 4 standard deviations, 5 standard deviations, or some other deviant yet to be determined. A Deviant to be named later. (I always wanted to do that with the Red Sox. Sometimes when you have a player who is a pain in the ass, you trade him away for whatever you can get, and then later the other team will give you somebody that THEY can’t stand and their manager wants to get rid. I always wanted us to make an announcement that Joe Schmuck had been traded to San Diego for a troublemaker to be named later.)
If we use 3 standard deviations below the norm, the 1932 Cincinnati Reds did NOT walk 467 batters that they might reasonably have walked, which would probably save them somewhere around 150 runs. If we use 4 standard deviations, then it would be 552 walks (about 180 runs), and if we use 5 standard deviations, then it would be 637 walks (about 210 runs).
On the other end of the Elephant, the 1915 Philadelphia Athletics were basically a team of teenagers and minor leaguers which had been pieced together to replace the stars who had been sold to the other American League teams so that they would not flee to the Federal League where they could make better money. In 1915 they walked 120 batters MORE than the standard of 3 standard deviations below the norm; in other words, they were worse than incompetent in this area. This means that we probably can’t use a standard of 3 standard deviations below the norm; we probably HAVE to go to at least 4. Not necessarily, not absolutely; there are only six teams in history which were worse than the 3SD cutoff, and none of the others was worse than 60, so the problem is kind of contained, but still, negative numbers play hell with an analysis of this nature, so you have to avoid them. Not that I think we would up using 3 standard deviations as the cutoff, anyway.
If we use 4 standard deviations as the miserymark, then the 1915 A’s were 38 walks worse than terrible; if we use 5 standard deviations, then they were 44 walks better than the miserymark.
Transition here. Let me mark that appropriately. . .
Important Transition Here. My general thinking here, my first set of working assumptions, is that I might use 3 standard deviations below the norm as the zero point for categories in which the worst number you can post is zero, but 4 standard deviations below the norm as the zero point in categories which the BEST number you can post is zero. In other words, 3 standard deviations below the norm in a category like strikeouts, where the bad teams have the lowest numbers, but 4 standard deviations in a category like walks, where the bad teams have the highest numbers. My previous research (previous in this series. . ..the stuff I have posted over the past two weeks) has shown that in almost every area, the best/worst teams are not quite 3 standard deviations from the norm in the "zerolimited" direction, but almost 4 standard deviations from the norm in the "skies the limit" direction, the direction from the norm in which there is no zero. So my first inclination is to use 3 standard deviations below the norm when small numbers indicate bad performance, but 4 standard deviations worse than the norm when large numbers indicate bad performance.
Let us suppose that there is a "worksheet" for every team in major league history, on which we are tallying up their Runs Prevented. There will be 11 categories of Run Prevention on the sheet: Strikeouts, Control, Home Run Avoidance, Wild Pitch Avoidance, Hit Batsmen Avoidance, Balk Avoidance, Fielding Range (DER), Fielding Consistency (Fielding Percentage), Stolen Base Control, Double Plays, and Passed Ball Avoidance. Ultimately, we have to tag a number of Runs Prevented to each of those things. Like this:
Team: 1600 Merchants of Venice



Runs Prevented By:


Strikeouts


Control


Home Run Avoidance


Hit Batsmen Avoidance


Wild Pitch Avoidance


Balk Avoidance


Fielding Range (DER)


Fielding Consistency (F Pct)


Double Plays


Stolen Base Control


Passed Ball Avoidance




Sum of the Above


Actual Runs Prevented:


Error/Discrepancy:


At this point, we can fill in one element of this worksheet for each team, and we can make preliminary estimates about two others. We know what the actual Runs Prevented for each team are, because I explained that process. Let’s make four initial assumptions about strikeouts and walks:
1) That the lower boundary for strikeouts in 3 standard deviations below the norm,
2) That the lower boundary for walks is 4 standard deviations worse than the norm,
3) That each strikeout is worth .30 runs, and
4) That each walk avoided is worth .32 runs
Understanding that these initial assumptions will not control the future course of the research. If we make those four assumptions, then we can fill in the data for runs prevented by strikeouts and control. That would give us this worksheet, for the defending World Champions:
Team: 2019 Washington



Runs Prevented By:


Strikeouts

206

Control

58

Home Run Avoidance


Hit Batsmen Avoidance


Wild Pitch Avoidance


Balk Avoidance


Fielding Range (DER)


Fielding Consistency (F Pct)


Double Plays


Stolen Base Control


Passed Ball Avoidance




Sum of the Above

264

Actual Runs Prevented:


Error/Discrepancy:

879

We would thus be able to "explain" or "attribute" 264 of the 879 runs prevented by the Washington Nationals’ pitching and defense. The other 615 runs would still have to be explained by the other performance areas.
For all teams in baseball history since 1900, the number of runs prevented that we will have to attribute to somebody is 1,783,676. The number attributed by this process, for all teams, would be 457,184. That would be 26% of the whole.
Intuitively, that percentage feels like it is way too low. I would suspect that, over all of history, strikeouts and control would be maybe 50% of run prevention, wouldn’t you think? I think it is more than 26%.
It’s too early in the process to worry about that. We’ll run the numbers for all 11 categories, and then we’ll see where we are, and then we’ll see what we can do to move the numbers closer to the target. Thanks for reading; I hope this clarifies what I am trying to do at least a little bit at least for some people.