Strikeout Runs Saved
In order to figure the number of runs saved by each team’s pitcher’s strikeouts, we need to know four things:
How many batters the team’s pitchers faced,
How many they struck out,
How many of those strikeouts we should remove, as "background" or "zero competence level", and
The run-prevention value of each strikeout.
The first two of those we know now.
The fourth one we will not know for some time.
The purpose of this stage of the research is to generate some options for the third—the background level. The zero competence level. In order to generate those options, we need two additional pieces of information: the period norm for strikeouts per plate appearance, and the standard deviation of same, on the team level. I don’t know if that makes sense; it makes sense to me, which is all I can offer you at the moment.
So anyway, we need four pieces of information, then: the first two above, plus the period norm for strikeouts and the standard deviation of the same.
The 2018 Houston Astros hold the major league record for strikeouts in a season, with 1,687, out of 5,913 batters that they faced. Our question is, how many of those 1,687 strikeouts should we regard as merely background—batters who were going to strike out no matter what kind of a fool was on the mound. An average pitching staff in their era, given the same number of batters faced, would have struck out 1,216 batters.
But we are comparing them not to an average staff, nor to a replacement-level staff, but to a zero-value staff, which is a staff which would allow twice as many runs as the league average. That number (of background strikeouts) might be represented as 3 standard deviations below the norm (70), 4 standard deviations below the norm (60) or 5 standard deviations below the norm (50). The period norm for strikeouts is .205664 of plate appearances.
Doing the math, then, if we use 3 standard deviations below the norm as the background level, then 794 of the Astros’ strikeouts were background, and 893 were acts of run prevention. The average of strikeouts is .205664, and the standard deviation is .023830:
5913 * (.205664 – 3*.023830) = 794
1687-794 = 893
It’s actually 894 if you carry enough decimals; it’s a case in which 793.xxx rounded up and 893.xxx rounded up add up to 1687.xxx. Anyway, Ii we wind up using 4 standard deviations below the norm as the background level, then the 2018 Astros will be credited with 1,035 strikeouts above the cutoff level:
(5913 * (.20566 – 4*.023830) = 652
1687 – 652 = 1035
And if we use 5 standard deviations below the norm, we wind up with 1,175 "contributing" strikeouts:
(5913 * (.20566 – 5*.023830) = 512
1687 – 512 = 1175
At this point, we don’t know whether we are going to wind up using 3 standard deviations below the norm, 4 standard deviations below the norm, 5 standard deviations below the norm, or some other number entirely. We don’t know.
Also, at this point we don’t know what the value of a strikeout is, in this structure. One can estimate the value of a strikeout, using different approaches, at anywhere from .13 runs (negative .13 runs to the offense) up to .35 runs. We don’t know what the number is that will work in this structure. We’ll start with .30 runs as a test assumption, but. . . we don’t know.
We know, from the work we did yesterday, that the 2018 Astros saved 935 runs against a zero-value pitching-and-defense combination. That number is fixed; it will not be allowed to adjust as our analysis evolves. But how many of those 935 runs will we credit to the strikeouts?
At this point, we could reasonably get a number anywhere from 116 to 411. 116 is 894 * .13. 411 is 1175 * .35. That’s our operating range. We have to figure out what the number is in there that consistently gets the answer that we need to make this system work. The number that works logically HAS to be the "right" answer.
It’s just like a runs created method, only backwards. The values that work, in a runs created method, are the numbers that tell you how many runs the team will score. The values that work, in a runs saved against zero method, would be or will be the numbers that tell you how many runs the team will allow.
The 2018 Astros are the #1 team in "relevant strikeouts" whether you use 3 standard deviations below the norm, 4 standard deviations or 5. They’re #1 any way you look at it.
But as to who has the FEWEST relevant strikeouts, there you get a different answer with each test. The 2003 Detroit Tigers (43-119; Mike Maroth and Jeremy Bonderman). . .the 2003 Tigers struck out only 764 batters among 6,376 who teed off against them. If we use the 3 standard deviation standard, then the Tigers are credited with only 25 strikeouts, the lowest total of all time.
But if we use the 4 standard deviation cutoff, then the 2003 Tigers scoot all the way up to the 4th-worst team ever, and the worst team ever would be the 1918 Philadelphia Athletics (52-76 in a war-shortened season.) The Athletics struck out only 276 batters among 4,834 that they faced, which would be 131 strikeouts above the level of 4 standard deviations below the norm.
And if we use the 5 standard deviation level, then we get a different answer again. Using the 5 standard deviation level, the worst team ever would be the 1925 Boston Red Sox (47-105). They struck out 310 of 6,025 opposing batters, which would be only 197 strikeouts better than five standard deviations below the norm.
We’re not arbitrarily choosing the right values. We’re trying to find the right values, through a research effort, the right values being those numbers that do the best job of predicting how many runs the team will allow. There is one set of numbers that will work better than any other set of numbers, and we won’t find it, exactly, because my skill set isn’t that good, but with luck, we’ll get pretty close.
Thanks for reading.