Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Keeping the Game Under Control

By Bill James

April 2, 2020

Keeping the Game Under Control

Double Plays and Stolen Base Prevention; these things keep the game under control. Our first task today is to estimate how many runs each team has prevented by turning the Double Play. The first assignment of THAT task is to estimate how many double plays the team should have been EXPECTED to turn. Fortunately, I have a really good set of formulas for that, and fortunately, I have already explained those formulas to you, so we can dispense with that. For illustration, we’ll use the 1941 Yankees, the greatest Double Play team of all time.

The 1941 Yankees turned 196 Double Plays. Had they been just average at turning the double play we would have expected them to turn 151, which is an above-average average; the average over time is 139. (The team which would have been expected to turn the most double plays, for whatever this is worth, is the 1983 California Angels, who could have been expected to turn 202 Double Plays, since (a) the team gave up a huge number of hits, and (b) they had an extreme ground ball staff. The Angels actually turned 190 Double Plays, only six fewer than the 1941 Yankees, but 12 below expectation in their case.)

So anyway, we re-state the 196 Double Plays for the 1941 Yankees as 145, meaning 45 more than expected. Every team is now stated on the 100-scale, so that the average for each year is 100, or some number very close to it. The Standard Deviation for the 1940s is 16.12.

We’re going to use three standard deviations below the norm as the zero-standard here. When you array the data by standard deviations—we learned this earlier in this series of article. When you array the data by standard deviations, the most extreme teams are almost always just short of three standard deviations from the norm in the direction bounded by zero, and just short of four standard deviations from the norm in the direction which is not bounded by zero. I made a decision earlier that I would use three standard deviations below the norm as the zero-standard in an area in which higher numbers represented excellence, and four standard deviations below the norm as the zero-standard in an area in which higher numbers represented failure. But since higher numbers represent failure for all of the categories here except strikeouts and double plays, this means that we use three standard deviations below average as the zero-standard for strikeouts and double plays, but four standard deviations for all of the other categories.

This was a questionable decision, in the construction of the system, and we’ll revisit it at an appropriate point, but for now, I’m proceeding with 3 standard deviations below the norm as the zero-value standard for double plays. The standard deviation for the 1940s is 16.12—another questionable choice in there, by the way—so three standard deviations below the norm would be 52 double plays. (100 – 3 * 16.12 = 51.64.)

We have the Yankees now at 145 double plays—remember, we adjusted it for context—so the Yankees were 93 double plays better than the zero-value standard. But what is the value of a double play, in this context?

We have already given the defense credit for the first out, the forceout at second base; that would be included in DER. What is at issue here is the second out, and the removal of a baserunner. How do we value those things?

The removal of the baserunner by the second out has the value of a negative walk. Since a walk is valued at .32 runs, that’s .32 runs. The addition of an out to the scoreboard, in this context, seems to be the same as a strikeout, more or less. We value a strikeout at .30 runs, so that makes a total of .62 runs. So I’m valuing the double play at .62 runs.

Again, this is a questionable choice, and I will revisit it if I become aware of some better way to place a value on a double play (in this context.) But for now, I will use .62 runs. So the Yankees are 93 double plays better than a zero-value defense, and each double play is valued at .62 runs. That’s 58 Runs. We will credit the Yankee defense (1941) with preventing 58 runs by their ability to turn the Double Play.

All 2,550 teams in the study are credited with an estimated 62,176 runs prevented by turning the double play in a competent fashion. A couple of teams are below zero.

Our next issue, then, is Stolen Base Control. Unlike everything else in this system, Stolen Base Control has already been stated as a number of runs, so we won’t need to make that translation toward the end of our process. We’ve got really 12 categories of defensive contribution that we are studying here, but we combined stolen bases allowed and runners caught stealing into one at an earlier stage of the analysis, and we stated them in terms of run value at that time.

When we created this thing called "Stolen Base Value", however, we combined one category in which a high number is bad—stolen bases allowed—with one category in which a high number is good—caught stealing. This leaves it unclear whether we should use three standard deviations or four standard deviations below the period norm as the zero-value standard.

That issue, however, is easily enough resolved. We don’t like negative numbers of runs saved here. Negative numbers are going to be a damned nuisance later on in the process. Negative numbers are very often a nuisance when you are analyzing value. We can’t use three standard deviations below the norm as the zero-value standard if there are a significant number of teams which are below that standard.

There are 21 teams in history which are three standard deviations-plus below the period norm in terms of Stolen Base Prevention, or Stolen Base value. The 2007 Padres are a whopping 5.8 Standard Deviations below the norm, and one other team is also four standard deviations below the norm. We can deal with one or two teams being under water, but 21, no way. We obviously have to use 4 standard deviations as the misery line.

The best team ever at turning the Stolen Base Attempt into a weapon for the defense was the 1920 Boston Braves; we talked about them before, remember? Mickey O’Neil, catcher. The 1920 Braves gained about 44 runs from their opponents’ efforts to steal bases against them, but, since Stolen Value can be either positive or negative, this is 56 runs better than the zero-value line. We credit the 1920 Braves with 56 Runs Prevented by Stolen Base Attempts—or, actually, by good defense against the Stolen Base Attempt. The worst team ever, also by this measure: 2007 Stay Classy San Diego Padres.

For all of baseball history (1900 to the present) we estimate that 49,427 runs have been prevented by competent defense against the stolen base.

OK, our last category to be subjected to this preliminary estimates round will be Passed Balls—the most annoying weed in our garden, other than Balks. If we run Passed Balls through our standard protocols here, we reach the conclusions:

1) That the best team ever at avoiding Passed Balls was the 1909 Detroit Tigers, catchers Oscar Stanage and Boss Schmidt. The second-best team ever was the 1908 Detroit Tigers. The 1931 Yankees (Bill Dickey) had NO passed balls on the season, but this gets less credit for Run Prevention since Passed Balls were much less common in that era.

2) That the worst team ever was the 1987 Texas Rangers; you all know that story.

3) The 1909 Tigers Prevented about 9 runs from Passed Balls that did not get past. The 1987 Rangers actually have one of those dreaded negative numbers, negative 5. Which I’ll probably just ignore at some point in our analysis, since it’s such a bullshit stat, and only a few teams have negative numbers anyway, most of which are like negative two tenths of a run or something.

4) The sum total of Runs Prevented by Passed Ball avoidance is estimated at 9,661.

Adding all three of these categories into our data, our estimate is that we have accounted for 1,548,593 runs saved, which is only 13% off of our target. Our next task will be to make some corrections to move this number even closer to our target (1,783,676), which, since we have hundreds of options on how to proceed, should not be difficult. Thanks for reading.

COMMENTS (13 Comments, most recent shown first)

djmedinah
Hotstatrat: You understand that the problem with negative numbers Bill is talking about is multiplying/dividing them, right?
8:16 PM Apr 6th

W.T.Mons10
Another, perhaps small, point is that a certain number of double plays will be of the strikeout-throw out variety, and are already counted in those stats. I see the Yankees had 8 of those last year, out of 102 DPs. It was probably more common when there was more stealing going on.
1:23 PM Apr 3rd

CharlesSaeger
I’m not even getting to the relative value of outs in the inning. I’m mostly talking about the value of not getting another batter to the plate, which is just R/BFP for the league that year.
12:02 PM Apr 3rd

willibphx
It might be a bit high. I looked at the base/out tables I had handy. Cant site where I got them but it does depend on the situation. The most common I believe would a runner on first.

With no outs, the run expectation from the table was .831 runs. Runner on first with one out was .489. No runners with two outs was .095. Thus the value of the force play getting the lead runner would be -.342, The value of getting both runners would be -.736, thus the incremental value of the second out would be -.394.

With one out and a man on first the incremental value of the DP would be -.214. More unusual situations would be higher. Man on 3rd gets doubled off with no outs was -.770. Bases loaded with no outs assuming man on 3rd scores and runner left on third would be -.727.
10:03 PM Apr 2nd

evanecurb
I said "well said" meaning that I remember very well "make sure to get the first out." Double plays are extremely rare in baseball's lower levels - almost unheard of except for the occasional 4-6-3 when there's a slow hitter at the plate. It's an extremely difficult play to execute for amateurs. The major leaguers make it look easy, as they do with most plays on defense.
8:52 PM Apr 2nd

evanecurb
Guillermo: Well said.
8:49 PM Apr 2nd

GuillermoMountain
Evan, I respectfully think you have it backwards: since we're already using defensive efficiency in calculating runs saved, I would think that we are double counting the first out, rather than the second out if we give credit for both outs in the double play. The second out is the true "bonus" here, and so the value of that out is where the "extra" credit should be given. And that lines up with how we think about it as fans too: "make sure you get at least the sure out."
7:41 PM Apr 2nd

evanecurb
With respect to the value of the second out on the DP, retiring the batter has the same value as preventing a walk. A walk is worth .32 runs, so that's .32 runs. Then he adds .30 for recording the out. That's how Bill gets to 0.62 for the value of the DP outside of the value already accounted for in DER.

I wasn't sure I understood why this is not double counting the second out. I was wondering if any of you had the same issue. I think I was able to explain it to myself using two scenarios - one where the DP is turned, the second where it's not and the batter reaches on a fielder's choice.

Scenario I: Batter hits into DP. Next batter walks.
Scenario II: Batter hits into FC. Next batter strikes out.

These two situations result in the same base/out situation, so I assume the values should add up. Under Bill's system, I think the values are like this:

Scenario I: 062 for the DP, minus 0.32 for the walk, equals 0.3 net runs prevented.
Scenario II: 0 for the DP, plus 0.3 for the strikeout, equals 0.3.
6:49 PM Apr 2nd

Guy123
Charles: I agree the value of the second out in a DP is higher than the average value of an out, because the play begins with one or more runner on base which increases the leverage. But .62 seems high to me. The value of a CS is about .40-.45, depending on the run environment (see Ruane). Shouldn't the value of erasing the runner on a DP be similar, or am I thinking about it wrong?
4:48 PM Apr 2nd

CharlesSaeger
@Guy123: there is an additional cost for adding an out (fewer runs score later in the inning) and preventing another batter from coming up, but there’s no way those double the value.
3:40 PM Apr 2nd

BobGill
Several articles ago, didn't you say you were going to combine passed balls, wild pitches and balks into a single category, to make the numbers more substantial and reduce the number of outliers? I thought you said that, anyway, and it seemed like a good idea, but over the last couple of days you've been treating those three stats as separate categories again. Did something change your mind, or are you just planning to combine them later, once you've finished with the preliminary work?

2:57 PM Apr 2nd

Guy123
I'm not 100% sure, but I think you are double-counting in setting the value of removing the runner in the DP. Removing a runner from 1B and recording an out can't be worth any more than having retired him initially (approx. 0.3 runs). And the total value of a DP to the defense (including the batter's out) is traditionally worth around .75-.80 runs -- see Tom Ruane's data here: https://www.retrosheet.org/Research/RuaneT/valueadd_art.htm.

2:34 PM Apr 2nd

hotstatrat
Bill, are you really sure dealing with negative numbers will mess up your calculations down the road? The negative runs prevented in one area would just take away from the runs prevented from the whole enchilada of runs prevented when you combine the different areas - as it should - as far as I am guessing - but I don't know how you plan to combine these run preventing figures.

The reason I worry that you might be over-avoiding negatives is that under the concept of offense = defense heading towards runs created = runs prevented, you are messing your data up more by using more standard deviations away from 0 as you are towards 0.
12:01 PM Apr 2nd

Keeping the Game Under Control

COMMENTS (13 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: