Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Legally Stolen Bases

By Bill James

March 26, 2020

Legally Stolen Bases

Not all of the events recorded in the Stolen Base Attempts Column are actual Stolen Base Attempts. Some of them, particularly from the years before 1970, are busted Hit-and-Run tries, and some of them are plays when the runner is picked off first and breaks for second.

Also, it is rather an oddity that we take for granted, that a base taken legally within the rules of the game, a base to which the baserunner is entitled, is called a "stolen" base. It is rather as if "Sacks" in football were called "Assaults".

Runners Caught Stealing were not part of the official stat package until 1920. Some information about failed stolen base attempts prior to 1920 has been recovered, but it is sporadic, and not useful because it is sporadic. Up until 1920 stolen bases were systematically recorded, but not runners caught stealing.

Since 1920, 67.4% of events recorded as stolen base attempts have been successful steals. The success rate for actual stolen base attempts would be somewhat higher, but my point is, the break-even point for a stolen base attempt is probably higher than 67.4%. The sum total of all events recorded as stolen base attempts adds up to negative runs. Some teams have helped themselves score more runs by stealing bases; some teams have cost themselves runs by attempting to steal bases. On balance, it hasn’t helped teams score runs, at the major league level. Amateur baseball is different, of course.

The run values I am using here are +.153 runs for a successful stolen base, and -.428 runs for a caught stealing, Here part of a run-value chart, copied from the Hardball Times:

Runners	0 Outs	1 Out	2 Out
None	.489	.263	.101
1	.858	.512	.212
2	1.073	.655	.319

If all stolen base attempts were attempts to steal second base with no other runner on base, the value of a stolen base would be .215 runs with no one out (1.073 minus .858), .143 runs with one out, and .098 runs with two out. The cost of a caught stealing would be .595 runs with no one out (.858 minus .263), .411 runs with one out, and .221 runs with two out.

If we assume, then, that 40% of stolen base attempts occur with no one out, 30% with one out, and 30% with two out, then the value of a stolen base would be .153 runs, and the cost of a caught stealing would be .428 runs. The break-even percentage would be 74%. But since we are studying these things from a defensive standpoint, the caught stealing is a positive (for the catcher), +.428 rather than -.428, and the stolen base allowed is a negative for the catcher, -.153 rather than +.153.

Back to the point that not all of these things that are recorded in the Stolen Base Attempts column are actual stolen base attempts. There’s nothing I can do about that; I’m just working with the data that I’ve got. The question we are ultimately pursuing, in this area, is "How many runs did the catcher prevent by throwing out runners?" Throwing out runners implicitly discourages additional base stealing, so those runs prevented have to be counted as well.

Up until 1920, since we don’t have consistent or usable caught stealing data, the assumption is simply that the fewer stolen bases you have allowed, the more you have prevented. Since 1920—and actually, before 1920 as well—we start with a "Stolen Base Value" against each team, which is .153 * SB - .428 * CS. Before 1920 the Caught Stealing are always presumed zero, so the calculation is always negative from the catcher’s perspective. Since 1920, it is sometimes negative, but more often positive.

The 1900 Philadelphia Phillies allowed 272 stolen bases; what my source for this was, I do not know, but I must have had one, since I certainly didn’t make it up. The stolen base value of this, then, is -41.6 runs, from the catcher’s standpoint.

The 2019 Philadelphia Phillies allowed 66 stolen bases, and caught 50 runners stealing. The Stolen Base Value of this, then, is +11.3 runs, from the catcher’s standpoint.

This number, the Stolen Base Value, is the starting point of our process here. We divide the Stolen Base Value by the teams Estimated Runners on First Base, getting the SBV/ROF. (The explanation of Estimated Runners on First Base was given in an earlier article.) Since this number is infinitesimal, starting with a couple of zeroes, I then convert it to Stolen Base Value per 1,000 runners on first (SBV/1000ROF). The conversion to 1000 Runners on First is purely cosmetic; when I do the later calculations, it will revert to per runner on first.

Anyway, we then form decade norms for SBV/1000ROF. We then compare the team to the decade norm, and divide the difference by the standard deviation for the decade.

In 1920 the Boston Braves, or whatever we want to call them at that moment in time—the National League Boston team—allowed 111 stolen bases, but caught 143 runners stealing. This is quite a remarkable ratio; it is, in fact, the most remarkable set of team totals of all time. The team had a gain of almost one baserunner per game—the 143 runners taken off the bases—against a cost of less than 20 runs all season. The net gain for the team was about 44 runs.

My ordinary process in stating that fact would be to lead you through a process of developing understanding, in which we said that the best team ever in terms of raw data was X, but if we look at it in this other way then the best team would be Y, but then if we considered this as well, the best team would be Z. But the 1920 Braves number is so remarkable that they are the X, Y and Z of the subject, and they force us to deal with them first.

It is not actually as much that the Braves are remarkable, as it is that the moment is remarkable. Up until 1920 there were few home runs in baseball, and teams relied heavily on stealing bases to try to move runners. Beginning in 1920 home runs accelerated rapidly.

But also, the National League (and the American League) prior to 1920 did not officially record runners caught stealing. The National League counted them then, for a few years (1920-1926) and then decided it was more trouble than it was worth; they resumed counting them in the 1950s. The American League kept the counts going.

In any case, both of these things strongly discouraged careless base stealing, so caught stealing totals dropped sharply after 1920. They had CRAZY numbers of runners caught stealing, just absolutely insane. Not all them were true caught stealing; they also used the Hit and Run very extensively in that era, and that also sometimes led to runners recorded as caught stealing. Both of these factors drove caught stealing totals sharply downward after 1920. Once home runs became more common, there was much less use of stolen base attempts, hit and run attempts, and sacrifice bunt attempts to try to move runners into scoring position.

But also, as you probably know, when you observe a thing, you very often change it because you have observed it. Teams prior to 1920 did not KNOW how many baserunners they were losing to failed stolen base attempts. Once they had a statistic which showed often that happened, they began to understand that it was not really paying off. The number of stolen base attempts dropped off a cliff.

The 1920 Boston team, then, sits at a propitious moment in history—at the very end of the period of excessive and careless base stealing, and at the very beginning of the period when caught stealing are documented. If there were stolen base counts prior to 1920, very probably there would be some team with data even more remarkable than Boston’s. If stolen bases had not declined post-1920, there would be other teams with similar data. But as it is, the 1920 Braves’ data is the most remarkable of all time.

Well. . .there are other counts. There are unofficial but now known counts for some teams prior to 1920, and some of those are also extremely unusual, comparable to Boston’s. But we can’t use them, because they’re sporadic. Cleveland against Boston; if you count Caught Stealing for Cleveland but don’t count them for Boston, you’ve got nothing.

The Braves’ number one catcher in 1920 was a rookie named Mickey O’Neil. I’m guessing he was Irish. He wasn’t much of a hitter, but he was always a good defensive catcher. O’Neil—NOT played by Brad Pitt—threw out 84 of 153 potential base stealers. As I said before, Boston’s SB/CS has a net value of 44.2 runs to the Braves—the highest of all time.

YEAR	City	Team	Lg	SBA	OCS	SBV
1920	Boston	Braves	NL	111	143	44.2
1920	Philadelphia	A's	AL	103	133	41.2
1921	Boston	Braves	NL	90	119	37.2
1920	Chicago	Cubs	NL	115	117	32.5
1922	Chicago	Cubs	NL	79	104	32.4

When you normalize that to SBV/1000ROF (Stolen Base Value per 1000 Runners on First), that team is still first:

YEAR	City	Team	Lg	SBA	OCS	SBV	SBV/1000ROF
1920	Boston	Braves	NL	111	143	44.2	35.5
1920	Philadelphia	A's	AL	103	133	41.2	29.4
1921	Boston	Braves	NL	90	119	37.2	28.2
1920	Cincinnati	Reds	NL	108	111	31.0	26.7
1920	Chicago	Cubs	NL	115	117	32.5	26.5

And when you compare that to the decade norm and calculate how far it is above the decade norm, it is still in first:

YEAR	City	Team	Lg	SBA	OCS	SBV	Score
1920	Boston	Braves	NL	111	143	44.2	139
1963	Chicago	Cubs	NL	48	65	20.5	133
1937	Boston	Braves	NL	30	67	24.1	133
1943	Detroit	Tigers	AL	84	80	21.4	132
1993	Chicago	White Sox	AL	82	82	22.6	130

The data for the 1920 Boston Braves was 3.9 Standard Deviations better than the period norm. That’s actually not the END of the process of analyzing this data; that’s just the end of how far we are going in this round. We’ve got a lot more to do later on. The only one of the other four teams that is likely to mean anything to most of you is the 1993 White Sox. The White Sox catcher was a big albino named Ron Karkovice; I don’t know if he was truly albino, but he didn’t have much color. He did have a fantastic arm.

The really interesting catcher here is Dick Bertell. I remember Dick Bertell because, you know, I remember everybody; it’s my job. I had his rookie card and, being a pack rat, I am sure that I still have it. He hit .273 as a rookie in 1961, then hit .302 in 1962. Those were the years of the Cubs’ "college of coaches" experiment, so Bertell was in and out of the lineup, as some other players were.

I don’t remember that he was a top flight defensive catcher; maybe he was, but I don’t remember that anybody noticed. If he was, that would certainly help explain how Dick Ellsworth went 22-10 with 2.11 ERA in 1963, kind of out of context with his career. But whatever the reason, Bertell was the Cubs’ regular catcher in 1963, and the team did have stunning data in the stolen base/caught stealing categories.

The Cubs either did not realize Bertell’s defensive value—just as they did not realize the value of their own leading base stealer, Lou Brock—or else they knew something that I don’t know. Bartell again had outstanding stolen base/caught stealing data in 1964, but didn’t hit much, and lost his job.

The 1937 Braves’ catcher was Al Lopez, the Hall of Fame manager, who was regarded as a premier defensive catcher. The 1943 Tigers’ catcher was Paul Richards, another famous manager; Richards was mentioned in the MVP voting in both 1944 and 1945, although he was a half-time player who didn’t hit. Thanks for reading.

COMMENTS (15 Comments, most recent shown first)

raincheck
As Casey Stengel allegedly said, “ "You have to have a catcher, otherwise you will have a lot of passed balls."
4:20 PM Mar 27th

Fireball Wenz
Bertell's backup on the 1963 Cubs was Jimmie Schaffer, whose expression on his 1968 Topps card with the Reds looks exactly like you would if you were a catcher and Johnny Bench just showed up. He threw out 16 of 24 base stealers that year (67 %) - his career mark, of which that year was a large part, was 46%. Bertell's career mark was 48%. Bertell led the NL in passed balls in 61 and 62, which may be why his defensive rep wasn't great.

The Cubs in the 1960s employed a "scoreboard spy" to steal signs. Cuno Barragan, another backup catcher, spent a lot of time doing that one year. I wonder if the Cubs had cracked a lot of signs and were pitching out and nailing runners.
9:13 AM Mar 27th

bjames
Bill is calculating the value of a steal and a caught stealing assuming all stolen base attempts take place at second base. This is not true. Looking at the 1927 AL, for example, 12.8% were at 3rd base and 8.2% at home.

Right. These are trivial differences, and the sort of thing you worry about the second or third or fourth time you study an issue. It's not something that is appropriate to focus on at this point in the research.
11:30 PM Mar 26th

W.T.Mons10
Bill is calculating the value of a steal and a caught stealing assuming all stolen base attempts take place at second base. This is not true. Looking at the 1927 AL, for example, 12.8% were at 3rd base and 8.2% at home (with a 41% success rate, certainly above break even). That's a bit over 1 in 5 at another base. Jumping ahead to the 1993 AL, there were still 12.1% of the attempts at 3rd, while steals of home dropped to 1.7% (and only 17.5% succeeded).
9:04 PM Mar 26th

W.T.Mons10
Baseball-Reference.com has caught stealing datums for the pre-1920 years. I believe they came from Pete Palmer, although where he got them I couldn't say. Anyway, they show the 1900 Phillies allowing 273 steals and throwing out 185 runners.
8:34 PM Mar 26th

doncoffin
evanecurb:
"If Ivan Rodriguez or Tony Pena is behind the plate, only the best base stealers get the green light. This prevents steals but it also prevents extra outs from caught stealing. If Gene Tenace or Mike Piazza is catching, more guys take off, and the defense has more opportunities to catch them stealing. It's an optimization problem - what are the charateristics of a catcher/defense that will create the most value from caught stealing?"

So let's see. In his career, Mike Piazza was the catcher when 1400 bases were stolen, and he threw out 423. So, 1823 stole base attempts.

I-Rod was catching when 830 bases were stolen, against 796 CS. 1626 stolen base attempts. So not much difference.

But Piazza caught 13,555 innings, and A-Rod caught 20,348. Let's normalize this to per-1000 innings:
------------------SB/100------CS/1000------Net SB/1000 (SB-CS)
Piazza------------103.28-------31.21---------------72.08
I-Rod--------------40.79-------38.63----------------2.16

So I-Rod actually threw out more baserunners per 1000 innings than Piazza did. Net, Piazza allowed 72 steals per 1000 innings; I-Rod allowed 2 (two).

I know these are probably the extreme cases, but there's no way that Piazza's defense against the stolen base actually saved more runs than did I-Rods (and we haven't accounted for the run value of the bases gained by the steals when Piazza was catching).

7:14 PM Mar 26th

TJNawrocki
Two of those "worst-ever" teams had something in common: Pedro Martinez.
3:55 PM Mar 26th

Guy123
Sorry, wrote that wrong. The best SB-prevention teams were .488, while the worst SB-prevention teams were a bit better at .509.
3:47 PM Mar 26th

Guy123
This, as far as I know, is the first study of defense to approach the issue in that way: What happens if you con't have somebody who can do this?

At first blush, the answer appears to be "not much." The ten best teams identified here have a cumulative winning percentage of .488, while the ten worst teams had a winning percentage of .488. But maybe the larger samples tell a different story?

3:25 PM Mar 26th

bjames
A decade is the calendar decade. I see that in this article, I failed to post the list of the teams which were worst at preventing the stolen base.

The worst ever was the 2007 Padres, who allowed 189 or 190 stolen base with only 20 runners caught stealing, a stolen base value of negative 20.4. The 10 worst teams, park and era adjusted: the 2007 Padres, 2012 Pirates, 1959 Cubs, 1988 Astros, 2019 Mets (139/22), 2001 Red Sox, 1959 Washington Senators, 1997 Expos, 2006 Padres, and 2009 Red Sox.

And you're completely wrong about Ivan Rodriguez--and this study is precisely designed to expose that flaw. You're thinking about him compared to the normative value, rather than compared to a complete failure--ie the 2007 Padres. Rodriguez value is based on what happens if you DON"T have a catcher who can throw. THe other team steals you blind. This, as far as I know, is the first study of defense to approach the issue in that way: What happens if you con't have somebody who can do this?
2:41 PM Mar 26th

SteveN
I'm unclear on what Bill means by decade. I think that he means calendar decades, such as the 80's, 90's and the aughts?. But he might mean the 5 year spread before and after the year being examined.
2:16 PM Mar 26th

Jack
"The Braves’ number one catcher in 1920 was a rookie named Mickey O’Neil. I’m guessing he was Irish."

I did not see this coming, and it made me literally laugh out loud.

Thanks for this series of excellent, illuminating articles, Bill.
2:03 PM Mar 26th

mathias2
Thanks, evanecurb.
It has annoyed me foe decades when some catcher like Ivan Rodriguez would be praised for throwing out a high percentage of runners and given extra credit for reducing attempts, when it is obvious that he would have MORE value if there were more attempts.
I appreciate you explaining that concept better than I ever have.
1:18 PM Mar 26th

evanecurb
We're so spoiled by the amount of data we have now that it's especially frustrating to try to deal with an area where no data exists. I'll be interested in seeing Bill handles the years for which we have no data. I'm betting he comes up with something using catcher assists.

Perhaps the catcher / defense that would generate the most value would be one in which the catcher had a mediocre throwing arm (or the pitchers were so-so at holding runners on). Stay with me; it's actually pretty simple. If (big if) it's true that outstanding arms discourage base stealing, and that base stealing has a negative value for the offense, then it follows logically that the defense would want to encourage stolen base attempts. If Ivan Rodriguez or Tony Pena is behind the plate, only the best base stealers get the green light. This prevents steals but it also prevents extra outs from caught stealing. If Gene Tenace or Mike Piazza is catching, more guys take off, and the defense has more opportunities to catch them stealing. It's an optimization problem - what are the charateristics of a catcher/defense that will create the most value from caught stealing?
1:04 PM Mar 26th

chuck
Bill, many thanks for providing these daily pieces of the puzzle. It's great during this stay-at-home period to be able to check in to the site and see fresh, interesting stuff each day.

You wrote: "The question we are ultimately pursuing, in this area, is "How many runs did the catcher prevent by throwing out runners?" Throwing out runners implicitly discourages additional base stealing, so those runs prevented have to be counted as well."
It looks like this gets addressed with the SBV/ROF ratio.

I'm wondering if this ratio changes much depending on a park being more of a pitchers' park as opposed to a hitters' park. I would think in a run-challenged environment that teams are more likely to risk the stolen base than they would in a hitter-friendly park... that the park itself plays a role in the encouragement/discouragement of stealing. Perhaps that's too small of a thing to matter for what you're doing.
12:50 PM Mar 26th

Legally Stolen Bases

COMMENTS (15 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: