Remember me


April 4, 2020

            Before I start the main work of the day, there are two questions that have long bothered me about baseball history, which we are now in a position to answer.   The two questions are:

1)     When in baseball history did Fielding Range (Expressed on the team level by DER) become more important than Fielding Percentage, and

2)     When did strikeouts become important for a pitcher than control?


When did Fielding Range become more important than Fielding Percentage?

One of the first articles that I ever published, in the Baseball Digest in early 1976, was based on the premise that Fielding Range was much more important than Fielding Percentage, and actually that article overstated the extent to which this is true.   Not too long after I understood that, I understood that what had happened was that there had been a shift in that direction early in baseball history.  

When Organized Baseball began, when the statistical record was designed, there were no fielding gloves.   There were no masks or shin guards for catchers.  Fields were nothing like the manicured fields we play on now; they were just fields, with rocks and mole runs and mud patches after a rain.  Often the grass in the outfield might be 12 to 18 inches high.  People didn’t have lawn mowers then.  The first gas-powered lawn mower, like we have now, was marketed in 1919.  The baseballs were hand made and varied significantly in weight, and they were used until they wore out, nothing like the fresh, clean baseballs put in play now. 

Under those conditions, errors were understandably very common.  In the National League in 1876 there were 12 errors per game, 6 by each team (on average).   A game in which a team committed ten errors would not have been terribly uncommon.  Thus, when the statistics were laid out, it made sense to evaluate fielders by how successfully they could avoid mishandling what we would now see as routine plays.

But fielding percentages shot up quickly after 1876, so that at some point a fielder’s range became more significant than his fielding percentage.   What I have never known, until know, is when exactly this happened.

This method, even though it is primitive and not reliable for close calculations, can already give us a pretty clear and definitive answer to that question.   It happened before 1900.  Probably, based on the data we have—which does not cover the period in which that happened—it probably happened between 1888 and 1893, I believe.

In the first decade of the 20th century, 1900 to 1909, I estimate that there were 39,652 runs saved by range (DER), and 19,374 saved by fielding percentage, essentially a 2-to-1 ratio.  In the second decade (1910 to 1919) the ratio was nearly three to one (46,031-15,818); in the 1920s it was more than three to one.   Although the ratio continued to stretch out, it didn’t reach 4 to 1 until the 1980s, and has not really changed since then; it’s now about 4.1 to 1. 

Drawing the graph backward from 1900, I believe the two lines (value of fielding percentage, value of fielding range) would have crossed somewhere in the 1888 to 1893 era.  



When did strikeouts become important for a pitcher than control?


Growing up, I believed that, of the two categories strikeouts and walks, walks were more important (for a pitcher) than strikeouts.   Announcers would sometimes state directly that this was true, and—although I certainly questioned this traditional wisdom, as I try to question all traditional wisdom--my own observations tended to support that generalization. 

At some point, as strikeouts increased and increased and increased, it became obvious that this was no longer true.   But when, exactly, did this happen?   

It happened in 1951.   For purposes of this study, I considered Hit Batsmen and Wild Pitches to be part of Control, with their cost included with the cost of walks.   Using that method, strikeouts were actually more important than control in the 1900-1909 era, by a ratio of 7-5.

That changed in the next decade, however—control becoming then just slightly more important—and control was more valuable than strikeouts in the 1920s by a ratio of about 10-7.   Control continued to have the upper hand then until 1950, 10-9 in the 1930s and 10-9 in the 1940s. 

I am certain that I have read somewhere that the rule book was thoroughly revised after the 1951 season, although there is somebody on here who occasionally tells me that I am wrong about this.   In any case, there was a period just after World War II when walks surged to historic numbers.   In 1950 there were just short of 10,000 walks in the major leagues.  By 1952 this number had dropped to about 8700.  That was enough to make strikeouts more important than control.  As strikeouts have continued to increase in frequency, the strikeout has remained the more important of the two, by a value ratio of 12-10 in the 1950s, 13-10 in the 1960s, 12-10 again in the 1970s, 17-10 in the 1980s, 14-10 in the 1990s, 13-10s in the 2000s, and 20-10 (two to one) in our most recent decade, much higher at the end of the decade than at the beginning. 

Now, to get to the main point of today’s article. . .



            There is nothing unnatural about saying that when fewer runs are scored, more runs are prevented.  This is a very natural way to look at the data; indeed, this is THE natural way to look at the data.   Until Pete Palmer, Dick Cramer and myself invented sabermetrics, that was the way that everybody looked at baseball  data al the time. 

            Before Sabermetrics, basically all baseball statistics were interpreted as if all of the players were swimming in the same pool.  Essentially, everybody looked at baseball records as if standards did not change over time, and park differences were not relevant.   There is the traditional view of baseball statistics, and there is the sabermetric view. 

            The traditional view says, without explanation or apology, that the mid-1960s were dominated by fantastic pitchers, that there were a group of great pitchers active in the mid-1960s—Koufax, Gibson, Drysdale, Marichal, Bunning, Ford.   These pitchers dominated the game.

            The sabermetric view comes along and says, us being a bunch of nerdy assholes who want to spoil everyone’s fun, that "No no no; you can’t look at it that way.   Every league is a self-contained universe, and each league balances with itself.  If the number of runs scored in a league is low, it’s not because the pitchers are great or because the hitters are bad.  It’s just the conditions under which the games are played.  All of the numbers are only meaningful when compared to the league context."

            What I am trying to get to here is, we’re going to have to adopt the other viewpoint for purposes of this study.  In order to make this work, we’re going to have to at least temporarily set aside the sabermetric viewpoint, and build the data based on the traditionalist assumption.   I don’t see how we could avoid it.

            Well, actually, I DO see how we could avoid it, but that would be an even longer work-around.   We’ve got a choice here:  build the Panama Canal, and float the ships from one sea level to the other, or else sail all the way around South America.   We could work around the problem by adjusting the value of each run-prevention event to the potential run context in which it occurs, and then normalizing the data after the fact, but that would be an even more awkward and laborious process than what it looks like I am going to have to do instead.

            I’m still building this thing, still trying to think through how to build the Canal, but this is what I am thinking.   We’re going to need to state our Runs Prevented measurements here on three levels:

            The Natural Level, which assumes a universe like the pre-sabermetric universe, in which there is more or better pitching in 1968 than in 1930 or 1999,

            The Neutral Level, in which the numbers are moved to a neutral context so that there are as many runs prevented/game in every season as in every other season, and

            The Original Context Level, in which the runs prevented in a league rise and fall with the league runs scored.


            My assumption going into this process was that we would figure things on the Original Context Level, and then translate them to the Neutral Level for purposes of comparative evaluation.   I never envisioned using the Natural Level, at all.   But now I believe that we’re going to have to figure them on the Natural Level.  I am confident that we can successfully translate them back to the Neutral Level, but less confident than we will be able to carry them all the way back to the Original Context Level.   A projection between levels of runs scored loses accuracy if it gets too large.  You project 4.5 runs a game up to 5.0, 11% projection, you don’t really lose anything.  You project 3.5 runs a game up to 6, a 71% projection, it magnifies small errors to the point that it often undermines the conclusions.

            On the natural level, the proposition that Runs Prevented are equal to Runs Scored is still true, only it doesn’t apply to every league.   What I was shooting for originally was a system in which Runs Prevented were equal to Runs Scored in every season.   What we’re going to have now is a system in Runs Prevented are equal to Runs Scored throughout the history of baseball, but not in every season.   A season like 1968, there will be more Runs Prevented than Runs Scored.  A season like 1930 or 1999, there will be more Runs Scored than Runs Prevented.   


            Well. . . that’s a much as I can tell you now; I’m not EXACTLY sure how I’m going to do this.  I may have something to report to you tomorrow, or I may not.   At this point I just can’t say. 


COMMENTS (4 Comments, most recent shown first)

When you say strikeouts are more important than walks, what do you mean by "more important"? Do you mean that strikeouts correlate better with runs prevented than walks (inversely, of course) do?

What I mean is that they play a larger role in the game. And, since they play a larger role in the game, it is more important for a pitcher to be able to get strikeouts than it is for him to avoid walks, as a generalization.
11:19 AM Apr 6th
When you say strikeouts are more important than walks, what do you mean by "more important"? Do you mean that strikeouts correlate better with runs prevented than walks (inversely, of course) do?

6:53 PM Apr 5th
This will be an interesting lens to view run prevention through. I must admit I was a little less concerned about being able to compare runs prevented across eras as I have thought that sooner or later this would develop into a much improved calculation of WS for pitchers and defense which would balance out player values across time.
3:20 PM Apr 4th
The reason for the disconnect between what's natural to think regarding relative numbers of Runs Prevented, between higher and lower run environments, and what is sabermetrically meaningful and workable in the manner being done here, is that the "natural" thought is based on a different meaning of the phrase. They're only dimly related; it's sort of a conceptual pun that the same phrase is used for both.
That's the reason so many of us had trouble grasping the idea, I think. (Certainly it was mine.) We couldn't suspend our usual concept of the meaning of the phrase.
12:40 PM Apr 4th
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy