Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Why We Need Runs Saved Against Zero

By Bill James

June 23, 2020

Why We Need Runs Saved Against Zero

In the normal course of events, we measure things in absolute terms before we make comparative statements about them. In the normal course of measurement, comparative statements rely upon absolute measurements. Suppose, for example, that you noted that your friend’s son seemed rather tall, so you ask, "How tall is he now?", and your friend said "He’s +3 inches."

"He’s +3 inches", you would say, "But how tall is he?"

"Three inches taller than an average fifth grader," your friend says.

"Well, yeah. But how tall is he? Is he 5-foot-4 or 5-foot-5? How tall is the average fifth grader?"

"That seems like a stupid question," says your good friend. "An average fifth grader is exactly as tall as an average fifth grader. The average is zero."

"Well, OK, but. . . how far is he from the ground?"

"You mean, above the level where his height is a problem? He’s 8 IAE. Eight Inches Above Embarrassment."

"No no no; not IAE. Inches altogether, counting everything. How tall is he from the soles of his feet?"

"We have no idea."

"You haven’t measured that?"

"It’s impossible. Measuring sticks are only 12 inches long. He’s several times that tall, so there’s no way of knowing. Anyway, why on earth would anyone need to know THAT? We know that he is +3 and that +3 at his age is +8 IAE; isn’t that really all we need to know?"

"Well, suppose that you were buying a bed for him, and you wanted to know whether the bed was long enough. Wouldn’t you want to be able to measure the bed?"

"If we’re buying a bed for him," your friend replies patiently, "we take him along with us, and we see how he fits."

"Suppose that he can’t come with you when you’re going to the store?"

"We’ll go some other time."

"Suppose you are buying sheets for the bed. Wouldn’t you need to know what size sheets to buy?"

"All sheets are about the same size. You just fold under the parts you don’t need."

"Suppose you are building a house," you reply, in patient confusion. "Don’t you need to know how high to make the doorway?"

"Everybody knows to make doorways taller than your head."

Do you see where I am going with this? Measuring a child’s height only in relative terms, without any measurement of height in absolute terms, would seem problematic, and would limit our ability to make comparisons beyond the field of his classmates. It would be a very serious limitation on what we actually know.

Or suppose that we did the same with weight; in that case, even more problematic. Suppose that you and your wife are worried that your son Charlie is getting a little bit fat, and you are trying to assess the problem. For the purpose of this one, we’ll assume that you come from a normal society, in which weight is measured in pounds, but you are living in your wife’s country, where they have no concept of absolute weight, merely relative weight.

"So what’s Charlie’s weight now?"

"It’s up a little. He’s +12."

"+12? I thought he was +14 last year? Surely he put on some weight in the last year?"

"Last year he was in a different group. All the kids are bigger now than they were a year ago."

"Well, OK, but it sill looks to me like he has gotten a little heavier."

"He has."

"But you said he was +12 now and +14 a year ago. . ."
"No," she says. "I said he was +12 now. YOU said he was +14 a year ago."

"Wasn’t he?"

"He’s +12 compared to his HEIGHT. Last year he was +14 compared to his AGE. Different thing."

"What is he now, compared to his age?"

"He’s +26, but since his height is +3, his weight isn’t really so bad."

"Let’s try this another way. What is his weight, compared to that chair?"

"How would I know that? There is no chair in his class."

"How do you figure height and weight, anyway?"

"We have a balance, like what you call a teeter-totter. Charlie stands on one side, and the middle-sized kid stands on the other side. In order to make them balance, the middle-sized kid has to have 26 rocks in his pockets. So he’s +26 compared to the middle-sized kid."

Or suppose that you take a test, in school, and you get 71% of the answers correct, but rather than telling you that you’re 71%, the teacher tells you you are minus eleven. Would that not seem strange to you?

Or suppose that, in filing your IRS return, when the government asked what your income was last year, what they expected you to tell them was not what it was in absolute terms--$63,000, let’s say—but what it was relative to other people in your profession. You’d be having this conversation with your accountant:

"So, you’re a lawyer, Mr. Shyster?"

"No no. Shyster is my NAME. I’m an undertaker."

"You’re an undertaker?"

"That’s right."

"Oh, I’m sorry. My secretary told me you were a lawyer; I’m afraid I have got the wrong set of charts here." (Punches button on his phone.) "Mrs. Franklin, you brought me the wrong set of books. He’s not a lawyer; he’s an undertaker. Could you bring me the mortician books?"

"I wanted to go to law school," says the client. "But the law schools all thought my application was a prank, for some reason."

"Very sorry about that. OK, so do you charge more or less, per body, than the average undertaker?"

"We’re a first-class operation, silk-lined caskets with brass handles. So we charge a little more than the average."

"How much more?"

"We think about 17% more, on average."

"The average is what we’re concerned about here. Let’s say you are 17% above average. And how many bodies did you bury last year?"

"122."

"122?"

"Right, 122".

"That’s amazing. OK, so you do 122 more funerals a year than the average undertaker, and you charge 17% more, so your income must be. . . .

"No no no; not 122 MORE funerals than average. 122 funerals."

"That’s not what I asked you. I asked you how many you did. What that means is, how many did you do compared to the average undertaker?"

"Why is that?"

"The government says that the average undertaker pays $14,557 a year in taxes. So we need to figure out whether you are above or below average."

Absolute values, measured from zero up, create structure to our thought. Absolute values make analysis possible. Without reference points, it is very, very difficult to create a map, because every little error magnifies itself over space, creating an uncertain relationship between distant points. "Surveying" is a matter of accurate measurement, yes, but it is also a system of creating multiple reference points. Every map starts with a zero point; you don’t know what it is, but the guy who created the map does. (For Google Earth, by the way, the zero point is in Lawrence, Kansas. Really.)

I understand, of course, why fielding is difficult to measure in absolute terms. It is difficult to measure in all sports. Offense is measured from the ground up. Defense is measured from the sky down. How tall is Charlie, measuring down from the sky? It’s a hard problem.

But the lack of a zero point, in measuring fielding, creates chaos in our ability to analyze fielding. It creates just as much confusion in measuring fielding as it would in measuring income, or in creating a map. And, because we have been measuring fielding in this odd way, there are lots and lots and lots of things about fielding that we SHOULD know, but we just have no way of knowing.

For example.

Using "shortstops" here as a stand-in for the position; what we mean is any position. One of the things we don’t know is how the shortstops of one era compare defensively to the shortstops of another era. In the 1950s, 1960s, and 1970s there were many shortstops in baseball, like Roy McMillan, Mark Belanger, Ray Oyler, Dal Maxvill, Roger Metzger, Ed Brinkman and Bobby Wine, who were able to keep their jobs for years and years although they would often finish the season hitting .220 with zero home runs. Now, we don’t really have any players like that, but instead we have some shortstops who hit 30 homers a season.

But are the shortstops now making a defensive contribution equal to the shortstops of 1960, or have we substituted offense for defense at the position? One explanation for the change would be that, as strikeouts have increased, the overall reliance on fielding has decreased, so we put less emphasis on fielding, more on hitting. Another explanation would be that, because of weight training beginning at an earlier age and occurring on a more organized and systematic basis, the 200-pound shortstops of 2020 are actually just as quick and just as agile as the 160-pound shortstops of 1970. We don’t know. Since the shortstops of 2019 are compared only to the other shortstops of 2019, rather than being measured in absolute terms, we have no reasonably direct way of approaching the problem.

Let us accept the claim that the best defensive shortstop in the league is 20 runs better in some season than the average shortstop. But what is the relationship of their defensive value? Is the great shortstop 40% better than average at preventing runs, or is he 5% better than average? How many runs per inning does the shortstop have to prevent, to be average?

What is the relative value of the different components of a shortstop’s defense? A shortstop does many different things, on the field; he makes double plays, he goes into the hole and makes long throws, he dives and gets quickly to his feet, he takes relay throws from center field and throws home.

Without any measurement of the SUM of a shortstop’s contributions, it is impossible to make accurate guesses about the value of each component. A scout sees a young shortstop; he has tremendous feet and fantastic quickness, but his arm is a little bit short. What is the relative value of one, against the other? It is impossible to know.

Measuring changes in the game, over time. . . . .I believe, and I suspect that most people believe, that pitching has become more important to preventing runs than it was years ago. John McGraw estimated that baseball was 40% hitting, 10% baserunning, 30% pitching, 20% fielding. The "pitching" percentage has gone up over time. It was probably 35-37% in my youth; it may be over 40% now.

But we don’t know. Without having an absolute value for fielding, we have no way of knowing.

What percentage of a shortstop’s value is in his hitting, and what percentage is in his fielding?

You may have an opinion about that issue, and we have made efforts to measure it in the past—for example, by Win Shares. But if you know how many runs a player created—which we have known for 40 years—and you know how many he saved, then the issue of how much of his value is in offense and how much is in fielding becomes relatively clear and certain. A player creates 80 runs and saves 20, his value is 80% batting, 20% fielding.

In a practical sense, the largest problem RIGHT NOW is that we can only guess at the relative value of defense between positions. We’re just guessing; we have no idea.

The relative value of defense between positions used in WAR is essentially derived from the logic of Pete Palmer in the 1970s. The idea was that, since we do not KNOW how many more runs the shortstop is saving than the first baseman, we must assume that the overall value of the two positions is equal. The average shortstop MUST be as valuable as the average first baseman, so if first basemen create 16 more runs on average, then shortstops must be preventing 16 more runs on average.

Tom Tango, noting that there are certain problems with this approach, has tried to develop a different method to assign values to positions. For example, although everyone agrees that right field is a more demanding defensive position than left field, right fielders also hit better than left fielders. If you follow the logic that all positions must be of the same value combining offense and defense, you arrive at a conclusion that everyone acknowledges to be false: i.e., that left fielders are better defensive players than right fielders. Tom has created a method to avoid this problem.

But the REAL problem is, there is no evidence that the underlying principle is correct. There is no evidence that shortstops overall have the same value as first basemen. In football, do quarterbacks have the same value as tight ends? In basketball, does the small forward have the same value as the point guard?

When Pete Palmer first explained this idea to me, in a letter probably in 1976, I thought, "OK, well, that’s clever; we can use that until we figure out some way to determine the ACTUAL defensive value of a shortstop." The problem is, we never did. As Sabermetrics grew, it just skipped over that problem. Pete had offered a way to work around the absence of knowledge, so people said, "OK, that’s good; let’s go with that."

The proposition that a thing which cannot be measured directly can be measured instead by the reaction to it, assuming that the energy of the reaction is equal to the energy of the action, is used in many places in science, and is not a bad concept. That’s what Pete was doing; he was assuming that the size of the difference in offensive value was a reaction to the difference in defensive value, so it must be of the same size as the difference in defensive value. It is not a bad concept. It is not an infallible concept, either. Sometimes the energy of the reaction is diffused into some other place that you didn’t expect it to go. Shortstops and first basemen are drawn from separate markets with some crossover, but it is not clear that the talent available in one market equals the talent available in the next. There are market forces and random variables that can cause that NOT to be true in the real world.

But, because we never did get around to measuring defense in absolute terms, we have no way of knowing. We have no way of knowing YET. If I can make this thing work, we will have a way of knowing.

The REAL problem, though, the big problem, is that without absolute reference points, it is impossible to know whether your measurements are accurate or are not. In batting, we KNOW, we can say with great confidence, that a team which hits this number of singles, this number of doubles, this number of triples, etc., will score about this many runs. We KNOW that that is true, because we measure batting in absolute terms.

In fielding, we cannot say that a team which has this number of strikeouts, this number of walks, this number of double plays. . . .we cannot say how many runs they have prevented, and therefore we do not really know what the value of each defensive action is.

In John Dewan’s Runs Saved, there ARE absolute measurements hiding behind the data. John has a system that measures the probability that a shortstop will make a play on a ball hit this distance at this speed off the bat; it measures each of those events in absolute numbers.

Many, many people have worked hard at "solving" fielding statistics, and they have done much good work. I am not criticizing that work. I am not suggesting that it has no value. It has great value.

What I am saying is that, as a group, as a field of study, we skipped the BIG question that we should have asked. We whiffed on the big one. We struck out with the game on the line.

We skipped that problem because it was too hard. It’s too much work. It required that we make assumptions about things that we don’t actually know, and it’s very confusing trying to think it all through.

One way to understand what I am doing with this process is to return to the idea of measuring down from the sky. It is impossible to measure down from the sky, true, but suppose that we think of offensive and defensive value being in a room, with a ceiling. The average number of runs scored in a game—4.50, essentially—is the mid-point of the room. It’s a 9-foot ceiling; the midpoint is four and a half feet. Offense is measured up from the floor. Defense is measured down from the ceiling. That’s what I am doing: I am trying to measure everyone down from the ceiling, rather than up from the floor.

I am not suggesting that my method firmly resolves all of these issues. I am not suggesting that my theoretical method is perfect, that all of my zero points are accurately established, or that all of my internal values are correct.

What I am suggesting is that, once we have a method to address these questions, then people will SEE the potential avenues of understanding to which they have previously been blind. Once people see what CAN be known, then they’ll start to invent better ways to get at those problems, better ways to estimate the number of runs saved by the shortstop, better ways to estimate the defensive difference between a shortstop and a first baseman. That is the difference between knowledge and bullshit. Bullshit is the same in one generation as it was in the previous one. Knowledge evolves.