Responding to Reader Comments
In an article posted here a few days ago, I discussed the issues of whether and how we should count “RBI Opportunities”. My proposal was to count RBI opportunities as the sum of actual RBI and missed RBI opportunities, with missed RBI opportunities being defined as:
1.00 for each runner left on third base with less than two out, plus
.70 for a runner left on second base or left on third base with two out, plus
.40 for a runner left on first base.
With no missed RBI opportunities counted, however, when there was no out made. I asked readers to comment on that proposal, and the purpose of this article is to carry forward the discussion by responding to your comments.
The first responder, mskarpelos, said that he liked the general approach, but that “I suggest using adjustments based on the famous 24-node finite state Markov Chain.” I tend to regard the 24 states analysis sort of the way I regard music at the ballpark: I know that we need this, but I’m not sure that we need quite so much of it.
Statistics, like all other products, must serve the needs of the consumers as well as the inclinations of the statisticians. The public may eventually get the general idea of a 24-states analysis, but you’re asking an awful lot, to ask the public to sit through calculations of that length in order to get to something as simple as a count of RBI opportunities.
I have invented many stats that are now in common usage, and I have invented many more that are or have been forgotten. It is my view that, for a stat to have a chance
to succeed, we have to be able to explain it to people whose patience with statistical analysis is extremely limited. We have to shoot for the day on which a radio announcer can say casually “Jackson has 48 RBI this year in 147 RBI opportunities, not a very good ratio,” and everybody will know what he means. We can never reach that point if I can’t explain to the radio announcer how RBI opportunities are counted. And I can’t explain it if I have to reference something that he doesn’t understand or want to understand, like the Markov Chain.
So I think what you’re really saying.. ..I know you don’t mean it this way. . .but I think what you’re really saying is “Screw the consumer; let’s just make it up the way we like it.” But if you do that, you very quickly become irrelevant to the general discussion. I have worked hard to be relevant.
The same general issue arises in respect to Tangotiger’s suggestion that the opportunity loss from a batter failure with a runner on third, no out, is actually much less than the opportunity loss from a failure with a runner on third with two out, thus that the grouping should not be the 1st and 2nd outs of the inning against the 3rd, but rather, the 1st and 3rd outs against the second.
I very much appreciate Tango’s research on the issue of how my outline relates to the run probabilities chart, and I appreciate the generally supportive comments. But that said. . .aren’t you kind of looking at the issue from the wrong angle here? Measuring changes in run states is a very useful exercise, but it doesn’t happen to be exactly what we’re doing right now. Exactly what we’re doing right now is measuring missed RBI opportunities.
A batter who strikes out or pops out with a runner on third and no one out has missed a very easy RBI opportunity—the same as he has if he strikes out or pops out with one out. True, the “damage” to his team, by the Markov Chain, may be greater with one out—just as one RBI may be of much more value than another RBI--but what the batter has done is the same. We don’t adjust RBI counts for the value of each RBI, at least while we are in the process of simply counting RBI. Does it really make sense to try to adjust Missed RBI for the value of the RBI that are missed?
Setting that issue aside, I go back to the earlier problem. I visualize myself on the radio, trying to explain this concept to some afternoon radio guy in Mobile, Alabama—and the reality is, if the afternoon radio guy doesn’t get the idea, then the public doesn’t get the idea. It seems to me that, by making the “missed RBI count” larger for a runner on third, one out, than for a runner on third, no one out, we’re putting a huge unnecessary boulder in the path of the explanation. We’re going to have three minutes on the air to explain the concept—and I’ll guarantee you, if we do this, we’re going to spend two of the three minutes trying to explain why striking out with a man on third and no one out is better than striking out with a man on third and one out. I just can’t feature doing that.
The suggestion to park-effect the stat, again, is a suggestion to complicate the computation to such an extent that no one will ever understand what we’re doing. I will also point out that this is contrary to standard statistical practice. Everything is related to park effects—including, for example, RBI themselves. But when counting RBI, we don’t count an RBI as .80 RBI in Coors Field and 1.15 in Shea Stadium—nor should we. We count them first, then we adjust for park effects. Same thing here: count first, adjust later.
Trailbzr asks “What is the purpose of the stat?”, to which the answer is “to measure each hitter’s RBI opportunities.” Sorry. . .I thought that was clear.
It is worth making the point: it is not the purpose of this stat to measure how good a hitter someone is, or to measure what his value is compared to another hitter. We are simply trying to put the RBI stat in the context of RBI opportunities—that’s all.
The discussion about Mike Schmidt seems to be leap-frogging the research, and I probably shouldn’t comment on that.
Martin suggests that “I suspect that this will go over well with the statheads and almost nobody else. Remember how much people despised GWRBIs? Well, they’ll view this the same way.”
This is just my opinion, but I think that kind of defeatist thinking is fantastically wrongheaded. First of all, I am very puzzled by how you can connect this to GWRBI, or why you would do this. Out of a universe of billion failed stats and a few hundred successful ones which have been introduced since then, why did you choose to link RBI opportunities to a stat with which it seems to have virtually nothing in common?
Game Winning RBI failed for the best possible reason: it was horribly designed. It richly deserved to fail. Most new stats fail—including mine, many of which, in retrospect, have obvious design flaws--and people who work with me know that I am perpetually over-optimistic about the chances of making a new stat work. We’ll see, I guess. My feeling is that it can succeed, if we do enough things right.
Gregg Borgeson pointed out two missing elements in the stat: 1, that we shouldn’t charge an RBI opportunity for a successful sacrifice bunt, since the bat has been taken out of the batters hands, and 2, that we should do something with double play balls. He is correct on both points. . .he had a suggestion for what to do with GIDP, which I have exactly adopted, but there is a need to do something there.
Ralph C. (Cramden?) asked what happens if multiple players are stranded. We just add up the totals. . .if you bat with the bases loaded, nobody out, and strike out, that counts as 2.1 missed RBI—1.00 for the runner on third, .70 for the runner on second, .40 for the runner on first.
OK, there’s one more issue here: RBI opportunities when somebody could hit a home run. My proposal before ignored these. I think there are four reasons not to ignore them:
1) Parallel construction. If a player gets an RBI for a solo home run, why is there no “opportunity” for an RBI if the bases are empty?
2) Mathematical consistency. We had this:
Runner on Third 1.00
Runner on Second .70
Runner on First .40
Doesn’t this fit kind of perfectly:
Runner on Third 1.00
Runner on Second .70
Runner on First .40
Runner at Home .10
3) Suppose that you compare these two players:
AB H 2B 3B HR BB SO RBI
400 136 32 4 24 70 50 90
500 136 32 4 24 40 100 90
In that case, are these two players the same as RBI men, or are they different? Were their RBI opportunities truly the same, or were they different?
They were different. The player who had 100 more at bats had 100 more chances to do something. It seems to me that it is inappropriate to entirley ignore those—even in the extreme and unusual case, where the second player makes all of the marginal outs with the bases empty.
4) With all due respect to the people who wanted to drag the Markov chain into this, it seems to me that that’s really not the relevant “balance” that we should be looking for. What we should looking for, it seems to me, is to balance the numbers so that an average hitter has essentially the same ratio of RBI to missed RBI in each situation. I don’t KNOW, but I would assume we’re pretty close to that. That was my intent, anyway.
Suppose that the second hitter above, with 500 at bats. . ..let’s assume that he is kind of an average hitter, and let’s assume that his plate appearances are split:
135 with men on third
135 with men on second
135 with men on first
135 with the bases empty
Unrealistic, of course, but for illustration, and let’s assume that his performance is the same in each group—34 hits, 8 doubles, a triple, 6 homers, 10 walks.
With men on third, the player would probably drive in about 49 runs (one for each hit, plus 6 for the homers, plus about 7 that might score on fly balls and ground balls.) He would probably make about 84 outs that didn’t produce and RBI, for which he would be charged with 76 missed RBI (assuming that two-thirds of these outs would be the first or second out.) Thus, his expected RBI percentage would be between .350 and .400—49 for 125, more or less.
With runners on second he probably would drive in about 36 runs (12 on the homers, one on the triple, 8 on the doubles, and about 15 on the singles) while making 91 outs that didn’t produce a run, for which we would charge him about 63.7 missed chances. Thus, his expected RBI percentage would be in the same general range—about .400 (36 for 99.7. You get different numbers when you estimate for different types of hitters.)
With runners on first he probably would drive in about 19 runs (12 on the homers, one on the triple, 6 on the doubles) while making the same 91 outs that didn’t produce a run, for which we would charge him 36.4 missed chances. Thus, his expected RBI percentage would be in the same range—about .350 (19 for 55.4).
But with the bases empty, if we don’t charge anything for outs there, he would have 6 RBI vs. no missed RBI—a percentage of 1.000. That doesn’t seem right, and it causes bases-empty opportunities to distort the totals. If you charge him .10 missed RBI for each of his 91 outs, he’s back in the same range—about .400 (6 for 15.1). That seems to me to be better.
OK, this is what I’m going to do. . .and thank you all for your input. We’re going to figure RBI opportunities in this way:
1) RBI Opportunities are the sum of actual RBI and Missed RBI Opportunities.
2) Missed RBI Opportunities are tallied as follows:
1.00 for a runner left on third base with less than two out or for grounding
into a double play,
.70 for a runner left on second base or for a runner left on third base
with two out,
.40 for a runner left on first base,
.10 for a bases-empty out, HOWEVER
No Missed RBI Opportunities are charged when the batter does not make an out or hit into a forceout, and
No Opportunity is charged on a successful sacrifice bunt.
If a player leaves multiple runners on base he is charged with each missed opportunity. If a player Grounds into a Double Play with other runners on base, he is also responsible for the other missed opportunities. He is not charged with a missed RBI opportunity, however, for a runner who scores from third on the Double Play—no RBI, but no missed RBI for that runner, since that runner has scored.