Remember me

More Strikeouts, More Walks

May 26, 2011

            I have some more data about strikeout and walk rates that I had better write up before I forget about it.    In my previous article about this I argued that

            1) One strikeout is not equal in value to one walk,

            2) Therefore the approach of evaluating strikeout and walk data by Strikeouts Minus Walks is not optimal, and

            3) We know this because, when you take a group of pitchers with a given rate of Strikeouts Minus Walks and split that group into high strikeout/high walk pitchers and low strikeout/low walk pitchers, the low strikeout/low walk pitchers are clearly and significantly better.  

            Tom Tango, responding to this in the reader comments, replied that "K rates have been on a steady rise, and there’s been a shift in batting average on balls in play between 1992 and 1994 (.280 pre-1992, .300-1994), (so) the groupings that Bill has done may have an era bias.   This is why I find it helpful to look at the data from 1993-present, because there’s been a pretty strong line drawn there, in terms of looking at unadjusted data."

 

            Well, OK; we can look at the data in that way if you prefer.   To keep the groups of substantial size I had to make the bands a little bit wider.   Whereas in the previous study I referred to all pitchers whose strikeouts exceeded their walks by 3.85 to 4.15 per nine innings as "+4", in this study I had to make the band stretch from 3.75 to 4.25, and whereas in the previous study I had 790 pitchers in the +4 group, in this version I had only 620, despite have stretched the band.   That stuff is interesting to nobody, but. . .those are the rules; we explain the details. Well, we explain some of the details; if it was real science we would have to explain all of the details, no matter how boring, but since it’s just fun science, we can explain some of the details and let a few of them go.

 

 

            Anyway, this is the data.   Pitchers who walked as many as they struck out had a 5.42 ERA if they were in the "Low Walk" half of the group, a 6.08 ERA if they were in the "High Strikeout" half.    I made eight groups with "High Strikeout" and "Low Walk" subgroups for each, and in every group the "Low Walk" sub-group had both a lower ERA and a higher Winning Percentage, with one exception.   In the "+5" group the "Low Walk" sub-group had a higher Winning Percentage (.545 to .520), but also a slightly higher ERA (3.88 to 3.86), although, since the high-walk pitchers gave up more Un-earned runs, they were still a fraction better if we counted those. The general conclusion is still that pitchers who have a given margin between strikeouts and walks are more effective if they have fewer walks than if they walk people but make up for it with strikeouts.

 
  
 

            If you aggregate the groups, the "High Strikeout" pitchers had 28% more strikeouts per inning pitched than the "Low Walk" pitchers, with:

            59% more walks,

            70% more wild pitches,

            19% more balks, and

            19% more hit batsmen.

 

            They had 6% fewer hits allowed but, interestingly enough, only 2% fewer home runs allowed.   The high strikeout pitchers, taking the groups as a whole, gave up 119 home runs per 1000 hits; the low walk pitchers, 115 per 1000 hits.

 

            There’s a lot more data there, but first we have another problem to deal with.   The high strikeout/high walk pitchers, in this study, also had 91% more Saves, per inning pitched, than the low walk pitchers.   High strikeout/high walk pitchers were much more likely to be used in relief. 

            I call this the church ceiling effect.   One time when I was about 13 years old, our church hired my dad to re-paint the ceiling, which hadn’t been re-painted, we don’t think, since the church was built 70 years before.   It wasn’t an easy job; the vault of the ceiling was probably 35 feet high, but we put up a scaffolding system and painted the ceiling a glowing, crystal white, made the whole church brighter. 

            A couple of weeks later the minister came over, and said, "Well, I’ll say this for you, George; you sure made the rest of the church look like hell."   It did, too; once you re-painted the ceiling it focused attention on the dirty walls of the rest of the church—and once those were painted, the floors looked like they hadn’t been cleaned since Satan moved out. 

            In studies like this, once you remove one bias, it focuses attention on the next.    If you have multiple biases in a data set they will tend to counter-act one another and minimize their effects.   If you remove one, it will focus attention on the next so that you have to remove the next, and when you remove the second, it merely focuses attention on a third.    You wind up with a fourth as much data as you started out with.   Most people in the field like to sift the data to remove biases, but I generally don’t believe in it.   

            Anyway. . . .having removed the timeline bias, we now have to compare starters to starters, so let’s get to it.   I removed all pitchers from the data who had less than 3 starts.   In modern baseball the jobs of pitchers are so sharply defined that there are very few pitchers who make 3 starts unless they are starting pitchers; pitchers don’t really shift back and forth between starting and relief a whole lot anymore.   If we remove the relief pitchers, this is the same data I showed you.

  

            The Low Walk pitchers still have a better ERA in and better Winning Percentage in every sub-group than do the High Strikeout pitchers, except that the anomaly noted before persists.  

            In the interests of accommodating further research by anybody who is still into this, this chart presents some extended data from these sub-groups, like hits, runs, earned runs and home runs allowed:

 
 
           For what it is worth, this data does show that performance gap between the High Strikeout and Low Walk groups is smaller than I first estimated it, in the earlier study.   Still, the difference is large enough that I would not choose to evaluate a pitcher’s strikeout and walk data by this approach.   I simply do not believe that a pitcher who strikes out 250 batters and walks 150 is likely to have the same level of effectiveness as a pitcher who strikes out 150 and walks 50.    That’s the last I’ll say about that; I’m not going to get emotionally invested in what metrics you choose to use; hell, I never liked OPS, either.

            Tom Tango, in a private communication, argued that


You can also do this:
1. Take any team, and calculate RC for that team.
2. Calculate RC/27 outs.
3. Add 100 walks and 100 strikeouts.
4. Repeat steps 1 and 2

You will get very close to the same number of RC/27 outs for both teams.

Actually, in saying this, he under-stated his argument; in my experiments, I found that you usually get a slightly lower RC/27 outs if you add 100 walks and 100 strikeouts.

This brings up a terribly interesting question at the tail end of this project: Why?   If, in theory, a pitcher is just as effective if he adds 50 walks and 50 strikeouts, why is this noticeably untrue if studied in the way I have studied it here?

There are three theories that suggest themselves:

1) The Runs Created formula understates the value of a walk.  This is, of course, true; we have known this for many years, that the Runs Created formula slightly understates the value of a walk, because it assumes that a walk has no "RBI value", only the "table setting" value.   In fact, a walk does have a small value in advancing runners.  The more complicated versions of the Runs Created method adjust for this. 

2) The fellow travelers of Walks are damaging.   This, again, is no doubt true; pitchers who walk more hitters also throw more Wild Pitches, hit more batters with the pitch, commit more Balks, and also give up more Home Runs as a percentage of hits.    These costs are not insignificant.

3) Walks may have some tendency to form clusters.   This is the really interesting one, because we don’t know if it is true or not.   Offensive elements are particularly damaging when they form clusters.   It may well be that walks, perhaps because of the tendencies of umpires or because of other factors, are more inclined to form clusters than are, let us say, singles.   We could study this, for example, by asking what a pitcher’s walk rate is when he has walked one of the previous two hitters, and when he has not.   If the walk rate is higher when there has been a previous walk, then there is evidence of clustering.

I am intuitively inclined to believe that such an effect might well exist.  

 
 

COMMENTS

No comments have been posted.
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy