Remember me

Random and Responsive Performance by Starting Pitchers: Part I

July 10, 2010

One of the central questions of sabermetrics, stated in the in obscure, unemotional language favored by academics and bankers testifying before congress, is to what extent player performance is responsive to conditions, and to what extent it occurs in random patterns.   This dispute has hundreds of manifestations.

            The experts and analysts of my youth universally believed that player performance was responsive to conditions.   They believed that there were large-scale, omnipresent interactions between situations and performance.   They believed that some hitters hit much better in clutch situations than at other times, while others fell short in the same conditions.   They believed that some hitters were RBI men, meaning that they had the ability to hit better in RBI situations.   They believed that if a base stealer reached base, the next hitter would hit better because the pitcher’s attention was divided.   They believed that a power hitter would hit better if he had another good hitter behind him, to protect him.   If one team hit .240 and scored 750 runs while another hit .270 and scored 700 runs, they attributed this to timely performance and to subtle, unmeasured skills like hitting behind the runner and going from first to third on a single.   They believed that good teams won the close games because they executed when the game was on the line.  They believed that great players dominated in the heat of a pennant race.  If one pitcher went 18-11 with an ERA of 4.00 while a teammate went 9-15 with an ERA of 3.60, they believed absolutely and without question that this happened because the 18-11 pitcher had pitched better when the game was on the line.  

            They believed these things, and they were the experts, and so, in 1975, I believed these things.   I had been told all of my life that these things were true, and I accepted that they were.   When I began to study the game in a systematic way, I expected to find obvious and abundant evidence of these responsive performance interactions.

            I couldn’t find them.

            I couldn’t find any of them, basically.   I couldn’t find (nor could other researchers) any evidence that certain players had an ability to hit in the clutch, nor an ability to hit in RBI situations.    I couldn’t find that when a base stealer reached base, the next hitter hit better, because in fact he doesn’t.    I couldn’t find that any hitter hit meaningfully better with “protection” behind him, because in fact nobody does.

I did come to understand why one team would hit .240 and score 750 runs, while another would hit .270 but score 700 runs, but the explanation, nine times in ten, wasn’t clutch hitting or subtle skills; it was that the team that hit .240 had more power and more walks.   With regard to good teams and close games, I found that the opposite of the conventional wisdom was true:  that good teams don’t in fact win the close games.   .600 teams are .550 teams in one-run games.

And I did come to understand why one pitcher would go 18-11 with a 4.00 ERA, while a teammate would go 9-15 with a 3.60 ERA.   It was offensive support.   It was entirely offensive support, I argued.   There was no evidence—or so I thought at the time--that it had anything at all to do with pitching well when the game was close, or, indeed, that there was any such ability.

I was widely attacked for saying these things, which was one of the best things that ever happened to me.  The attacks were of two natures:  1) That I was ignorant for knowing nothing about the small skills and subtleties of the game which could not be measured by mere numbers, and 2) That by arguing that player’s performance did not change with game situations, I was arguing against character and courage.    Players came through in the clutch, people chose to believe, because they were Men of Character.   By arguing that there was no evidence of any such ability, I was seen to be arguing against the relevance of character.

Conventional wisdom in these areas is not what it once was.   What the experts and analysts of 40 years ago believed, a certain number of them still believe today, and that “certain number” is near 100% of the on-air experts.   Still, the situation is very different than it was, because everybody now has data.   Everybody now has batting averages with runners in scoring position, and run support numbers for pitchers, and other points of data that grind away daily at the misconceptions that loomed as monsters in days of yore.   It is different because everybody has data, and it is different as well because there is now another community of analysts and experts, a community whose expertise is based not merely on playing the game, but on actually studying it.

 

 

II.

 

There are, of course, some documented interactions between situations and performances.   It is not that these responsive performance variations don’t exist, but that many of them don’t exist at all, while those that do either are of much less significance than was once believed or act in the opposite manner from what was long assumed.

Among the heretical denunciations of my youth, however, the one with which I am least comfortable is the belief that pitchers have zero ability to pitch to the level of the game—that is, in the example above, that ALL of the difference between the 18-11 pitcher and the 9-15 pitcher is in offensive support, and that none of it is explained by the 18-11 pitcher pitching to the level that was required to win the game.

Pitching is different from hitting in this respect:  that pitching is planned, whereas hitting is reacting.    A hitter who tried to “bear down” at key moments of the game would almost certainly perform worse, not better, as he would just be putting a lot of tension in his swing, disrupting his normal mechanical actions.   A pitcher, on the other hand, reasonably could maximize his outcomes by “bearing down” at key moments of the game.  A pitcher could “set up” hitters by throwing secondary pitches with the bases empty, but using his best stuff when the game was on the line.  A pitcher can coast when the game is out of hand, just working his way calmly through the innings within worrying too much about runs scoring.  I’m not saying that it’s true, but it’s easier to visualize it working for a pitcher than for a hitter.

One of the arguments I made, thirty years ago, was this:  that if you took the “A” and “B” parts of that set—the 18-11 pitcher being the “A” part—if you took the A and B parts of those sets and projected them forward, the “B” pitchers would perform better in subsequent seasons—not only better in terms of ERA, but better in terms of wins and losses as well.   There was no evidence of a carryover effect here, thus no evidence of a real skill.

There were other arguments that I made, as well, but the problem with that argument is this.  If every pitcher pitched to the level of the contest, and did so regularly and on a large scale, then of course we could easily identify that, and demonstrate that it was true.   But what if just one pitcher had this skill?

If just one pitcher had this trait, the aggregate data would look the same as if no pitcher had this trait—but what if 15 or 20% of pitchers had this trait, and what if the trait itself was fairly small and difficult to document?  Would I have been able, under those circumstances, to recognize the differences between a universe of pitchers, none of whom had this trait, and a universe of pitchers, 15 or 20% of whom had this trait to some small degree?

Frankly I could not have.   I am not well trained in statistical methods.   I think by analogy and I work by intuition, with a sort of bumper-sticker understanding of scientific methods.   It seems to me that if only a minority of pitchers had this trait, and if the trait was of relatively modest scale in those cases, that I would almost certainly have missed it.  

Well, is it reasonable to think that only 15 to 20% of pitchers would have a certain trait?

Sure it is.

What percentage of pitchers can throw an effective change-up?   What percentage of pitchers can throw 95?   What percentage of pitchers can throw left-handed?  What percentage of pitchers have a four-pitch mix?   What percentage of pitchers can pick a runner off first base?

Some pitchers have traits that other pitchers do not have.   It is not inherently unreasonable to think that some pitchers would have this trait and others would not, any more than it is unreasonable to think that some outfielders would be able to throw well enough to play right field, and others would not.

I’m not apologizing; I did honest work and reported on it honestly.   I am simply saying that, looking back at it from a distance of time, I find my own argument to be not entirely convincing.

 

 

III. The Curious Case of Mike Morgan

 

Recently, as Livan Hernandez moved past 2800 innings in his career, we veered into a discussion of whether he might be the worst pitcher ever to have such a long career.   In the course of this discussion Mike Morgan and his career 141-186 won-lost record were mentioned, probably by me.   It was pointed out to me in response that, while Morgan’s won-lost record was 45 games under water, Morgan was only 36 runs worse than an average pitcher, park and league adjusted—thus, that his won-lost record was basically attributable to the teams that he pitched for.

Or was it?

Actually, the teams that Mike Morgan pitched for weren’t all that bad.   Morgan pitched for 25 teams in his career.    Not counting Morgan’s decisions, 12 of those teams had losing records, 12 of them had winning records, and the other was exactly .500.   The career winning percentage of Mike Morgan’s teams other than when Mike Morgan was pitching, weighted by Morgan’s number of decisions each year, was .494.    Had Morgan had the same winning percentage as the rest of his team in every season and the same number of decisions that he actually had, his career won-lost record would have been 162-165.    Which, by the way, would have made him Bump Hadley.   Bump Hadley, most famous as the pitcher who skulled Mickey Cochrane, had a career record of 161-165, with a 4.24 ERA.

Might it not be true that Morgan’s runs allowed rate was misleading, because Morgan failed to pitch well when he had a chance to win the game?

In fact, it is true.  Morgan did not pitch well when he had a chance to win, and his won-lost record does reflect his failure to do so.

In tomorrow’s installment of this short series, I will document that this is true, and edge us toward a method to make the appropriate adjustments based on that.

 
 

COMMENTS (3 Comments, most recent shown first)

monahan
Agreed. A fantastic and concise description of the role of sabermetrics.

Also, it seems that the "analysts of your youth" have been resurrected in modern politics as the Tea Party.
3:03 AM Jul 21st
 
THBR
Ditto, brother Ralph, and I'm going to print off 10 or 20 copies of it to hand out to friends. This is a BASIC and UNDERSTANDABLE version of what sabermetrics is all about: posing the questions, posing the questions PROPERLY, finding the data, and showing from the data that you were right, you were wrong, or you need another study, set up perhaps differently.

*I* have no training in statistics, either, but I can usually follow Bill James because a) he thinks logically and b) he can write clearly. I love this stuff!
9:21 PM Jul 10th
 
nettles9
What do I think? I think I'm looking forward to the rest of this short series.
5:43 PM Jul 10th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy