I had a question in the “Hey, Bill” section on September 25, from someone signing in as Jake, asking about the “notion that ‘power pitcher’ are better in the postseason (this is often used to explain why Smoltz was better than Glavine or Maddux in the playoffs). Aside from the obvious that good pitchers tend to throw the ball harder than bad ones, this one doesn’t seem to hold much water, but I’ve never seen any data on it.”
I told him that I would try to study the issue. I had in mind a quick 5- or 7-hour study of the issue, but it turned into a time-consuming monster that probably took me 35 hours to do. . .like that effects your life or the value of the study. Anyway, I was not the first person to study this. Someone sent me a link to a study by Nate Silver, asking the more general question “What are the characteristics of teams which succeed in the post season?”, in which Silver found that teams with power pitchers did well.
We should say at the top that obviously this bias in the data could not explain the post-season effectiveness of a pitcher like Smoltz or Schilling. There may be a bias in post-season in favor of power pitchers, but obviously it could not be large enough to cause John Smoltz to go 15-4 with a 2.65 ERA in 207 post-season innings.
Anyway, I decided to study this with a matched set study, taking the quality of the pitchers out of the equation by very carefully matching power pitchers with equally good finesse pitchers. I started by identifying all pitchers in the years 1969 through 2001 who made at least one post-season start (starting in 1969 because that is when the playoffs start, giving us a more useful number of pitchers appearing in post season, and ending in 2001 because when I got through 2001 I knew that I had enough pitchers to make the study work.) There were 587 pitchers who made a post-season start in those years. . .not 587 different pitchers, but 587 if you count Bob Welch, 1981, Bob Welch, 1983, Bob Welch, 1985, Bob Welch, 1988, Bob Welch, 1989, Bob Welch, 1990, and Bob Welch, 1992, as seven pitchers.
I assigned each of these pitchers a “power score”, by this formula
2 Times Strikeouts
+ Walks
+ 2 Times (Strikeouts above the league average)
Per Nine innings.
Randy Johnson in 2001 had 372 strikeouts, 71 walks in 249.2 innings. An average pitcher in that league would have struck out 194 batters in 249.2 innings, so the Unit was +178 strikeouts. That makes his “Power Score”:
((2*372) + 71 + (2*178)) * 9 / 249.667
This makes a Power Score of 42.22, which was the highest of any pitcher in the study. The lowest Power Score of any pitcher in the study was 1.36, by Mike Flanagan in 1982.
I then started forming matched sets of pitchers with very different Power Scores, but nearly identical records in other respects. I matched pitchers on eleven criteria
Year (So that, in general, a pitcher from the 1970s would be more likely to be paired with another pitcher from the 1970s)
Age
Games Started
Wins
Losses
Innings Pitched
Runs Saved compared to league average
Career Innings Pitched
Career Wins
Career Losses
Power Score
The system is set up so that if two pitchers were identical in all of these areas their Similarity Score would be 1000.000. For every difference between them points are subtracted, except that for a difference in the Power Score points are ADDED, rather than subtracted, so that the highest-scoring candidate to be compared to each pitcher is a pitcher with very similar wins, losses, innings pitched, ERA, etc., but a very different Power Score.
The best match in the study was Scott McGregor, 1979, representing the “finesse” camp, and Kerry Wood, 1998, representing the “power” department. Both pitchers were 13-6. McGregor gave up 65 earned runs in 174.2 innings, giving him a 3.35 ERA, while Wood gave up 63 earned runs in 166.2 innings, giving him a 3.40 ERA. The league ERAs were almost the same (4.23 and 4.24), so McGregor was 17 runs better than an average pitcher, Wood 16 runs. Kerry Wood was 21 years old at that time; McGregor was 25. The largest difference between them, other than power, was that McGregor had pitched 536 career innings with a 31-25 career record, whereas Wood was a rookie.
But whereas McGregor had struck out 81 men and walked 23, Wood had struck out 223 and walked 85. Huge difference.
The second-best match was Randy Johnson, 2001, against Tom Glavine, 1998. Johnson was 21-6, 2.49 ERA; Glavine was 20-6, 2.47 ERA. Glavine was 32 years old with a career record of 173-105; Johnson was 37 years old, with a career record of 200-101.
This was not the only time that Glavine and Johnson were paired in the study. Randy Johnson in 1998 was also paired with Glavine in 2000. Johnson in 1997 (20-4, 2.28 ERA) was paired with Greg Maddux the same year (19-4, 2.20). Randy Johnson in 1999 (17-9, 2.48 ERA) was paired with Maddux in 1998 (18-9, 2.22). Maddux in various other years was paired with Pedro Martinez, Curt Schilling and David Cone. Maddux in 1989 (19-12, 2.95 ERA) is paired with Tim Leary in 1988 (17-11, 2.91). It looks silly in retrospect, but Maddux at the time had a career ERA of 3.77, Leary 3.78. Maddux in 1999 (19-9. 3.57 ERA) is paired with David Cone in 1995 (18-8, also a 3.57 ERA). David Cone in 1998 (20-7, 3.55 ERA) is paired with Jamie Moyer in 2001 (20-6, 3.43), and Cone in 1988 (20-3, 2.22 ERA) is paired with Orel Hershiser in 1985 (19-3, 2.03). At various other times Cone is paired with Ed Figueroa, Tom Glavine and Rick Reuschel.
Back-of-the-rotation guys have their matches as well. Scott Sanders, 1996 (9-5, 3.38 ERA) is paired with his actual teammate the same season, Alan Ashby (9-5, 3.23). Floyd Bannister, 16-10, 3.35 ERA in 1983, is paired with Zane Smith in 1991 (16-10, 3.20). Both pitchers had career records, at that time, of 67-78. Nolan Ryan is paired with Tommy John, Mike Scott with John Tudor, Juan Guzman with Larry Gura. Tim Lollar in 1984 (11-13, 3.91 ERA) is paired with Rick Camp in 1982 (11-13, 3.77). Dave Righetti (8-4 with a 2.05 ERA in 15 starts in 1981) is paired with Tim Wakefield (8-1 with a 2.15 ERA in 13 starts in 1995). Lance Painter in 1995 (3-0 with a 4.37 ERA) is paired with Bob Wolcott the same season (3-2, 4.42). David Segui in 1971 (10-8 in 21 starts, 3.14 ERA) is paired with Ray Burris in 1981 (9-7 in 21 starts, 3.05 ERA); their career records at the time were 74-83 and 72-83. Jack Morris, 1991, is paired with Tommy John, 1980; Morris was 18-12 to John’s 22-9, but both pitchers had 3.43 ERAs, and their career records were 214-141 and 216-162. Not everybody has a match, of course, but the pitchers who don’t match anybody are left out of the study.
The lowest-scoring “match” that qualified for the study was Mike Cuellar, 1971, with Don Sutton, 1974. Cuellar, 34 years old, had made 38 starts, pitched 292 innings with a record of 20-9. Sutton, 29 years old, had made 40 starts, 276 innings with a record of 19-9. Cuellar’s ERA was 3.08 against a league norm of 3.47, so he had saved 13 runs vs. an average pitcher. Sutton’s ERA was 3.23 against a league norm of 3.63, so he had saved 12 runs vs. an average pitcher. Cuellar had a career record at that time of 109-69; Sutton had a career record of 139-113. Even their walks were similar—78 for Cuellar, 80 for Sutton—but Cuellar, with 124 strikeouts, was 52 strikeouts below the American League norm in 1971, whereas Sutton, with 179 strikeouts, was 22 strikeouts above the National League norm in 1974.
In individual cases there were differences between the pitchers. In the aggregate, there were almost no indications of a difference in quality. The finesse pitchers were a little bit older, averaging 29.1 vs. 28.4 for the power pitchers. There were 100 pitchers in each group. The power pitchers had an aggregate won-lost record of 1519-831; the finesse pitchers with 1514-838. The power pitchers had pitched an average of 210.0 innings; the finesse pitchers, 211.1. Both groups had given up an average of 85.5 runs, with 77.1 or 76.8 of those earned, leading to ERAs of 3.33 (for the power group) and 3.29 (for the finesse group.) Both groups had relative ERAs (ERA divided by league ERA) of .813. The power pitchers had made an average of 199 career starts, with career records averaging 79-58 and career ERAs of 3.50. The finesse pitchers had made an average of 223 career starts, with records averaging 84-62 and career ERAs of 3.54.
Our essential goal was that there would be no observable difference in the quality of the pitchers in the two groups. But the power pitchers had averaged 183 strikeouts, 76 walks; the finesse pitchers had averaged 107 strikeouts, 57 walks. The two groups were nearly even in terms of home runs allowed (a few more for the power pitchers), but the finesse pitchers had given up, on average, 18 more hits. 18 more hits, 19 less walks, one less homer. . .the same results overall.
Having formed these two nearly-identical groups of 100 pitchers (all of whom had made at least one start in post-season play), I then looked up their records in post-season play. To cut to the chase without ceremony, the Power Pitchers did in fact perform better in post-season play than did the most-equal Finesse Pitchers. The Power Pitchers made 222 starts in post-season play, with a won-lost record of 85-67 and an ERA of 3.35. The Finesse Pitchers made 214 starts in post-season play, with a won-lost record of 73-87 and an ERA of 3.59.
This difference is NOT statistically significant. The winning percentage of the 200 pitchers, in post-season play, was .50641. The chance that a group of pitchers with that winning percentage would go 85-67 or better is 6%, and the chance that a group of pitchers with that winning percentage would go 73-87 or worse is 9%. The difference in ERA is more difficult to test for significance, but it appears to be obviously smaller than the difference in Won-Lost records, and thus unlikely to meet a higher standard of statistical significance.
In post-season play the Power Pitchers in our study faced 6,043 batters, pitching 1424.2 innings—essentially one team/season’s worth of pitching. The Finesse pitchers faced 5,684 batters, pitching 1343 innings. The power pitchers limited opponents to a post-season batting average of .227, whereas the finesse pitchers allowed a post-season batting average of .263—larger than the batting average difference between the groups in regular season. This difference was only partially offset by walks. The power pitchers walked 3.35 per nine innings in post-season play; the finesse pitchers, 2.69.
My conclusion is that it does appear to be probably true that Power Pitchers are more effective in post-season play, with an advantage in the neighborhood of 0.25 ERA, for pitchers of the same quality. However, the difference is small enough that these could be random effects.
One other note from the study: Six of the power pitchers won the Cy Young Award, while only two of the matched finesse pitchers were given the award.