On Walker and Walks

May 23, 2011

            How’s this for a Hot Streak:  In a stretch of 19 games in 1947, Walker Cooper hit 14 homers and drove in 38 runs.    Anybody top that?    You got 39 RBI in 19 games anywhere?   40 RBI in 20 games?   

            I got interested in Walker Cooper after noticing something surprising about his career.   This is Cooper’s career record:

 

 

            His 1947 season with the New York Giants sticks out there like a sore thumb—or, if not a sore thumb, at least an oversized digit of some kind; 35 homers there, half as many anywhere else, 122 RBI there, no similar numbers anywhere else.

            A friend of mine is named "Walker" after Walker Cooper.   While Cooper was a very famous player with the Cardinals in the early 1940s, he was famous mostly as Mort’s brother.   He was also Don Blasingame’s father-in-law, but he’s not famous for that.  He didn’t play regularly until he was 27 years old, which is unusual; he lost a year to the war (1945), and he was a part-time player almost all of his career.    In spite of this he lasted until he was old enough for his daughter to be dating his teammates, and was a highly effective pinch hitter at an advanced age.

            The 35-homer, 122-RBI season is the pearl of his career, and there is a back story there that was familiar to fans of my generation, although it would mean little to people under 50.   The 1947 Giants hit 221 homers as a team.   The 221 homers were:

            a)  a major league record at the time, and

            b)  almost twice as many as 14 of the other 15 teams.   The Pirates were the only other major league team to hit more than 115.  

            The previous major league record for homers by a team was 182, by the 1936 Yankees.  The Giants obliterated that record—but finished fourth.    It was a famous event at the time, frequently sited as evidence that home runs are overrated.    Cooper was one of four players who hit most of those homers—Johnny Mize, 51 homers, 138 RBI, Willard Marshall, 36 homers, 107 RBI, Cooper, and Bobby Thompson, 29 homers, 85 RBI.    Marshall’s year was more of a fluke than Cooper’s; other than that season he never hit more than 17 homers.   Thompson, although he later had good years, had only 2 homers before that season—he was basically a rookie—and even Mize, although he is in the Hall of Fame, had only one season prior to 1947 with more than 28 homers.

            Of course, later teams have pushed home run totals way beyond 221.  So anyway, you tend to write that season off as a fluke, and, in particular, you tend to write it off as a park illusion.    I was, then, quite surprised when I took a look at Cooper’s career home/road batting record:

 

 

G

AB

R

H

2B

3B

HR

RBI

AVG

OBP

SLG

Total

1473

4699

573

1341

240

40

173

813

.285

.332

.464

 

 

 

 

 

 

 

 

 

 

 

 

Home

756

2341

272

652

122

20

71

388

.279

.325

.439

Away

717

2358

301

689

118

20

102

425

.292

.339

.489

 

            His career OPS was 64 points higher on the road than in his home parks—one of the larger home field liabilities I have ever seen.   Wow, I thought; that must mean that his career numbers were held down by other parks, and 1947 was the one year that he had the park working for him.  

            Wrong again:

 

 

G

AB

R

H

2B

3B

HR

RBI

AVG

OBP

SLG

Total

140

515

79

157

24

8

35

122

.305

.341

.586

 

 

 

 

 

 

 

 

 

 

 

 

Home

68

239

32

62

6

5

12

44

.259

.300

.477

Away

72

276

47

95

18

3

23

78

.344

.376

.681

 

            His OPS in 1947, the fluke year, was 247 points higher on the road than it was in his home park.   Three-fourths of his doubles and almost two-thirds of his home runs and RBI were on the road. 

            Which, by the way, was not true for the other guys on the team; Willard Marshall hit 25 of his 36 homers in the Polo Grounds, and Mize hit 29 of 51 there.    So then I started looking at Cooper’s daily log.   I knew he hit homers in six straight games that year—I mentioned that in the Historical Abstract—but I really had no idea of the magnitude of his hot streak.   This is Cooper’s Game Log from June 9 to July 3, 1947:

 

 

 

 

Home/

 

 

 

 

 

M

D

#

Road

Versus

AB

R

H

RBI

Notes

6

9

 

VS

PIT

4

2

2

5

2 Home Runs

6

10

 

VS

PIT

5

0

1

0

 

6

11

 

VS

PIT

4

0

0

0

 

6

14

 

AT

CIN

4

1

2

3

Homer

6

15

1

AT

CIN

4

2

2

0

 

6

18

 

AT

PIT

6

1

2

3

Homer

6

19

 

AT

PIT

3

1

1

0

 

6

20

 

AT

STL

4

1

1

2

Homer

6

21

 

AT

STL

4

1

1

2

 

6

22

 

AT

STL

5

2

3

3

2 Homers

6

23

 

AT

CHI

5

1

1

2

Homer

6

24

 

AT

CHI

5

2

2

2

Homer

6

25

 

AT

CHI

4

1

3

4

Homer

6

27

 

VS

PHI

4

1

1

3

Homer

6

28

 

VS

PHI

4

3

2

2

Homer

6

29

1

VS

BRO

3

1

0

0

 

7

1

 

VS

BOS

4

2

2

2

Homer

7

2

 

AT

BRO

4

0

1

0

 

7

3

 

AT

BRO

5

2

3

5

Homer

 

            Cooper hit "just" .370 in the 19-game stretch, but with 14 homers and 38 RBI.   Interesting year.

 

 

K Minus Walks

 

            Some of you may be aware that I’ve had a disagreement with a reader on the site about the use of Strikeouts Minus Walks as an indicator of the quality of the Strikeout and Walk combination, and I won’t mention the reader’s name because it is better not to personalize disagreements, but. . .it’s a highly esteemed reader that I don’t ordinarily argue with.

            The reader, however, argues that the best way to look at strikeouts and walks by a pitcher is to look at strikeouts minus walks, to look at the margin between the two.   That can’t be right, I argue, because to do that implies that one strikeout is roughly equal in value to one walk, which it isn’t; a strikeout is nowhere near equal in value to a walk. 

            OK, how do we study this?   Here’s what I did.   I took all pitcher/seasons from 1946 through 2009 (I haven’t updated that data base yet to include 2010).   For all pitchers in that area I figured their Strikeouts Minus Walks per 9 innings.   Then I focused on narrow bands of Strikeouts Minus Walks:

 

            All pitchers with 0.88 to 1.12 Strikeouts Minus Walks Per 9 Innings (Representing 1.00)

            All pitchers with 1.88 to 2.12 Strikeouts Minus Walks Per 9 innings (Representing 2.00)

            All pitchers with 2.88 to 3.12 Strikeouts Minus Walks Per 9 innings (Representing 3.00)

            All pitchers with 3.85 to 4.15 Strikeouts Minus Walks Per 9 innings (Representing 4.00)

            All pitchers with 4.80 to 5.20 Strikeouts Minus Walks Per 9 innings (Representing 5.00)

            All pitchers with 5.75 to 6.25 Strikeouts Minus Walks Per 9 innings (Representing 6.00)

           

            There were 917 pitchers in Group 1.00,

1,154 in Group 2.00,  

1,008 in Group 3.00,

790 in Group 4.00,

613 in Group 5.00, and

391 in Group 6.00.

 

It is certainly true that the pitchers in Group 2.00—the pitchers who have 2.00 more strikeouts than walks per nine innings—are better than those in Group 1.00, Group 3.00 is better than Group 2.00, etc.   This is the data on that:

 

 

 

 

 

 

 

 

 

 

Cy Young

 

 

Group

G

W

L

WPct

IP

SO

BB

ERA

Seasons

KP9

BB9

1.00

26684

4801

5202

.480

90061.1

45211

35236

4.16

1

4.52

3.52

2.00

35776

6751

6877

.495

120986.0

69832

42933

4.03

2

5.19

3.19

3.00

32162

6047

5688

.515

104322.1

69437

34672

3.86

4

5.99

2.99

4.00

26477

4912

4157

.542

82050.0

62835

26459

3.65

7

6.89

2.90

5.00

23134

3516

2878

.550

57539.0

50109

18258

3.43

6

7.84

2.86

6.00

14425

2092

1523

.579

32446.2

31228

9688

3.18

15

8.66

2.69

 

Let me strip out some of the data there to emphasis the rest:

 

 

 

 

 

 

 

 

 

 

 

 

Group

 

 

 

WPct

 

 

 

ERA

 

KP9

BB9

1.00

 

 

 

.480

 

 

 

4.16

 

4.52

3.52

2.00

 

 

 

.495

 

 

 

4.03

 

5.19

3.19

3.00

 

 

 

.515

 

 

 

3.86

 

5.99

2.99

4.00

 

 

 

.542

 

 

 

3.65

 

6.89

2.90

5.00

 

 

 

.550

 

 

 

3.43

 

7.84

2.86

6.00

 

 

 

.579

 

 

 

3.18

 

8.66

2.69

 

As the margin between strikeouts and walks increases, the performance improves—the Winning Percentage improves, the ERA improves, and the number of Cy Young seasons increases.   This is not a surprise. 

That’s one question, but there is another.   Is 7 strikeouts, 3 walks really the same as 5 strikeouts and 1 walk?    Are these really legitimate groups?

The group "6.00" includes Nolan Ryan, 1973 (383 strikeouts, 162 walks) and Sam McDowell, 1968 (283 strikeouts, 110 walks), but also includes Greg Maddux, 1997 (177 strikeouts, 20 walks) and Roy Halladay, 2003 (204 strikeouts, 32 walks.)   The group "5.00" includes Herb Score, 1956 (263-129), Kerry Wood, 2002 (217-97), and Sam McDowell, 1970 (304-131), but also includes Ferguson Jenkins, 1974 (225-45), Rick Reed, 2001 (142-31), and Jimmy Key, 1993 (173-43).  Should these pitchers really be grouped together?

Each of these groups can be easily re-sorted into high-strikeout, high-walk and low-strikeout, low-walk sub groups, by sorting them by ((SO + BB)/IP) rather than ((SO – BB)/IP).   If these are valid groups—if one strikeout has roughly equal impact to one walk—then the top half of the 6.00 group should be equal in effectiveness to the bottom half of the 6.00 group.   I argue that this cannot be true, because one strikeout is NOT equal in impact to one walk; my friend argues that it is.

Not so much.     This is the "6.00" group, sorted into the top half (Ryan and McDowell) and the bottom half (Maddux and Halladay):

Group

G

W

L

WPct

IP

SO

BB

ERA

KP9

BBP9

Group 6.00

14425

2092

1523

.579

32446.2

31228

9688

3.18

8.66

2.69

Top 196

7584

787

655

.546

12561.0

13443

5092

3.38

9.63

3.65

Bottom 195

6841

1305

868

.601

19885.2

17785

4596

3.06

8.05

2.08

 

The winning percentage of the Maddux/Halladay pitchers is .601.   The winning percentage of the Ryan/McDowell pitchers is .546.    This is the split for group 1.00:

 

Group

G

W

L

WPct

IP

SO

BB

ERA

KP9

BBP9

Group 1.00

26684

4801

5202

.480

90061.0

45211

35236

4.16

4.52

3.52

Top 459

12862

1806

2210

.450

36072.0

21813

17779

4.51

5.44

4.44

Bottom 458

13822

2995

2992

.500

53989.0

23398

17457

3.93

3.90

2.91

 

Again, there is a 50-point difference in the winning percentage between the top half of the group—the high strikeout/high walk group--and the bottom half.   The difference between Group 1.00 and Group 2.00, in Winning Percentage, is only 15 points.   The difference between Group 1.00 and Group 3.00 is only 35 points winning percentage.   The difference between the high strikeout/high walk pitchers and the low strikeout/low walk pitchers in these two groups is 50 points.  

This chart splits all six groups into their high walk/high strikeout and low walk/low strikeout components:


 

OK, the winning percentage difference is more like 40 points than 50, but still. …it takes almost two strikeouts to offset one walk, not one to one.  

Another reader, trying to be helpful, suggests that the accuracy of "Strikeouts Minus Walks" supports the McCracken thesis, but actually, that’s the problem:  It clearly contradicts the McCracken thesis.   One strikeout only eliminates about .30 hits.  Three-tenths of a hit is obviously worth nowhere near as much as one walk—therefore, for strikeouts minus walks to be a valid way to look at the relationship, the strikeout pitcher would have to be gaining a large additional benefit from his strikeout tendency, beyond the direct benefit of having one less ball in play.   The McCracken thesis is that there is no such secondary benefit. 

            My experience is that posting research like this rarely ends a debate.   Other people will still see the data in other ways, and obviously they have a right to do so.   I don’t see that Strikeouts Minus Walks is an appropriate way to look at the Strikeout and Walk data.

 
 

COMMENTS (13 Comments, most recent shown first)

lvrotsos
It's a bit misleading how you called them the top and bottm halves of each k-bb groups seeming as you showed the bottom halves were a bit better.
5:37 PM May 24th
 
jwilt
The thing about Cooper is that he was a catcher, exclusively. Never played a defensive inning anywhere else. He actually played quite a bit for a catcher of his era, leading or nearly leading the 1940-55 era in games caught. 1947 was the one year he was healthy enough or was allowed to catch 130 games. By rate stats '47 was probably his best year (although you could argue for '55), but it doesn't stick out like a sore thumb.
1:32 PM May 24th
 
tangotiger
My response: http://www.insidethebook.com/ee/index.php/site/article/k_minus_bb_differential_or_ratio/
11:49 AM May 24th
 
slideric
Neat job. A great way to discuss, with data. Needs a response and you will get one. thanks
10:17 AM May 24th
 
tangotiger
sptaylor: What's your point? I've been using Tangotiger ever since I've been online for over a decade. Whether it's a name, or handle, or whatever, it's still a unique identity. There's a Tangotiger in Denmark I think, and me. In the whole internet world! I can turn into an a$$hole, and that will affect me personally, because I can't create a new identity called Tomahawk, and be able to talk about Win Expectancy and people not ask "aren't you that a$$hole tangotiger?". Just because you might think a handle is disposable, I don't.

In any case, Bill I presume meant to show he disagreed with my idea, and not me personally, and hence not to make it personal. And I said, he can use Tangotiger, and I still wouldn't make it personal anyway. He erred on the side of caution.

***

Charlie: right, it's more in-line with the linear weights model as the explanation.
4:57 PM May 23rd
 
CharlesSaeger
How many runs is a strikeout worth? Let's see ... turning an out into 0.30 hits. We'll set a hit at 0.56 runs, accomodating for extra bases; we'll make that 0.29 hits per ball in play; we'll ignore home runs, which might not be a good thing for this but what the hell. That's 0.16 runs, half a walk.

TomTom, would you need to add the value of an out to this derivation, making this a quarter run, approaching the value of a walk?
4:43 PM May 23rd
 
sptaylor70
Tom - Of course you don't mind if Bill uses your name. It's not really your name, but rather just a "handle."
3:03 PM May 23rd
 
sokho
Bill, Tom: It's a pleasure and a privilege to listen in on a dialogue between two such eminent analysts. Beggars can't be choosers, but one small quibble from the peanut gallery: there's a bit too much civility here. Can you guys do us the favor of tossing out the odd curse word here and there, maybe work in a comment about each others' mothers? Then, there'd be NO reason for me to keep reading other sabermetrics discussion boards...
2:24 PM May 23rd
 
Steven Goldleaf
Another, perhaps obvious, way to express the total oddity of Cooper's 1947 stats is that his 23 road HRs far eclipses his TOTAL season HRs in any other year (his high is 18), and his road RBIs almost does the same (one season he had three more total RBI than he had on the road in '47).
2:00 PM May 23rd
 
Robinsong
Thanks, Bill. I was the other reader and you corrected my thinking. Thank you for studying the issue and for explaining it well.

Why was Walker Cooper a part-time player? Was he platooned? Easily injured? Or was he Don Slaught?
1:52 PM May 23rd
 
tangotiger
Two additional points to mention, each also carried the issue of bias:
1. I had said it should be K minus BB per PA, not per IP. It's not that big a deal, but it still has some bias to it.
2. And of course, HR exploded at the same time BABIP jumped (between 1992-1994). And so, if you have more guys in the high K minus BB per PA group, you'd also have higher HR rates as well.

If Bill were to update his charts above to also show BABIP and HR, then we'll see that these more to the groups than they have differing K and BB rates.
12:40 PM May 23rd
 
tangotiger
I'm the unnamed person, and I don't mind being named. I wouldn't take this issue personally. I will say one important thing: because K rates have been on a steady rise, and there's been a shift in batting average on balls in play between 1992 and 1994 (.280 pre-1992, .300 post-1994), that the groupings that Bill has done may have an era bias. This is why I find it helpful to look at the data from 1993-present, because there's been a pretty strong line drawn there, in terms of looking at unadjusted data.

That said, I appreciate what Bill did, and it's exactly the way the study should have been set up. It's just a question of whether there's an era bias.
12:19 PM May 23rd
 
Brian
In the 1st 19 games of June, 1998, Sammy Sosa had 17 HRs and 35 RBIs. Completely unrelated thought: was Cooper ever tested for PEDs?
12:14 PM May 23rd
 
 
©2021 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy