Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

On Walker and Walks

By Bill James

May 23, 2011

How’s this for a Hot Streak: In a stretch of 19 games in 1947, Walker Cooper hit 14 homers and drove in 38 runs. Anybody top that? You got 39 RBI in 19 games anywhere? 40 RBI in 20 games?

I got interested in Walker Cooper after noticing something surprising about his career. This is Cooper’s career record:

His 1947 season with the New York Giants sticks out there like a sore thumb—or, if not a sore thumb, at least an oversized digit of some kind; 35 homers there, half as many anywhere else, 122 RBI there, no similar numbers anywhere else.

A friend of mine is named "Walker" after Walker Cooper. While Cooper was a very famous player with the Cardinals in the early 1940s, he was famous mostly as Mort’s brother. He was also Don Blasingame’s father-in-law, but he’s not famous for that. He didn’t play regularly until he was 27 years old, which is unusual; he lost a year to the war (1945), and he was a part-time player almost all of his career. In spite of this he lasted until he was old enough for his daughter to be dating his teammates, and was a highly effective pinch hitter at an advanced age.

The 35-homer, 122-RBI season is the pearl of his career, and there is a back story there that was familiar to fans of my generation, although it would mean little to people under 50. The 1947 Giants hit 221 homers as a team. The 221 homers were:

a) a major league record at the time, and

b) almost twice as many as 14 of the other 15 teams. The Pirates were the only other major league team to hit more than 115.

The previous major league record for homers by a team was 182, by the 1936 Yankees. The Giants obliterated that record—but finished fourth. It was a famous event at the time, frequently sited as evidence that home runs are overrated. Cooper was one of four players who hit most of those homers—Johnny Mize, 51 homers, 138 RBI, Willard Marshall, 36 homers, 107 RBI, Cooper, and Bobby Thompson, 29 homers, 85 RBI. Marshall’s year was more of a fluke than Cooper’s; other than that season he never hit more than 17 homers. Thompson, although he later had good years, had only 2 homers before that season—he was basically a rookie—and even Mize, although he is in the Hall of Fame, had only one season prior to 1947 with more than 28 homers.

Of course, later teams have pushed home run totals way beyond 221. So anyway, you tend to write that season off as a fluke, and, in particular, you tend to write it off as a park illusion. I was, then, quite surprised when I took a look at Cooper’s career home/road batting record:

	G	AB	R	H	2B	3B	HR	RBI	AVG	OBP	SLG
Total	1473	4699	573	1341	240	40	173	813	.285	.332	.464

Home	756	2341	272	652	122	20	71	388	.279	.325	.439
Away	717	2358	301	689	118	20	102	425	.292	.339	.489

His career OPS was 64 points higher on the road than in his home parks—one of the larger home field liabilities I have ever seen. Wow, I thought; that must mean that his career numbers were held down by other parks, and 1947 was the one year that he had the park working for him.

Wrong again:

	G	AB	R	H	2B	3B	HR	RBI	AVG	OBP	SLG
Total	140	515	79	157	24	8	35	122	.305	.341	.586

Home	68	239	32	62	6	5	12	44	.259	.300	.477
Away	72	276	47	95	18	3	23	78	.344	.376	.681

His OPS in 1947, the fluke year, was 247 points higher on the road than it was in his home park. Three-fourths of his doubles and almost two-thirds of his home runs and RBI were on the road.

Which, by the way, was not true for the other guys on the team; Willard Marshall hit 25 of his 36 homers in the Polo Grounds, and Mize hit 29 of 51 there. So then I started looking at Cooper’s daily log. I knew he hit homers in six straight games that year—I mentioned that in the Historical Abstract—but I really had no idea of the magnitude of his hot streak. This is Cooper’s Game Log from June 9 to July 3, 1947:

			Home/
M	D	#	Road	Versus	AB	R	H	RBI	Notes
6	9		VS	PIT	4	2	2	5	2 Home Runs
6	10		VS	PIT	5	0	1	0
6	11		VS	PIT	4	0	0	0
6	14		AT	CIN	4	1	2	3	Homer
6	15	1	AT	CIN	4	2	2	0
6	18		AT	PIT	6	1	2	3	Homer
6	19		AT	PIT	3	1	1	0
6	20		AT	STL	4	1	1	2	Homer
6	21		AT	STL	4	1	1	2
6	22		AT	STL	5	2	3	3	2 Homers
6	23		AT	CHI	5	1	1	2	Homer
6	24		AT	CHI	5	2	2	2	Homer
6	25		AT	CHI	4	1	3	4	Homer
6	27		VS	PHI	4	1	1	3	Homer
6	28		VS	PHI	4	3	2	2	Homer
6	29	1	VS	BRO	3	1	0	0
7	1		VS	BOS	4	2	2	2	Homer
7	2		AT	BRO	4	0	1	0
7	3		AT	BRO	5	2	3	5	Homer

Cooper hit "just" .370 in the 19-game stretch, but with 14 homers and 38 RBI. Interesting year.

K Minus Walks

Some of you may be aware that I’ve had a disagreement with a reader on the site about the use of Strikeouts Minus Walks as an indicator of the quality of the Strikeout and Walk combination, and I won’t mention the reader’s name because it is better not to personalize disagreements, but. . .it’s a highly esteemed reader that I don’t ordinarily argue with.

The reader, however, argues that the best way to look at strikeouts and walks by a pitcher is to look at strikeouts minus walks, to look at the margin between the two. That can’t be right, I argue, because to do that implies that one strikeout is roughly equal in value to one walk, which it isn’t; a strikeout is nowhere near equal in value to a walk.

OK, how do we study this? Here’s what I did. I took all pitcher/seasons from 1946 through 2009 (I haven’t updated that data base yet to include 2010). For all pitchers in that area I figured their Strikeouts Minus Walks per 9 innings. Then I focused on narrow bands of Strikeouts Minus Walks:

All pitchers with 0.88 to 1.12 Strikeouts Minus Walks Per 9 Innings (Representing 1.00)

All pitchers with 1.88 to 2.12 Strikeouts Minus Walks Per 9 innings (Representing 2.00)

All pitchers with 2.88 to 3.12 Strikeouts Minus Walks Per 9 innings (Representing 3.00)

All pitchers with 3.85 to 4.15 Strikeouts Minus Walks Per 9 innings (Representing 4.00)

All pitchers with 4.80 to 5.20 Strikeouts Minus Walks Per 9 innings (Representing 5.00)

All pitchers with 5.75 to 6.25 Strikeouts Minus Walks Per 9 innings (Representing 6.00)

There were 917 pitchers in Group 1.00,

1,154 in Group 2.00,

1,008 in Group 3.00,

790 in Group 4.00,

613 in Group 5.00, and

391 in Group 6.00.

It is certainly true that the pitchers in Group 2.00—the pitchers who have 2.00 more strikeouts than walks per nine innings—are better than those in Group 1.00, Group 3.00 is better than Group 2.00, etc. This is the data on that:

									Cy Young
Group	G	W	L	WPct	IP	SO	BB	ERA	Seasons	KP9	BB9
1.00	26684	4801	5202	.480	90061.1	45211	35236	4.16	1	4.52	3.52
2.00	35776	6751	6877	.495	120986.0	69832	42933	4.03	2	5.19	3.19
3.00	32162	6047	5688	.515	104322.1	69437	34672	3.86	4	5.99	2.99
4.00	26477	4912	4157	.542	82050.0	62835	26459	3.65	7	6.89	2.90
5.00	23134	3516	2878	.550	57539.0	50109	18258	3.43	6	7.84	2.86
6.00	14425	2092	1523	.579	32446.2	31228	9688	3.18	15	8.66	2.69

Let me strip out some of the data there to emphasis the rest:


Group	WPct	ERA	KP9	BB9
1.00	.480	4.16	4.52	3.52
2.00	.495	4.03	5.19	3.19
3.00	.515	3.86	5.99	2.99
4.00	.542	3.65	6.89	2.90
5.00	.550	3.43	7.84	2.86
6.00	.579	3.18	8.66	2.69

As the margin between strikeouts and walks increases, the performance improves—the Winning Percentage improves, the ERA improves, and the number of Cy Young seasons increases. This is not a surprise.

That’s one question, but there is another. Is 7 strikeouts, 3 walks really the same as 5 strikeouts and 1 walk? Are these really legitimate groups?

The group "6.00" includes Nolan Ryan, 1973 (383 strikeouts, 162 walks) and Sam McDowell, 1968 (283 strikeouts, 110 walks), but also includes Greg Maddux, 1997 (177 strikeouts, 20 walks) and Roy Halladay, 2003 (204 strikeouts, 32 walks.) The group "5.00" includes Herb Score, 1956 (263-129), Kerry Wood, 2002 (217-97), and Sam McDowell, 1970 (304-131), but also includes Ferguson Jenkins, 1974 (225-45), Rick Reed, 2001 (142-31), and Jimmy Key, 1993 (173-43). Should these pitchers really be grouped together?

Each of these groups can be easily re-sorted into high-strikeout, high-walk and low-strikeout, low-walk sub groups, by sorting them by ((SO + BB)/IP) rather than ((SO – BB)/IP). If these are valid groups—if one strikeout has roughly equal impact to one walk—then the top half of the 6.00 group should be equal in effectiveness to the bottom half of the 6.00 group. I argue that this cannot be true, because one strikeout is NOT equal in impact to one walk; my friend argues that it is.

Not so much. This is the "6.00" group, sorted into the top half (Ryan and McDowell) and the bottom half (Maddux and Halladay):

Group	G	W	L	WPct	IP	SO	BB	ERA	KP9	BBP9
Group 6.00	14425	2092	1523	.579	32446.2	31228	9688	3.18	8.66	2.69
Top 196	7584	787	655	.546	12561.0	13443	5092	3.38	9.63	3.65
Bottom 195	6841	1305	868	.601	19885.2	17785	4596	3.06	8.05	2.08

The winning percentage of the Maddux/Halladay pitchers is .601. The winning percentage of the Ryan/McDowell pitchers is .546. This is the split for group 1.00:

Group	G	W	L	WPct	IP	SO	BB	ERA	KP9	BBP9
Group 1.00	26684	4801	5202	.480	90061.0	45211	35236	4.16	4.52	3.52
Top 459	12862	1806	2210	.450	36072.0	21813	17779	4.51	5.44	4.44
Bottom 458	13822	2995	2992	.500	53989.0	23398	17457	3.93	3.90	2.91

Again, there is a 50-point difference in the winning percentage between the top half of the group—the high strikeout/high walk group--and the bottom half. The difference between Group 1.00 and Group 2.00, in Winning Percentage, is only 15 points. The difference between Group 1.00 and Group 3.00 is only 35 points winning percentage. The difference between the high strikeout/high walk pitchers and the low strikeout/low walk pitchers in these two groups is 50 points.

This chart splits all six groups into their high walk/high strikeout and low walk/low strikeout components:

OK, the winning percentage difference is more like 40 points than 50, but still. …it takes almost two strikeouts to offset one walk, not one to one.

Another reader, trying to be helpful, suggests that the accuracy of "Strikeouts Minus Walks" supports the McCracken thesis, but actually, that’s the problem: It clearly contradicts the McCracken thesis. One strikeout only eliminates about .30 hits. Three-tenths of a hit is obviously worth nowhere near as much as one walk—therefore, for strikeouts minus walks to be a valid way to look at the relationship, the strikeout pitcher would have to be gaining a large additional benefit from his strikeout tendency, beyond the direct benefit of having one less ball in play. The McCracken thesis is that there is no such secondary benefit.

My experience is that posting research like this rarely ends a debate. Other people will still see the data in other ways, and obviously they have a right to do so. I don’t see that Strikeouts Minus Walks is an appropriate way to look at the Strikeout and Walk data.

COMMENTS (13 Comments, most recent shown first)

lvrotsos
It's a bit misleading how you called them the top and bottm halves of each k-bb groups seeming as you showed the bottom halves were a bit better.
5:37 PM May 24th

jwilt
The thing about Cooper is that he was a catcher, exclusively. Never played a defensive inning anywhere else. He actually played quite a bit for a catcher of his era, leading or nearly leading the 1940-55 era in games caught. 1947 was the one year he was healthy enough or was allowed to catch 130 games. By rate stats '47 was probably his best year (although you could argue for '55), but it doesn't stick out like a sore thumb.
1:32 PM May 24th

tangotiger
My response: http://www.insidethebook.com/ee/index.php/site/article/k_minus_bb_differential_or_ratio/
11:49 AM May 24th

slideric
Neat job. A great way to discuss, with data. Needs a response and you will get one. thanks
10:17 AM May 24th

tangotiger
sptaylor: What's your point? I've been using Tangotiger ever since I've been online for over a decade. Whether it's a name, or handle, or whatever, it's still a unique identity. There's a Tangotiger in Denmark I think, and me. In the whole internet world! I can turn into an a$$hole, and that will affect me personally, because I can't create a new identity called Tomahawk, and be able to talk about Win Expectancy and people not ask "aren't you that a$$hole tangotiger?". Just because you might think a handle is disposable, I don't.

In any case, Bill I presume meant to show he disagreed with my idea, and not me personally, and hence not to make it personal. And I said, he can use Tangotiger, and I still wouldn't make it personal anyway. He erred on the side of caution.

***

Charlie: right, it's more in-line with the linear weights model as the explanation.
4:57 PM May 23rd

CharlesSaeger
How many runs is a strikeout worth? Let's see ... turning an out into 0.30 hits. We'll set a hit at 0.56 runs, accomodating for extra bases; we'll make that 0.29 hits per ball in play; we'll ignore home runs, which might not be a good thing for this but what the hell. That's 0.16 runs, half a walk.

TomTom, would you need to add the value of an out to this derivation, making this a quarter run, approaching the value of a walk?
4:43 PM May 23rd

sptaylor70
Tom - Of course you don't mind if Bill uses your name. It's not really your name, but rather just a "handle."
3:03 PM May 23rd

sokho
Bill, Tom: It's a pleasure and a privilege to listen in on a dialogue between two such eminent analysts. Beggars can't be choosers, but one small quibble from the peanut gallery: there's a bit too much civility here. Can you guys do us the favor of tossing out the odd curse word here and there, maybe work in a comment about each others' mothers? Then, there'd be NO reason for me to keep reading other sabermetrics discussion boards...
2:24 PM May 23rd

Steven Goldleaf
Another, perhaps obvious, way to express the total oddity of Cooper's 1947 stats is that his 23 road HRs far eclipses his TOTAL season HRs in any other year (his high is 18), and his road RBIs almost does the same (one season he had three more total RBI than he had on the road in '47).
2:00 PM May 23rd

Robinsong
Thanks, Bill. I was the other reader and you corrected my thinking. Thank you for studying the issue and for explaining it well.

Why was Walker Cooper a part-time player? Was he platooned? Easily injured? Or was he Don Slaught?
1:52 PM May 23rd

tangotiger
Two additional points to mention, each also carried the issue of bias:
1. I had said it should be K minus BB per PA, not per IP. It's not that big a deal, but it still has some bias to it.
2. And of course, HR exploded at the same time BABIP jumped (between 1992-1994). And so, if you have more guys in the high K minus BB per PA group, you'd also have higher HR rates as well.

If Bill were to update his charts above to also show BABIP and HR, then we'll see that these more to the groups than they have differing K and BB rates.
12:40 PM May 23rd

tangotiger
I'm the unnamed person, and I don't mind being named. I wouldn't take this issue personally. I will say one important thing: because K rates have been on a steady rise, and there's been a shift in batting average on balls in play between 1992 and 1994 (.280 pre-1992, .300 post-1994), that the groupings that Bill has done may have an era bias. This is why I find it helpful to look at the data from 1993-present, because there's been a pretty strong line drawn there, in terms of looking at unadjusted data.

That said, I appreciate what Bill did, and it's exactly the way the study should have been set up. It's just a question of whether there's an era bias.
12:19 PM May 23rd

Brian
In the 1st 19 games of June, 1998, Sammy Sosa had 17 HRs and 35 RBIs. Completely unrelated thought: was Cooper ever tested for PEDs?
12:14 PM May 23rd

On Walker and Walks

COMMENTS (13 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: