Remember me

Research Reports

February 8, 2010

1.  The Willie Davis Study

 

            I have a couple of small research studies to report here.   The first study, which we could call the Willie Davis study, has to do with the significance of walk rates as a growth predictor for young players.    Willie Davis was an immensely talented young player who came up about 50 years ago.   In defense of Willie, he was also a very good player, and a very underrated player.   He played his prime years in a quite exceptional run context, with very low league ERAs, and in a park with a ridiculously low park factor.   If you created 4.00 runs per 27 outs in that park in that time, that was a lot.   It’s hard to adjust for this intuitively, so Willie’s numbers look worse than they really were.

            But at the same time, Willie was not an easy man to like.   He was self-centered and hard-headed, and he doesn’t come off well in books written by his teammates.    Also, he didn’t walk.    He walked 14 times in 1965, 15 times in 1966, playing regularly.   Not saying these facts are related.

            Anyway, I was wondering:   does a low walk rate predict a failure to develop as a hitter?   Because I can see it either way.   I can see that a low walk rate for a young player could be an impediment to development, but I can also see how a low walk rate might be predictive of development, in this way:  that the hitter who walks more, as a young player, can be seen as a more finished product, and therefore as a player who has less room to develop.   There’s an extra door open for the undeveloped hitter.

            I studied the issue in this way.   I took all players in history who

            1)  Were 22 to 24 years old on June 30 of the season in question,

            2)  Had played 300 to 400 games in their careers at the end of the season,

            3)  Had at least 1,000 career plate appearances, and

            4)  Had done so no more recently than 2004.

            There were 548 qualifying players meeting those standards.  (If players met those standards more than once, I used the more recent season.)   Those 548 players, I coded by their career batting average and their career slugging percentage, and by their career walk rate.    I then looked for “sets” of players or “pairs” of players who had the same codes for batting average and slugging percentage, but significant separations in their walk rates.   Garry Templeton, 1978, was coded “IJ 2”, for example, while Gene Richards, 1978, was coded “IJ 10”.   The “I” code indicates that the player’s career batting average at that point was between .290 and .299.  (Actually, they had both hit .299).  The “J” indicates that his slugging percentage was between .400 and .420.    The “2” for Templeton indicates that he rarely walked.   The “10” for Richards indicates that he walked a lot.   These are the career records of those two players at that time:

 

Last

AGE

G

AB

HR

RBI

BB

SO

SB

CS

Avg

OBA

SPct

OPS

Templeton

22

361

1481

11

143

44

190

73

42

.299

.319

.405

.724

Richards

24

300

1080

9

77

124

160

93

29

.299

.373

.406

.778

 

Another match was Harold Baines, 1992 (Code LH 5) with John Olerud, 1992 (Code LH 10).  

 

Last

AGE

G

AB

HR

RBI

BB

SO

SB

CS

Avg

OBA

SPct

OPS

Baines

23

384

1379

48

195

80

201

18

9

.268

.306

.449

.755

Olerud

23

394

1278

47

182

195

221

1

4

.269

.364

.440

.804

 

            Altogether I had 85 matched sets in the study.   Then I looked to see how those players performed in the rest of their careers.  

 

            Short answer:   No difference whatsoever, other than that the “high walk” players continued to walk more often than the low walk players.   Otherwise, they remained essentially perfectly matched to the end of their careers.   These are the average career totals of the two groups of players through the base year of the study.   I’ll break this into several lines so that I can carry more information.   The first group of lines summarizes the SEASON stats of the players in the base year:

 

Group

G

AB

R

H

2B

3B

HR

RBI

BB

SO

SB

CS

Avg

Low Walks

118

451

63

124

21

7

8

54

24

46

12

3

.275

High Walks

128

448

67

123

21

5

8

53

60

50

14

4

.275

 

Group

Age

OUTS

PA

OBA

SPct

OPS

Low Walks

23.27

342

483

.310

.398

.707

High Walks

23.37

345

520

.358

.391

.749

 

            The second group of lines summarizes their career totals at that point in their careers:

 

Group

Years

G

AB

R

H

2B

3B

HR

RBI

BB

SO

SB

CS

Avg

Low Walks

3.87

346

1302

177

353

59

19

19

143

59

127

31

9

.271

High Walks

3.75

346

1184

174

320

53

15

19

134

150

139

36

10

.270

 

Group

OUTS

PA

OBA

SPct

OPS

Low Walks

985

1384

.306

.390

.696

High Walks

912

1364

.355

.389

.744

 

            These lines compare the “rest of career” performance of the two groups of players:

 

Group

Years

G

AB

R

H

2B

3B

HR

RBI

BB

SO

SB

CS

Avg

Low Walks

9.14

1007

3690

531

1049

178

44

75

502

273

350

107

25

.284

High Walks

9.13

1016

3413

518

947

162

32

77

436

468

376

93

30

.277

 

Group

OUTS

PA

OBA

SPct

OPS

Low Walks

2775

4049

.337

.418

.754

High Walks

2615

3967

.367

.411

.778

 

            And these lines compare the final career totals of the two groups of players, combining the “before” and “after” lines given above:

 

Group

Years

G

AB

R

H

2B

3B

HR

RBI

BB

SO

SB

CS

Avg

Low Walks

13.01

1353

4992

707

1402

237

63

95

646

332

477

138

34

.281

High Walks

12.88

1362

4597

692

1267

215

47

96

570

618

515

128

40

.276

 

Group

OUTS

PA

OBA

SPct

OPS

Low Walks

3760

5433

.329

.410

.739

High Walks

3527

5331

.364

.405

.769

 

            Essentially, there is no reason to believe that the walk rate plays any predictable role in the future development of a young player.  

 

2.  The Chris Young Study

 

            The second little study arose from the Gold Mine 2010, on which we have been working feverishly here behind your screen.   Someone, and I’m sorry to say that I don’t know who, had submitted the following “nugget” about Chris Young:

 

Chris Young in 2009 was the most extreme flyball hitter in the majors. 56% of his balls in play were fly balls (the highest figure among major league qualifiers) and only 26% were ground balls (the lowest). The rest were line drives.

 

To this, I had suggested that we add a second paragraph, as follows:

 

            You’ll notice that he’s not really a good hitter.   Most hitters who are extreme one way or the other (flyball or ground) are not good hitters, because they hit a low percentage of line drives, and good hitters mostly live on the line drives.   If a hitter is an extreme flyball hitter he’s off-center, and it’s likely that a high percentage of the fly balls are just pop outs.

 

            Dave Studenmund argued that there was no evidence that this was true, and this led to an extended back-and-forth between us about whether it was or was not.   Eventually I did a little study.  I took all hitters since 2006, and found for each player the number of Ground Balls, Line Drives and Fly Balls that he had put in play over the four-year period.   I then figured to what extent each hitter was an extreme flyball or extreme groundball hitter, by this method.   I multiplied the Ground Balls hit by 1, the Line Drives by 5, and the Fly Balls by 9, and then divided by the total of the three.   I then subtracted 5, and stated the result as an absolute. 

For example, Jacque Jones over the four years hit 505 ground balls, 161 Line Drives, and 225 Fly Balls.   That's a total of 891.   If you multiply the ground balls by 1, the Line Drives by 5, and the Fly Balls by 9 and divided by 891, you get 3.74.   Subtract 5, that's -1.26.   Stated as an absolute, that's 1.26.

            Second example, David Delucci.   Delucci hit 269 ground balls, 119 line drives, 269 ground balls.   Follow the same process, you get 5.00.   Subtract 5, that's zero.   Stated as an absolute, that's zero.

            Third example, Chris Young.  Young over the years hit 486 ground balls, 233 line drives, 608 fly balls.   Follow the same process, you get 5.37.  Subtract 5, you have 0.37.   Stated as an absolute, that's 0.37.

            I then removed from the study all players who put less than 500 balls in play over the four years.    That left us with 382 players.   I sorted them into three groups---127 "extreme" players, with the highest absolute scores, 127 "centrist" players, and 128 players who were neither extreme nor centrist.

I then looked up the OPS (over the four-year period) of each of those players.

            As a group--as I expected--the centrist hitters were the best hitters, the "in-between" hitters were next, and the "extreme" hitters were the weakest hitters.   The differences were relatively large.  

            Of the "extreme" hitters, more had an OPS under .700 than over .800.   33 of the 127 an OPS under .700.   Only 25 had an OPS over .800, and only three had an OPS over .900.

But of the "centrist" hitters, more had an OPS over .900 than under .700--not merely over .800, but over .900.    Here's a spectrum of OPS for the three groups:

 

 

 

OPS

OPS

OPS

OPS

 

 

Under

.700 to

.800 to

Over

 

 

.700

.799

.899

.900

 

Extreme Groups

33

66

25

3

 

Neither/Nor Group

12

68

40

8

 

Centrist Group

11

64

36

16

 

            There is no question that there is a difference, and that this IS consistent with my essential claim that the best hitters are the "centrist" hitters.   The average OPS of the "extreme hitters" above was .732; of the second group, .786, and, for the centrist hitters, .798.

            However, I wasn't EXACTLY right, myself.   What I didn't anticipate when I set up this study is that there are significantly more ground balls than fly balls.   Because there are more ground balls than fly balls, the "norm" for the weighted average wasn't 5.00,as I intended, but 4.77. Significant difference.

            Because of that the great majority of the hitters that I had measured as "extreme" hitters were extreme GROUND BALL hitters.   Actually, of the 127 players in that group, 106 were extreme ground ball hitters, whereas only 21 were extreme fly ball hitters.  The weighted average for those hitters, the extreme hitters, was 4.32.

            To correct for this, first I resorted the 382 players into 5 groups, in ascending order of their "weighted average", 76 players in each group except the third group, which had 78. The most extreme groundball hitters were in the "1" group, and the most extreme flyball hitters were in the "5" group.   

            Sorted in this way, the data shows that the advantage is not to "centrist" hitters as much as it is to flyball hitters.    This is the OPS of the five groups:

 

Group 1

Ground Ball Hitters

.731

Group 2

 

.746

Group 3

Mid Range

.781

Group 4

 

.806

Group 5

Fly Ball Hitters

.795

 

            So the real separator was not “extreme” hitters versus “centrist” hitters, but fly ball hitters versus ground ball hitters.   However, I also decided to correct the flaw of the original study by multiplying fly balls not by 9.00, as I originally did, but by 9.59.   This causes the average to be 5.00, which causes the “extreme” group to be more or less evenly divided between fly ball hitters and ground ball hitters.

            When we do this, it still appears to be more true than not that the best hitters are “centrist” hitters.    The average OPS of the “extreme” hitters was .763.   The average OPS of the “mid-range” group was .770, and of the “centrist” hitters, .783.    The OPS breakdown, as given before, was:

 

 

 

OPS

OPS

OPS

OPS

 

 

Under

.700 to

.800 to

Over

 

 

.700

.799

.899

.900

 

Extreme Groups

23

63

33

7

 

Neither/Nor Group

21

68

31

7

 

Centrist Group

12

66

37

12

 

            So. ..is it strongly true that “extreme” hitters, like Chris Young, tend to be not-very-good hitters?  No.   But is it somewhat true?   Yes, it is.   Ultimately, we decided to remove the questionable paragraph, but it was not entirely untrue in my opinion.

 
 

COMMENTS (8 Comments, most recent shown first)

izzy24
I find it very strange that the low walk guys scored more runs than the high walk guys. Considering how close their batting average and slugging pct. is I would think the higher obp guys would score more runs.
8:28 PM Dec 7th
 
hotstatrat
I made a similar analysis (different method) as the walk rate study several years ago for www.scoresheetwiz.com and came up with the same conclusion. Yes, the orthodoxy is so strong that players who can command the strike zone have a better chance of developing that I've been doubting my study lately. Thanks for this.

On another note, one rule of thumb you made once, Bill, that a significant drop in strikeouts is a better indication of a pitcher getting old than his actual age - or something along that line, needs an updated study. How true is that? What is significant? From mere empirical evidence, I'm finding more pitchers than I would expect bouncing back from hefty strikeout drops and other pitchers who just get worse and worse in their 30s. Perhaps, improved training and medicine has changed this rule of thumb?
10:06 PM Feb 9th
 
CharlesSaeger
Yeah, I should have noticed good, old-fashioned means regression staring me in the face. Actually, other than walks, these are two very ordinary groups of players, so that's the only place where it would happen.
5:15 PM Feb 9th
 
tangotiger
The phenomenon Bill is referring to is called "regression toward the mean". It basically says that anything you OBSERVE is in TRUE closer to the mean. It works on anything. If the average score on a test is 75%, and some random person you see had a 90%, then you know that their "true" score (their next test, or their previous test, neither of which you know anything about) will actually be somewhere between 75% and 90%. In a general sense, any good score had good luck associated to it, and any bad score has some back luck associated to it.
5:11 PM Feb 9th
 
bjames
Responding to Charles Saeger, the walk rate of the "Low Walk" group went up more than the walk rate of the High Walk group because all groups are pulled toward the center. If you compared 40-homer players and 10-homer players, among those who were good enough to stay in the league, the home run rates of the 40-homer players would go down, while the 10-homer players would go up, because everybody tends toward the center. The "gravity" of the numbers pulls the low-walk players walk rates up, and keeps the high-walk players numbers down.
12:28 AM Feb 9th
 
birtelcom
I'm still very skeptical of the underlying data that divides balls in play among fly balls, ground balls and line drives. I worry that the observers doing the characterizations will often tend to call treat two balls in play with identical trajectories differently depending on whether they result in safe hits or not. Any such bias would presumably throw off studies like this.
7:21 PM Feb 8th
 
CharlesSaeger
1) I assume you meant Harold Baines, 1982, not 1992. If not, I assume that I watching a teenager play baseball in the 1980s; Joe Nuxhall had nothing on that.

2) The walk rate of the low walk group went up more than that of the high walk group. I don't know it it's significant or not, and it doesn't close the gap, but it might be that some of the low walk group figured out how to take a pitch.
3:45 PM Feb 8th
 
Trailbzr
I'm surprised by the results of the Willie Davis study. From my sabermetrically-formative years, I would have expected Tim Raines vs. Willie Wilson, Ozzie Smith vs. Garry Templeton and Eddie Murray vs. Cecil Cooper were matched sets about the value of walks as a measure of maturity and discipline and hence future value.
1:15 PM Feb 8th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy