By Bill James

September 25, 2007

Let me ask you a question. Are there more pitchers, would you suppose, who win 15 games in a season, or 16? Are there more pitchers who win 7 games, or 8? Are there more pitchers who win 13 games, or 12?

Yes, of course; there are more pitchers who win 15 games in a season than 16, more who win 7 than 8, and more who win 12 than 13. The most common number of wins for a pitcher in a season is zero. The second most-common is one, the third most-common is two. Of course.

But then let me ask you another question. Are there more pitchers who win 19 games in a season, or 20?

Yes, of course; there are more pitchers who win 20 games than 19. You would know this by intuition; I am just pointing out to you the implications of what you already know. The number of pitchers with x wins in a season decreases constantly as x rises from zero to 26—except that there are more pitchers with 20 wins than 19. For an obvious reason. Twenty wins is a target. This actually is a count of all the pitchers with x wins in a season from 1900 through 2006:

__x= x= x= x= __

41—1 29-- 5 19—216 9--1092

40—1 28-- 11 18—314 8--1158

37—1 27-- 27 17—353 7--1312

36—1 26-- 24 16—477 6--1439

35—1 25-- 44 15—592 5--1649

34—1 24-- 57 14—687 4--1961

33—4 23-- 83 13—750 3--2208

32—1 22—113 12—812 2--2748

31—7 21—167 11—913 1--3841

30—3 20—251 10—965 0--8110

I was thinking about that, and I got to wondering what other “targets” there are out there which disrupt the normal frequency curves. Are there more .300 hitters, over time, than .299 hitters? There must be, but. . how many more?

I looked at this phenomenon in a general way. This is what I found:

*Hits*

There are 49 players in history who wound up the season with 200 hits, whereas there are only 27 players who wound up with 199 hits—another obvious example of targeting. (Counts for hitting stats, by the way, go back all the way to 1876.)

Other than that, I found no evidence of targeting any hit level. You don’t have more players with 100 hits than with 99, for example, because 100 hits is not a standard of excellence, so it doesn’t exert any “pull” on the players. Also, the distribution of hits actually goes *down* slightly from 135 to about 100. There are 1,887 players in history with 130-139 hits, but only 1,768 players with 100 to 109 hits. Again, for an obvious reason, that being that the length of the schedule causes hit totals to cluster just under 150.

*Home Runs*

I found no evidence of successful “targeting” of any particular home run level. There are, for example, only five seasons in history of exactly 50 home runs, whereas there are twenty seasons of 49 home runs. There are more 39-homer seasons than 40s, more 29-homer seasons than 30s. It could be, perhaps, that when players try to force that last home run, they tend *not* to hit it.

*Runs Scored and RBI*

** **This one is puzzling. There is clear evidence of targeting in the RBI column. This is the distribution of RBI seasons between 105 and 95:

105 RBI-- 69 seasons

104 RBI-- 73 seasons

103 RBI— 78 seasons

102 RBI—103 seasons

101 RBI-- 93 seasons

100 RBI—100 seasons

99 RBI-- 79 seasons

98 RBI-- 81 seasons

97 RBI— 88 seasons

96 RBI—86 seasons

95 RBI—109 seasons

Obviously, 100 RBI is a goal for many hitters, and hitters who are close to that level find a way to get there, causing a “bubble” in the data right above 100 RBI. The magnetic influence of magic number RBI is so strong, actually, that it appears to jog players all along the spectrum. There are more players with 90 RBI, for example, than with 89, more players with 70 RBI than with 69, more players with 60 RBI than with 59, etc. It doesn’t hold at every level, but on balance, it appears that players who are on the verge of changing the first digit on their RBI count tend to push, in the last days of the season, to get over the hump. The guy who HAS 101 RBI takes the last day of the season off so that the guy with 89 RBI can bat cleanup and try to get to 90.

If this is true of RBI it must be true of runs scored, right? But no. . .there is no apparent evidence of “targeting” runs at any level. There is no bubble at 100 runs scored, not at all, no matter how you look at the data, and there is no evidence of targeting ten-levels. Either it is more difficult to manipulate the runs column, or it is less of a goal for players, or. . .something. You figure it out.

*Stolen Bases*

Since stolen bases involve stolen base *attempts*—that is to say, they involve the player’s judgment or discretion—I would have guessed that stolen bases might be particularly vulnerable to manipulation by players attempting to reach goals. It is hard to say whether this is true or not, for two reasons. First, there really are no magic numbers in the stolen base arena, except perhaps 100 stolen bases, and that has been achieved so infrequently as to be useless for this type of study. Second, the downward slope of the stolen base line as it moves from zero to 100 is so severe that it masks abberations in the data. In other words, if you compare 80 runs scored to 50 runs scored, there are 2,047 seasons in history with 80 to 89 runs scored, and 3,194 seasons with 50 to 59 runs scored. It’s a *reasonably* flat line between then, which reveals bumps or irregularities.

But in stolen bases, there are 31 seasons in history with 80 to 89 stolen bases, and 234 seasons with 50 to 59—a seven-to-one ratio. The steep slope of the line tends to hide any little uptick along the way.

Anyway, it does appear that there is targeting of ten-levels. Despite the steep slope of the line there are eleven seasons in history of exactly 70 stolen bases, whereas there are only three seasons of 69 stolen bases. There are more seasons of 40 than 39, more of 30 than 29. On balance, it does appear there is some targeting in this category.

I remember a story about Rocky Colavito stealing a base. It was a nothing game, sort of getting out of hand, and Rocky took off for second, and they gave him a stolen base (the defensive indifference rule was virtually never applied in those days.) Asked about it after the game, Rocky joked that he liked to get into every category sometime during the year, and he had stolen bases taken care of now.

So you see, targeting *could* effect the distribution at any level. That’s targeting, too. Rocky’s target was one stolen base.

*Batting Average*

This is the big one. There are, in baseball history, 195 players who have finished the season hitting .300 (in 400 or more plate appearances), whereas there are only 107 players who have finished the season hitting .299—the largest “targeting” discrepancy that I found in this study. The “.300 lure” is strong enough that it effects all of the data points around .300:

.304 147

.303 123

.302 156

.301 157

.300 195

.299 107

.298 128

.297 138

.296 123

Altogether, there are 508 players who have finished the season hitting .300 to .302, as opposed to 373 players who have finished the season hitting .297 to .299.

Batting average, of course, is the Godfather of baseball stats, and there appears to be batting average targeting at other levels as well--.350 and .400, for example. But there the numbers are so small that you can’t really be sure.

This data (on .300 hitters) was so dramatic that I asked myself whether this could be impacted by the fact that there is a simple combination—3 for 10—that figures up to exactly .300. In other words, could the number spike upward at .300 because the hitter has a chance every tenth at bat to land on exactly .300?

It could be, but. . .it isn’t. You have a chance to land on exactly .300 every tenth at bat, but you have a chance to land on .286 every seventh at bat, as 2 for 7 is .286. If this caused a spike at .300, it should cause a larger spike at .286. But in fact there have been fewer .286 hitters in history than .285 hitters or .287 hitters. There is a second reason that I know that this isn’t what’s causing the spike in the data. . .I’ll explain in a moment.

There is also a “ten-pull” in batting average. There are more players who hit .280 than .279, more players who hit .270 than .269, etc. Seems pretty clear.

Brooks Robinson used to tell this story on himself, don’t know if he still does. He had a miserable year in 1963, and went into his last at bat of the season hitting exactly .250—147 for 588. If he made an out, he wound up the season hitting under .250—but he got a hit, and wound up at .251. He said it was the only hit he got all season in a pressure situation.

That’s my point.. .players WANT to wind up the season hitting .250, rather than in the .240s. They tend to make it happen.

*On Base and Slugging*

I found no evidence whatsoever of “targeting” in the categories on-base percentage, slugging percentage and OPS.

On Base and Slugging only became focus statistics for players in the 1980s, late 1970s, and it may also be that these statistics are difficult to manipulate by deliberate actions. A player has limited options to manipulate his stats at season’s end, and, to the extent that he can do that, he’s going to tend to focus on batting average, rather than on-base percentage or slugging percentage.

The other way that we know that the spike in batting average at .300 isn’t caused simply by players going 3 for each 10 is by looking at the .400 slugging percentage. If this phenomenon caused a spike in the data at this point, it would cause a huge spike in slugging percentage at .400, since a player has the opportunity to land on exactly .400 every FIVE at bats—twice as often as he can land on .300. Further, since the data around .400 is thinner, a “chance spike” of the same size would appear larger.

But in fact, there is no spike in slugging percentage whatsoever at .400. A player in 500 at bats has the opportunity to slug exactly .400, or .402, or .404, or .406. In 501 at bats, he can land on .399, or .401, or .403, or .405. So that effect completely washes out when a player has 400 or more plate appearances.

*Pitchers *

Other than wins, the only place where there seems pretty clear evidence of “targeting” in pitcher’s stats is in strikeouts. There are seven pitchers (since 1900) who have finished the season with 300 to 304 strikeouts—as opposed to zero pitchers with 295 to 299 strikeouts.

There are also 38 pitchers who have finished the season with 200 to 202 strikeouts, as opposed to 27 pitchers with 197 to 199 strikeouts. This seems to indicate possible “targeting” in the strikeout columns.

Other than that, I was unable to see evidence of targeting in any pitchers’ totals. I looked at innings pitched, and. . .there could be some targeting there, but I can’t really say. There are more pitchers with 200-219 innings pitched than with 180-199, but that very well could just be the way the data is. There are also more with 180-199 than with 160-179. ..the numbers flatten out in the 150 area.

I looked at ERAs, and again, there could be some targeting here, but I couldn’t really say. There have been more pitchers with ERAs of 1.97 to 1.99 (that is, just under the 2.00 line) than with ERAs of 2.00 to 2.02 (33 to 27). There are more pitchers with ERAs of 2.97 to 2.99 than with ERAs of 3.00 to 3.02 (145 to 126). But there are no more pitchers with ERAs just under 4.00 than just over 4.00, and there are actually more pitchers with ERAs just over 5.00 than just under 5.00. There could be some “targeting” effects here, and I kind of think there are but. . .you can’t really tell.

*Saves*

I would have thought, because of the nature of the Save stat, that it would be a natural for this kind of manipulation. There is really no evidence of pitchers being pushed to meet threshold levels in Saves:

50 Saves—1 pitcher 49 Saves—1 pitcher

40 Saves—11 pitchers 39 Saves—11 pitchers

30 Saves—26 pitchers 29 Saves—29 pitchers

20 Saves—44 pitchers 19 Saves—42 pitchers

*The Time Line*

This study was actually done *after* the studies on Cigar Points, which are written as a companion piece. This study naturally goes before that one in a logical sequence, but the work was done backward.

So I was telling John Dewan about the Cigar Points study, before I wrote it up, and I asked him to guess who had the most “Cigar Points” of any player in history. A cigar point is a point you get for ALMOST reaching a target, but not quite—close, but no cigar.

John guessed that whoever had the most cigar points would have to be a player who played a long time ago. *Why?* I asked, genuinely puzzled.

“Well,” he said, “it seems like now, when a player gets close to those numbers, he’s going to keep going until he gets there.”

We weren’t quite communicating. I was talking about magic numbers, and John thought that what I meant was things like 3,000 hits, 500 home runs, 300 wins—career cumulative totals. I also did consider those career milestones in the cigar points studies, but what I *mostly* had in mind was season standards like a .300 batting average, 100 RBI, 200 hits. . .the things we have discussed here.

That was just a misunderstanding based on my failure to communicate clearly, but then I got to thinking about the issue this raises. Is it true that there is more of this now than there was years ago? Are players *more* goal-oriented now than they were forty years ago.

When I got to that question, I almost immediately remembered lots of stories about players from my childhood making explicit efforts to meet statistical goals—like the Rocky Colavito and Brooks Robinson stories that I told earlier. Ron Santo talked publicly about his desire to hit .300 with 30 home runs, 100 RBI every year. Willie Stargell one year announced a goal of hitting .320 with 40 homers, 120 RBI.

In 1964 Tony Cloninger won his 19^{th} game of the season with a couple of days left on the schedule. On the last day of the season the Braves led ^{th} win.

“No,” said Cloninger. “When I win 20 games, I want to do it on my own.” He did win twenty the next year—24, in fact. It’s like the Ted Williams story on a small level—Williams being offered a day off to keep his batting average at .400. Speaking of Ted Williams. . .Williams held the American League record for home runs by a rookie, 31, until 1950, when Walt Dropo hit 34. (Al Rosen in 1950 was not considered a rookie by the standards of the time.) In 1959 Bob Allison had 27 homers by the end of July, on pace for 40+, but hit only two in August and one in September, and missed the record. Again, his manager (Cookie Lavagetto) talked openly about how disappointed he was that Allison had failed to break the record. Allison, Callison. When Johnny Callison was at .301 with two days left in the season in 1962, his manager, Gene Mauch, gave him only one at bat over the last two days of the season so that he would stay at .300. He explained to the media that he wanted Callison, a young player, to go through the winter thinking of himself as a .300 hitter.

It seems to me, and this is just my intuition, but it seems to me that there was at least as much focus on players meeting statistical standards then as there is now. So then I got to thinking. ..has this really changed? And how could I measure that?

We can look at that by looking at the ratio of players who just meet a standard to those who just miss the standard. If the focus on meeting these records has increased, it should cause an increase in the ratio of those who just meet the standard to those who just miss, right? It seems to me that it should.

So I took the six (or seven) standards which most clearly are subject to targeting effects, which are:

200 hits

100 RBI

10-levels in stolen bases

a .300 average

20 wins

200 strikeouts and 300 strikeouts

I looked at the number of players just meeting and just missing these targets in each decade (except that I continued to ignore pitcher’s stats in the 19^{th} century. Twenty wins did not become a standard of excellence for a pitcher until about 1920.)

Anyway, in the 1800s there is no evidence that players had any interest whatsoever in meeting statistical magic numbers. Altogether, in the 1880s, 49 players just reached a magic number, while 55 players just missed one:

1880s 49 made 55 missed .471 percentage

1890s 99 made 98 missed .503 percentage

There is no data for the 1870s, since no player in the 1870s had 100 RBI, 200 hits, or 400 plate appearances.

In the first four decades of the twentieth century there is little evidence of motivation to meet these statistical standards:

1900s 97 made 90 missed .519 percentage

1910s 95 made 77 missed .552 percentage

1920s 98 made 118 missed .454 percentage

1930s 99 made 92 missed .518 percentage

Players throughout this era were essentially as likely to hit .299 as to hit .300, as likely to win 19 games as to win 20, which I take to be evidence that there was little focus on these magic numbers.

And then suddenly, in the 1940s, the ratios changed very dramatically:

1940s 76 made 52 missed .594 percentage

1950s 101 made 63 missed .616 percentage

1960s 96 made 73 missed .568 percentage

1970s 176 made 114 missed .607 percentage

Despite what people might assume, the focus on these magic numbers actually *decreased* somewhat once the free agent era began:

1980s 159 made 119 missed .572 percentage

1990s 174 made 129 missed .574 percentage

2000s 142 made 104 missed .577 percentage (through 2006)

Isn’t that cool? It’s so rare that we pose a question like that and actually find the answer, but we have it here: Clear, definitive evidence that the focus on magic numbers began about 1940.

Which makes sense when you think about it. In 1927, when Ty Cobb got his 4,000^{th} career hit, it was a matter of slight interest to the baseball public. In 1934 Sam Rice retired with 2,987 career hits.

But in 1938 the Washington Senators deliberately pushed Taffy Wright to the 100-game level, pinch-hitting him in the last week of the season so that (they believed) he would be eligible for the batting title—the first time that I know of such a thing being done. Over the next five years there would be several more controversies about batting titles and ERA titles. In 1941 there occurred the Ted Williams story—an oft-told tale in which Williams is portrayed as a hero for successfully meeting his personal statistical goal. The Joe DiMaggio story, also 1941, is also a story about a player performing purely personal statistical heroics.

In 1939 the Hall of Fame opened, and this certainly increased the focus on player’s statistical accomplishments. Yes, players did have a demonstrated interest in personal statistics before 1940, but it certainly does seem that there was a ramping up of that interest right about that time—in fact, I think that I have written before that this occurred at about that time, without knowing what I now know, what I have learned while writing this article. The ratio of players with 200 hits to players with 199 hits, players hitting .300 to those hitting .299, players winning 20 games to those winning 19. . .those ratios are actually much MORE one-sided if you focus only on the years since 1940.

Obviously one cannot draw reliable inferences about the ethics of the game from a study of this nature. But for whatever it is worth, I am willing to believe that the emphasis on magic numbers has decreased since free agency. Free agency, in a sense, made selfishness less tolerated within the game—simply because it had to be that way.

There is a very healthy ethic within baseball today that “yes, you can leave when the season is over. If you can make $40 million signing a contract to play for another team, good for you. But as long as you are a part of this team, you need to do what is best for this team.” That ethic hasn’t gone away, during the free agency era, and it may well have gotten stronger.

*(My appreciation to Retrosheet for help with the details in re Brooks Robinson, Bob Allison, Johnny Callison and Tony Cloninger.) *

©2023 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy

## COMMENTS (3 Comments, most recent shown first)

CharlesSaegerCan you break down the 1930s by halves or thirds or something? If this began with the talk of the Hall of Fame, we'd expect a definite increase in the latter half of the decade.

4:08 PM Mar 15thstudesOh, I see. Never mind. Well, I would have said more pitchers with 19! Silly me.

9:10 PM Feb 21ststudesI'm not sure you meant this, did you?

"Yes, of course; there are more pitchers who win 20 games than 19."

First sentence, fourth paragraph.

9:08 PM Feb 21st