Remember me

Ex RBI

June 12, 2007
This study begins by asking what is essentially an archaic question, albeit an archaic question to which I want to know the answer. The question is, “When a player drives in more runs than you would expect, or fewer runs than you would expect, to what extent should we suspect that this is because he hit well in RBI situations, and to what extent should we suspect that this happens just because he had a lot of RBI opportunities?” Let’s take Al Kaline, 1956. Kaline hit .314 with 27 homers, 128 RBI. A player who hits .314 with 27 home runs would not ordinarily drive in 128 runs; I’ll document that later, but one knows it immediately by a baseball fan’s intuition. Did he hit .400 that year with runners in scoring position, or did he drive in all of those runs that year because he happened to hit in a lot of RBI situations?

As I said, it’s sort of an archaic question because, in the modern world, it’s easy to find out what a player has hit with runners in scoring position, and how many times he has batted with runners in scoring position. It’s still an interesting question, at least to me, so I’m still going to chase it.


Part I
Establishing Expectations

In order to study whether deviations from expected RBI occur more because of performance or more because of opportunity, we first have to establish how many RBI a player should be expected to have, given his other stats. How do we do that?

We can assume, as a starting point, that RBI must bear some predictable relationship to total bases. I have a spreadsheet that has in it the batting stats of about 80% of the regular hitters in baseball history. . .in a few weeks I’ll have everybody, but that doesn’t really matter, and I’m interested in this today. I took that spreadsheet, and I eliminated from it all the players who had less than 250 Plate Appearances, because frankly they’re just a nuisance for the present study. This left me with a file of 20,806 player seasons.

I then eliminated all the players from the 1882-1890 era for whom we have no RBI counts. This left me with 20,355 players, with RBI ranging from 191 to 5 (Ernie Fazio, 1963), and total bases ranging from 457 to 35 (Bill Bergen, 1911). The players in the study averaged 55 RBI and 176 Total Bases, so they averaged .31 RBI per base--.311 580, if you want to be technical about it. Let’s call it .3116.

We can thus make a first estimate of each player’s expected RBI by simply assuming that his RBI should be .3116 times his total bases. Al Kaline, 1956, had 327 total bases, so we would expect that he would have 102 RBI. Since he actually drove in 128, we have an error of 26. By this first-cut method, we have a total error of 197,964 RBI for all players, an average error of 9.72. By this method, the biggest RBI over-achiever in my data sample was Sam Thompson, 1887. Thompson had 311 total bases, which should have led to 97 RBI. He actually had 166, or +69. . .an error of 69. The biggest under-achiever was Lloyd Waner, 1927. Waner had 258 total bases, which should have led to 80 RBI, but he actually drove in only 27.

We can do a little bit better than that pretty easily. The next question was, “Is this figure fairly constant over time, or does it go up and down radically as the game changes?”

Answer: It’s reasonably constant except for the 1890s. It does go up and down with run scoring levels, but, except for the 1890s, it stays within an acceptable range. . .what seems to me like an acceptable range. This is the norm for each decade in baseball history:

1870s .307 1920s .317 1970s .301
1880s .329 1930s .327 1980s .302
1890s .371 1940s .315 1990s .316
1900s .299 1950s .310 2000s .311
1910s .295 1960s .296  

It goes up and down some, but except for the 1890s it stays within a range from .295 RBI per total base to .329. I think I’m going to deal with the aberrant data for the 1880s/1890s by simply throwing the 19th century out of the study. The average for all of baseball history since 1900 is .3079, but in any case we will continue to treat this as a constant throughout baseball history.

Our aggregate error now is 175,877 RBI. . .see, we’re already making progress. That’s not real progress, of course; we’ve reduced the aggregate error by shrinking the study, but the average error for 1900 to the present is 9.52, so we have at least reduced the average error a little. The largest errors now are Hank Greenberg, 1937 and Hack Wilson, 1930, both over their estimate by 60.76 RBI. Waner is still the furthest under, followed by Richie Ashburn in 1958.

It is apparent from glancing at the data that home run hitters tend to be “over” their expected RBI, and singles hitters under. Well, here, let me show you some of the data. . .these are the fifteen hitters who are furthest over their expected RBI:

Player YEAR G AB R H 2B 3B HR RBI Avg   TB EX RBI Error
Hank Greenberg 1937 154 594 137 200 49 14 40 183 .337   397 122 61
Hack Wilson 1930 155 585 146 208 35 6 56 191 .356   423 130 61
Manny Ramirez 1999 147 522 131 174 34 3 44 165 .333   346 107 58
Lou Gehrig 1931 155 619 163 211 31 15 46 184 .341   410 126 58
Vern Stephens 1949 155 610 113 177 31 2 39 159 .290   329 101 58
Jimmie Foxx 1938 149 565 139 197 33 9 50 175 .349   398 123 52
Zeke Bonura 1936 148 587 120 194 39 7 12 138 .330   283 87 51
Hank Greenberg 1935 152 619 121 203 46 16 36 170 .328   389 120 50
Hack Wilson 1929 150 574 135 198 30 5 39 159 .345   355 109 50
Vic Wertz 1949 155 608 96 185 26 6 20 133 .304   283 87 46
Luke Appling 1936 138 526 111 204 31 7 6 128 .388   267 82 46
Jimmie Foxx 1930 153 562 127 188 33 13 37 156 .335   358 110 46
Joe DiMaggio 1948 153 594 110 190 26 11 39 155 .320   355 109 46
Ted Williams 1949 155 566 150 190 39 3 43 159 .343   368 113 46
Don Hurst 1932 150 579 109 196 41 4 24 143 .339   317 98 45


And these are the fifteen who are furthest under:

Player YEAR G AB R H 2B 3B HR RBI Avg   TB EX RBI Error
Lloyd Waner 1927 150 629 133 223 17 6 2 27 .355   258 79 -52
Richie Ashburn 1958 152 615 98 215 24 13 2 33 .350   271 83 -50
Patsy Dougherty 1904 155 647 113 181 18 14 6 26 .280   245 75 -49
Snuffy Stirnweiss 1944 154 643 125 205 35 16 8 43 .319   296 91 -48
Luis Castillo 2000 136 539 101 180 17 3 2 17 .334   209 64 -47
Johnny Mostil 1926 148 600 120 197 41 15 4 42 .328   280 86 -44
Don Blasingame 1959 150 615 90 178 26 7 1 24 .289   221 68 -44
Juan Pierre 2006 162 699 87 204 32 13 3 40 .292   271 83 -43
Tim Raines 1985 150 575 115 184 30 13 11 41 .320   273 84 -43
Ralph Garr 1975 151 625 74 174 26 11 6 31 .278   240 74 -43
Ralph Garr 1971 154 639 101 219 24 6 9 44 .343   282 87 -43
Willie Wilson 1980 161 705 133 230 28 15 3 49 .326   297 91 -42
Matty Alou 1966 141 535 86 183 18 9 2 27 .342   225 69 -42
Pete Rose 1970 159 649 120 205 37 9 15 52 .316   305 94 -42
Matty Alou 1967 139 550 87 186 21 7 2 28 .338   227 70 -42


It is apparent that home run hitters tend to be over their expected RBI and singles hitters under. This probably isn’t a direct effect of home runs as opposed to singles, in the main; it’s caused by the fact that singles hitters bat leadoff and home run hitters in the middle of the order. But that’s not our problem; we’re not looking for some way of figuring expected RBI based on batting order position. We’re just looking for expected RBI based on batting stats. As far as we know, home runs lead to extra RBI.

OK, so let’s cut out the home runs from the herd, and have two weights: Home Runs, and OTB (Other Total Bases. Or Off-Track Betting, whichever you prefer.) What weight should we put on HR, and what weight on OTB?

It is apparent that the weight on a HR cannot be less than 1.24, because below 1.24, we would not be increasing the weight given to home runs; we would be decreasing it. Let’s start with a weight of 1.30 RBI per home run. If we have 1.30 RBI per home run, the weight for a non-home run total base drops to .3031. . .this can be simply derived from the data, of course. So let’s try this formula:
1.30 HR + .3031 OTB = RBI

That reduces the average error of the RBI estimates to 9.31. . .a much larger gain in accuracy than I was expecting, frankly. Let’s try:
1.40 HR + .2961 OTB = RBI

This drops the standard error to 9.03. Moving on:
1.50 HR + .2892 OTB = RBI

By the way. . .we all know that Babe Ruth holds the record for Total Bases in a season, 457. But who holds the record for Total Bases, not including Home Runs? The answer appears to be. . .unless it is someone who is not in my study. . .Ty Cobb, 1911. 335 OTB. Anyway. …our average error on this try drops to 8.79, so we move on:
1.60 HR + .2822 OTB = RBI

That has an error of 8.60. I am now violating one of the cardinal rules of writing up research: Don’t tell them about the process of your research. Tell them about the product of your research. My apologies. . .I was just trying to explain to you how I come up with these formulas. I’ll get to the point. The formula I settled on is
2 HR + .25 OTB = RBI

Which can also be written as:
TB/4 + HR = RBI

This actually is not the most accurate formula I could find; others are a hair more accurate. But this is simple and makes intuitive sense. You drive in other runners at a rate of one RBI per four bases, plus, if you hit a home run, you drive in yourself.

Based on this formula, we would expect these ten hitters to be the best RBI men within my study:

Player YEAR G AB R H 2B 3B HR RBI Avg   OTB   Ex RBI Error
Babe Ruth 1921 152 540 177 204 44 16 59 171 .378   221   173 2
Sammy Sosa 2001 160 577 146 189 34 5 64 160 .328   169   170 10
Sammy Sosa 1998 159 643 134 198 20 0 66 158 .308   152   170 12
Jimmie Foxx 1932 154 585 151 213 33 9 58 169 .364   206   168 2
Babe Ruth 1927 151 540 158 192 29 8 60 164 .356   177   164 0
Sammy Sosa 1999 162 625 114 180 24 2 63 141 .288   145   162 21
Hack Wilson 1930 155 585 146 208 35 6 56 191 .356   199   162 29
Luis Gonzalez 2001 162 609 128 198 36 7 57 142 .325   191   162 20
Lou Gehrig 1927 155 584 149 218 52 18 47 175 .373   259   159 16
Rogers Hornsby 1922 154 623 141 250 46 14 42 152 .401   282   155 3


They weren’t the ten best RBI men, exactly, but they weren’t too shabby, either, driving in an average of 162 runs.

Remarkably enough, the two biggest RBI over-achievers in my study were both on the same team: Zeke Bonura, 1936 Chicago White Sox (+55.25), and Luke Appling, also 1936 Chicago White Sox (also +55.25). . .actually a third player ties with them, that being Pie Traynor on the 1928 Pirates:

Player YEAR G AB R H 2B 3B HR RBI Avg   TB   Ex RBI Error
Zeke Bonura 1936 148 587 120 194 39 7 12 138 .330   283   82.75 55.25
Luke Appling 1936 138 526 111 204 31 7 6 128 .388   267   72.75 55.25
Pie Traynor 1928 144 569 91 192 38 12 3 124 .337   263   68.75 55.25


I had never realized that the ’36 White Sox were such a unique team. They had no power hitters, obviously, so they used Bonura and Appling in the middle of the order, hit .292 as a team, drew a lot of walks, and scored six runs a game, somebody had to drive them in. All of the biggest “+ RBI” guys in history are singles hitters who somehow were stranded in the middle of the batting order, while the two biggest “negative RBI” guys are power hitters who happened to hit leadoff—
  1. Felipe Alou, 1966
  2. Alfonso Soriano, 2006
After that the low-RBI guys are a mix of leadoff power hitters and leadoff guys who just didn’t drive in any runs, like Waner in ’27 and Patsy Dougherty in 1904. The ten biggest RBI under-achievers were all guys who scored 100-plus runs, except for Lou Brock, 1966.

So this provides us a first answer to the question: extraordinary gaps in RBI (vs. expected RBI) appear, at first look, to be driven by context.


II. Back To Kaline for a Second

I promised to show that Kaline’s RBI count was out of line. . . .I looked up all players in baseball history who had
  • 190-199 hits
  • 25-29 home runs
  • .300-.330 batting average
There are 19 such players. . .Bobby Thomson, 1949, Yogi Berra, 1950, Frank Robinson, 1957, Willie Mays, 1960, etc. Kaline is in the middle of this group in hits, home runs, in a three-way tie for 8th-9th-10th in total bases—but first in the group in RBI. The other 18 players averaged 101 RBI. By our little formula, Kaline is +19.


III. The Test Group

OK, I’m not really interested in extraordinary gaps in RBI, which are obviously caused by things like Adam Everett hitting cleanup, but in the more routine discrepancies where somebody just drives in about 20 more runs than you would think they ought to. And, since I then want to check these guys out and see why they drove in more runs than they ought to have, I’ll need to focus on players since 1957, when Retrosheet data is now available. And I’ll have to skip 1999. . .sorry, Manny.

I chose 20 “matched sets” of players. I identified 20 players who had un-expectedly high RBI totals. Then, for each player, I identified
  1. a player with very comparable batting stats for the season, but a normal RBI total, and
  2. a player with very comparable batting stats for the season, but a notably LOW RBI total.
These were the 20 sets, listed alphabetically by the Alpha player:

George Bell and Andre Dawson were the MVPs in 1987, having similar seasons and leading the two leagues in RBI. In other years, however, they match in other ways. Bell had almost the same stats in 1992 that Andre Dawson had had in 1978—but drove in 40 more runs. We’ll sandwich another guy who didn’t deserve the MVP award either in between them:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
George Bell 1992 155 627 74 160 27 0 25 112 31 97 5 2 .255 262 .294 .418 .712
Don Baylor 1982 157 608 80 160 24 1 24 93 57 69 10 4 .263 258 .329 .424 .754
Andre Dawson 1978 157 609 84 154 24 8 25 72 30 128 28 11 .253 269 .299 .442 .740


Barry Bonds in 1991 had almost the same number of singles, doubles, triples, home runs and at bats that Wally Moon had had in 1957, but whereas Moon drove in only 73 runs and had a normal-sized head, Bonds drove in 116:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Barry Bonds 1991 153 510 95 149 28 5 25 116 107 73 43 13 .292 262 .410 .514 .924
Gordy Coleman 1961 150 520 63 149 27 4 26 87 45 67 1 3 .287 262 .341 .504 .845
Wally Moon 1957 142 516 86 152 28 5 24 73 62 57 5 6 .295 262 .367 .508 .875
Barry Bonds 1988 144 538 97 152 30 5 24 58 72 82 17 11 .283 264 .368 .491 .859


An irony here being that Bonds had another year, 1988, in which he also had about the same numbers—but drove in even fewer runs than Moon, and half as many as he would in ’91. We all know what this proves: driving in runs is proof of steroid use. No, seriously. . .Bonds in ’88 was a young player and a leadoff hitter. By ’91, although he was essentially the same player, he had been moved to the middle of the order and had begun to irritate journalists in a serious way. But Wally Moon wasn’t a leadoff hitter, so that’s something else. . .we’ll look at what in the next section of the study.

Jeff Burroughs won the MVP Award in ’74, essentially because he drove in a lot of runs for a team that had a surprisingly good season. Whether he deserved the MVP Award is another question, but Burroughs drove in 42 runs more than did Ellis Valentine in ’78, having essentially the same season:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Jeff Burroughs 1974 152 554 84 167 33 2 25 118 91 104 2 3 .301 279 .397 .504 .901
Chipper Jones 2003 153 555 103 169 33 2 27 106 94 83 2 2 .305 287 .402 .517 .920
Ellis Valentine 1978 151 570 75 165 35 2 25 76 35 88 13 8 .289 279 .330 .489 .820


This article isn’t about clutch hitting, but Joe Carter was a famous clutch hitter for a lot of reasons, among them his RBI numbers, even when his other stats were down:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Joe Carter 1990 162 634 79 147 27 1 24 115 48 93 22 6 .232 248 .290 .391 .681
Eric Karros 1993 158 619 74 153 27 2 23 80 34 82 0 1 .247 253 .287 .409 .696
Max Alvis 1965 159 604 88 149 24 2 21 61 47 121 12 8 .247 240 .308 .397 .706


Donn Clendenon, Attorney at Law, joined the Mets in the middle of their miracle, hitting cleanup for them much of September ’69 although he never got much attention for it among Seaver, Koosman, Agee, Cleon, Hodges, and the great Al Weis. In 1970 Clendenon just had a remarkable season, driving in almost 100 runs in less than 400 at bats, despite not really hitting at a fantastic level. Sixto Lezcano had about the same hitting numbers later in the decade, but drove in half as many runs:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Donn Clendenon 1970 121 396 65 114 18 3 22 97 39 91 4 1 .288 204 .348 .515 .863
Eddie Murray 1981 99 378 57 111 21 2 22 78 40 43 2 3 .294 202 .360 .534 .895
Sixto Lezcano 1977 109 400 50 109 21 4 21 49 52 78 6 5 .273 201 .358 .503 .861


Chili Davis had an interesting career. He was hyped so much as a young player that he spent several years being considered a major disappointment, although, in retrospect, he wasn’t half bad. He hit .300 three times, usually hit around .280, hit 30 homers once, 29 once, 28 once, 27 once, 26 once. But he drove in 100 runs only once—in a season when he hit just .243 and had easily the highest strikeout total of his career. Don Baylor, who had a similar career to Davis’ and who also drove in 100 runs only once although he was famous as an RBI man, is on the other end of this set:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Chili Davis 1993 153 573 74 139 32 0 27 112 71 135 4 1 .243 252 .327 .440 .767
Greg Walker 1987 157 566 85 145 33 2 27 94 75 112 2 1 .256 263 .346 .465 .810
Don Baylor 1977 154 561 87 141 27 0 25 75 62 76 26 12 .251 243 .334 .433 .768


Tommy Davis in 1962 drove in more runs in a season than any other player between 1950 and 1997—and did it without hitting either 30 home runs or 30 doubles. He won the batting title and had 230 hits, but with only 27 homers and only 27 doubles—remarkably low totals for a player driving in 153 runs.

In a way, I really should not have included this set of players in the study, since they violate a few kind of unwritten rules underlying the exercise. First, I tried to avoid including players in the study if I actually understood why their RBI count was out of line. I tried to avoid including in the study power hitters who were used as leadoff men, for example, or players who hit in the middle of the order because they usually hit 25 homers a year but just happened to have seasons when they only hit 8. I avoided using Tommie Herr’s 1985 and 1987 seasons, when he had inflated RBI totals because he was batting behind Vince Coleman, who was stealing 100 bases a year.

Also, I tried to avoid including in the study players whose batting stats were so unusual that they were hard to match. I wouldn’t include Mark McGwire, 1996, for example, even though his RBI total is clearly wrong for his numbers (.312 with 52 homers, 113 RBI), because his overall batting stats are so odd that you can’t really find a good match for him.

Davis violates both of these rules. His inflated RBI count is also, in part, obviously explained by batting behind a player stealing 100+ bases, and his numbers are so unusual that they are hard to match. Felipe Alou, 1966, also violates this rule, because I certainly know that he was a leadoff hitter that year, and that this largely explains his low RBI total.

I decided to include them anyway, because:
  1. Davis in ’62 is in a sense the epitome of the type of player I am interested in—the guy who has an RBI total which is just obviously out of line with what you would expect,
  2. these players were part of my childhood, and I am interested in them, and
  3. although I am quite interested in both seasons, I had never before noticed that Felipe Alou in ’66, with 74 RBI, and Davis in ’62, with 153, actually have quite similar numbers. One might have expected Alou to have had more RBI (120) than Davis (116).
Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Tommy Davis 1962 163 665 120 230 27 9 27 153 33 65 18 6 .346 356 .374 .535 .910
Kirby Puckett 1988 158 657 109 234 42 5 24 121 23 83 6 7 .356 358 .375 .545 .920
Felipe Alou 1966 154 666 122 218 32 6 31 74 24 51 5 7 .327 355 .361 .533 .894


Pedro Guerrero hit 27 to 33 home runs four times in his career, but set a career high for RBI in 1989, hitting only 17 home runs, although it was still a very good year. Actually, it was about the same year that Carl Yastrzemski had had in 1963, driving in 68 runs:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Pedro Guerrero 1989 162 570 60 177 42 1 17 117 79 84 2 0 .311 272 .391 .477 .868
Chet Lemon 1979 148 556 79 177 44 2 17 86 56 68 7 11 .318 276 .391 .496 .887
Carl Yastrzemski 1963 151 570 91 183 40 3 14 68 95 72 8 5 .321 271 .418 .475 .894


In 1998 Superman’s older brother Jeff Kent played 137 games, batted only 526 times, hit .297. He did hit 31 homers, but one would have expected him to get over 100 RBI—maybe; actually my formula says 104. He drove in 128. The Crime Dog had done basically the same thing ten years earlier, but drove in 82:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Jeff Kent 1998 137 526 94 156 37 3 31 128 48 110 9 4 .297 292 .359 .555 .914
Tino Martinez 1995 141 519 92 152 35 3 31 111 62 91 0 0 .293 286 .369 .551 .920
Fred McGriff 1988 154 536 100 151 35 4 34 82 79 149 6 1 .282 296 .376 .552 .928


When you hit .238 with only 129 hits you don’t really expect to drive in 112 runs even if you hit 40 homers—but Jeff King did it hitting 28.

King was a funny guy. ..he was a good guy, but he was perhaps the most negative major league player since Pink Hawley. I was talking to a couple of ex-Royals about the 1999 team, one guy claimed he was standing next to Jeff King during the opening-day ceremonies. “Man,” said King. “I hate this song.”

“Are you crazy?” said the other guy. “This is the National Anthem. How can you hate the National Anthem?”

“Every time they play this song,” King explained, “I have a bad day.”

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Jeff King 1997 155 543 84 129 30 1 28 112 89 96 16 5 .238 245 .341 .451 .792
Frank Thomas 2002 148 523 77 132 29 1 28 92 88 115 3 0 .252 247 .361 .472 .834
Ron Cey 1980 157 551 81 140 25 0 28 77 69 92 2 2 .254 249 .342 .452 .794


Darren Lewis in ’96 had a season that you could easily miss, because he batted only 337 times and hit only .228 with no power—yet somehow, he drove in 53 runs. It’s not a lot of RBI, but if you only bat 337 times and hit .228 with no power, that’s a lot of RBI. Just ask Jimmie Piersall:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Darren Lewis 1996 141 337 55 77 12 2 4 53 45 40 21 5 .228 105 .321 .312 .632
Jimmy Piersall 1959 100 317 42 78 13 2 4 30 25 31 6 3 .246 107 .303 .338 .641
Larry Brown 1966 105 340 29 78 12 0 3 17 36 58 0 1 .229 99 .309 .291 .600


Tino Martinez in 1998 and J. T. Snow in 1997 were both American League first basemen, both Gold Glove-type fielders. They had the same number of at bats (531), same hits (169), same homers (28). The differences between them are few and subtle—but Martinez drove in 123 runs, Snow 104.

Snow walked 96 times, Martinez 61. One can say, about this, that Snow didn’t drive in runs because he was a guy who wouldn’t expand the strike zone in an RBI situation. But if that’s the explanation, what do you say about Jacques Jones in 2006? He expanded the strike zone too much?

That’s the reason I’m printing all these stats, to give you a fair chance to look for the small differences between the players that might explain the RBI discrepancies:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Tino Martinez 1998 142 531 92 149 33 1 28 123 61 83 2 1 .281 268 .355 .505 .860
J.T. Snow 1997 157 531 81 149 36 1 28 104 96 124 6 4 .281 271 .387 .510 .898
Jacque Jones 2006 149 533 73 152 31 1 27 81 35 116 9 1 .285 266 .334 .499 .833


Lee May and George Scott were profoundly similar hitters, although May couldn’t carry Scott’s glove. Like Barry Bonds earlier, Johnny Bench, the “C” player in this study, also had a very similar season in which he did drive in runs:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Lee May 1976 148 530 61 137 17 4 25 109 41 104 4 1 .258 237 .312 .447 .759
George Scott 1971 146 537 72 141 16 4 24 78 41 102 0 3 .263 237 .317 .441 .758
Johnny Bench 1971 149 562 80 134 19 2 27 61 49 83 2 1 .238 238 .299 .423 .722
Johnny Bench 1973 152 557 83 141 17 3 25 104 83 83 4 1 .253 239 .345 .429 .774


Bench outshone the other outstanding catchers who were born in 1947, one of whom was Thurman Munson. But Munson had his moments, and he drove in some runs:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Thurman Munson 1975 157 597 83 190 24 3 12 102 45 52 3 2 .318 256 .366 .429 .795
Mark Loretta 2003 154 589 74 185 28 4 13 72 54 62 5 4 .314 260 .372 .441 .814
Brady Clark 2005 145 599 94 183 31 1 13 53 47 55 10 13 .306 255 .372 .426 .798


One year when I was a kid, Floyd Robinson drove in 109 runs with only 11 homers. That season, like Tommy Davis the same year, had a huge impact on me. I can still remember the “experts” at that time—the radio guys, who passed on their wisdom with the assurance of oracles—talking about this, saying that this proved that a line drive hitter could be a good RBI guy if he got the chance.

That was Robinson’s first full year as a regular—he had played his way into the lineup in ’61—and I, as a 12-year-old kid, fully expected that Robinson would be a consistent player of this type, hitting 40+ doubles and using them to drive in 100+ runs year after year. After all, the experts said that he had proven that he could do this.

He could never do it again, however, and I always wondered why. Kind of obsessively. Since then we have had periodic players who have had very similar seasons to this—10 homers, 45 doubles, 100+ RBI, All-Star attention. But none of them have ever been able to repeat it. Wes Parker in ’71 hit .319 with 47 doubles, only 10 homers, but 111 RBI. But the next season, like Robinson, he just couldn’t do it; he dropped off by about 50 RBI—actually, as Parker’s ’71 season is much like Robinson’s ’62 season, so Parker’s ’72 looks much like Robinson’s ’63. Keith Hernandez had a very similar season in ’79, winning half of an MVP Award with 48 doubles, 11 homers, 105 RBI, .344 average. Hernandez was a better player than Floyd Robinson or Parker, but, although he hit more home runs later on—16, 18, 15, etc.—he never again drove in 100 runs.

Jeff Cirillo, 2000. Cirillo, aided some by Coors Field, upped the doubles ante to 53, 11 homers like the other guys, .326 average, 115 RBI. But, like the other guys, he was unable to sustain it. He hit .326 again the next year, with more homers but with only 83 RBI, and then basically evaporated.

On back to Adam Comorosky and Joe Vosmik, Edgar Renteria in 2003, Willie McGee in ’87, hitters of this type may drive in 100 runs, and people will always think if they have done it once they can do it again—but they never do. It’s one of those things. . .it fascinates me because I don’t really understand it, but I sort of understand it. Nobody is actually good enough to drive in 100 runs by finding gaps in the outfield. One year you may hit 45 balls in the gap, and you look great doing that—but its not a real skill; you can’t really rely on hitting line drives where the fielders aren’t playing. And, if you can do it again, you still can’t get the hits you need in RBI situations. It’s like the year that Derek Lowe had in 2002—it’s wonderful, but you can never do it again because you can never get the ground balls to go right at people like they did that year.

Anyway, the Floyd Robinson set:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Floyd Robinson 1962 156 600 89 187 45 10 11 109 72 47 4 2 .312 285 .384 .475 .859
Robin Yount 1988 162 621 92 190 38 11 13 91 63 63 22 4 .306 289 .369 .465 .834
Shannon Stewart 2001 155 640 103 202 44 7 12 60 46 72 27 10 .316 296 .371 .462 .834


In 1969 Ron Santo and Le Grand Orange (Rusty Staub) were playing in the same division, the NL East, and had almost the same batting stats—except that Santo drove in a lot more runs:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Ron Santo 1969 160 575 97 166 18 4 29 123 96 97 1 3 .289 279 .384 .485 .869
Ken Boyer 1959 149 563 86 174 18 5 28 94 67 77 12 6 .309 286 .384 .508 .892
Rusty Staub 1969 158 549 89 166 26 5 29 79 110 61 3 4 .302 289 .426 .526 .952


Staub had a bad RBI year in 1969—but he had a real good one in 1976, driving in 42 more runs with fewer homers and fewer total bases:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Rusty Staub 1978 162 642 75 175 30 1 24 121 76 35 3 1 .273 279 .347 .435 .782
Roy Smalley 1979 162 621 94 168 28 3 24 95 80 80 2 3 .271 274 .353 .441 .794
Cal Ripken 1986 162 627 98 177 35 1 25 81 70 60 4 2 .282 289 .355 .461 .816


Not that Mike Sweeney is headed for the Hall of Fame or anything, but in 2000 Sweeney had the same number of homers and the same number of total bases that Frank Robinson had had in 1957—but drove almost twice as many runs. In fact, Sweeney drove in more runs that summer than either Frank Robinson or Steve Garvey ever did:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Mike Sweeney 2000 159 618 105 206 30 0 29 144 71 67 8 3 .333 323 .407 .523 .930
Steve Garvey 1979 162 648 92 204 32 1 28 110 37 59 3 6 .315 322 .351 .497 .848
Frank Robinson 1957 150 611 97 197 29 5 29 75 44 92 10 2 .322 323 .376 .529 .905


Miguel Tejada in 2004 had 203 hits, 40 doubles, 34 homers. . .great numbers, but you would have expected him to drive in 121 runs. He drove in 150.
I paired Tejada with Cecil Cooper in 1982, who had similar numbers but did drive in 121, and Billy Williams in 1964, who had almost the same numbers but missed the 100-RBI level:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Miguel Tejada 2004 162 653 107 203 40 2 34 150 48 73 4 1 .311 349 .360 .534 .894
Cecil Cooper 1982 155 654 104 205 38 3 32 121 32 53 2 3 .313 345 .342 .528 .870
Billy Williams 1964 162 645 100 201 39 2 33 98 59 84 10 7 .312 343 .370 .532 .901


Joe Torre in 1971 hit nothing but line drives, picking up an MVP Award for driving in 137 runs. He doesn’t have a lot of good matches, and maybe, like Tommy Davis in ’62, I should have excluded him from the study. But I didn’t:

Player YEAR G AB R H 2B 3B HR RBI BB SO SB CS Avg TB OBA SPct OPS
Joe Torre 1971 161 634 97 230 34 8 24 137 63 70 4 1 .363 352 .421 .555 .976
Cecil Cooper 1980 153 622 96 219 33 4 25 122 39 42 17 6 .352 335 .387 .539 .926
Darin Erstad 2000 157 676 121 240 39 6 25 100 64 82 28 8 .355 366 .409 .541 .951



IV. Test Group Summary Data

Taken as individual comparisons, the players in these 20 matched sets are similar but there are small differences between them in walks, strikeouts, at bats, etc., and large differences between them in RBI. Taken as groups, the groups are similar but there really small differences in the other categories, and really large differences in RBI.

The Alpha players—the RBI men—had 471 homers, 584 doubles, 60 triples, but drove in 2,351 runs, which is 118 RBI apiece.

The Beta group—the middle group—had 468 homers, 595 doubles, 62 triples, but drove in only 1,865 runs, which is 93 apiece.

The Charlie group—the girlie men who didn’t drive in runs—had 471 homers, 596 doubles, 64 triples, but drove in only 1,412 runs, or 71 apiece. (Incidentally, the Retrosheet data accounted for all of the RBI for all of the players in both the Alpha and Charlie groups. There was a little bit of missing data in the Beta group.)

There are some differences between the Alpha and Charlie groups. ..the familiar problem of quality leakage. Even though you try to control the quality of the players selected, you never quite make it. The Alpha players did have more walks than the Charlie group (63-56 on average) and fewer strikeouts (81-84). The Charlie group, because it contained players used more often in the leadoff spot, did score more runs (87-85) and steal more bases (11-9). These are very small differences.

The Alpha group did have:
  1. Many, many more sacrifice flies (182 to 65, or 9 to 3 on average. The Beta group had 118 sac flies, an average of 6.)
  2. Significantly more GIDP. The GIDP totals were 331 for the Alpha group (average 17), 260 for the Beta group (13), 252 for the Charlie group (13).
  3. More intentional walks. The Alpha group had 176 IBB (average of 9), the Beta group had 160 (8), and the Charlie group had 106 (5).
The Alpha Group had a .478 slugging percentage, the Beta group .481, the Charlie Group .478. The Alpha Group hit .291; the Charlie Group hit .290. There are other differences between the groups, but these differences are infinitessimal compared to the separation of almost 1,000 RBI between the Alpha and the Charlie players.


V. Trait Retention

Does the characteristic of driving in more runs than you would have been expected to drive in tend to “stay with” a player in the following season, or does it disappear?

It tends to be retained at about 40% of the original strength. In other words, if you take a group of players who drive in an average of 25 runs more than expected in year X, they will probably average about 10 more RBI than expected in year X+1. I did not study whether driving in fewer runs than expected was also a retained trait.


VI. Career Data

The career data isn’t especially interesting. . ..just shows persistant biases in the formula of which one would generally be aware anyway. The biggest career over-achievers in RBI vs. expected RBI are all 19th century guys. .Cap Anson, +767, Jake Beckley, +453, George Davis, +450, etc. The top over-achiever since 1900 is Pie Traynor, who hit third for years despite not hitting home runs, followed by Joe Cronin, who is sort of the same (and who could well have been Traynor’s teammate for half of his career.) Then Bobby Veach, Ty Cobb, Al Simmons, Sam Crawford, Harry Heilmann. . . .bunch of dead ball era and 1920s stars. The top player who is essentially post-World War II is Enos Slaughter (+235, 46th place), and the top player more recent than Slaughter is Vic Wertz (+196, 78th place). Ted Simmons (+193) is the only player who is fully within the last 50 years who is among the top 100. .. the only one who has noticeably more RBI than one would expect. After him the next player from the last two or three generations is Rusty Staub (+128). . ..lot of Rusty Staub notes here.

On the other end are lead-off hitters who had a little power, and one interesting name:

1. Rickey Henderson -329
2. Lou Brock -309
3. Pete Rose -284
4. Willie Mays -274
5. Craig Biggio -268


I wouldn’t be inclined to draw any conclusions from that.


VII. Analysis of RBI Differentials
Of Players in Study Group

Getting finally to the question under review, which is: why did these players over-achieve in the RBI column? 42% of the separation in RBI between these two groups of equally productive players was caused by the fact that the RBI men were more productive in RBI situations (and less productive in non-RBI situations.) Actually, .4217. 37% of the separation in RBI between these two groups was caused by the fact that the RBI men had more RBI opportunities presented to them. Actually, .3749. 20% of the separation in RBI between these two groups was caused by the interaction of these two causes. Actually, .2034.

Allocating the interactive effects in the same proportion as the direct effects, 53% of the separation in RBI is caused by hitting better in RBI situations (.5294), and 47% is caused by having more RBI opportunities to work with (.4706). The “unexpected RBI” effect is essentially evenly attributable to these two causes.

The players who had high RBI totals had:
5,588 at bats with the bases empty,
5,727 at bats with men on base,
and 3,458 at bats with runners in scoring position.

The players who had low RBI totals had:
6,697 at bats with the bases empty.
4,607 at bats with men on base,
and 2,544 at bats with runners in scoring position.

I will just leave out the data for the Beta group, because it is essentially non-instructive.

The players with good RBI totals averaged:
.262 with the bases empty,
.318 with men on base (which would be .309 if the Sacrifice Flies were counted as at bats),
.335 with runners in scoring position (.324 adjusted for the Sac Flies),
.374 with the bases loaded (.324 adjusted for the Sac Flies.)

The players with poor RBI totals hit:
.299 with the bases empty,
.280 with men on base (which would be .276 if we counted the Sac Flies as at bats),
.262 with runners in scoring position (.255, Sac Fly adjusted.)
.235 with the bases loaded (.221 adjusted).

The RBI men homered 30% more often with men on base than with the bases empty—36 home runs per 1000 at bats with the bases empty, 47 with men on base.

The poor-RBI group homered 19% less often with men on base—45 homers per 1000 at bats with the bases empty, 36 with men on base.

The RBI men averaged:
.934 RBI per plate appearance when the bases were loaded,
.391 RBI per plate appearance when there were runners in scoring position but the bases were not loaded,
.123 RBI per plate appearance when there was a runner on first base only,
.033 RBI per plate appearance when the bases were empty.

The poor-RBI group averaged:
.579 RBI per plate appearance when the bases were loaded,
.276 RBI per plate appearance when there were runners in scoring position but the bases were not loaded,
.097 RBI per plate appearance when there was a runner on first base only,
.041 RBI per plate appearance when the bases were empty.


VIII. As Individuals

Almost without exception, every hitter that we selected as having a “good” RBI total had a higher batting average with men on base than with the bases empty, and a higher batting average with runners in scoring position than with men on base. There are just a few small exceptions to that rule, among these 20 players.

In general, the “non-RBI” men had the opposite pattern, although it is slightly less notable among the non-RBI men.
Thus, if you see a player who is +20 in RBI vs. expected RBI, it appears that one can assert with little risk of error that that player has hit well with runners in scoring position.

In this study, every RBI man (without exception) had more at bats with runners in scoring position than did his “non-RBI” match.

A few of the more notable cases:

Chili Davis in 1993 hit .182 with the bases empty (287 at bats), but .325 with runners in scoring position (163 at bats.)

Pedro Guerrero in 1989 hit .400 with runners in scoring position (68/170).

Darren Lewis in 1996 hit .177 with the bases empty, but .243 with men on first base only, .318 with runners in scoring position, and .500 with the bases loaded (6/12).

Larry Brown in 1966, who had essentially the same batting stats as Lewis except for RBI, hit .261 with the bases empty, but .200 with men on first, and .149 with runners in scoring position.

Lee May in 1976 hit .181 with the bases empty, but .350 with men on base.

Frank Robinson in 1957 hit .351 with the bases empty, but .252 with runners in scoring position, And he hit 21 of his 29 homers with the bases empty (57% of his at bats with bases empty).

Andre Dawson in 1978, despite having a poor RBI season, actually hit for a better average with runners in scoring position (.252) than with the bases empty (.242). But almost all of his power (18 of 25 homers) was with the bases empty.

Felipe Alou, who had the odd stats as a leadoff man in ’66, actually did hit better with runners on base (.354) and in scoring position (.359) than with the bases empty (.325). His shortfall in RBI is entirely caused by a lack of RBI opportunities. Tommy Davis, who had comparable batting stats to Alou but twice the RBI, had 213 at bats with runners in scoring position—an almost fantastic number. Alou had 103.

Miguel Tejada (2004) is interesting. Tejada did hit 22 of his 34 homers with men on base, which is a very high percentage even for the high-RBI group. But most of these players cleaned up bases-loaded situations. Tejada, almost unbelievably, went 2-for-17 in bases-loaded situations—2-for-21 if you throw out the sac flies. He did drive in 14 runs in those situations.

Bill James
Brookline, Massachusetts
June 9, 2007
 
 

COMMENTS (1 Comment)

bbbilbo
I know it wasn't where you were going, but I can't help wanting to see:
The Team OBA for each player's year.
The OBA of the three hitters who usually batted in front of each player.
Is that available?
2:06 PM Apr 11th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy