This study begins by asking what is essentially an archaic question, albeit an archaic question to which I want to know the answer. The question is, “When a player drives in more runs than you would expect, or fewer runs than you would expect, to what extent should we suspect that this is because he hit well in RBI situations, and to what extent should we suspect that this happens just because he had a lot of RBI opportunities?” Let’s take Al Kaline, 1956. Kaline hit .314 with 27 homers, 128 RBI. A player who hits .314 with 27 home runs would not ordinarily drive in 128 runs; I’ll document that later, but one knows it immediately by a baseball fan’s intuition. Did he hit .400 that year with runners in scoring position, or did he drive in all of those runs that year because he happened to hit in a lot of RBI situations?
As I said, it’s sort of an archaic question because, in the modern world, it’s easy to find out what a player has hit with runners in scoring position, and how many times he has batted with runners in scoring position. It’s still an interesting question, at least to me, so I’m still going to chase it.
Part I
Establishing Expectations
In order to study whether deviations from expected RBI occur more because of performance or more because of opportunity, we first have to establish how many RBI a player should be expected to have, given his other stats. How do we do that?
We can assume, as a starting point, that RBI must bear some predictable relationship to total bases. I have a spreadsheet that has in it the batting stats of about 80% of the regular hitters in baseball history. . .in a few weeks I’ll have everybody, but that doesn’t really matter, and I’m interested in this today. I took that spreadsheet, and I eliminated from it all the players who had less than 250 Plate Appearances, because frankly they’re just a nuisance for the present study. This left me with a file of 20,806 player seasons.
I then eliminated all the players from the 1882-1890 era for whom we have no RBI counts. This left me with 20,355 players, with RBI ranging from 191 to 5 (Ernie Fazio, 1963), and total bases ranging from 457 to 35 (Bill Bergen, 1911). The players in the study averaged 55 RBI and 176 Total Bases, so they averaged .31 RBI per base--.311 580, if you want to be technical about it. Let’s call it .3116.
We can thus make a first estimate of each player’s expected RBI by simply assuming that his RBI should be .3116 times his total bases. Al Kaline, 1956, had 327 total bases, so we would expect that he would have 102 RBI. Since he actually drove in 128, we have an error of 26. By this first-cut method, we have a total error of 197,964 RBI for all players, an average error of 9.72. By this method, the biggest RBI over-achiever in my data sample was Sam Thompson, 1887. Thompson had 311 total bases, which should have led to 97 RBI. He actually had 166, or +69. . .an error of 69. The biggest under-achiever was Lloyd Waner, 1927. Waner had 258 total bases, which should have led to 80 RBI, but he actually drove in only 27.
We can do a little bit better than that pretty easily. The next question was, “Is this figure fairly constant over time, or does it go up and down radically as the game changes?”
Answer: It’s reasonably constant except for the 1890s. It does go up and down with run scoring levels, but, except for the 1890s, it stays within an acceptable range. . .what seems to me like an acceptable range. This is the norm for each decade in baseball history:
1870s .307 | 1920s .317 | 1970s .301 | 1880s .329 | 1930s .327 | 1980s .302 | 1890s .371 | 1940s .315 | 1990s .316 | 1900s .299 | 1950s .310 | 2000s .311 | 1910s .295 | 1960s .296 | |
It goes up and down some, but except for the 1890s it stays within a range from .295 RBI per total base to .329. I think I’m going to deal with the aberrant data for the 1880s/1890s by simply throwing the 19th century out of the study. The average for all of baseball history since 1900 is .3079, but in any case we will continue to treat this as a constant throughout baseball history.
Our aggregate error now is 175,877 RBI. . .see, we’re already making progress. That’s not real progress, of course; we’ve reduced the aggregate error by shrinking the study, but the average error for 1900 to the present is 9.52, so we have at least reduced the average error a little. The largest errors now are Hank Greenberg, 1937 and Hack Wilson, 1930, both over their estimate by 60.76 RBI. Waner is still the furthest under, followed by Richie Ashburn in 1958.
It is apparent from glancing at the data that home run hitters tend to be “over” their expected RBI, and singles hitters under. Well, here, let me show you some of the data. . .these are the fifteen hitters who are furthest over their expected RBI:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | Avg | | TB | EX RBI | Error | Hank Greenberg | 1937 | 154 | 594 | 137 | 200 | 49 | 14 | 40 | 183 | .337 | | 397 | 122 | 61 | Hack Wilson | 1930 | 155 | 585 | 146 | 208 | 35 | 6 | 56 | 191 | .356 | | 423 | 130 | 61 | Manny Ramirez | 1999 | 147 | 522 | 131 | 174 | 34 | 3 | 44 | 165 | .333 | | 346 | 107 | 58 | Lou Gehrig | 1931 | 155 | 619 | 163 | 211 | 31 | 15 | 46 | 184 | .341 | | 410 | 126 | 58 | Vern Stephens | 1949 | 155 | 610 | 113 | 177 | 31 | 2 | 39 | 159 | .290 | | 329 | 101 | 58 | Jimmie Foxx | 1938 | 149 | 565 | 139 | 197 | 33 | 9 | 50 | 175 | .349 | | 398 | 123 | 52 | Zeke Bonura | 1936 | 148 | 587 | 120 | 194 | 39 | 7 | 12 | 138 | .330 | | 283 | 87 | 51 | Hank Greenberg | 1935 | 152 | 619 | 121 | 203 | 46 | 16 | 36 | 170 | .328 | | 389 | 120 | 50 | Hack Wilson | 1929 | 150 | 574 | 135 | 198 | 30 | 5 | 39 | 159 | .345 | | 355 | 109 | 50 | Vic Wertz | 1949 | 155 | 608 | 96 | 185 | 26 | 6 | 20 | 133 | .304 | | 283 | 87 | 46 | Luke Appling | 1936 | 138 | 526 | 111 | 204 | 31 | 7 | 6 | 128 | .388 | | 267 | 82 | 46 | Jimmie Foxx | 1930 | 153 | 562 | 127 | 188 | 33 | 13 | 37 | 156 | .335 | | 358 | 110 | 46 | Joe DiMaggio | 1948 | 153 | 594 | 110 | 190 | 26 | 11 | 39 | 155 | .320 | | 355 | 109 | 46 | Ted Williams | 1949 | 155 | 566 | 150 | 190 | 39 | 3 | 43 | 159 | .343 | | 368 | 113 | 46 | Don Hurst | 1932 | 150 | 579 | 109 | 196 | 41 | 4 | 24 | 143 | .339 | | 317 | 98 | 45 |
And these are the fifteen who are furthest under:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | Avg | | TB | EX RBI | Error | Lloyd Waner | 1927 | 150 | 629 | 133 | 223 | 17 | 6 | 2 | 27 | .355 | | 258 | 79 | -52 | Richie Ashburn | 1958 | 152 | 615 | 98 | 215 | 24 | 13 | 2 | 33 | .350 | | 271 | 83 | -50 | Patsy Dougherty | 1904 | 155 | 647 | 113 | 181 | 18 | 14 | 6 | 26 | .280 | | 245 | 75 | -49 | Snuffy Stirnweiss | 1944 | 154 | 643 | 125 | 205 | 35 | 16 | 8 | 43 | .319 | | 296 | 91 | -48 | Luis Castillo | 2000 | 136 | 539 | 101 | 180 | 17 | 3 | 2 | 17 | .334 | | 209 | 64 | -47 | Johnny Mostil | 1926 | 148 | 600 | 120 | 197 | 41 | 15 | 4 | 42 | .328 | | 280 | 86 | -44 | Don Blasingame | 1959 | 150 | 615 | 90 | 178 | 26 | 7 | 1 | 24 | .289 | | 221 | 68 | -44 | Juan Pierre | 2006 | 162 | 699 | 87 | 204 | 32 | 13 | 3 | 40 | .292 | | 271 | 83 | -43 | Tim Raines | 1985 | 150 | 575 | 115 | 184 | 30 | 13 | 11 | 41 | .320 | | 273 | 84 | -43 | Ralph Garr | 1975 | 151 | 625 | 74 | 174 | 26 | 11 | 6 | 31 | .278 | | 240 | 74 | -43 | Ralph Garr | 1971 | 154 | 639 | 101 | 219 | 24 | 6 | 9 | 44 | .343 | | 282 | 87 | -43 | Willie Wilson | 1980 | 161 | 705 | 133 | 230 | 28 | 15 | 3 | 49 | .326 | | 297 | 91 | -42 | Matty Alou | 1966 | 141 | 535 | 86 | 183 | 18 | 9 | 2 | 27 | .342 | | 225 | 69 | -42 | Pete Rose | 1970 | 159 | 649 | 120 | 205 | 37 | 9 | 15 | 52 | .316 | | 305 | 94 | -42 | Matty Alou | 1967 | 139 | 550 | 87 | 186 | 21 | 7 | 2 | 28 | .338 | | 227 | 70 | -42 |
It is apparent that home run hitters tend to be over their expected RBI and singles hitters under. This probably isn’t a direct effect of home runs as opposed to singles, in the main; it’s caused by the fact that singles hitters bat leadoff and home run hitters in the middle of the order. But that’s not our problem; we’re not looking for some way of figuring expected RBI based on batting order position. We’re just looking for expected RBI based on batting stats. As far as we know, home runs lead to extra RBI.
OK, so let’s cut out the home runs from the herd, and have two weights: Home Runs, and OTB (Other Total Bases. Or Off-Track Betting, whichever you prefer.) What weight should we put on HR, and what weight on OTB?
It is apparent that the weight on a HR cannot be less than 1.24, because below 1.24, we would not be increasing the weight given to home runs; we would be decreasing it. Let’s start with a weight of 1.30 RBI per home run. If we have 1.30 RBI per home run, the weight for a non-home run total base drops to .3031. . .this can be simply derived from the data, of course. So let’s try this formula:
1.30 HR + .3031 OTB = RBI
That reduces the average error of the RBI estimates to 9.31. . .a much larger gain in accuracy than I was expecting, frankly. Let’s try:
1.40 HR + .2961 OTB = RBI
This drops the standard error to 9.03. Moving on:
1.50 HR + .2892 OTB = RBI
By the way. . .we all know that Babe Ruth holds the record for Total Bases in a season, 457. But who holds the record for Total Bases, not including Home Runs? The answer appears to be. . .unless it is someone who is not in my study. . .Ty Cobb, 1911. 335 OTB. Anyway. …our average error on this try drops to 8.79, so we move on:
1.60 HR + .2822 OTB = RBI
That has an error of 8.60. I am now violating one of the cardinal rules of writing up research: Don’t tell them about the process of your research. Tell them about the product of your research. My apologies. . .I was just trying to explain to you how I come up with these formulas. I’ll get to the point. The formula I settled on is
2 HR + .25 OTB = RBI
Which can also be written as:
TB/4 + HR = RBI
This actually is not the most accurate formula I could find; others are a hair more accurate. But this is simple and makes intuitive sense. You drive in other runners at a rate of one RBI per four bases, plus, if you hit a home run, you drive in yourself.
Based on this formula, we would expect these ten hitters to be the best RBI men within my study:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | Avg | | OTB | | Ex RBI | Error | Babe Ruth | 1921 | 152 | 540 | 177 | 204 | 44 | 16 | 59 | 171 | .378 | | 221 | | 173 | 2 | Sammy Sosa | 2001 | 160 | 577 | 146 | 189 | 34 | 5 | 64 | 160 | .328 | | 169 | | 170 | 10 | Sammy Sosa | 1998 | 159 | 643 | 134 | 198 | 20 | 0 | 66 | 158 | .308 | | 152 | | 170 | 12 | Jimmie Foxx | 1932 | 154 | 585 | 151 | 213 | 33 | 9 | 58 | 169 | .364 | | 206 | | 168 | 2 | Babe Ruth | 1927 | 151 | 540 | 158 | 192 | 29 | 8 | 60 | 164 | .356 | | 177 | | 164 | 0 | Sammy Sosa | 1999 | 162 | 625 | 114 | 180 | 24 | 2 | 63 | 141 | .288 | | 145 | | 162 | 21 | Hack Wilson | 1930 | 155 | 585 | 146 | 208 | 35 | 6 | 56 | 191 | .356 | | 199 | | 162 | 29 | Luis Gonzalez | 2001 | 162 | 609 | 128 | 198 | 36 | 7 | 57 | 142 | .325 | | 191 | | 162 | 20 | Lou Gehrig | 1927 | 155 | 584 | 149 | 218 | 52 | 18 | 47 | 175 | .373 | | 259 | | 159 | 16 | Rogers Hornsby | 1922 | 154 | 623 | 141 | 250 | 46 | 14 | 42 | 152 | .401 | | 282 | | 155 | 3 |
They weren’t the ten best RBI men, exactly, but they weren’t too shabby, either, driving in an average of 162 runs.
Remarkably enough, the two biggest RBI over-achievers in my study were both on the same team: Zeke Bonura, 1936 Chicago White Sox (+55.25), and Luke Appling, also 1936 Chicago White Sox (also +55.25). . .actually a third player ties with them, that being Pie Traynor on the 1928 Pirates:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | Avg | | TB | | Ex RBI | Error | Zeke Bonura | 1936 | 148 | 587 | 120 | 194 | 39 | 7 | 12 | 138 | .330 | | 283 | | 82.75 | 55.25 | Luke Appling | 1936 | 138 | 526 | 111 | 204 | 31 | 7 | 6 | 128 | .388 | | 267 | | 72.75 | 55.25 | Pie Traynor | 1928 | 144 | 569 | 91 | 192 | 38 | 12 | 3 | 124 | .337 | | 263 | | 68.75 | 55.25 |
I had never realized that the ’36 White Sox were such a unique team. They had no power hitters, obviously, so they used Bonura and Appling in the middle of the order, hit .292 as a team, drew a lot of walks, and scored six runs a game, somebody had to drive them in. All of the biggest “+ RBI” guys in history are singles hitters who somehow were stranded in the middle of the batting order, while the two biggest “negative RBI” guys are power hitters who happened to hit leadoff— - Felipe Alou, 1966
- Alfonso Soriano, 2006
After that the low-RBI guys are a mix of leadoff power hitters and leadoff guys who just didn’t drive in any runs, like Waner in ’27 and Patsy Dougherty in 1904. The ten biggest RBI under-achievers were all guys who scored 100-plus runs, except for Lou Brock, 1966.
So this provides us a first answer to the question: extraordinary gaps in RBI (vs. expected RBI) appear, at first look, to be driven by context.
II. Back To Kaline for a Second
I promised to show that Kaline’s RBI count was out of line. . . .I looked up all players in baseball history who had - 190-199 hits
- 25-29 home runs
- .300-.330 batting average
There are 19 such players. . .Bobby Thomson, 1949, Yogi Berra, 1950, Frank Robinson, 1957, Willie Mays, 1960, etc. Kaline is in the middle of this group in hits, home runs, in a three-way tie for 8th-9th-10th in total bases—but first in the group in RBI. The other 18 players averaged 101 RBI. By our little formula, Kaline is +19.
III. The Test Group
OK, I’m not really interested in extraordinary gaps in RBI, which are obviously caused by things like Adam Everett hitting cleanup, but in the more routine discrepancies where somebody just drives in about 20 more runs than you would think they ought to. And, since I then want to check these guys out and see why they drove in more runs than they ought to have, I’ll need to focus on players since 1957, when Retrosheet data is now available. And I’ll have to skip 1999. . .sorry, Manny.
I chose 20 “matched sets” of players. I identified 20 players who had un-expectedly high RBI totals. Then, for each player, I identified - a player with very comparable batting stats for the season, but a normal RBI total, and
- a player with very comparable batting stats for the season, but a notably LOW RBI total.
These were the 20 sets, listed alphabetically by the Alpha player:
George Bell and Andre Dawson were the MVPs in 1987, having similar seasons and leading the two leagues in RBI. In other years, however, they match in other ways. Bell had almost the same stats in 1992 that Andre Dawson had had in 1978—but drove in 40 more runs. We’ll sandwich another guy who didn’t deserve the MVP award either in between them:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | George Bell | 1992 | 155 | 627 | 74 | 160 | 27 | 0 | 25 | 112 | 31 | 97 | 5 | 2 | .255 | 262 | .294 | .418 | .712 | Don Baylor | 1982 | 157 | 608 | 80 | 160 | 24 | 1 | 24 | 93 | 57 | 69 | 10 | 4 | .263 | 258 | .329 | .424 | .754 | Andre Dawson | 1978 | 157 | 609 | 84 | 154 | 24 | 8 | 25 | 72 | 30 | 128 | 28 | 11 | .253 | 269 | .299 | .442 | .740 |
Barry Bonds in 1991 had almost the same number of singles, doubles, triples, home runs and at bats that Wally Moon had had in 1957, but whereas Moon drove in only 73 runs and had a normal-sized head, Bonds drove in 116:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Barry Bonds | 1991 | 153 | 510 | 95 | 149 | 28 | 5 | 25 | 116 | 107 | 73 | 43 | 13 | .292 | 262 | .410 | .514 | .924 | Gordy Coleman | 1961 | 150 | 520 | 63 | 149 | 27 | 4 | 26 | 87 | 45 | 67 | 1 | 3 | .287 | 262 | .341 | .504 | .845 | Wally Moon | 1957 | 142 | 516 | 86 | 152 | 28 | 5 | 24 | 73 | 62 | 57 | 5 | 6 | .295 | 262 | .367 | .508 | .875 | Barry Bonds | 1988 | 144 | 538 | 97 | 152 | 30 | 5 | 24 | 58 | 72 | 82 | 17 | 11 | .283 | 264 | .368 | .491 | .859 |
An irony here being that Bonds had another year, 1988, in which he also had about the same numbers—but drove in even fewer runs than Moon, and half as many as he would in ’91. We all know what this proves: driving in runs is proof of steroid use. No, seriously. . .Bonds in ’88 was a young player and a leadoff hitter. By ’91, although he was essentially the same player, he had been moved to the middle of the order and had begun to irritate journalists in a serious way. But Wally Moon wasn’t a leadoff hitter, so that’s something else. . .we’ll look at what in the next section of the study.
Jeff Burroughs won the MVP Award in ’74, essentially because he drove in a lot of runs for a team that had a surprisingly good season. Whether he deserved the MVP Award is another question, but Burroughs drove in 42 runs more than did Ellis Valentine in ’78, having essentially the same season:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Jeff Burroughs | 1974 | 152 | 554 | 84 | 167 | 33 | 2 | 25 | 118 | 91 | 104 | 2 | 3 | .301 | 279 | .397 | .504 | .901 | Chipper Jones | 2003 | 153 | 555 | 103 | 169 | 33 | 2 | 27 | 106 | 94 | 83 | 2 | 2 | .305 | 287 | .402 | .517 | .920 | Ellis Valentine | 1978 | 151 | 570 | 75 | 165 | 35 | 2 | 25 | 76 | 35 | 88 | 13 | 8 | .289 | 279 | .330 | .489 | .820 |
This article isn’t about clutch hitting, but Joe Carter was a famous clutch hitter for a lot of reasons, among them his RBI numbers, even when his other stats were down:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Joe Carter | 1990 | 162 | 634 | 79 | 147 | 27 | 1 | 24 | 115 | 48 | 93 | 22 | 6 | .232 | 248 | .290 | .391 | .681 | Eric Karros | 1993 | 158 | 619 | 74 | 153 | 27 | 2 | 23 | 80 | 34 | 82 | 0 | 1 | .247 | 253 | .287 | .409 | .696 | Max Alvis | 1965 | 159 | 604 | 88 | 149 | 24 | 2 | 21 | 61 | 47 | 121 | 12 | 8 | .247 | 240 | .308 | .397 | .706 |
Donn Clendenon, Attorney at Law, joined the Mets in the middle of their miracle, hitting cleanup for them much of September ’69 although he never got much attention for it among Seaver, Koosman, Agee, Cleon, Hodges, and the great Al Weis. In 1970 Clendenon just had a remarkable season, driving in almost 100 runs in less than 400 at bats, despite not really hitting at a fantastic level. Sixto Lezcano had about the same hitting numbers later in the decade, but drove in half as many runs:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Donn Clendenon | 1970 | 121 | 396 | 65 | 114 | 18 | 3 | 22 | 97 | 39 | 91 | 4 | 1 | .288 | 204 | .348 | .515 | .863 | Eddie Murray | 1981 | 99 | 378 | 57 | 111 | 21 | 2 | 22 | 78 | 40 | 43 | 2 | 3 | .294 | 202 | .360 | .534 | .895 | Sixto Lezcano | 1977 | 109 | 400 | 50 | 109 | 21 | 4 | 21 | 49 | 52 | 78 | 6 | 5 | .273 | 201 | .358 | .503 | .861 |
Chili Davis had an interesting career. He was hyped so much as a young player that he spent several years being considered a major disappointment, although, in retrospect, he wasn’t half bad. He hit .300 three times, usually hit around .280, hit 30 homers once, 29 once, 28 once, 27 once, 26 once. But he drove in 100 runs only once—in a season when he hit just .243 and had easily the highest strikeout total of his career. Don Baylor, who had a similar career to Davis’ and who also drove in 100 runs only once although he was famous as an RBI man, is on the other end of this set:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Chili Davis | 1993 | 153 | 573 | 74 | 139 | 32 | 0 | 27 | 112 | 71 | 135 | 4 | 1 | .243 | 252 | .327 | .440 | .767 | Greg Walker | 1987 | 157 | 566 | 85 | 145 | 33 | 2 | 27 | 94 | 75 | 112 | 2 | 1 | .256 | 263 | .346 | .465 | .810 | Don Baylor | 1977 | 154 | 561 | 87 | 141 | 27 | 0 | 25 | 75 | 62 | 76 | 26 | 12 | .251 | 243 | .334 | .433 | .768 |
Tommy Davis in 1962 drove in more runs in a season than any other player between 1950 and 1997—and did it without hitting either 30 home runs or 30 doubles. He won the batting title and had 230 hits, but with only 27 homers and only 27 doubles—remarkably low totals for a player driving in 153 runs.
In a way, I really should not have included this set of players in the study, since they violate a few kind of unwritten rules underlying the exercise. First, I tried to avoid including players in the study if I actually understood why their RBI count was out of line. I tried to avoid including in the study power hitters who were used as leadoff men, for example, or players who hit in the middle of the order because they usually hit 25 homers a year but just happened to have seasons when they only hit 8. I avoided using Tommie Herr’s 1985 and 1987 seasons, when he had inflated RBI totals because he was batting behind Vince Coleman, who was stealing 100 bases a year.
Also, I tried to avoid including in the study players whose batting stats were so unusual that they were hard to match. I wouldn’t include Mark McGwire, 1996, for example, even though his RBI total is clearly wrong for his numbers (.312 with 52 homers, 113 RBI), because his overall batting stats are so odd that you can’t really find a good match for him.
Davis violates both of these rules. His inflated RBI count is also, in part, obviously explained by batting behind a player stealing 100+ bases, and his numbers are so unusual that they are hard to match. Felipe Alou, 1966, also violates this rule, because I certainly know that he was a leadoff hitter that year, and that this largely explains his low RBI total.
I decided to include them anyway, because: - Davis in ’62 is in a sense the epitome of the type of player I am interested in—the guy who has an RBI total which is just obviously out of line with what you would expect,
- these players were part of my childhood, and I am interested in them, and
- although I am quite interested in both seasons, I had never before noticed that Felipe Alou in ’66, with 74 RBI, and Davis in ’62, with 153, actually have quite similar numbers. One might have expected Alou to have had more RBI (120) than Davis (116).
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Tommy Davis | 1962 | 163 | 665 | 120 | 230 | 27 | 9 | 27 | 153 | 33 | 65 | 18 | 6 | .346 | 356 | .374 | .535 | .910 | Kirby Puckett | 1988 | 158 | 657 | 109 | 234 | 42 | 5 | 24 | 121 | 23 | 83 | 6 | 7 | .356 | 358 | .375 | .545 | .920 | Felipe Alou | 1966 | 154 | 666 | 122 | 218 | 32 | 6 | 31 | 74 | 24 | 51 | 5 | 7 | .327 | 355 | .361 | .533 | .894 |
Pedro Guerrero hit 27 to 33 home runs four times in his career, but set a career high for RBI in 1989, hitting only 17 home runs, although it was still a very good year. Actually, it was about the same year that Carl Yastrzemski had had in 1963, driving in 68 runs:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Pedro Guerrero | 1989 | 162 | 570 | 60 | 177 | 42 | 1 | 17 | 117 | 79 | 84 | 2 | 0 | .311 | 272 | .391 | .477 | .868 | Chet Lemon | 1979 | 148 | 556 | 79 | 177 | 44 | 2 | 17 | 86 | 56 | 68 | 7 | 11 | .318 | 276 | .391 | .496 | .887 | Carl Yastrzemski | 1963 | 151 | 570 | 91 | 183 | 40 | 3 | 14 | 68 | 95 | 72 | 8 | 5 | .321 | 271 | .418 | .475 | .894 |
In 1998 Superman’s older brother Jeff Kent played 137 games, batted only 526 times, hit .297. He did hit 31 homers, but one would have expected him to get over 100 RBI—maybe; actually my formula says 104. He drove in 128. The Crime Dog had done basically the same thing ten years earlier, but drove in 82:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Jeff Kent | 1998 | 137 | 526 | 94 | 156 | 37 | 3 | 31 | 128 | 48 | 110 | 9 | 4 | .297 | 292 | .359 | .555 | .914 | Tino Martinez | 1995 | 141 | 519 | 92 | 152 | 35 | 3 | 31 | 111 | 62 | 91 | 0 | 0 | .293 | 286 | .369 | .551 | .920 | Fred McGriff | 1988 | 154 | 536 | 100 | 151 | 35 | 4 | 34 | 82 | 79 | 149 | 6 | 1 | .282 | 296 | .376 | .552 | .928 |
When you hit .238 with only 129 hits you don’t really expect to drive in 112 runs even if you hit 40 homers—but Jeff King did it hitting 28.
King was a funny guy. ..he was a good guy, but he was perhaps the most negative major league player since Pink Hawley. I was talking to a couple of ex-Royals about the 1999 team, one guy claimed he was standing next to Jeff King during the opening-day ceremonies. “Man,” said King. “I hate this song.”
“Are you crazy?” said the other guy. “This is the National Anthem. How can you hate the National Anthem?”
“Every time they play this song,” King explained, “I have a bad day.”
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Jeff King | 1997 | 155 | 543 | 84 | 129 | 30 | 1 | 28 | 112 | 89 | 96 | 16 | 5 | .238 | 245 | .341 | .451 | .792 | Frank Thomas | 2002 | 148 | 523 | 77 | 132 | 29 | 1 | 28 | 92 | 88 | 115 | 3 | 0 | .252 | 247 | .361 | .472 | .834 | Ron Cey | 1980 | 157 | 551 | 81 | 140 | 25 | 0 | 28 | 77 | 69 | 92 | 2 | 2 | .254 | 249 | .342 | .452 | .794 |
Darren Lewis in ’96 had a season that you could easily miss, because he batted only 337 times and hit only .228 with no power—yet somehow, he drove in 53 runs. It’s not a lot of RBI, but if you only bat 337 times and hit .228 with no power, that’s a lot of RBI. Just ask Jimmie Piersall:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Darren Lewis | 1996 | 141 | 337 | 55 | 77 | 12 | 2 | 4 | 53 | 45 | 40 | 21 | 5 | .228 | 105 | .321 | .312 | .632 | Jimmy Piersall | 1959 | 100 | 317 | 42 | 78 | 13 | 2 | 4 | 30 | 25 | 31 | 6 | 3 | .246 | 107 | .303 | .338 | .641 | Larry Brown | 1966 | 105 | 340 | 29 | 78 | 12 | 0 | 3 | 17 | 36 | 58 | 0 | 1 | .229 | 99 | .309 | .291 | .600 |
Tino Martinez in 1998 and J. T. Snow in 1997 were both American League first basemen, both Gold Glove-type fielders. They had the same number of at bats (531), same hits (169), same homers (28). The differences between them are few and subtle—but Martinez drove in 123 runs, Snow 104.
Snow walked 96 times, Martinez 61. One can say, about this, that Snow didn’t drive in runs because he was a guy who wouldn’t expand the strike zone in an RBI situation. But if that’s the explanation, what do you say about Jacques Jones in 2006? He expanded the strike zone too much?
That’s the reason I’m printing all these stats, to give you a fair chance to look for the small differences between the players that might explain the RBI discrepancies:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Tino Martinez | 1998 | 142 | 531 | 92 | 149 | 33 | 1 | 28 | 123 | 61 | 83 | 2 | 1 | .281 | 268 | .355 | .505 | .860 | J.T. Snow | 1997 | 157 | 531 | 81 | 149 | 36 | 1 | 28 | 104 | 96 | 124 | 6 | 4 | .281 | 271 | .387 | .510 | .898 | Jacque Jones | 2006 | 149 | 533 | 73 | 152 | 31 | 1 | 27 | 81 | 35 | 116 | 9 | 1 | .285 | 266 | .334 | .499 | .833 |
Lee May and George Scott were profoundly similar hitters, although May couldn’t carry Scott’s glove. Like Barry Bonds earlier, Johnny Bench, the “C” player in this study, also had a very similar season in which he did drive in runs:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Lee May | 1976 | 148 | 530 | 61 | 137 | 17 | 4 | 25 | 109 | 41 | 104 | 4 | 1 | .258 | 237 | .312 | .447 | .759 | George Scott | 1971 | 146 | 537 | 72 | 141 | 16 | 4 | 24 | 78 | 41 | 102 | 0 | 3 | .263 | 237 | .317 | .441 | .758 | Johnny Bench | 1971 | 149 | 562 | 80 | 134 | 19 | 2 | 27 | 61 | 49 | 83 | 2 | 1 | .238 | 238 | .299 | .423 | .722 | Johnny Bench | 1973 | 152 | 557 | 83 | 141 | 17 | 3 | 25 | 104 | 83 | 83 | 4 | 1 | .253 | 239 | .345 | .429 | .774 |
Bench outshone the other outstanding catchers who were born in 1947, one of whom was Thurman Munson. But Munson had his moments, and he drove in some runs:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Thurman Munson | 1975 | 157 | 597 | 83 | 190 | 24 | 3 | 12 | 102 | 45 | 52 | 3 | 2 | .318 | 256 | .366 | .429 | .795 | Mark Loretta | 2003 | 154 | 589 | 74 | 185 | 28 | 4 | 13 | 72 | 54 | 62 | 5 | 4 | .314 | 260 | .372 | .441 | .814 | Brady Clark | 2005 | 145 | 599 | 94 | 183 | 31 | 1 | 13 | 53 | 47 | 55 | 10 | 13 | .306 | 255 | .372 | .426 | .798 |
One year when I was a kid, Floyd Robinson drove in 109 runs with only 11 homers. That season, like Tommy Davis the same year, had a huge impact on me. I can still remember the “experts” at that time—the radio guys, who passed on their wisdom with the assurance of oracles—talking about this, saying that this proved that a line drive hitter could be a good RBI guy if he got the chance.
That was Robinson’s first full year as a regular—he had played his way into the lineup in ’61—and I, as a 12-year-old kid, fully expected that Robinson would be a consistent player of this type, hitting 40+ doubles and using them to drive in 100+ runs year after year. After all, the experts said that he had proven that he could do this.
He could never do it again, however, and I always wondered why. Kind of obsessively. Since then we have had periodic players who have had very similar seasons to this—10 homers, 45 doubles, 100+ RBI, All-Star attention. But none of them have ever been able to repeat it. Wes Parker in ’71 hit .319 with 47 doubles, only 10 homers, but 111 RBI. But the next season, like Robinson, he just couldn’t do it; he dropped off by about 50 RBI—actually, as Parker’s ’71 season is much like Robinson’s ’62 season, so Parker’s ’72 looks much like Robinson’s ’63. Keith Hernandez had a very similar season in ’79, winning half of an MVP Award with 48 doubles, 11 homers, 105 RBI, .344 average. Hernandez was a better player than Floyd Robinson or Parker, but, although he hit more home runs later on—16, 18, 15, etc.—he never again drove in 100 runs.
Jeff Cirillo, 2000. Cirillo, aided some by Coors Field, upped the doubles ante to 53, 11 homers like the other guys, .326 average, 115 RBI. But, like the other guys, he was unable to sustain it. He hit .326 again the next year, with more homers but with only 83 RBI, and then basically evaporated.
On back to Adam Comorosky and Joe Vosmik, Edgar Renteria in 2003, Willie McGee in ’87, hitters of this type may drive in 100 runs, and people will always think if they have done it once they can do it again—but they never do. It’s one of those things. . .it fascinates me because I don’t really understand it, but I sort of understand it. Nobody is actually good enough to drive in 100 runs by finding gaps in the outfield. One year you may hit 45 balls in the gap, and you look great doing that—but its not a real skill; you can’t really rely on hitting line drives where the fielders aren’t playing. And, if you can do it again, you still can’t get the hits you need in RBI situations. It’s like the year that Derek Lowe had in 2002—it’s wonderful, but you can never do it again because you can never get the ground balls to go right at people like they did that year.
Anyway, the Floyd Robinson set:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Floyd Robinson | 1962 | 156 | 600 | 89 | 187 | 45 | 10 | 11 | 109 | 72 | 47 | 4 | 2 | .312 | 285 | .384 | .475 | .859 | Robin Yount | 1988 | 162 | 621 | 92 | 190 | 38 | 11 | 13 | 91 | 63 | 63 | 22 | 4 | .306 | 289 | .369 | .465 | .834 | Shannon Stewart | 2001 | 155 | 640 | 103 | 202 | 44 | 7 | 12 | 60 | 46 | 72 | 27 | 10 | .316 | 296 | .371 | .462 | .834 |
In 1969 Ron Santo and Le Grand Orange (Rusty Staub) were playing in the same division, the NL East, and had almost the same batting stats—except that Santo drove in a lot more runs:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Ron Santo | 1969 | 160 | 575 | 97 | 166 | 18 | 4 | 29 | 123 | 96 | 97 | 1 | 3 | .289 | 279 | .384 | .485 | .869 | Ken Boyer | 1959 | 149 | 563 | 86 | 174 | 18 | 5 | 28 | 94 | 67 | 77 | 12 | 6 | .309 | 286 | .384 | .508 | .892 | Rusty Staub | 1969 | 158 | 549 | 89 | 166 | 26 | 5 | 29 | 79 | 110 | 61 | 3 | 4 | .302 | 289 | .426 | .526 | .952 |
Staub had a bad RBI year in 1969—but he had a real good one in 1976, driving in 42 more runs with fewer homers and fewer total bases:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Rusty Staub | 1978 | 162 | 642 | 75 | 175 | 30 | 1 | 24 | 121 | 76 | 35 | 3 | 1 | .273 | 279 | .347 | .435 | .782 | Roy Smalley | 1979 | 162 | 621 | 94 | 168 | 28 | 3 | 24 | 95 | 80 | 80 | 2 | 3 | .271 | 274 | .353 | .441 | .794 | Cal Ripken | 1986 | 162 | 627 | 98 | 177 | 35 | 1 | 25 | 81 | 70 | 60 | 4 | 2 | .282 | 289 | .355 | .461 | .816 |
Not that Mike Sweeney is headed for the Hall of Fame or anything, but in 2000 Sweeney had the same number of homers and the same number of total bases that Frank Robinson had had in 1957—but drove almost twice as many runs. In fact, Sweeney drove in more runs that summer than either Frank Robinson or Steve Garvey ever did:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Mike Sweeney | 2000 | 159 | 618 | 105 | 206 | 30 | 0 | 29 | 144 | 71 | 67 | 8 | 3 | .333 | 323 | .407 | .523 | .930 | Steve Garvey | 1979 | 162 | 648 | 92 | 204 | 32 | 1 | 28 | 110 | 37 | 59 | 3 | 6 | .315 | 322 | .351 | .497 | .848 | Frank Robinson | 1957 | 150 | 611 | 97 | 197 | 29 | 5 | 29 | 75 | 44 | 92 | 10 | 2 | .322 | 323 | .376 | .529 | .905 |
Miguel Tejada in 2004 had 203 hits, 40 doubles, 34 homers. . .great numbers, but you would have expected him to drive in 121 runs. He drove in 150.
I paired Tejada with Cecil Cooper in 1982, who had similar numbers but did drive in 121, and Billy Williams in 1964, who had almost the same numbers but missed the 100-RBI level:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Miguel Tejada | 2004 | 162 | 653 | 107 | 203 | 40 | 2 | 34 | 150 | 48 | 73 | 4 | 1 | .311 | 349 | .360 | .534 | .894 | Cecil Cooper | 1982 | 155 | 654 | 104 | 205 | 38 | 3 | 32 | 121 | 32 | 53 | 2 | 3 | .313 | 345 | .342 | .528 | .870 | Billy Williams | 1964 | 162 | 645 | 100 | 201 | 39 | 2 | 33 | 98 | 59 | 84 | 10 | 7 | .312 | 343 | .370 | .532 | .901 |
Joe Torre in 1971 hit nothing but line drives, picking up an MVP Award for driving in 137 runs. He doesn’t have a lot of good matches, and maybe, like Tommy Davis in ’62, I should have excluded him from the study. But I didn’t:
Player | YEAR | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | CS | Avg | TB | OBA | SPct | OPS | Joe Torre | 1971 | 161 | 634 | 97 | 230 | 34 | 8 | 24 | 137 | 63 | 70 | 4 | 1 | .363 | 352 | .421 | .555 | .976 | Cecil Cooper | 1980 | 153 | 622 | 96 | 219 | 33 | 4 | 25 | 122 | 39 | 42 | 17 | 6 | .352 | 335 | .387 | .539 | .926 | Darin Erstad | 2000 | 157 | 676 | 121 | 240 | 39 | 6 | 25 | 100 | 64 | 82 | 28 | 8 | .355 | 366 | .409 | .541 | .951 |
IV. Test Group Summary Data
Taken as individual comparisons, the players in these 20 matched sets are similar but there are small differences between them in walks, strikeouts, at bats, etc., and large differences between them in RBI. Taken as groups, the groups are similar but there really small differences in the other categories, and really large differences in RBI.
The Alpha players—the RBI men—had 471 homers, 584 doubles, 60 triples, but drove in 2,351 runs, which is 118 RBI apiece.
The Beta group—the middle group—had 468 homers, 595 doubles, 62 triples, but drove in only 1,865 runs, which is 93 apiece.
The Charlie group—the girlie men who didn’t drive in runs—had 471 homers, 596 doubles, 64 triples, but drove in only 1,412 runs, or 71 apiece. (Incidentally, the Retrosheet data accounted for all of the RBI for all of the players in both the Alpha and Charlie groups. There was a little bit of missing data in the Beta group.)
There are some differences between the Alpha and Charlie groups. ..the familiar problem of quality leakage. Even though you try to control the quality of the players selected, you never quite make it. The Alpha players did have more walks than the Charlie group (63-56 on average) and fewer strikeouts (81-84). The Charlie group, because it contained players used more often in the leadoff spot, did score more runs (87-85) and steal more bases (11-9). These are very small differences.
The Alpha group did have: - Many, many more sacrifice flies (182 to 65, or 9 to 3 on average. The Beta group had 118 sac flies, an average of 6.)
- Significantly more GIDP. The GIDP totals were 331 for the Alpha group (average 17), 260 for the Beta group (13), 252 for the Charlie group (13).
- More intentional walks. The Alpha group had 176 IBB (average of 9), the Beta group had 160 (8), and the Charlie group had 106 (5).
The Alpha Group had a .478 slugging percentage, the Beta group .481, the Charlie Group .478. The Alpha Group hit .291; the Charlie Group hit .290. There are other differences between the groups, but these differences are infinitessimal compared to the separation of almost 1,000 RBI between the Alpha and the Charlie players.
V. Trait Retention
Does the characteristic of driving in more runs than you would have been expected to drive in tend to “stay with” a player in the following season, or does it disappear?
It tends to be retained at about 40% of the original strength. In other words, if you take a group of players who drive in an average of 25 runs more than expected in year X, they will probably average about 10 more RBI than expected in year X+1. I did not study whether driving in fewer runs than expected was also a retained trait.
VI. Career Data
The career data isn’t especially interesting. . ..just shows persistant biases in the formula of which one would generally be aware anyway. The biggest career over-achievers in RBI vs. expected RBI are all 19th century guys. .Cap Anson, +767, Jake Beckley, +453, George Davis, +450, etc. The top over-achiever since 1900 is Pie Traynor, who hit third for years despite not hitting home runs, followed by Joe Cronin, who is sort of the same (and who could well have been Traynor’s teammate for half of his career.) Then Bobby Veach, Ty Cobb, Al Simmons, Sam Crawford, Harry Heilmann. . . .bunch of dead ball era and 1920s stars. The top player who is essentially post-World War II is Enos Slaughter (+235, 46th place), and the top player more recent than Slaughter is Vic Wertz (+196, 78th place). Ted Simmons (+193) is the only player who is fully within the last 50 years who is among the top 100. .. the only one who has noticeably more RBI than one would expect. After him the next player from the last two or three generations is Rusty Staub (+128). . ..lot of Rusty Staub notes here.
On the other end are lead-off hitters who had a little power, and one interesting name:
1. Rickey Henderson | -329 | 2. Lou Brock | -309 | 3. Pete Rose | -284 | 4. Willie Mays | -274 | 5. Craig Biggio | -268 |
I wouldn’t be inclined to draw any conclusions from that.
VII. Analysis of RBI Differentials
Of Players in Study Group
Getting finally to the question under review, which is: why did these players over-achieve in the RBI column? 42% of the separation in RBI between these two groups of equally productive players was caused by the fact that the RBI men were more productive in RBI situations (and less productive in non-RBI situations.) Actually, .4217. 37% of the separation in RBI between these two groups was caused by the fact that the RBI men had more RBI opportunities presented to them. Actually, .3749. 20% of the separation in RBI between these two groups was caused by the interaction of these two causes. Actually, .2034.
Allocating the interactive effects in the same proportion as the direct effects, 53% of the separation in RBI is caused by hitting better in RBI situations (.5294), and 47% is caused by having more RBI opportunities to work with (.4706). The “unexpected RBI” effect is essentially evenly attributable to these two causes.
The players who had high RBI totals had:
5,588 at bats with the bases empty,
5,727 at bats with men on base,
and 3,458 at bats with runners in scoring position.
The players who had low RBI totals had:
6,697 at bats with the bases empty.
4,607 at bats with men on base,
and 2,544 at bats with runners in scoring position.
I will just leave out the data for the Beta group, because it is essentially non-instructive.
The players with good RBI totals averaged:
.262 with the bases empty,
.318 with men on base (which would be .309 if the Sacrifice Flies were counted as at bats),
.335 with runners in scoring position (.324 adjusted for the Sac Flies),
.374 with the bases loaded (.324 adjusted for the Sac Flies.)
The players with poor RBI totals hit:
.299 with the bases empty,
.280 with men on base (which would be .276 if we counted the Sac Flies as at bats),
.262 with runners in scoring position (.255, Sac Fly adjusted.)
.235 with the bases loaded (.221 adjusted).
The RBI men homered 30% more often with men on base than with the bases empty—36 home runs per 1000 at bats with the bases empty, 47 with men on base.
The poor-RBI group homered 19% less often with men on base—45 homers per 1000 at bats with the bases empty, 36 with men on base.
The RBI men averaged:
.934 RBI per plate appearance when the bases were loaded,
.391 RBI per plate appearance when there were runners in scoring position but the bases were not loaded,
.123 RBI per plate appearance when there was a runner on first base only,
.033 RBI per plate appearance when the bases were empty.
The poor-RBI group averaged:
.579 RBI per plate appearance when the bases were loaded,
.276 RBI per plate appearance when there were runners in scoring position but the bases were not loaded,
.097 RBI per plate appearance when there was a runner on first base only,
.041 RBI per plate appearance when the bases were empty.
VIII. As Individuals
Almost without exception, every hitter that we selected as having a “good” RBI total had a higher batting average with men on base than with the bases empty, and a higher batting average with runners in scoring position than with men on base. There are just a few small exceptions to that rule, among these 20 players.
In general, the “non-RBI” men had the opposite pattern, although it is slightly less notable among the non-RBI men.
Thus, if you see a player who is +20 in RBI vs. expected RBI, it appears that one can assert with little risk of error that that player has hit well with runners in scoring position.
In this study, every RBI man (without exception) had more at bats with runners in scoring position than did his “non-RBI” match.
A few of the more notable cases:
Chili Davis in 1993 hit .182 with the bases empty (287 at bats), but .325 with runners in scoring position (163 at bats.)
Pedro Guerrero in 1989 hit .400 with runners in scoring position (68/170).
Darren Lewis in 1996 hit .177 with the bases empty, but .243 with men on first base only, .318 with runners in scoring position, and .500 with the bases loaded (6/12).
Larry Brown in 1966, who had essentially the same batting stats as Lewis except for RBI, hit .261 with the bases empty, but .200 with men on first, and .149 with runners in scoring position.
Lee May in 1976 hit .181 with the bases empty, but .350 with men on base.
Frank Robinson in 1957 hit .351 with the bases empty, but .252 with runners in scoring position, And he hit 21 of his 29 homers with the bases empty (57% of his at bats with bases empty).
Andre Dawson in 1978, despite having a poor RBI season, actually hit for a better average with runners in scoring position (.252) than with the bases empty (.242). But almost all of his power (18 of 25 homers) was with the bases empty.
Felipe Alou, who had the odd stats as a leadoff man in ’66, actually did hit better with runners on base (.354) and in scoring position (.359) than with the bases empty (.325). His shortfall in RBI is entirely caused by a lack of RBI opportunities. Tommy Davis, who had comparable batting stats to Alou but twice the RBI, had 213 at bats with runners in scoring position—an almost fantastic number. Alou had 103.
Miguel Tejada (2004) is interesting. Tejada did hit 22 of his 34 homers with men on base, which is a very high percentage even for the high-RBI group. But most of these players cleaned up bases-loaded situations. Tejada, almost unbelievably, went 2-for-17 in bases-loaded situations—2-for-21 if you throw out the sac flies. He did drive in 14 runs in those situations.
Bill James
Brookline, Massachusetts
June 9, 2007 |