Remember me

Typical League-Leading Figures

January 23, 2012

To a quite astonishing extent, the numbers typical of the best hitters in baseball today have returned to their historic norms.  These numbers, of course, became very atypical (compared to most of baseball history) during the steroid era, 1993-2004. Since the banning of steroids in the middle of the last decade, these numbers have returned quite precisely to their previous norms.

We have to begin by asking here, "How do we determine a typical league-leading figure?"  There are a million ways to approach that question, of course. "What is a typical number to lead the league in home runs, in our era? What was it in 1965?   What is a typical league-leading batting average?  What does it take to lead the league in doubles?"

There are a lot of different reasons why we need to be able to answer those questions.   There is a fairly stable and predictable relationship between league-leading figures and the highest career totals posted in a generation. Some player in each generation will post a number which is 15 to 18 times the normal league-leading figure, so that, for example, if the normal league-leading RBI total was 110, some player in that generation would drive in 1,650 to 1,900 runs, but it is not likely that anyone would drive in significantly more than that. The all-time records tend to hover in the range of 18 to 21 times the normal league-leading figure, so if the normal league-leading RBI total was 120, then the record for career RBI would probably be at least 2,160 (18 times 120) but less than 2,520 (21 times 120).  

If the normal league-leading figure in a category goes up to where the ratio between the career record and the normal league-leading total is less than 18-1, then the record becomes vulnerable, and is likely to be broken. In 1983, for example, the record for career stolen bases was 938, and the typical league-leading stolen base total was 86. That’s a ratio of less than 11-1, and this made it absolutely clear that the career stolen base record was likely to be broken, as of course it was.  On the other hand , if that ratio is 21-1 or greater, then the record can be considered "safe" for the present time.  As of now the stolen base record is 1,406 and the normal league-leading figure for stolen bases is 57, a ratio of almost 25-1, so Rickey Henderson’s career stolen base record is safe for the time being, since a player would have to lead the league in stolen bases for 25 years to break the record, and that ain’t happenin’.

But how do we establish the normal league-leading figure?   What I used to do, years ago, before we had personal computers and telephones and rubber tires, was just to take the average of the last ten players who led the league in whatever it was. This method is simple and has the advantage of directly representing what we call the stat, but it also has numerous disadvantages.   It will sometimes happen, for example, that the three best home run totals in a five-year span all occur in the same league in the same season, so that the 17th-best figure in the five-year span is included in the calculation, but the 2nd-best and 3rd-best figures are not. Also, the size of leagues changes over time, so that, for example, in comparing 2011 to 1955, we are making an apples-to-oranges comparison, comparing the best figures on eight teams to the best figures on 14 or 16 teams.      

A third major disadvantage of doing it that way is that a strike year will screw up the numbers big-time; a strike season gives you a five-year period of seriously non-representative numbers.  

There are lots of other ways to approach the question and there are problems with all of them, but this is what I did.    I took all of the players active in each five-year period, and combined them into one group.    I then looked at the top n number of players, where n was one player for each 8 teams that played in those five years.    For the 2007 to 2011 era, there were 30 teams per season, five seasons of 30 teams is 150 teams, 150 divided by 8 is (almost) 19, so I considered the top 19 figures to be "league leading" numbers.   This basically gets rid of the strike-year problem, and it enables us to compare 2011 and 1955 on a more even basis.   

These are the normal league-leading figures for the 2007-2011 period:

     
 

Games

162

 

At Bats

672

 

Runs

123

 

Hits

212

 

Doubles

50

 

Triples

14

 

Home Runs

44

 

RBI

132

 

Walks

114

 

Strikeouts

190

 

Stolen Bases

57

 

Batting Average

.346

 

Total Bases

357

 

GIDP

27

 

Hit By Pitch

21

 

On Base Percentage

.436

 

Slugging Percentage

.616

 

OPS

1.042

 

Ten years ago, the typical league-leading figures in the Triple Crown stats were 57 home runs, 149 RBI, and a .357 average.  If baseball had stabilized at those numbers, the career record for home runs would have moved up close to 1,000, actually probably over 1,000.  Now we’re back to 44, 132 and .346—historically normal numbers as we shall see.

I figured the typical league-leading figures in all of these categories for every five-year period back to 1876 (that is, 1876-1880, 1877-1881, 1878-1882, etc.)   This enables us to look very closely at trends in motion, how the game is changing, and also, of course, enables us to evaluate the vulnerability of various records in a macro sense (that is, not their vulnerability due to the presence of a single player who could break the record or a group of players who may be threatening the record, but their vulnerability based on the conditions of the game.)    Let’s start with trends.

 

Trend Lines

The typical league-leading number for games played is 162, and this basically doesn’t change.   From 1962 to 1970 it was 163, but it has been 162 since 1971.

The typical league-leading number for at bats is dropping sharply at the present time.  In the first five-year period that can be created (1876-1880) the typical league-leading at bat total was 386, since teams played less than 100 games per season.   This number increased steadily until 1894, due mostly to lengthening of the schedule; by 1894 the typical league-leading at bats total was 631.   After 1894 this number dropped due to reductions in overall offensive totals and a shorter schedule.  In 1910 the number was at 610.   It didn’t get above 631 until the lively ball era began.   By 1921 the typical league-leading at bat total was 639, and it continued to grow, on up to 665 in 1936.

It went backward again after 1936 due to reductions in offense; reductions in offense mean fewer runners reaching base, therefore fewer at bats.    By 1946 the number was down to 640, and the 1936 peak of 665 was not surpassed until the 1961-1962 expansion added eight games to the schedule.  In 1962 the number went to 669, and peaked at 675 in 1965.   

When we say "in 1965", that means in the five-year period ending in 1965.     Anyway, after 1965 this number went into a sustained decline, dropping to 647 in 1995.  (The 1994-1995 labor dispute shaved a couple off of it, but the number had been hovering around 650 since 1990.)   After 1995 the number exploded again as all of the hitting numbers did, reaching an all-time record 692 in 2007.   In the last four years it has receded to 672.

The history of at bat totals, then, has eight movements:  up until 1895, down until 1919, up until 1936, down until 1961, up sharply to 1965, down until 1995, up from 1995 to 2007, and down since 2007.  

Runs Scored started out at 92 in 1880, dropped briefly, then increased to an all-time record 168 in 1895.  After 1895 baseball edged toward a pitcher’s era, and the normal league-leading runs total dropped to 109 by 1910—down 59 runs in fifteen years—then began to recover.   It reached 129 by 1915 (Ty Cobb era), then slackened again, then reached 130 in 1921, 140 in 1922, and 150 in 1930.   In 1931 runs scored reached a modern peak of 156, dropped to 126 by 1945, rallied briefly after the war, then went into a sustained decline.   By 1975 a typical league-leading runs total was just 115.   

Then began the era of the great leadoff men, and by 1986 the total was back up to 126, slipped again, and fell to 116 by 1992.    By 2000 the typical league-leading runs total had exploded to 138—the highest total since 1941.   Since 2000 the number has been falling, and we’re now at 123.   Eight movements again—up until 1895, down until 1910, up until 1931, down until 1975, up until 1986, down until 1992, up from 1992 to 2000, down since 2000.  

We are in the eighth movement of baseball history; this is one thing we learn by doing this.  Hits, now; hits are perhaps the most interesting category, and what is interesting here is the phenomenal consistency in typical league-leading hit totals across baseball history.   Yes, the number has gone up and down with the other movements in the game, but not very much.   A typical league-leading hits total is 212 now, it was 213 ten years ago, it was 209 twenty years ago (1991), it was 213 in 1981, and it was 211 in 1971.    It was 210 in 1962, 212 in 1950, 216 in 1941, 220 in 1920, 210 in 1916, 210 in 1905, 215 in 1894, and 203 in 1889.   

A typical league-leading hits total has swung in very narrow circles for more than 120 years.   It did reach a peak of 227 in 1899, fell to 198 in 1919 (the last time it has been under 200), went up to 240 in 1931 (its all-time high), dropped to 203 in 1958, and dropped to 202 in 1994.    Still, one can essentially say that a typical league-leading hits total has not changed since 1890.    And, a remarkable side-note:  although the 1931 season was the all-time high-water mark for hits, there was only one player active in 1931 who would have 3,000 hits in his career (Paul Waner)—whereas in 1910, although the league-leading figure was 206, there were five active players who would get 3,000 career hits, and in 1957, although the league-leading norm was only 203, there were also five.   The inflated hit totals of the 1930-1931 era didn’t last long enough to seriously inflate the career totals for players of that era.

A league-leading doubles figure started out at 27 in 1880, increased to 46 by 1898, dropped to 40 by 1909 and, although increasing briefly after the cork center ball was introduced in 1911, was still at 41 by 1919.    We can thus note the slightly remarkable fact that the all-time doubles leader, Tris Speaker, played out his prime in a relatively low-doubles era, although he did last long enough to take some advantage of the lively ball era. Post-1920 doubles increased, to 45 by 1921, 50 by 1926, 55 by 1934, and to an all-time record 57 by 1936. 

After 1936, however, doubles went into a very prolonged decline.  In 1960, the year I became a baseball fan, Tito Francona led the American League in doubles with 36, and Vada Pinson led the National League with 37. These numbers are very low by historic standards—either before 1960 or after.  After the high of 57 in 1936 the typical league-leading doubles total dropped below 50 in 1941, below 45 in 1951, and below 40 in 1958, to a modern low of 39.    Totals increased slightly in the early 1960s, to a high of 44, but were back at 39 as a typical league-leading figure by 1973.

The introduction of artificial turf drove the numbers up sharply in the late 1970s, up to a high of 46 in 1979.   The numbers dipped slightly after that, but remained in the 40s and were at 45 in 1992, before the steroid era really began.    By 2003 the typical league-leading figure was up to 52.    Since then it has declined slowly; the 2010 average of 50 (49.63) is the lowest since 1998.

Triples react more radically to changes in the game than do hits or doubles.  Since 1890 the average figure for hits is 215 with a standard deviation of slightly less than 10, or less than 5%.   The average figure for doubles is 46 with a standard deviation of 4.4, or slightly less than 10%, whereas the average figure for triples is 17.2 with a standard deviation of 4.2, or almost 25%.   (Standard deviation is a measure of how much the number jumps around.)

Triples started out at 14 as a typical league-leading number, reached 15 by 1883, reached 20 by 1885, reached 25 by 1894, and peaked at 27—which is still the all-time high—in 1897.   In 1897 a lot of major league games were still played in parks with no outfield fences, or with temporary fences that were a long way from home plate and separated center field from a corn field, a warehouse or a trash dump.   The outfield grass was always high, since teams did not yet have gasoline-powered lawn mowers to mow the grass, and cutting grass was hard work; they would mow the infield short or skin the infield and play on dirt, but in the outfield the grass might be six inches high. These conditions led to more triples than homers.

League-leading triples dropped to 19 by 1910, recovered to 25 by 1915 (after the cork-centered baseball was introduced in 1911) and were back at 19 by 1919.   Triples increased only very slightly after the lively ball era started.   Outfield fences raced toward home plate, contributing to an explosion of home runs, so triples went up only slightly as other hitting totals exploded.    Triples were back below 20 by 1933, and dropped to 17 in 1940, the lowest they had been since 1883, and much lower per game played than they were in 1883.    After World War II they resumed their downward march, dropping to 16 in 1950, below 15 in 1952, below 14 by 1959, below 13 by 1962, and to a low of 12.00 in 1973, at the start of the artificial turf era.

Artificial turf brought speed back into the park, and triples increased to 15 (as a league-leading figure) by 1981.   This number hovered around 15 in the early 1980s, then began to drop again in the 1990s, reaching a low of 11.8 by 1999—the lowest figure in the history of baseball, even NOT adjusting for the greatly increased length of the schedule over time.  The number then increased until 2008, reaching a peak slightly over 15, and has decreased since 2008 to its current level a hair below 14.

Home runs, of course, are the glamour stat.   Home runs historically have the volatility of triples but the scale of doubles.   Since 1890 an average league-leading figure has been 37, but with a standard deviation of 13.4; in other words, the volatility is more than a third of the norm.

                Home runs in 1876 were kind of "super triples"; they were not balls hit out of the park, but balls hit so far into the long grass of the outfield that the runner could circle the bases before the outfielder could retrieve the orb.   A typical league-leading home run number started out at six, but increased to 18 by 1888, in large part because there was one park in that era—the Chicago park—with short fences.   In 1884 in the National League there were 322 home runs—197 in Chicago, and 125 in the other seven parks combined.

                The Chicago park of the 1880s is a sort of "dog that didn’t bark" in baseball history, like Firpo Marberry.  In 1924 the Washington Senators took a hard-throwing kid (Marberry) and made him into something very close to a modern closer, and won two consecutive pennants with what had long been a sub-.500 team.   This should have had enormous impact on the game, but it didn’t; managers looked at the innovation and rejected it, converted Marberry into a starter, and turned bullpens back into sanctuaries for old men with weird deliveries who came in after the starter was knocked out.   The hard-throwing closers didn’t take over the game until the 1970s.  Marberry was 50 years ahead of his time.

                Same thing here; the close dimensions of the Chicago park, in their own time, were regarded as making a mockery of the sport, and were rejected by the other teams and abandoned by Chicago.   It would be more than 30 years before they would make a comeback.  

                Home runs, then, went from 6 to 18 between 1880 and 1888, and then went into decline, down to a low of 10 in 1909, the beginning of the Home Run Baker era.   Home runs shot up beginning in 1911, due to the cork-center ball, and then, building on that, continued to increase in the late teens due mostly to Babe Ruth.  A typical league-leading home run number went from 10 in 1909 up to 15 by 1913, to 20 by 1920, and to 40 by 1923.   Home Runs had more than doubled, nearly tripled, in five years—the most radical change in how the game was played in the twentieth century.

                From 40 home runs the typical league-leading figure continued to increase for a decade, up to 49 by 1931.    This number then fell until 1945, to a low of 33, then increased again for another twenty years, returning to 49 by 1965.   By "1965" we mean the 1961-1965 era; in those years we have 98 teams in five years (18 in 1961, 20 a year from 1962-1965), so the typical league-leading figure is defined by the top twelve totals in that five-year span.   Maris hit 61 in 1961 (oh, you knew that?), Mantle hit 54 the same year, Mays hit 47 in 1964 and 52 in 1965, Killebrew hit 46, 48, 45 and 49, Aaron hit 44 or 45 a year, Colavito hit 45, Gentile hit 46, Cepeda hit 46, McCovey hit 44.. . .there were a lot of home runs in that era, even though Conigliaro and Cash actually led the American league in 1965 with 32, which is a very low league-leading figure.  The 49.0 in 1965 is the highest average until the steroid era.

                After 1965 home runs went into a long, slow slide as parks gradually got larger; by the late 1970s Mike Schmidt was leading the National League in home runs with 36 or 38 a year, although occasionally somebody would bust out with 44 to keep the average up.   The typical league-leading figure dropped to a low of 38 in 1985, the lowest point of the graph since 1948, then began gradually to move back up under the leadership of Mark McGwire, Frank Thomas and others.   By 1991—before the steroid era really started—the number was 43. In the late 1990s it was increasing by one or two a year, reaching a peak of 57 in 2001, eight higher than the previous well-established record of 49.   2001 was the year that Bonds hit 73 homers, and the 1997-2001 years include all of the McGwire/Bonds/Sosa waltz.

                Since 2001 the number has been falling. It is now down to 44, its lowest point since 1995, and there is every reason to believe that it will continue to fall, but there’s no reason to get ahead of ourselves on that. 

                RBI were not figured in 1876; the numbers we have now are later re-constructions.   Reconstructing, RBI leaders started out at 64, reached 100 by 1885, and were up to 130 by 1890.   They reached a peak of 142 in 1895, dropped (along with everything else) to 106 in 1907, recovered with the cork-center ball to 119 (1913), then dropped back to 106 in 1919.  

                With the advent of the lively ball era RBI counts exploded, up to an all-time record of 170 in 1931.    We never got back to that level, even in the steroid era, in large part because in the steroid era everyone was hitting home runs.   In the Hack Wilson/Lou Gehrig era, offenses consisted of one or two power hitters surrounded by six guys whose job was to get on base.   In the steroid era everyone was hitting home runs, so no one player had as many RBI opportunities in 2001 as in the late 1920s and early 1930s. 

                After 1933 the RBI number dropped almost every year until the end of World War II, reaching a low of 124 in 1945.   It jumped to 143 after the war (1950), then slid again down to 128 in 1960.    It jumped again to 139 in 1963; the 1961-1962 expansion added eight games to the schedule and brought two tremendous hitter’s parks into the American League.   After 1963 the trend resumed the slide that it had been in, really, since 1933.    By 1976 the typical league-leading figure was down to 117, the lowest it had been since the start of the lively-ball era.  

                It recovered after that, but very, very slowly; by 1992 it was still only 122 RBI, and had never gone higher than 125.    Then the steroid era started, of course, and the typical league-leading figure shot up to 149 by the year 2000—the highest it had been since the start of World War II.    Since 2000 the typical league-leading RBI count has been in decline, and we are now at 132. 

                Walks by hitters, like RBI, were not counted in the 19th century, in addition to which the game was very different; it took more than four balls to issue a walk, pitchers threw underhanded, and the batter could call for a pitch at the height that he preferred.    The typical league-leading walk count started at 24 in 1880, but was up to 61 by 1884, and reached 100 in 1889.   With the exception of one very brief period, the average has never been under 100 since 1889.

                Walks are almost like hits, phenomenally stable across history.   Hits are historically stable at 210 as a league-leading figure; walks at equally stable at 120.   Since 1890 the average league-leading figure is 119, with a standard deviation of 10.6.    We can find trends up and down in the last 120 years, but what is most remarkable is how timid these trends have been.  There was a peak of 120 in 1894.   The number dropped to low of 98 in 1910; the only years since 1890 in which the figure has been under 100 are 1904, 1908, 1909 and 1910, and in all of those years it is barely under 100.    There was a peak of 115 in 1915, a low of 102 in 1920, a peak of 129 in 1930, and a low of 120 in 1944.   After the war there was that brief period where every team had two guys named Eddie whose job was to walk, the Eddie Stanky/Eddie Joost/Ed Yost era, and the number went up to 146 in 1950.   That remains the all-time high.   The powers that be "fixed" the strike zone in 1952 and again in 1963, and walks declined to 103 (as a league-leading figure) by 1967.   There was a peak of 126 in 1971, and a low of 107 in 1985.  The number was back up to 117 before the steroid era began, and it went all the way up to 138 by 2004, when opposing pitchers would walk Barry Bonds three times before the National Anthem was finished.   Since 2004 the number has dropped to 114—actually the most rapid contraction of walks by walk leaders in the history of baseball.  

                Strikeouts have the simplest trend line in baseball history:  They only go up.    That’s not 100% true, of course.   We don’t have usable counts of batter’s strikeouts until 1916 (1912-1916), and the trend line starts at 94.  It goes down to 82 (1920), but up to 110 (1941).   It goes down to 99 (1949) but up to 160 (1971).  It goes down to 145 (1983), but up to 169 (1990).   It goes down 152 (1995) but up to 190 (so far).   The trend line is still going up.  It appears that we will hit 200, as a typical league-leading figure, within five years. 

                The interaction of strikeouts with other stats is an interesting one.   It may be possible to calculate, based on other internal relationships, where the point will occur at which strikeouts can no longer go up, but must go down.   That’s a man-sized piece of research, and we’ll save that for another time.

                Stolen bases began at about 52 (1885) and increased to 100+ before 1890.   The definition of a stolen base was a little different then, and these counts included a handful of plays on which players went from first to third on a single or moved up on outs.     In 1891 a league-leading stolen base number was 113, the highest it has ever been.    In the next ten years half of these stolen bases disappeared, for three reasons:

1)       The elimination of other base running incidents from stolen base counts,

2)       The development of catcher’s gloves and catcher’s gear, which increased the ability of catchers to throw out runners, and

3)      The rapid contraction of offense generally, which reduced the number of base runners.

 

A contraction of offense reduces stolen bases in the short run, by reducing baserunners, but increases stolen bases in the longer run, once strategy shifts and managers realize that they need to be more aggressive to generate offense.  By 1904 (Honus Wagner era) a league-leading stolen base total was down to 53.   This bounced up to 75 by 1913 (Ty Cobb era) due to the interaction of changing strategy and the cork center ball.  This number had already dropped to 59 before the lively ball era started (by 1919), and dropped steadily for twenty years after the lively ball era began.  By 1942 a league-leading stolen base count was down to 31—forty percent of what it had been in the Ty Cobb era.   Stolen bases came back into play during World War II, due to the shortage of power, and reached a peak of 42 in 1946, then began to decline again, dropping to an all-time low of 28 by 1956 (1952-1956).   

Luis Aparicio started to bring stolen bases back; Aparicio handed off the ball to Wills, Wills to Lou Brock, Brock to Rickey Henderson.   From 1956 to 1986 there was almost constant increase in the numbers of stolen bases, from 28 up to 93—an increase substantially more than 200%.    From 1986 to 2004, as the Raines/Henderson/Vince Coleman era ended and the steroid era began, stolen bases for league leaders dropped from 93 to 52.    When the steroid era ended stolen bases began to creep back into the game, increasing to 59 by 2009.   Our current figure is 57, down a notch, although it seems more likely that this is a random measurement decline than an actual new direction for the category.  Our current number is essentially the historic average since 1890 (58), with a standard deviation of 19.3. 

In 1876 a ball that landed in fair territory and rolled foul was a fair ball.   In 1876, using the fair/foul bunt, Ross Barnes hit .424.  Levi Meyerle had hit .492 in the 1871, in the American Association.  The rule was changed the winter of 1876-1877 eliminating the fair/foul bunt, and so batting averages were probably declining before our measurements start, before 1880.   Our first measurement of a typical league-leading batting average (1880) is .386.   By 1892 this was down to .361.   Then they moved the pitcher’s mound back, from 45 feet to 60.5, and batting averages jumped to .413 (1892-1896 and 1893-1897).   .413 is the all-time high; we’ve been back to .400, but never back to .413.   Due to subsequent rule changes, and to pitchers discovering that they could throw curve balls from 60 feet that would have been impossible from 45, the league leading batting average dropped back to .359 by 1908.  

The cork-center ball drove averages up to .393 by 1913 (Ty Cobb era).  Gravity pulled them back down to .363 by 1918, and then the lively ball era—Hornsby, Heilmann and Sisler—sent them skyrocketing again, up to .403 by 1924.   The early 1920s are the only era since 1900 in which the typical league-leading batting average has been over .400.

Although power flourished until 1930, league-leading batting averages had already dropped to .389 by 1930.   They fell rapidly after that, driven down by the increases in strikeouts; the norm was down to .370 by 1940, to .357 by 1950, to .344 by 1956 and to .332 by 1966, ignoring a small uptick in the very early 1960s.    Between 1924 and 1966—42 years—a typical league-leading batting average had fallen more than 70 points.

Baseball tried to bring offense back into the game in 1969, lowering the pitcher’s mounds, and artificial turf drove averages up a little bit; by 1977 (Rod Carew era) the figure was up to .346.   It faded a little bit, rallied to .351 by 1988 (Wade Boggs/Tony Gwynn era), then faded again.   By 1997 the typical league-leading batting average was up to .361—the highest it had been since 1943.   Since 1997 league-leading batting averages, under pressure from increasing strikeouts, have dropped to .346.

Well, I could do Total Bases, but maybe I am going long here.  Grounding into double play have been essentially stable since the category was introduced; the entire range of figures there is barely over 10% from top to bottom.   One could almost say there are no trends in GIDP.  Hit Batsmen were at 38 as a league-leading figure in 1900, and declined constantly until 1949, reaching a low of 9 in 1949.    The introduction of batting helmets allowed the batters to crowd the plate with less fear, and reversed the trend; by 1971 we were back up to 21.  This dropped to 12 by 1981, due probably to policies about no tolerance for bean balls (ejecting pitchers who threw at batters, etc.), then resumed its assent; by 2001 we were back up to 25.   We’re now at 21.

On base percentages have always been in the .400s, but have ranged from the top to the bottom of the .400s.  They started at .412 (1880), went up to .499 (1891), down to .423 (1904), back up to .498 (1920), down to .447 (1942), up to .477 (1954), down to .410 (1967), up to .436 (1971), down to .420 (1984), all the way back to .482 (2004), and now down .436 (2011). 

Slugging percentages have reached peaks of .640 (1895), .725 (1930), .660 (1958), and .703 (2004), offset by lows of .507 (1909), .582 (1946), and .559 (1986).   We are now at .613, which is near the historic norm of .622.  

OK. ..well, I can’t tell you how disappointed I am that I got through that without getting the opportunity to use the words "nadir" and "zenith".    I feel like I wasted my high school English classes.    

 

On the Vulnerability of Records

This all started, really, because of doubles.  I rather rashly predicted here, several years ago, that the career record for doubles would be broken by someone before 2020.   Tris Speaker has held the career record for doubles for an exceedingly long time, about 90 years as of now; close to 90.    A few years ago, when we had several league leaders who had figures of 55 and up, I predicted that someone would make a run at this record.

Who?   Don’t know.   It is rather curious:  the conditions to make a run at this record exist and have existed since about 1995, but no one has stepped forward to make a challenge.   The percentage of hits which are doubles is higher now than it has been through almost all of baseball history.   This relates to the strikeouts.   Modern hitters accept strikeouts in order to hit the ball harder.    This leads to more home runs, some decrease in batting averages, but the decrease in balls in play is offset by higher in-play batting averages—and also, incidentally, more doubles as a percentage of hits.  

The career record for doubles is 792 or 793, and the normal league-leading doubles total is 49.6, so the record represents only sixteen years (15.96 years) worth of league-leading type performance.   Historically, that marks doubles as a vulnerable record; not actually a DOOMED record, but vulnerable.   If the ratio gets significantly below 15, I regard the record as doomed.   We’re not quite there.

We are there, in terms of strikeouts and GIDP; the career record for strikeouts (2,597, by Reggie) is only fourteen years’ worth of league-leading performance, so that could go at any time.  (Don’t tell the government; they’ll want to add it to the endangered species list, and some bureaucrat will show up at Comerica Park, telling Justin Verlander he is no longer allowed to pitch to Mark Reynolds.)   GIDP are even worse; the career record is only 350 (Cal Ripken), and the typical league-leading figure in 27, so that’s only thirteen season’s worth.

Of course, strikeouts and GIDP are negative, so the players who lead the league in those things may not tend to stick around as long, and that may cause those relationships to be different that the normal career-to-season ratios in positive categories.   You don’t want to lead the league in strikeouts and double plays.   You may not want to lead the league in being hit with the pitch, either; in any case the record for that (287, still held by 19th century shortstop Ee-yah Jennings), is now only 13.5 seasons worth of  league-leading type performance.   Let’s chart this:

Category

Record

Record Holder

League Leading Figure

Seasons Required to Break Record

 
 
 

GIDP

350

Ripken

27

12.8

 

Hit By Pitch

287

Hughie Jennings

21

13.5

 

Strikeouts

2597

Reggie

190

13.7

 

 

Just establishing the form.   Doubles is the most vulnerable of the positive records:

Category

Record

Record Holder

League Leading Figure

Seasons Required to Break Record

 
 
 

GIDP

350

Ripken

27

12.8

 

Hit By Pitch

287

Hughie Jennings

21

13.5

 

Strikeouts

2597

Reggie

190

13.7

 

Doubles

792

Tris Speaker

50

16.0

 

 

Followed by home runs and RBI:

Category

Record

Record Holder

League Leading Figure

Seasons Required to Break Record

 
 
 

Home Runs

762

Sterry Bonds

44

17.2

 

RBI

2297

Hank Aaron

132

17.4

 

 

The records for home runs and RBI are somewhat vulnerable, in that the ratio is less than 18-1, but these records are becoming very rapidly LESS vulnerable, since home run totals are in rapid decline. There was a window that opened to break these records, and Bonds made it through that window, but the window is now closing.  The records for runs scored, hits, total bases and at bats are in that 18-to-21 years range where records cannot be said to be beyond reach, but certainly would not be described as vulnerable:

Category

Record

Record Holder

League Leading Figure

Seasons Required to Break Record

 
 
 

Runs

2295

Rickey Henderson

123

18.7

 

Total Bases

6856

Aaron

357

19.2

 

Hits

4256

Pete Rose

212

20.0

 

At Bats

14053

Pete Rose

672

20.9

 

 

While the other records are, at least for the moment, out of reach:

Category

Record

Record Holder

League Leading Figure

Seasons Required to Break Record

 
 
 

Plate Appearances

15861

Pete Rose

748

21.2

 

Games

3562

Pete Rose

162

22.0

 

Triples

309

Sam Crawford

14

22.2

 

Walks

2558

Bonds

114

22.5

 

Stolen Bases

1406

Rickey Henderson

57

24.6

 

 

So doubles is the vulnerable record—despite the fact that nobody is mounting any kind of a charge against it.

 

The Pace of Change, or

Where Are We Right Now?

OK, this is the neat part of the study; aren’t you glad you stayed with me?   As I was doing this, it occurred to me that I could measure the pace at which baseball was changing by comparing the standards for each season to the standards from previous seasons.   I used a similarity score.   If the league-leading standards are similar to those of earlier seasons, then the pace of change is slow; if they’re different, the pace of change is more rapid.  

This chart compares the league-leading standards of 2011 to those of 1971:

Records_One

Is that uncanny, or what?   With the end of the steroid era, almost every number has gone back to precisely what it was 40 years ago.    With a huge detour, baseball has returned to the norms that were current in the game in 1971.  There’s an increase in doubles, a little change in the strikeouts and walks, but almost all of the numbers are back where they were the year I graduated from college.

What this means is, records are now legitimate again.  This era we have had, where nobody wants to look at the numbers because they’re out of whack with the history of the game. ..it’s over.   In all of the history of baseball, there is no other 40-year period in which the game so closely resembles the game of 40 years earlier; in fact, there is nothing remotely close to that.   The similarity score, comparing 1971 to 2011, is 889.   The highest previous similarity score, comparing any two seasons 40 years apart, was 838, which was based on comparing 1970 to 2010.   But for context, let’s look at some other 40-year gaps:

Records_Two

Over time, baseball has become more and more stable.   Statistical standards in each generation have evolved more slowly than they did in the previous generation.  We can demonstrate that this is a true statement by the following method.   We can figure the similarity score for each season compared to the season 3 years earlier, 5 years earlier, 10 years earlier, 20 years earlier, 30 years earlier, 40 years earlier, and 50 years earlier.   Then we can average out those similarity scores over a 20-year period to see the trend line in the pace of change (looking for zeniths and nadirs.)   Looked at in that way, the 20-year stability score for 2011 is the highest of all time, breaking the previous record set in 2010, breaking the previous record set in 2009, breaking the previous record set in 2008.   Almost every year in baseball history has represented a new high-water mark in terms of the long-term stability of statistical standards.

This is not to say that there are no aberrations in the data; of course there are.   The ten-year similarity of 2011 as compared to 2001 is only 672, and the ten-year similarity of 2002 as compared to 1992 was only 518—whereas 1990 compared to 1980 was at 933, and even 1973 as compared to 1963 was at 837.   The steroid era was, of course, a huge departure from baseball’s historic norms; we all know that.    We can think of it as baseball having an affair with its secretary.   It has now gone back to its wife.  My point is that, despite this, the changes in each generation, as a whole, are smaller than the changes in the generation before.  Let me chart the five-year changes over time:

  

 

Compared

 

 

 

 

Compared

 

 

Year

To

Similarity

Visual

 

Year

To

Similarity

Visual

2011

2006

812

 

 

1947

1942

766

 

2010

2005

807

 

 

1946

1941

629

 

2009

2004

784

 

 

1945

1940

579

 

2008

2003

805

 

 

1944

1939

653

 

2007

2002

786

 

 

1943

1938

685

 

2006

2001

830

 

 

1942

1937

851

 

2005

2000

871

 

 

1941

1936

913

 

2004

1999

854

 

 

1940

1935

951

 

2003

1998

892

 

 

1939

1934

842

 

2002

1997

796

 

 

1938

1933

839

 

2001

1996

786

 

 

1937

1932

787

 

2000

1995

703

   

1936

1931

777

 

1999

1994

707

   

1935

1930

808

 

1998

1993

702

   

1934

1929

877

 

1997

1992

706

   

1933

1928

809

 

1996

1991

776

 

 

1932

1927

777

 

1995

1990

838

 

 

1931

1926

725

 

1994

1989

840

 

 

1930

1925

747

 

1993

1988

841

 

 

1929

1924

847

 

1992

1987

862

 

 

1928

1923

902

 

1991

1986

792

 

 

1927

1922

842

 

1990

1985

842

 

 

1926

1921

753

 

1989

1984

882

 

 

1925

1920

502

 

1988

1983

920

 

 

1924

1919

297

 

1987

1982

918

 

 

1923

1918

364

 

1986

1981

830

 

 

1922

1917

438

 

1985

1980

867

 

 

1921

1916

607

 

1984

1979

863

 

 

1920

1915

745

 

1983

1978

881

 

 

1919

1914

678

 

1982

1977

886

 

 

1918

1913

689

 

1981

1976

848

 

 

1917

1912

729

 

1980

1975

872

 

 

1916

1911

805

 

1979

1974

916

 

 

1915

1910

687

 

1978

1973

821

 

 

1914

1909

676

 

1977

1972

841

 

 

1913

1908

670

 

1976

1971

801

 

 

1912

1907

745

 

1975

1970

844

 

 

1911

1906

871

 

1974

1969

918

 

 

1910

1905

835

 

1973

1968

866

 

 

1909

1904

818

 

1972

1967

881

 

 

1908

1903

720

 

1971

1966

880

 

 

1907

1902

729

 

1970

1965

855

 

 

1906

1901

704

 

1969

1964

850

 

 

1905

1900

704

 

1968

1963

765

 

 

1904

1899

666

 

1967

1962

755

 

 

1903

1898

705

 

1966

1961

749

   

1902

1897

683

 

1965

1960

749

   

1901

1896

756

 

1964

1959

790

 

 

1900

1895

793

 

1963

1958

770

 

 

1899

1894

861

 

1962

1957

813

 

 

1898

1893

710

 

1961

1956

885

 

 

1897

1892

628

 

1960

1955

887

 

 

1896

1891

684

 

1959

1954

861

 

 

1895

1890

730

 

1958

1953

852

 

 

1894

1889

751

 

1957

1952

851

 

 

1893

1888

793

 

1956

1951

857

 

 

1892

1887

812

 

1955

1950

869

 

 

1891

1886

583

 

1954

1949

886

 

 

1890

1885

603

 

1953

1948

865

 

 

1889

1884

299

 

1952

1947

820

 

 

1888

1883

81

 

1951

1946

695

 

 

1887

1882

0

 

1950

1945

691

 

 

1886

1881

183

 

1949

1944

773

 

 

1885

1880

157

 

1948

1943

864

 

         

 

In this chart blue indicates stability over the five-year period, red indicates instability or rapid change over the five-year period, and dizziness or confusion indicates color blindness.   The chart highlights the periods of rapid change in baseball history—1880 to 1897, 1916 to 1925, the war years, and, to an extent, the steroid era.   But even the changes in the steroid era, while they were the largest since the end of World War II, were not as rapid as some of the changes that took place earlier in baseball’s history. 

 
 

COMMENTS (7 Comments, most recent shown first)

jollydodger
I'm quite younger than Bill, but I first started memorizing stats from the backs of baseball cards in the late 80s/early 90s. It's good to see league-leading stats come back down to earth.

I was as enamored with what the Bonds/McGwire/Sosa crew did in their time, but the numbers are absurd.

I guess just in my mind's eye, it's more fun to compare guys from different eras when their numbers are closer.....Dawson, Strawberry, and Murphy vs. Pujols, Cabrera, and Kemp.

I don't know, it feels good. It's probably healthy for the game, too.
7:43 PM Jan 26th
 
doncoffin
Is there a way to present the "standards" graphically? (For those of us who like graphs.?) Seeing the way in which (for example) the "standard" league leading total for doubles has changed over time might make the point more clearly. (Just a thought.)
1:08 PM Jan 25th
 
Robinsong
I think this article sheds light on several debates about steroids. The return to historical norms is consistent with the hypothesis that all of the increase in offense during 1993-2004 was due to PEDs - the other factors - bats, balls, strength training, stadiums - are all about the same now as during 1993-2004 (with the exception of the Rockies humidor), yet driving the steroids out restored the norms. It also provides significant support for Bill's argument that steroids have been largely eliminated. Finally, the explosion from 1993 to 2004 in league leading numbers and the return to historical levels show that steroids does affect top performance. Bill has suggested that a primary effect was lengthening careers by improving the performance of old players. The problem is that old active players during the steroid era were far more likely to be using PEDs that young players for several reasons: desire to stay employed, selection bias (non-users retire earlier since they hit replacement level sooner), and learning about using steroids and sources of them. Steroids may flatten the age curve, but it is very hard to know how much unless you know who was using what PED and when they started. I think if Bonds had been using the same drugs in the early 90s, he would have hit truly astonishing performance levels.
12:50 PM Jan 24th
 
Trailbzr
The difficulty with projecting record-breaking is that some categories have less consistent leaders than others. BillJ posted a couple of years ago to the effect that the record for doubles (792 by Tris Speaker) should fall because it would be broken by 20 seasons of 40 apiece, and something like 15 players a year hit that many. BUT... it's not consistent; the only player who had hit more than 205 doubles the previous five seasons as of then was Brian Roberts (228), who's hit 21 in reduced playing time since.

Craig Biggio is the leader among players recently retired, with 668. He had four full-time seasons of less than 25. The season after leading the league with 51 and 56, he hit 13 in 101 games (projects to 21/162). Todd Helton has hit 44/162 for 15 seasons. He's still 238 away from the record at age 37.
10:26 AM Jan 24th
 
studes
The tables have been added back in. Sorry about that.
9:29 PM Jan 23rd
 
rtayatay
Does an increasing number of players/teams over time impact the evaluation of stability? If so, that would highlight the steroid era differences even more.
8:18 PM Jan 23rd
 
Robinsong
Great article, but the first two charts in the last section do not show up. Can this be fixed?

I found it particularly interesting that stolen bases is now the least vulnerable of the batting records, since I first became aware of Bill's work from the SI article predicting its demise.​
5:11 PM Jan 23rd
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy