By Bill James

January 30, 2019

**On Babe Ruth**

**Lost In Time**

Joe Posnanski posted a poll a day or two ago asking about Babe Ru .. . well, here; I’ll just copy the poll:

What do you think is Babe Ruth’s LIKELY level if you plucked him out of 1927 and put him in today’s big leagues.

Superstar

Star (after adjustments)

Average to below average

Could not make it

The poll got 8,298 votes. 21% (including me) said "Superstar", 25% said "Star (after adjustments)", 24% said "Average to below average", and 30% said "Could not make it".

I am quite certain that most people are wrong about this, and actually more certain about that than I was at the time I voted, but that’s getting ahead of myself. Accompanying my vote, I posted the following:

What people don't get is that the difference between TEAMS is much less than the diff between PLAYERS, & the diff between LEAGUES is less than teams. So there is enormous improvement since 1927, but on the scale of LEAGUES. On the scale of PLAYERS it isn't nearly as large.

My post got 8,299 responses; take that, Joe. Just kidding. Anyway, one response to my post was:

**I’ve read this tweet four times and it’s the syntax—that’s what I don’t understand. **

Well, I’d help you out, dude, but honestly I have never understood what "syntax" is. Seriously. If you held a gun to my head I couldn’t define the word "syntax". It has something to do with words, I think. But I posted this just in case it helps:

I’ve been trying to explain it to baseball people for 30 years and they rarely understand. Let’s say an average team is .500, a pennant-winning team is .600, and Babe Ruth is .850. If the average TEAM moves from ,500 to .600, that’s an enormous change, but it doesn’t do much to Babe Ruth.

All of that is by way of background. I’ve done a little study here or half-study or thought experiment or modelling exercise or something; whatever it is, it relates to the issue and breaks new ground for me, so I hope it is worth sharing.

The three questions I am trying to get to here are:

What is the standard deviation of performance level (or winning percentage) among ** players**?

What is the standard deviation of performance level (or winning percentage) among ** teams**?, and

What is the standard deviation of performance level (or winning percentage) among ** leagues**?

It is very easy to know what the standard deviation of winning percentage is for __teams__. You can calculate that for a season, you can calculate it for a decade, or for all of baseball history; that’s easily done, has been done thousands of times by thousands of people. That’s easy.

The standard deviation for players is not nearly as easy to measure. It is difficult to measure for at least three reasons. First, given a player’s performance over the course of a season, his stats, it is difficult to determine exactly what his effective winning percentage was. Different experts could address that problem, and they would come up with somewhat different answers, because there are unknowns in the process sufficient to cause meaningful discrepancies. Second, even given what a player’s effective winning percentage was, it is difficult to know to what extent this represented his true skill level. You could take 50 players of exactly the same skill level and let them play out a season; some of them would hit .300 and some of them would hit .250, just because that’s what happens, that’s baseball. What we need to know is the standard deviation of skill level. And third, you would get a very different answer to the question of "what is the standard deviation of the effective winning percentage of players" if you studied only players who had 600 plate appearances than you would if you studied players with 200 plate appearances, or with 50 plate appearances.

The standard deviation of skill level for players is very difficult to measure, but it could theoretically be measured. But the standard deviation of skill level for *leagues* is almost impossible to ascertain from data. Since one player always fails when another player succeeds, since the winning percentage of each league is .500, all leagues measure as being the same, even if they are not actually the same. You can gain some insight into the problem by studying interleague play, but that data sample is too small to be remotely reliable, and doesn’t address the problem of 2008 compared to 2018, which is what we are interested in. You can gain some insight into the problem by studying players moving between leagues, but that’s highly speculative, and you can gain some insight into the problem by studying college baseball, where teams from different leagues routinely play against one another, but that’s almost entirely irrelevant to the discussion of major league teams. Basically, we don’t have a clue what the standard deviation of performance levels between different leagues is. There’s no way to calculate it from the data.

BUT.

Perhaps we could calculate it from a model? We cannot calculate what the standard deviation of performance level for different major leagues IS, but perhaps we can estimate from a model what it SHOULD BE. We can estimate what we would expect it to be, based on what we know.

Suppose that we represent the performance level of an average player at .500, and the performance level of an individual player as a random number between zero and one. That’s his winning percentage, let us say. In a spreadsheet, I thus "created" 2 million players, which is vastly larger than the actual number of players in major league history, but each player was just one cell of the spreadsheet, one random number.

Suppose, then, that each 15 players represents a team. Please don’t debate with me the number "15"; I know that a major league team has 25 players, not 15, but a regular player, healthy all year, is much more than 1/25^{th} of his team. If I had used 25 players to represent a team, then the "diminishing improvement" implication which results from the model would be stronger than it is, and readers would correctly point out that I had over-stated the results by using too many players to represent a team.

And suppose, then, that each 15 *teams* represents a *league*. There are 15 teams in each league now, used to be 8, but that still kind of words, because two eight-team leagues would be 16 teams, and thus our measurement reasonably approximates the random difference between 1934 and 1935, assuming nothing big happened in the winter of 1934/35.

Given a sample of 2 million "players" with an average of .500, the standard deviation of skill level was .288, actually .288573, if you need to know.

Let us assume that each fifteen players, generated at random, represent a team. What, then, is the standard deviation of skill level for teams?

It is .068.

Now that’s a really interesting figure. When I saw that figure, I realized that I would have to publish the study.

Why is that a really interesting number? Some of you already know.

Because that’s almost exactly what the standard deviation of winning percentage for teams actually is. Over the last five years (2014 to 2018) the standard deviation of winning percentage for teams is .070.

I didn’t expect that to happen. It’s just a really simple model with some obviously unrealistic assumptions; I assumed that when I calculated the output from the model we would get a set of obviously unrealistic numbers, and then we could get into a discussion of how the output of the model relates to real life. But there is only one output data point from the model that we know what the real-life number ought to be, and the way it relates is: it’s the same.

That doesn’t answer ALL of the questions about how the data in the model relates to real life, not by miles and miles, but it advances the ball. It takes away certain issues of scale. There are still a lot of other problems.

Anyway, moving ahead. . . .What is the standard deviation of winning percentage for *leagues*?

It’s .017.

Standard deviation of Winning Percentage for players: .288

Standard deviation of Winning Percentage for teams: .068

Standard deviation of Winning Percentage for leagues: .017

Now, agreeing for the sake of argument that the quality of play in major league baseball improves steadily over time and has improved quite significantly since 1927. . . agreeing for the sake of argument, and also because it is pretty obviously true. But the next question is, does the improvement in play over time take place on the scale of PLAYERS, or on the scale of LEAGUES?

It seems obvious that it takes place on the scale of differences between *leagues*. Since the game of baseball is even larger than a league, it could reasonably be argued that the scale of change is even smaller than is reflected in this "league" number.

Comparing 1970 to 1960, for example. . .well, bad example. Baseball expanded from 16 teams to 24 in the 1960s, which probably caused a step backward in the overall level of play, rather than a step forward. Comparing 1980 to 1970, for example, it would seem to be almost certainly true that the quality of play in the National League in 1980 was higher than it was in 1970.

But is it likely that every TEAM in 1980 was better than any TEAM in 1970? The best team in the National League in 1970 was the Cincinnati Reds, 102-60. The worst team in the National League in 1980 was the Chicago Cubs, 64-98. You don’t really think that the 1980 Chicago Cubs could beat the 1970 Cincinnati Reds, do you?

No, of course you don’t, because changes in the quality of play don’t operate on that scale. To suggest that every player in the National League in 1980 was better than any player in the National League in 1970 would be another gigantic leap. No one would argue that.

It seems apparent to me that changes in the game overall operate on the scale of leagues, but how large are they, on that scale?

Let us suppose that the improvement in play in baseball in each ten-year period is one standard deviation of league quality. That would be an ENORMOUS change. It is actually hard to believe that the change could be that large, except under unusual conditions such as comparing 1953 to 1943. If we assume that that was true, that would suggest that the improvement from 1927 to the present was about 9 standard deviations.

A difference of 9 standard deviations is. . . well, numbers don’t come that large. It never happens. It’s not a one-in-a-million difference; it is vastly beyond that. A one-in-a-million difference is like 4 standard deviations or something, I forget exactly. Five standard deviations would be many times larger than four standard deviations. If you measured every human being on earth by height, weight, hair quantity, blood pressure and 50 other measurements, I would doubt that any human being would be nine standard deviations above the norm in any category. The 600-pound woman would not be nine standard deviations above the norm in terms of weight; she might be as much as six, maybe. Somebody will probably post and tell me I am wrong; she would actually be eleven. Whatever; I am just trying to explain that it is an enormous difference.

But this is my point: that, in the model that I established, Babe Ruth would be at .999, and one standard deviation above the norm ** for a player** would be at .788. If the norm moves forward on the scale of players, the difference is relevant to Ruth.

But on the scale of leagues, Babe Ruth is .999 compared to .500. If the league moves forward by one standard deviation, he is at .999 compared to .517. If it moves forward by two standard deviations, he is at .999 compared to .534. It makes very little difference to Babe Ruth. It is impossible to see how the norm __for a league__ could move forward by such a large margin that it buries Babe Ruth. It just can’t happen, once the system is organized. It can happen before the system is organized—that is, before 1890—but not after the system is organized.

Years ago, in a SABR publication, somebody posted a "proof" that baseball quality is constant over time, which went something like this; I forget the exact details.

Ty Cobb hit .350 in 1907, leading the league in hitting, and hit .357 in 1927.

Eddie Collins hit .347 and .324, as a regular, in 1909 and 1910, and hit .360 and .349, as a regular, in 1923 and 1924.

Babe Adams was 18-9 in 1910, and 14-5 in 1921, with a better ERA relative to the league in 1921 than in 1910.

Pete Alexander had a .683 winning percentage and a 2.57 ERA in 1911, and had a .677 winning percentage and a 2.52 ERA in 1927.

Eppa Rixey was 19-18 in 1921, and 19-18 in 1928.

Huck Betts was 3-7 in 1921, but 17-10 in 1934.

Ted Lyons was 12-11 with a 4.87 ERA in 1924, and was 14-6 with a 2.10 ERA in 1942.

Freddie Fitzsimmons was 14-10 with a 2.88 ERA in 1926, and 16-2 with a 2.82 ERA in 1940.

Al Benton had a 4.56 ERA in 1939, 4.44 in 1940, but 2.12 in 1949 and 2.37 in 1952.

Ted Williams was the best hitter in baseball in 1941, and the best hitter in baseball in 1960.

Pee Wee Reese hit .229 and .255, as a regular, in 1941 and 1942, but hit .309 and .282 in 1954 and 1955.

Warren Spahn was 21-10 in 1947, and 21-10 in 1960.

Bob Lemon was 20-14 in 1948, and 20-14 in 1956.

Warren Spahn was 23-7 in 1953, and 23-7 in 1963.

Ted Abernathy was 5-9 with a 5.93 ERA in 1955, but was 10-3 with a 2.59 ERA in 1970, and posted a 1.72 ERA in 1972.

Bob Gibson from 1959 to 1961 had ERAs of 3.32, 5.59 and 3.24. In 1972 and 1973 he had ERAs of 2.46 and 2.73.

Steve Carlton was 14-9 with a 2.98 ERA in 1967, and 23-11 with a 3.10 ERA in 1982.

Bert Blyleven was 16-15 with a 2.81 ERA in 1971, and 17-5 with a 2.73 ERA in 1989.

George Brett hit .333 in 1976, winning the batting title, and .329 in 1990, winning the batting title. He hit .282 in 1974, and .285 in 1992.

Paul Molitor hit .273 in 1978, and .281 in 1998, with more than 500 at bats each year.

Tony Gwynn hit .309, .351 and .317 from 1982 to 1984, and .372, .321 and .338 from 1996 to 1998.

Chuck Finley had a 4.11 ERA in 1988, and a 4.11 ERA in 2000.

Andy Pettitte was 21-8 in 1996, and 21-8 in 2003.

Bartolo Colon had a 3.71 ERA in 1998, and 3.43 in 2016.

I could do thousands more, but you get the point. If baseball improved dramatically, he asked, when did it improve?

There is a lot wrong with that argument, and when he published that argument in the mid-1970s I made fun of him for it, because, well, I’m kind of a jackass sometimes. There is a lot wrong with the argument, but there is something right about it, too. IF baseball was improving on the scale of teams—that is, if it was improving by .050 in a decade or something like that—then these comparisons would not be possible. It would not be possible for a player to remain dominant in his decline phase, seasons separated by 15 years or more, except in very rare situations where a player actually improved greatly over time. The fact that this happens is not evidence that baseball is not improving over time, but it IS evidence that the league is not taking large steps forward in each decade. The only way that Babe Ruth could NOT be a star player today is if the league was taking large steps forward in each decade.

Late Night Addendum 1

Expanding a little bit on the study published a couple of hours ago—

In this study there is (A) the standard deviation of talent among players, (B) the standard deviation of talent among teams, and (C) the standard deviation of talent among leagues.

B is a known variable and reasonably near a constant,

A is a partially known variable, and

C is completely unknown

The critical question of the study is, What is the relationship of B to C? That’s REALLY what the study is about: If B is known, what is C likely to be?

In my model, it is extremely likely that the estimate of A is completely wrong, and that (because of that) the relationship of A to B in this study is mis-stated or mis-calculated.

A is mis-estimated in this model, because using a random number to represent the player’s value (A) creates a straight-line distribution, when in reality it is likely that the distribution of A represents either a bell-shaped curve or the right-hand portion of a bell-shaped curve. The standard deviation of winning percentage among players is almost certainly NOT .288, but some number significantly lower than that. I would suggest that it is probably .120 to .150, but that’s just a guess.

With regard to Babe Ruth, the model assumes that he is +.499 vs. an average player, but also that he is less than two standard deviations above the norm. In reality, he is almost certainly MORE than two standard deviations above the norm. He might be 3, he might be 3 and a half, but he’s more than 2. He’s not a .999 player; he is more like an .850 player, but against a norm of .120, that’s close to three standard deviations above the norm.

Why, then, does the model hit the mark in regard to B?

Because the model assumed that the winning percentage of the team is the average of the winning percentage of the 15 players, when in reality it probably represents an exponent of the average. It’s probably the average to the power 2 vs (1-A), or to the power 4, or something. Depending on how you calculate the winning percentage of the player, which is a partially understood problem.

Basically, it’s off-setting errors with regard to A vs. B. A is wrong, and the relationship of A to B is wrong, but they happen to off-set to hit the target, so that B is what B ought to be. This was totally unexpected. I had no idea that B would turn out to be the actual B, so I wasn’t worried about the relationship of A to B.

BUT. Big But.

What is critical to the study is not the relationship of A to B, but the relationship of B to C. Is there any reason to believe that this problem has any impact on the relationship of B to C?

I can’t see that there is. It doesn’t seem to me that this problem has anything to do with the actual relationship of B to C. That’s why the study was exciting to me—because if B is the actual B, then it seems reasonable, based on the model, to assume that C is the actual C.

The Other Problem

Another issue here is that changes in HOW the game is played are not necessarily changes in how WELL the game is played. Not all changes in the game are changes for the better. The famous evolutionary biologist Stephen Jay Gould, who was a huge baseball fan, would make this point constantly: that evolution does not produce a BETTER species, it merely produces a DIFFERENT species which is better adapted to its own survival. I didn’t understand what the hell he was talking about at the time, but I get it now. Wish he was still around so I could tell him that.

Anyway, SOME of the changes that have taken place in baseball are not forward developments, but merely variations. Sideways evolution.

Suppose that in 1927, for some reason, baseball had split into two camps, with players NOT going back and forth from one to the other. Over time, the games played by the two camps would go in different directions. 90 years later, it would almost certainly be true that a dominant player from the West Coast league would not be equally dominant in the East Coast game—and vice versa. DIFFERENCE is not the same as QUALITY. The Japanese game, for example. The Japanese are baseball-obsessed at a much, much deeper level than Americans are, but the players from Japan who come to the states are sometimes at a disadvantage because they’re playing a significantly different game.

The "heavy bat" problem is not a QUALITY problem; it’s a DIFFERENT GAME problem. If Adam Ottavino went back to 1927, he almost certainly would have a six-plus ERA because his managers would make him try to pitch 9 innings a game, and he couldn’t do it. That’s the way the game was played then.

We moved away from the 9-inning starters because there is a competitive advantage in using relievers, but that advantage rests in large part on tactics that would be unsustainable in 1927. You can’t call players to the majors at the drop of a hat in 1927. You’re playing in St. Louis and your top minor league team is playing in Rochester, you can’t get him there for tonight’s game—or tomorrow night’s game. You can’t watch video of him pitching his last start, too see how he looks. You can’t check the internet to see how well he has been pitching. The strategies that dominate in 2019, wouldn’t work in 1927.

©2019 Be Jolly, Inc. All Rights Reserved.|Web site design and development by Americaneagle.com|Terms & Conditions|Privacy Policy

## COMMENTS (47 Comments, most recent shown first)

FrankDguy123 - thanks for the response. I did not know the pitchers-as-hitters difference has changed over such a long period of time.

2:58 PM Feb 7thGuy123I don't think pitchers-as-hitters reveal much useful data since HS and College went to DHs.That's a reasonable hypothesis. But it turns out to be incorrect. Most of the decline in pitcher hitting took place before the introduction of the DH, which can't be a function of pitchers not hitting when they are young. Moreover, the decline in pitcher hitting (relative to position players) started 120 years ago, and it continues under every imaginable condition. It also tracks pretty well with the shrinking variation of player offense, another measure of improved overall quality. Both show relatively rapid increases in league quality in the first half of the 20th century, then slower improvement in more recent decades.

Pitchers in 2018 may not be exactly equal in hitting ability to the pitchers of 1927 (they are, after all, much taller and stronger on average), but there isn't any serious doubt that the general downward trend in pitchers' hitting is a reflection of the rising quality of position players.

9:06 PM Feb 4thFrankDI don't think pitchers-as-hitters reveal much useful data since HS and College went to DHs. And using track/field historical data has inherent problems that need to resolved: mainly the different field conditions in the past vs. now. Yes, times for running races have fallen but the equipment including running surface has greatly improved, i..e., Jesse Owns ran on cinders and did not use a starting block. And its not "Ryan threw fast" its that analysis indicates Ryan had thrown the fastest pitch ever timed. And, overall SAT scores have fallen over the last several years- does that indicate that todays players are dumber? Of course not, there has to be many corrections to SAT data as to why these scores in aggregate have fallen. If we don't make adjustments for conditions and other changes, when comparing data from different eras we are comparing different data sets, not subsets of data drawn from the same overall data set. I'm not rejecting that there are meaningful statistical differences in the overall population or the population of MLB players of say Ruth's time and ours - even after corrections. And these differences probably indicate a change in overall physical ability toward modern players being 'better'. I just think that without adjustments based on conditions we will greatly mis-estimate these differences.

7:34 PM Feb 4thGuy123The difference between Ruth being a "superstar" (150+ wRC+) today, as opposed to merely "very good," is actually quite small in terms of the speed of change being suggested. If players are improving at 0.3% per year, then Ruth would be about a 150 wRC+ hitter now; but if players are improving at 0.5% per year, then he would be about 125. That's because 90 years is a long time, so the cumulative impact of small differences becomes meaningful. But over a 10-year MLB career, that means talent is rising only 3-5%, which is very hard to observe and measure when mixed in with aging and various era-specific changes in the game.

Now, I certainly don't know for sure whether players are improving at a rate of 0.5% or 0.3%, or somewhere in between. But it's not a question you can answer with evidence like "Hank Aaron wasn't that big," or "Nolan Ryan threw really fast," or "Jose Altuve is just 5'-6" tall." You have to look at a lot of data -- on the shrinking variance of player performance over time, on size and weight of players, on changes in hitter performance from year-to-year, on pitchers-as-hitters. Every one of these methods tells us that players have gotten better over time -- faster in 1920s through 1960, more slowly since -- but can careful analysis find a rough consensus on the *pace* of change? That's what needs to be determined.

7:33 AM Feb 3rdGuy123The problem with this analysis is that it confuses the standard deviation of height AMONG ATHLETES with the standard deviation of height IN THE GENERAL POPULATION. Athletes, as stated by the poster, are pre-screened to represent for size. They are disproportionately selected from the largest people in the population; thus, they have a much smaller standard deviation of size than does the general population.I'm afraid Bill is mistaken here. The standard deviations I reported -- 2.1 inches for a player, or 0.14 for a league -- were calculated for MLB position players in 1927 (300+ PA). (Interestingly, the SD in 2018 was also 2.1 inches -- no change). So the gain in height over 90 years was about 18 league SDs. By Bill's reasoning, that gain should be virtually impossible. But we know it happened. And that is because there is no necessary mathematical relationship between the variance of player height in one year and the amount of change that can occur over a century. That is true for baseball talent as well.

And no, the SD for height does not appear to be smaller among players than the general population. I can't find exact data for U.S. males, but it looks to be about 1.5 inches. Presumably the SD for players is a bit larger because they are selected for defense as well as offense -- the difference in average height by fielding position is 2-3 inches, ensuring a certain amount of variance. It is true, as Bill says, that the increase in mean height reflects a change in the larger male population, but I can't see how that changes the fact that today's players are in fact bigger and stronger.

7:15 AM Feb 3rdKaiserD2Neither Mays, nor Mantle, nor Henry Aaron was an especially large player by the standards of their own time. Mays lists at 5' 11", 170 lbs; Aaron, 6', 180 lbs; Mantle, 5'11", 195 lbs (which seems a bit heavy at least in the early stages of his career.) And this hasn't changed all that much. Mookie Betts is 5' 9", 180 lbs (and really reminds me of Aaron, physically.) Yelich is bigger--6'3", 195 lbs--and Trout is the real outlier, 6' 2", 235 lbs. Barry Bonds, says baseball-reference, was 6' 1", 185 lbs--presumably early in his career.

Another thing to think about: as I've said, many fewer young men get a real chance at professional baseball these days and I suspect a lot of them are screened out based on small size. Pedroia has proved you can be small and a superior hitter but not too many guys get the chance to do that.

David K

3:39 PM Feb 2ndFrankDMaybe way off base here statistically but maybe we should be comparing great ballplayers as to size, height, vs their cohorts in time. Were Ruth, Mays, Trout the largest of their era? Were the biggest players better than the smaller players in the same era? Then lets take all the 'greatest' and the run stats on this group for size, height, etc. Does height or weight seem to point to the opinion of who's the greatest. Ruth was bigger than Trout, does that make Ruth better? Many ways to slice the data. This would answer size vs. ranking.

I think we may be chasing a Chimera: it looks different to each observer. In every group of ball players in a time segment some are by far the 'best'. But is there anything that is deferentially measurable that all these 'best' players have in common? Without doing an autopsy to measure fast-twitch vs. slow-twitch muscles I don't think we'll ever find it. Which makes watching/arguing baseball fun: we can never 'prove' who was the greatest. But I find Bill James' stat analysis very interesting though I'm not so ready to accept that Gaussian distribution fits all the time - though at present I have no proof to not accept it.

10:37 PM Feb 1stFrankDI've posted this before:

https://tedsummaries.com/2014/05/03/david-epstein-are-athletes-really-getting-faster-better-stronger/

based on this argument there hasn't been that much improvement in easily measured athletic performance other than technology. The easily measured, directly comparable are times in races and distances in jumping, throwing. And all have been directly effected by technology improvements. Running on cinders sans starting block vs running on a modern, springy track.

Then look at

https://fastballmovie.com/the-film

which posits Nolan Ryan as the fastest ever. Please watch these and then comment.

We can't bring the old guys up to today, but we could have the modern athlete perform in yesterdays conditions: have Usain Bolt run on cinders. Have Trout swing a heavy bat. Have fielders play with a tiny glove on an uneven field. I know this is a test that probably won't happen but it would answer a lot of questions.

10:07 PM Feb 1stbjamesI'm not sure how much this analysis can tell us about the likely change in talent across time. Let's do the same analysis on another relevant player dimension, height. The SD of player height is 2.1 inches. So using Tango's formulas, at the team level we would expect a SD of 0.54 inches, and at the 15-team league level we expect a SD of just 0.14 inches. At the level of an entire league, we wouldn't expect mean player height to vary much. So, for players to grow considerably taller over a 90-year period -- let's say 2.6 inches -- would require an increase of not 9 SDs but actually 18 SDs. That seems wildly improbable. And yet, that is of course what happened: the average position player in 1927 was 70.5 inches and in 2018 was 73.1 inches. (Repeat this analysis at the nation-state level, and you will find that U.S. males have grown more than 9,000 SDs in that time!)

The problem with this analysis is that it confuses the standard deviation of height AMONG ATHLETES with the standard deviation of height IN THE GENERAL POPULATION. Athletes, as stated by the poster, are pre-screened to represent for size. They are disproportionately selected from the largest people in the population; thus, they have a much smaller standard deviation of size than does the general population.

When the size of the general population increases, this pushes the size of the athletes upward as well. But the "push" is being provided not by the athletes, but by the general population. Thus, the poster is measuring the effect not by its source, but by that portion of the population which is pre-screened to prevent the gap from getting wider.

9:27 PM Feb 1stFrankDInteresting read with long discussion about Ruth and baseball then vs baseball today is in the book: "The Year Babe Hit 104 Homeruns" ..... this is a hagiography of Ruth but I'd still posit that Ruth would be a star today. Why did Ruth hit so many long balls whereas nobody seems to be able to do that today? And Ruth wasn't small, on the order of 6'2", 200 lbs during younger playing days. In most baseball measurements Ruth is an extreme statistical outlier. The only way to bring him 'down' is to claim that the competition then was way worse than now - and I don't think that can be shown.

7:06 PM Feb 1stchuckWhen trying to figure what kind of stats Ruth might put up in the present game, I try first to first estimate what his strikeout rate would be. Bill, I've thrown this idea at you before, and this is another topic where it may be of use (or not- I'd like to hear your thoughts).

Because hitting approach (and equipment) has changed over the past 90-100 years, to the present trend towards uber-aggressive, plane-conscious swinging, comparing strikeout rates between the distant eras is complicated. One way to take the hitters' influence on things out of the mix is to look only at strikeout rates for pitcher-hitters. To my mind, this is much more of a control group, as they sucked at hitting in the 1920s and suck now. I would not expect them, as a group, to exhibit the same trend to aggressive swinging that position players have undergone.

Taking one season- 1926- BB-Reference splits say that AL pitcher-hitters struck out in 19.2% of plate appearances. Jump forward to 2017, and we see NL pitcher-hitters strikeout rate at 38.2%. (I included SH in the PA, as strikeouts on bunt attempts by pitchers are fairly common.) As pitcher-batters seldom are in the game against relievers, these numbers mostly reflect, then, the increase in strikeout rate and strikeout environment for starting pitching. It has essentially doubled, rising by 19% in comparing those 2 seasons.

One might then compare Ruth's 1926 rate (11.65%) to the pitcher-batters of 1926 (19.16%) and do one of two things, and here I don't know which would be right or better:

1) take Ruths' ratio (11.65/19.16) and apply it to the modern day 38.2%, which would spit out a 23.23% strikeout rate; this would put Ruth 37th among hitters in 2018, just behind Matt Carpenter's rate....

-or-

2) take Ruth's 11.65% and ADD 19% to it, getting 30.65%. Intuitively I would think Ruth's strikeout rate would be fairly high, considering the velocity, the range of offspeed pitches. 30.65% would put him 5th in the majors in 2018, just ahead of Stanton.

Notes:

I used just NL pitcher-batter rate because they bat on a much more regular basis than AL pitchers now. And I would add 1 or 2 percentage points to whichever Ruth modern SO percentage one estimates above when factoring in that he'd be facing relievers, and a very different breed of them, in around 40% of his PA's.

Factoring in a bit higher percentage (+2%) for relievers in 40% of PA's, and using 650 PA's, Ruth would have 156 strikeouts with the lower rate above and 204 strikeouts with the higher rate. Just looking through hitters with a 30%+ strikeout rate over the past few seasons:

OPS+

110 ... Chris Davis, 2016

096 ... Chris Davis, 2017

050 ... Chris Davis, 2018

118 ... Joey Gallo, 2017

107 ... Joey Gallo, 2018

097 ... Yoan Moncada, 2018

110 ... Teoscar Hernandez, 2018

126 ... Giancarlo Stanton, 2018

084 ... Trevor Story, 2017

171 ... Aaron Judge, 2017

131 ... Khris Davis, 2017

125 ... Eric Thames, 2017

103 ... Mark Reynolds, 2017

113 ... Chris Carter, 2016

080 ... Mike Napoli, 2016

Given a 30% strikeout rate, it wouldn't be impossible for Ruth to be a superstar like Judge, but I'd say the odds are against it at that rate, though good chances he would be above average. If one uses the lower estimate of 23% for him, then he would have a better chance of being a superstar, assuming his HR/batted ball rate is still great.

Also, that increase in pitcher-batter SO rate over the years may well have as a side factor the increased size of today's pitchers, giving them a somewhat larger strike zone in square inches, when hitting. One could do a study on SO rate by pitcher height in the two eras, as well as find the average height increase.

10:28 AM Feb 1stGuy123I think discussions of Ruth (or other old-time greats) get sidetracked by focusing on that individual player, and debating what he could or couldn’t do. Could Ruth hit a slider, or 98-mph fastballs? How much would Ruth gain from closer fences? Both sides can cite many ways Ruth would be helped or hurt by modern conditions, and many of the questions are unanswerable. But we KNOW how much better Ruth was than the average player of his day (+97% on offense), given equal conditions and training they all faced. If we can determine how many runs an average 1927 position player would create in today’s game, we can then estimate Ruth pretty accurately. And while that’s challenging, we do have some objective data to help us do that.

One option would be to estimate the impact of height and weight on offense, and then estimate how much better than likely makes today’s hitters. That would be a great project for some young saberist to undertake. You would have to look at players in the lower minors and perhaps college, to overcome the huge selection bias in MLB (where even small players are often great hitters). But I think you could build a model that would give us a pretty good sense of how much better today’s players are – on average! - simply by virtue of their greater size and strength.

Another approach is to look at pitchers as hitters. This is about the only constant we have in baseball – hitters who are good athletes and did a lot of hitting as boys and young men, but who were not selected to play in MLB in any way based on their hitting ability. Their offensive production has dropped steadily for over 100 years, reflecting the rising level of talent in the game. In 1927, pitchers created about 29 runs per 600 PA. Today they create about 18 runs per 600 PA. So we can infer that a player with talent X would create about 62% as many runs today as they did in 1927. Applying this 38% depreciation to Ruth, we would expect him to produce about 96 RC per 600 PA, which would make him a 123 wRC+ in today’s game. That is, a very good hitter, but not superstar production from a RF. If you want to argue he would be a 140 hitter, I’m not going to argue with you (although I’d say 110 is just as likely), but it does seem unlikely he would be a 150+ wRC+ hitter (superstar level).

10:16 AM Feb 1stsmbakeresqKaiserD2,

0 correct on the Birdie Tebbets comment. When Tony Gwynn was coaching college I heard him say to improve guys hitting in high school turn the batting machine to 85+ to get them used to the speed. This also goes the other way, Denny Neagle was able to win because he threw so much softer then everyone else, especially considering his size.

As far as strength there is this idea that somehow all the older players were not as strong as players today because today they lift weights. That might be true on a general level but certainly not true in some cases. Older players grew up working, building strength that way, not in gyms and showcases. Farms, mining, construction, etc. all build strength.

10:00 AM Feb 1stKaiserD2Regarding Walter Johnson, the Netfilx documentary, Fast Ball (I think that's the name), used modern technology to measure its fast ball. I don't remember the exact figure but it was well into the 90s. I don't see how anyone could doubt that Rube Waddell would have been a formidable pitcher in any era, either. Or, clearly, Bob Feller, or Lefty Grove.

I do think perhaps the single biggest change in baseball since the 1950s, though, is that pitchers are throwing a lot harder. I gather there is technology that could measure from film how fast guys in the 1950s and earlier are throwing. But I'm not sure how important that is.

The reason is a conversation I had with Birdie Tebbetts around 1990 or so when I was writing

Epic Season.He had had a catcher's and manager's eye view of two generations of pitchers and at that time he was still an active scout. He remarked spontaneously that if everyone could throw as hard as Nolan Ryan, then Nolan Ryan wouldn't be hard to hit. He may well have right. What makes a guy's pitchers harder to hit is their difference from what you are used to, and that could just as easily be 90 instead of 85, as 100 instead of 95.On another front: the best hitters of previous generations were strong enough to hit the ball out of very large ball parks. I think there's some evidence, too, that they could pull the ball more consistently. The Yankees worked on pulling the ball in batting practice in the 1930s-1940s because of the dimensions of Yankee Stadium.

David Kaiser

7:31 AM Feb 1stgarywmaloneyRuth was an extraordinary talent - the rare two-way player, and it's easy to forget he was also a fast and daring runner before he packed on the pounds.

His core ability -- hitting a baseball -- would have remained with him. This would be tempered somewhat by the proliferation of faster pitchers, but . . . Ruth adapted to the fast pitchers of HIS day (e.g. Walter Johnson, an exact contemporary).

I would give him a fair shot at adjusting to that problem -- as Williams likely would have, and (incidentally) as Aaron actually did, prospering in the pitcher-dominant 60s and 70s after his conquering performances in the more hitter-friendly 50s.

10:36 PM Jan 31stjayodumMy thought, which is s rudimentary that it probably doesn't add much, let alone advance the conversation (then why am I posting it?) relates to the size/evolution of athletes issue. Baseball today is played in several of the same parks as it was played in back in Babe Ruth's day. The bigger, stronger, faster players of today have not made those parks obsolete. Baseball retains its wonderful symmetry of 90 feet between the bases, which still works (and I know that is partly, if not wholly, due to the fact that players position themselves to give them the maximum range while still being able to catch fleet runners going down the line). I'm not sure where the ballpark issue fits, if it does at all, but I find it odd that players haven't outgrown the parks (and yes, I know about the small dimensioned parks of the past, etc.)

8:56 PM Jan 31stsproxI believe the ghost of Walter Johnson would love to invite all of you back to the early 20th century and challenge you to hit any of his pitches

Also, let's try and remember that pitchers in 1927 knew who they were facing when Babe Ruth stepped up to the plate - they weren't holding anything back like they might have if facing the back up 2nd basemen that just got called up

So anyway, what does it take to hit a pitched baseball 400 feet with any consistency?

Strength?

Flexibility?

Eye-hand coordination?

Muscle Memory?

Pitch recognition?

Ability to accelerate the bat through the hitting zone?

Does anyone really think these skills don't translate from 1927 to 2019?

8:10 PM Jan 31sttangotigerWhen Guy is talking about size, he's not saying it's a necessity, just that it's an important factor.

If you start with all American male adults 21-40, and break them up by height and weight, you will see a disproportionate share of those at ~6'2", 210 are MLB pitchers, compared to those 5'8", 180. Billy Wagner, and David Cone and Pedro are the exceptions that prove this Rule of Disproportionality.

Same applies, to a slightly smaller extent with nonpitchers. Certainly Altuve and Judge stand shoulder to shoulder. But Altuve is much more the exception than Judge among superstar players.

We have no problem accepting this in NHL and NFL that size plays a large role, even if Gretzky was the greatest ever.

***

In any case you have to make a distinction between plucking the player, and plucking his grandparents. This becomes a discussion with a huge set of assumptions. You have to state them all, and whoever you are talking to has to accept them all before you can even take the first step.

2:03 PM Jan 31stMarc SchneiderThis may not be relevant to this discussion, but is it possible to discern the effect of integration on Ruth? I see a lot of people that simply discount what Ruth did because he played only against white players. And I have no doubt that the major leagues would have been better had there been African-American players included. But how much better? The implicit assumption seems to be that every Negro League player would have displaced a white player and that, therefore, someone like Ruth would not have been as good because he would be facing better players. But, presumably, there were Negro League players that could not have made the majors. So, how much of an effect would integration have had on Ruth? Obviously, there are a significant number of white MLB players that would have been displaced by Negro League players, but it cannot have been every one.

1:53 PM Jan 31stphorton01A couple of additional comments:

1. Some wonder what "pluck out of 1927" means. For what it's worth, I think it should clearly mean the following:

- you're not assuming baby Babe Ruth grows to maturity and enters MLB with the advantages of a lifetime of today's nutrition, etc. advantages

- you're not assuming he is 25; he was 32 and fat (not really) in 1927. He would be 32 and have the same physique today --- WHEN he got here

2. Bill points out in his more recent comment that the game of today is DIFFERENT than the game of 1927. So he would be facing a succession of relievers throwing 100 MPH.

3. Therefore, I think it is only reasonable to assume he would HAVE to make adjustments, in some cases significant ones in order to succeed at the same level. The question then becomes how quickly he can make those adjustments, and are they enough to compensate for issues noted previously (along with the physical debilitation caused by time travel.)

4. Someone made the point that peak performance is really rare, but that the lower level players have gotten much better. Not sure if this true or not, but it might explain why the peaks of today are not as extreme (relative to the league) as they were in Ruth's time. Wilt Chamberlain averaged 27 rebounds a game for a couple of seasons. He couldn't do that today for various reasons -- among them that he wouldn't be able to play 48 minutes a game, and there are lots more 7 footers in the league now.

1:08 PM Jan 31stGeorge.RisingI posted accidentally before I was done. Except for Ruth to Cravath, it's hard to say that the "more modern" player was the better HR hitter.

12:05 PM Jan 31stGeorge.RisingThis is a great article: An innovative take on an important topic! Thanks Bill! And thanks to Tom Tango for his comment, too.

Like phorton01, I'm not sure if I understand how Ruth is .999 (or .850). It seems to have to do with standard deviations.

I tend toward the stance of "improvement is not as much as many claim." I think this especially true after baseball became professionalized and the national pastime, as Bill pointed out in one of his books. I'm not sure when that was, but I think it would be 1920 at the latest.

Less scientifically, I think the relative stability of talent/results can be viewed by looking at some of the top HR hitters in history. For example, Giancarlo Stanton (b. 1989) is a great power hitter, but is he that much better in hitting HRs than Ryan Howard (b.1979) at his peak? It's close, despite Stanton being "more modern." How about Howard and Mark McGwire (b. 1963)? Howard's "more modern," but was he better at hitting HRs? No.

To continue the pattern:

Player Birth Apart HR AB HR%

G. Stanton 1989 305 4194 7.3%

Ryan Howard 1979 10 382 5707 6.7%

Mark McGwire 1963 16 583 6187 9.4%

Mike Schmidt 1949 14 548 8352 6.6%

Harmon Killebrew 1936 13 573 8147 7.0%

Ralph Kiner 1922 14 369 5205 7.1%

Jimmie Foxx 1907 15 534 8134 6.6%

Babe Ruth 1895 12 714 8399 8.5%

Gavvy Cravath 1881 14 119 3951 3.0%

12:04 PM Jan 31stsmbakeresqFirst most ballplayers are "below average" as Bill once said in his iceberg example.

Second, stars are stars because of how you perceive them against most other players, which is how they compare to "below average" player. that background makes them stand out.

Third, I think that in every sport, the bottom of the roster, the worst players, has improved over time far more then the stars has improved over time.

Fourth, in 1927, there were more places to play baseball and make good money for that time then just MLB. Players want to 1) play and 2) make money.

Fifth, because of the above I think the bottom of baseball rosters is much better than it was in MLB in 1927. Since the bottom of the rosters are better its harder for todays players to stand out as much against that background as in earlier times.

I do think Ruth today would be a star. He would be a great star if he got the benefits of today such as food, coaching, video, lifestyle management, single games, etc. But right out of 1927 there would be a period of adjustment but the great ones are great because they adjust.

11:28 AM Jan 31stGuy123Steve: if you are right then MLB teams are all doing this very wrong. Because for every 6-2/210 guy there are probably five or ten 6-0/180 players. So if height and weight aren’t so important, then there are hundreds (maybe thousands) of players out there just as good as the current MLB players — but smaller. Some GM should put together a low-cost championship team this way.

11:21 AM Jan 31ststevebogusGuy123-

I wasn't aware that athletic ability has a linear relationship with size. Imagine how much better Joe Morgan would have been if he was seven feet tall. And why did they make the baseball diamond so large when the tiny little guys of the 1800s could barely throw the ball from third to first?

Okay, enough sarcasm.

Size and strength are somewhat related. But a larger body needs larger muscles just to accelerate its own mass. Think of power to weight ratio. I have no doubt that the average player today is physically stronger than the average player 100 years ago. But what does that translate to? In a contact sport mass can be a useful advantage (football linemen). There is much less opportunity in baseball to use extra body mass to your advantage. From a power standpoint, the key is acceleration of the ball or bat. From a foot speed perspective, there are big people who can run fast and many who cannot, just the same as smaller people.

11:12 AM Jan 31stFireball WenzI never know how to answer these questions. What does it mean to "pluck him out" of his time? Does it mean he uses the same equipment, the same diet, same training methods? Does it mean he's seeing splitters, cutters, etc. for the first time? Hss he had no access to film, etc. Or are we assuming some kind of adjustment period, and his adoption of today's methods?

10:27 AM Jan 31stGuy123I think folks are underestimating how inferior Ruth's competition was. In 1927, only 5% of the position players were at least 6 feet tall and 200 lbs, while today that describes 61% of the players. Two-thirds of the 1927 position players weighed under 180 lbs. -- only 5% of today's players are that small. And we know that height and weight play a huge role in determining offensive ability.

Ruth was twice as good a hitter as the average player of his day (197 wRC+), but we was playing against the physical equivalent of today's high school players. If Ruth was transported to 2018, we'd basically be telling him to "pick on someone your own size." And the notion that he would still be a superstar against that competition seems rather farfetched.....

9:45 AM Jan 31stMattGoodrichAlso depends on how you put Ruth in today's baseball. If you took him as a baby and raised him in this day and age, I can see him being a baseball superstar (assuming he even played baseball). Take a 25-year-old Ruth and time travel him to today's game and I don't think he'd dominate. Current people are simply bigger, stronger, and faster than 100 years ago, for various reasons.

Of course, with Ruth's tendency toward excess and our modern day ease of 'over nutrition', a Ruth growing up today would probably be too overweight and sedentary to be a baseball player. Maybe he'd be an offensive lineman.

9:34 AM Jan 31stKaiserD2I think Bill is certainly right here about Ruth. About what has happened to baseball, I'm not so sure, and I've offered a contrary argument in

Baseball Greatness.If you study the seasons of great players, using any reasonable metric, you find that people of the quality of the very greatest players, of whom Ruth is one, are incredibly rare. Taking a much lower bar than Ruth/Mays/Aaron/Bonds etc.--5 seasons of 4 WAA or more--well, about 100 players since 1901 have done that, which is .5 % of all MLB players. Top ability is extremely rare. I think the same is almost surely true in any complex field of endeavor, but the beauty of baseball is that we can measure it. Ruth was one of the very greatest (tied with Bonds in fact with 17 seasons of 4 WAA or more, including two as a pitcher). I think it's clear that he would have been a great player in any era.

About the improvement in the game, though, I'm less sure. I believe the quality of players in MLB is a function of the number of young men who are given the chance to make an all-out effort to develop their baseball skills by playing full seasons of organized baseball. And that number peaked, by a very large margin, around 1941, because the minor leagues were so huge. And neither breaking the color line, nor opening up Latin America, has had nearly enough impact to compensate for the huge decline in the minors in the 1950s. We have nearly twice as many MLB teams today but we have fewer young men spending most of their waking hours on baseball. We all know that the Dominican Republic, with about 10 million people I believe, probably has more young men per capita playing seriously than ever. But the US probably had just as many per capita in 1941 and it was well over 10 times bigger, even leaving out the black population. That had to have an impact.

This isn't just speculation on my part. The number of superstar seasons (4 WAA or more) has fallen a great deal in the last 15 years or so--it's at an historic low for that period.

But going back to the beginning, I agree, Ruth would have been a superstar in any era, without question.

David Kaiser

7:29 AM Jan 31stbermangeIt is the age-old discussion among we cricket statisticians too.

Don Bradman played international cricket between 1928 and 1948 and had a batting average of 99.94, scoring 6996 runs in the process.

His career batting average is almost 39 runs higher than the next batsman who scored more than 2,000 runs. For these batsmen the mean test batting average was 40.42 with a standard deviation of 9.19.

This gives Bradman’s batting average of 99.94 a Z-score of 6.48. If we only select the best test batsman - say those with a career batting average of over 50 who have scored 1,000 or more runs, then Bradman’s Z-score drops to a "mere" 5.4.

By any measure, streets ahead of the competition. But how would he do today?

7:02 AM Jan 31stGuy123I'm not sure how much this analysis can tell us about the likely change in talent across time. Let's do the same analysis on another relevant player dimension, height. The SD of player height is 2.1 inches. So using Tango's formulas, at the team level we would expect a SD of 0.54 inches, and at the 15-team league level we expect a SD of just 0.14 inches. At the level of an entire league, we wouldn't expect mean player height to vary much. So, for players to grow considerably taller over a 90-year period -- let's say 2.6 inches -- would require an increase of not 9 SDs but actually 18 SDs. That seems wildly improbable. And yet, that is of course what happened: the average position player in 1927 was 70.5 inches and in 2018 was 73.1 inches. (Repeat this analysis at the nation-state level, and you will find that U.S. males have grown more than 9,000 SDs in that time!)

What Bill's SD analysis tells us is that, if 2018 MLB players are much better than 1927 MLB players, then it is extremely unlikely the two leagues were randomly selected from a single, trans-historical pool of professional baseball players. And that's certainly true. Today's players come from an entirely different pool of young men, one that is not only much larger but also much taller and stronger on average. And that makes it at least possible that Ruth played against substantially inferior competition than today's players.

6:39 AM Jan 31stwdr1946One way to measure this is to look at changes (i.e., improvements) in league fielding averages over time, since this focuses on errors, and is a measure of incompetence. Of course there are other factors involved (groundkeeping), gloves, etc., but one could measure when these improvements took place, and by how much. My guess is that fielding probably reached something like its present levels in the late 1930s.

1:22 AM Jan 31sttangotigerBill,

If you were to cap the win% at .150 to .850, you would get an almost perfect uniform distribution using the data I posted. That would make it 70% of .288 or .20 as the standard deviation.

Therefore I think you can approximate using a uniform distribution of 0.1 to 0.9. That gives us close to .24

11:45 PM Jan 30thtangotigerBecause Babe Ruth (and Bonds and Mays and Trout) have more wins than "game space", their win% is above 1.000. Annoying yes, but we gotta live with it. (Every Ruth added to a team of Ruth eventually has diminishing returns.)

Anyway, I have Ruth with a career 1.300 win%.

I have the true talent team at 1sd = 0.060. So the player SD I would suggest is 0.240.

So, Ruth is, using this, 3.33 SD from the MLB population mean of the 1920s.

Ted Williams has a career 1.160 (2.75 SD), Mays 1.080 (2.42 SD), ARod 0.935 (1.81 SD).

I'd LIKE to believe that all these are equivalent. But that's just an initial belief.

10:58 PM Jan 30thbjamesLate Night Addendum 1

Expanding a little bit on the study published a couple of hours ago—

In this study there is (A) the standard deviation of talent among players, (B) the standard deviation of talent among teams, and (C) the standard deviation of talent among leagues.

B is a known variable and reasonably near a constant,

A is a partially known variable, and

C is completely unknown

The critical question of the study is, What is the relationship of B to C? That’s REALLY what the study is about: If B is known, what is C likely to be?

In my model, it is extremely likely that the estimate of A is completely wrong, and that (because of that) the relationship of A to B in this study is mis-stated or mis-calculated.

A is mis-estimated in this model, because using a random number to represent the player’s value (A) creates a straight-line distribution, when in reality it is likely that the distribution of A represents either a bell-shaped curve or the right-hand portion of a bell-shaped curve. The standard deviation of winning percentage among players is almost certainly NOT .288, but some number significantly lower than that. I would suggest that it is probably .120 to .150, but that’s just a guess.

With regard to Babe Ruth, the model assumes that he is +.499 vs. an average player, but also that he is less than two standard deviations above the norm. In reality, he is almost certainly MORE than two standard deviations above the norm. He might be 3, he might be 3 and a half, but he’s more than 2. He’s not a .999 player; he is more like an .850 player, but against a norm of .120, that’s close to three standard deviations above the norm.

Why, then, does the model hit the mark in regard to B?

Because the model assumed that the winning percentage of the team is the average of the winning percentage of the 15 players, when in reality it probably represents an exponent of the average. It’s probably the average to the power 2 vs (1-A), or to the power 4, or something. Depending on how you calculate the winning percentage of the player, which is a partially understood problem.

Basically, it’s off-setting errors with regard to A vs. B. A is wrong, and the relationship of A to B is wrong, but they happen to off-set to hit the target, so that B is what B ought to be. This was totally unexpected. I had no idea that B would turn out to be the actual B, so I wasn’t worried about the relationship of A to B.

BUT. Big But.

What is critical to the study is not the relationship of A to B, but the relationship of B to C. Is there any reason to believe that this problem has any impact on the relationship of B to C?

I can’t see that there is. It doesn’t seem to me that this problem has anything to do with the actual relationship of B to C. That’s why the study was exciting to me—because if B is the actual B, then it seems reasonable, based on the model, to assume that C is the actual C.

The Other Problem

Another issue here is that changes in HOW the game is played are not necessarily changes in how WELL the game is played. Not all changes in the game are changes for the better. The famous evolutionary biologist Stephen Jay Gould, who was a huge baseball fan, would make this point constantly: that evolution does not produce a BETTER species, it merely produces a DIFFERENT species which is better adapted to its own survival. I didn’t understand what the hell he was talking about at the time, but I get it now. Wish he was still around so I could tell him that.

Anyway, SOME of the changes that have taken place in baseball are not forward developments, but merely variations. Sideways evolution.

Suppose that in 1927, for some reason, baseball had split into two camps, with players NOT going back and forth from one to the other. Over time, the games played by the two camps would go in different directions. 90 years later, it would almost certainly be true that a dominant player from the West Coast league would not be equally dominant in the East Coast game—and vice versa. DIFFERENCE is not the same as QUALITY. The Japanese game, for example. The Japanese are baseball-obsessed at a much, much deeper level than Americans are, but the players from Japan who come to the states are sometimes at a disadvantage because they’re playing a significantly different game.

The “heavy bat” problem is not a QUALITY problem; it’s a DIFFERENT GAME problem. If Adam Ottavino went back to 1927, he almost certainly would have a six-plus ERA because his managers would make him try to pitch 9 innings a game, and he couldn’t do it. That’s the way the game was played then.

We moved away from the 9-inning starters because there is a competitive advantage in using relievers, but that advantage rests in large part on tactics that would be unsustainable in 1927. You can’t call players to the majors at the drop of a hat in 1927. You’re playing in St. Louis and your top minor league team is playing in Rochester, you can’t get him there for tonight’s game—or tomorrow night’s game. You can’t watch video of him pitching his last start, too see how he looks. You can’t check the internet to see how well he has been pitching. The strategies that dominate in 2019, wouldn’t work in 1927.

9:30 PM Jan 30thtangotigerEven more ammunition to the uniform distribution, here's the frequency not based on NUMBER of players, but by Individualized Games (iG):

win% freq_iG freqN

1.000 5% 4%

0.900 6% 5%

0.800 9% 8%

0.700 14% 13%

0.600 16% 16%

0.500 17% 16%

0.400 14% 15%

0.300 10% 11%

0.200 6% 7%

0.100 3% 3%

0.000 1% 2%

So this adds a bit more to the top-end. And selection bias depressed the bottom-end. Add in those guys, and it's starting to look a bit more uniform.

Just wonderful!

8:37 PM Jan 30thtangotigerBill:

So this keeps getting more impressive. As you know, I have The Indis, which is somewhat akin to Win Shares and Loss Shares (and importantly Game Shares). So, I have iW, iL, iG.

So I took the 9600 players with the most "Individualized Games" in the last 20 years (16 players x 30 teams x 20 seasons). I have their iWin%. I simply took their standard deviation. And it's .238. Let's say .24. Which if we divide by root16 gives us .06.

This is the counts I have of the players:

win% freq

1.000 4%

0.900 5%

0.800 8%

0.700 13%

0.600 16%

0.500 16%

0.400 15%

0.300 11%

0.200 7%

0.100 3%

0.000 2%

While not uniform, it's also not so tightly centered either. Not to mention that because I selected the 9600 players who have the most iG, I kind of lose out on the lower talented players, that would further spread this out a bit more.

All to say that using the uniform distribution is reasonable enough.

Can you produce a chart similar to above using WinShares win%?

8:32 PM Jan 30thBlueRulezUnderstanding that I'm picking nits, but it might be more interesting to transplant the Ruth from

1914,to now, rather than 1927. While the leagueswouldmove forward from 1914 to 1927, they wouldn't move much.This would give the

nineteen-year-oldBabe Ruth a chance to adapt to today's game, with his attendant growth potential still yet to be tapped.8:21 PM Jan 30thbjamesReplying to Bob Gill. . . I think you're asking questions that no one knows the answer to. I think my article took a step toward better understanding, but. . . there's a lot of steps.

In the case of, let's say, Harold Baines. . . and I'm sorry to pick on Harold, because there has been too much of that. But Harold, over the course of his career, is probably not a .550 player, probably not close to. He may be more like .530--and he's a Hall of Famer, and he's probably not the worst Hall of Famer. So obviously a realistic movement over time, a 20-point movement even, is very relevant to him. Johnny Bench, maybe not as much. Barry Bonds vs. Babe Ruth, even less.

8:03 PM Jan 30thphorton01Conceding that I don't fully understand the math/statistics arguments around applying a performance number (like .999) to Ruth and then assuming some standard deviation devolution from there [I don't think you can assign .999 to Ruth just because he is the best, as if Gehrig is .998 and so on] -- I still think it is silly to assume that he would be a superstar without adjustments. Unfortunately the poll question only had one answer that included adjustments (star).

I agree with @chrisbodig -- you would have to adjust to 100 mph fastballs, etc. An analogy I would make would be to basketball - the best three point shooters when the rule was introduced were probably 5 percentage points worse than they are now, on a much lower number of shots. But after five years, percentages steadily moved up -- because people made adjustments. And now, every kid shoots thousands of them every year for 10 years before they ever get to the NBA. Ruth would not have the luxury of many years of muscle memory to improve his technique, etc. He would certainly make adjustments, some quite quickly (like bat size) and I think he would be a very good player.

But not without significant adjustments. And this says nothing about league-wide changes in nutrition, weight training, diet, etc. I get that the league performance moves slowly -- but EVERY player he plays against will have 80 years of improvements in all of those things.

8:00 PM Jan 30thshtharWhat do you think is Babe Ruth’s LIKELY level if you plucked him out of 1915 and put him in 1927?

7:54 PM Jan 30thBobGillExcellent. One question, though: Granting that the likes of Ruth, Cobb and Wagner would still be great today, what about star players who weren't so far above the norm -- Bob Meusel, say, or Joe Judge? Would 0.17 a year drag them down to mediocrity, or lower? Maybe a better way to put it is this: What's the line a player from the 1930s or whenever had to be above in order to make it likely that he could hold a job in the major leagues today?

7:42 PM Jan 30thtangotigerAnyway, it is darn impressive that using a uniform distribution bounded by 0 and 1, and having ~16 players per team gives us the spread in team-talent. This really opens up a more straight forward path for analysis.

Thanks Bill for paving yet another road.

7:33 PM Jan 30thchrisbodigWhen I saw Joe's Tweet about Ruth, my first thought went to your book "Whatever Happened to the Hall of Fame" when you noted that Olympic swimmers of the 1930's posted times that wouldn't win a high school regional meet today (which was the 1990's when you wrote it).

So I pulled the book off my bookshelf and, lo and behold, I had forgotten the point. You went on to note that it's easier to improve performance in a one-dimensional skill than in something complex like hitting a baseball.

There is one aspect of baseball that is analogous, however, to swimming speed or sprint speed and that is the speed of a fastball.

Today's hitters have to adapt to a parade of relievers coming in throwing close to 100 MPH. Ruth didn't have to deal with that. Of course, Aaron and Mays didn't either.

I'm inclined to think that, if Ruth were plucked out of 1927 baseball, he would struggle mightily in the beginning but eventually figure it out and become the superstar that he was.

7:15 PM Jan 30thtangotigerA few equations.

1. Standard deviation of a uniform distribution (from 0 to 1) is 1/root12, or 0.288661. So, this is close to the random number one Bill used.

2. The standard deviation of a collection of such numbers is 1/rootN, where N = number of players in this case. I usually use 16, Bill used 15. Root of 15 is 3.873, so .288661/3.873 = .0745. A little higher than Bill's. And as Bill noted, .072 or so is close to what we've observed in MLB.

3. And naturally, a league is a collection of teams, so if you have 15 teams (*) gives us .0745/3.873 = .0192

(*) Root16 makes life alot easier!

6:57 PM Jan 30thManushfanI have a few things that come to my mind whenever I see the argument that comes across that Ruth would be a AAA player at best now, or Dave Kingman or whatever.

If you're saying this about Ruth, then you are saying this about-Dimaggio, Ted Williams and Stan Musial who played 5 minutes after Ruth got done, and you have to include guys like Mays, Aaron and Frank Robinson, Mantle and Kaline, who came up within a decade of Them.

I just don't see any argument that would convince me that, for example, Willie Mays of 1951 rookie fame, couldn't cut it today. It boggles the mind. Or Gehrig. Or a Paul Waner. It's silly.

I agree of course the game is BETTER and the players faster stronger better trained all of that, sure-and the pitchers throw harder. But when you see these clowns saying 'duh with that tentpole bat Ottavio would knock it outta his hands and k him every time out--' , you're assuming Ruth was an idiot who couldn't be coached, couldn't figure out a smaller lighter bat was the way to go, you're assuming a guy who went from the Dead Ball era to the Lively Ball era, who went from being a great pitcher in 1915 to being a great hitter as late as '33, somehow couldn't adjust adapt. That's--just flat out ignoring history and ignoring what Happened.

Plus, face it-any sport where guys like Rick Dempsey, Jaime Quirk and Darren Oliver can play 20 years plus or minus, isn't exactly 'getting harder' all that fast...

I'll let the other more coherent arguments and contributers have at it here now.

6:44 PM Jan 30thbjamesI meant to say any measurement in which one human being would be NINE standard deviations above the norm, not 4.

6:36 PM Jan 30th