Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Comparing a Player Outside His Era

By Bill James

January 17, 2018

Dialogue

X-plenashun

The Andruw Jones article which appeared here a few days ago was excerpted from an e-mail conversation with my friends Joe Posnanski and Tom Tango. After sending that e-mail I received a 163-word comment from Tom, to which I responded with a 2,908 word lecture. Tom could have taken offense to my verbose response or its tone, but being the great man that he is he did not, so what follows is a three-part exchange. (1) is Tom Tango’s e-mail, sent sometime late last week. (2) is my very long response to this, written on Sunday and Monday (January 14 and 15, 2018). (3) is Tom’s response to mine. Thanks for reading.

1. Tom

Slideshow

tom tom@tangotiger.netHide
To	jposnanski@mac.com
Cc
Bcc
Slideshow

In my view, we should be happy to compare players to their peers. Andruw was born April, 1977. His peers are born 1972-1982. You can go one more generation either way if you like, so as far as players born 1962-1992. After that, we get into the issue Bill is showing that the amount of information is different, so the uncertainty levels will be different. Older players have more depressed stats because of it.

You have about 12-15 nonpitcher in the HOF for any 10-11 year time period. If Andruw is one of the top 12 nonpitcher among his peers, great. If he’s one of the 40 nonpitchers born 1962-1992, great. (Though these numbers can be argued based on expansion.)

I don’t really see the benefit of starting to compare to Willie Mays. We’d never do that in hockey. And I think it’s unhelpful to compare current players to Bill Russell.

This way, we don’t have to make "timeline" adjustments. That’s me though.

2. Bill

I’d have to say that I 100% disagree with your main point here, and let me try to explain why. I would go so far as to say that I don’t think you can defend your position by saying "That’s me though". I think that position is untenable. I think it’s simply wrong. Let me try to explain. First, though, I think that that this part of your e-mail is very useful:

You have about 12-15 nonpitchers in the HOF for any 10-11 year time period. If Andruw is one of the top 12 nonpitchers among his peers, great. If he’s one of the 40 nonpitchers born 1962-1992, great. (Though these numbers can be argued based on expansion.)

This is useful, because it suggests a way to inform the Andruw/HOF debate in the short run which we have not been using, and which we very much need.

If you believe that cross-generational comparisons in baseball are unnecessary and inherently flawed, then you have blocked the Andruw Jones-for-the-Hall-of-Fame argument, which Andruw’s advocates would not accept and I would not accept. Their argument is that

1) Andruw is an all-time great defensive center fielder, and

2) His contributions to the success of his team (as measured in wins) are equal to those of other players who have been elected; therefore justice and equity demand that he should also be elected.

Your position, if accepted, would deny to Andruw’s advocates the right to make either of these arguments. If you can’t make cross-generational comparisons, you can’t argue (1) above, and if there are no cross-generational standards, then you can’t argue (2) above.

But these are very legitimate arguments, and, in my view, there is no reason that Andruw’s advocates should not be allowed to make them. The problem is that they may not be true.

The first reason that I think that your position is just completely untenable is that I don’t think you realize, or I don’t think you have thought about, what this would do to the Hall of Fame debate, if we were to adopt your position. Your position would not invalidate a certain limited number of Hall of Fame arguments, 15% or 20%; it would invalidate almost all of the central arguments for Hall of Fame selection which have always been made. It would invalidate 80 or 90 or 95% of the Hall of Fame arguments which have always guided selection, and require that people develop new and different approaches in representing their favorite players. Thus, it would set the careful and rational interpretations of Hall of Fame arguments, the sabermetric arguments. . .it would set the clock backward by 40 years, and require us to start over, developing new and different approaches.

But. . . and I was trying to point out where you were right here, and lapsed into explaining where I think you were wrong. The value in this is that it points to a different argument that Andruw’s advocates could be making, which would help them to maneuver around this roadblock of not being able to use his "modern" defensive values in contrast to older defensive players. From my standpoint, if I were to study Jones’ WAR compared to a tight cadre of his generational competitors and find that he ranked at or near the front of that group, that would help me to accept that he might be worthy of selection. I doubt that that is what we would find, but I haven’t studied it in that way, so I don’t know.

Let me try to summarize what it is that you are wrong about:

1) (Already noted) Your position, if accepted, would invalidate almost all of the arguments which have always been used to sort out who belongs in the Hall of Fame and who does not, thus setting the relevant field of understanding backward by generations.

2) You have tremendously overstated the difficulty of making cross-generational comparisons in baseball,

4) Your position, if accepted, would close the door to lines of analysis which are quite interesting and potentially quite valuable,

2) You have tremendously overstated the difficulty of making cross-generational comparisons in baseball. Yes, of course it is difficult to make cross-generational comparisons in baseball, but you ignore the fact that we have a constant frame of reference which applies across time and which stabilizes the data across time: Wins. IF you can state a player’s contribution to his team in terms of wins and losses, then you have a valid basis for comparing a player from 1968 to a player in 2008—leaving only the large problem of the relative quality of play over time, and the relatively small issues of the length of the schedule and what to do with post-season play.

And, in fact, we HAVE 85% solved the problem(s) of stating a player’s contribution to his team in terms of wins and losses. Yes, we still have the WAR/Wins and Losses Issue and the many problems inherent in WAR, but what we DO know about this issue now is much larger than what we do not know—unlike 1975, when we had not even reached the understanding that we needed to state these contributions in constant terms, and that wins was the constant unit which guides the debate.

3) You have overstated . . .again, tremendously overstated. . .the extent to which the Andruw Jones problem is a general problem, as opposed to a specific problem which complicates this particular player’s case, but has little to do with anyone else. Your e-mail says that "After that (after getting beyond comparisons within a generation), we get into the issue Bill is showing that the amount of information is different, so the uncertainty levels will be different. Older players have more depressed stats because of it." But this is only a problem if a player’s Hall of Fame case rests on claims that cannot be documented for players from the previous generation(s), and in reality this is very rarely a problem.

We should acknowledge that our statistician great-great-grandfathers did a hell of a job, in the 1870s, of setting up a record-keeping system for baseball. We do have, in fact, relatively good and. . .I shouldn’t say complete. We have relatively robust records for players and for games played from 140 years ago. I actually don’t know of any other player, other than Andruw, whose Hall of Fame case rests directly on issues which cannot be sorted out one way or another based on the information which exists, although I am sure that there must be and will be other cases in which this is a problem.

4) Your position, if accepted, would close the door to lines of analysis which are quite interesting and potentially quite valuable. There remains in front of us the issue of the relative quality of play over time. What you are in essence saying, it seems to me, is that "Oh, well; that’s an unsolvable problem. We don’t want to waste our time trying to figure that one out, because there is no way to answer all of the questions."

But that’s not true; the question of the relative quality of play over time is (a) a very interesting question, (b) an issue the better understanding of which is vital to our true understanding of the game, and (c) a set of questions which absolutely can be addressed by study.

At risk of getting lost chasing rabbits, there are three ways of approaching this issue:

1. Internal data comparisons,

2. Characteristic of Quality analysis, and

; 3. Modeling.

Internal data comparisons. In 1975 (I think. . .could be wrong about the year). . .in 1975 or thereabouts Dick Cramer published a study of the quality of play over time, using this approach. He assumed (or "proved", if you accept his demonstration that this is true). . .he assumed that a player’s ability was constant over time, other than that players decline sharply in their last year in the majors. Using this assumption, one can find a player who played in the National League in 1936 and in the American League in 1942. If he was 7% better-than-league in 1936 and 11% better-than-league in 1942, that would suggest that the National League in 1936 was 4% tougher than the American League in 1942. If you make all possible such comparisons (ignoring final seasons), then you can evaluate the strength of each league relative to every other league.

While there are of course a long list of problems with this approach, Cramer’s analysis yielded reasonable results on many issues. It showed that the quality of play in the major leagues dropped sharply during World War II, that it increased sharply after the breaking of the color line, that the American League was the stronger league from 1920 until the late 1940s but the National League was the stronger league from the late 1940s until the late 1960s, and that the overall quality of competition dropped after expansion.

The study was too far ahead of its time. In order to do the analysis, Cramer had to take fixed positions on issues which couldn’t really be understood at the time. No one ever really followed up on his analysis, addressing the inherent problems—but it could still be done, and it should still be done.

Characteristic of Quality Analysis. While almost all of sabermetrics is based on study of accomplishments relative to the norms, the fact is that the norms themselves are sometimes indicative of quality. At least fifteen generalizations about the quality of play over time are known to be true—for example, the percentage of runs which are driven in is consistently higher at higher levels of competition than at lower levels of competition. By developing this type of analysis, we could evaluate the quality of play over time.

Modeling. We could build computer models which represent the problems involved. . . .for example, (a) how much below average are the players introduced into the league by expansion, (b) how many such players are introduced into the player population, and (c) how much does the expansion lower the overall quality of competition?

If we combine these different approaches and work on them until they converge on a common point, then we have a solid basis for understanding of the issue.

5) You are treating what is in fact an advantage for baseball as opposed to other sports as if this was a problem for baseball, and as if baseball was not allowed to have peculiar charms which other sports may lack. Well, first of all, I would argue that the history of the game—the history of any game—is the foundation of its appeal, and that one should never and can never sever the history of a sport from its current impact. The easiest demonstration that this is true is to ask yourself, "If we were to invent a game which is superior to baseball, football or basketball in every conceivable aspect, what would be the immediate impact of that?" It would have NO immediate impact, because the game would have no history, thus no following, thus no appeal.

This seems obvious when you put it that way, but the failure to understand this damaged baseball tremendously between 1952 and 1972. A few years ago I had a guy on Bill James Online arguing that it was necessary and inevitable that teams like the Pittsburgh Pirates, in older and declining cities, must inevitably move to younger and larger cities like San Antonio and Orlando. But this ignores an obvious reality: that 130 years have been invested in making people Pittsburgh Pirate fans. There are PEOPLE in San Antonio, lots of people, but they are not Pirates’ fans. If you move the Pirates to San Antonio, you are starting over in that long, long process of making people care about the team because they grew up caring about the team, and their father grew up caring about the team, and their grandfather grew up caring about the team. Between 1952 and 1972 baseball owners hopscotched around the country willy-nilly, abandoning their fans, because they failed to think this through. They confused the burst of attendance that follows a city’s introduction to major league baseball with real devotion to the team—and during that period, baseball per-game attendance declined steadily and alarmingly.

Not SOME of baseball’s appeal rests on its history, but ALL of its appeal. If you cut the game off from its history in the way that you suggest, blocking comparisons of Jose Altuve to Joe Morgan and Kris Bryant to Mike Schmidt, you are in fundamental conflict with the essence of the game.

If hockey does not do this, then. . .well, Hockey is an Ass. But it is not really necessary to make that argument because each sport has charms that the other sports lack.

I was watching football yesterday, and thinking of the unique appeal of football, part of which is that every play is shaped by the immediate decisions of the offensive team. The team DECIDES to run in a specific direction or to throw the ball to this player, or that one if this one is covered. These decisions have an immediate impact on the game which is not really paralleled in any other sport, although basketball teams and to a lesser extent baseball teams do plan and execute designed plays, but in basketball only a limited number of plays are designed, and those usually break down and require the players to improvise the plans from second to second. Anyway, my point is that football—and every sport—has unique elements of its appeal, or elements of its appeal which are weak in one sport but stronger in another.

In baseball, history is a large part of its appeal. The super-organized nature of the sport enables you to make a record of the game as you are watching it, take that record home with you and study the game, re-evaluate the game, in ways that are not possible in sports which depend on continuous action improvised from moment to moment. This is part of the appeal of the game.

We cannot write intelligently about baseball or baseball history while ignoring this part of the game’s appeal. That’s what the Hall of Fame IS; it is an argument about how players stand relative to the history of the game. It is a valid approach toward that issue to say "How does the player stand with respect to HIS OWN time?", but the debate does not end there and cannot end there and will not end there. If you pretend that it does, you are cutting YOURSELF off from the debate.

And if sabermetrics pretends that the debate can be conducted without this part of the discussion, then we are eliminating OURSELVES from the discussion. This is pointless, and it is wrong.

I would also harangue you a little bit about the statement that "that’s just me" or that’s just my way of thinking about the issue. I remember a sportswriter that I very much admired as a young boy, a Topeka, Kansas writer named Bob Hentzen. Hentzen would give his opinion about something, but then, when confronted with an opposing opinion, would use one of two expressions: We all understand that the newspaper is just fish wrap, or Whatever we say, it’s lining the bird cage tomorrow morning.

Sports writing—as opposed to sports analysis—sports writing spins its wheels. The debates about who was the Most Valuable Player were the same in 1970 as they were in 1950, except that the names were changed, but the lines of analysis were exactly the same, unmoved by what anyone said. The accepted convention of sportswriters was that they did not change the debate by what they said. I have my opinion, you have yours; it is all the same.

I am not an arrogant person, I don’t think, but that is the difference between them and us—that we DO accept the challenge of permanently changing the debate. It’s NOT fish wrap tomorrow, what we say about the issue is supposed to change the debate.

Of course it is appropriate, in debating the Hall of Fame credentials of Andruw Jones or Mike Mussina, or in debating the MVP candidacy of Aaron Judge or Giancarlo Stanton, to have as much respect for the opinions of others as for your own; this is just how I see the issue, and I acknowledge that there are other equally valid ways to look at the problem. But sabermetrics is not a collection of individual opinions, one as good as the other; it is a commitment to resolve the issues by finding the compelling logic on the underlying problems. This is an underlying problem. I don’t think that we have the freedom to shrug our shoulders here and say that one opinion is as good as the next.

3. Tom

Slideshow

Bill:

Your entire points are well-balanced, so feel free to post it as-is, even if it can be superficially characterized as being "anti-Tango" (and for those who do that, well, that's on them). And you can post my prior email, as well as this one to book end it if you like.

My larger point is that BEFORE we start comparing say Scott Rolen to 3B born 50-100 years before he was, let's compare him to his peers (all nonpitchers born within 5 years) and his overlapping generations (nonpitchers born within 15 years). I would say I'm part of the very quiet minority that does this.

Inherent in the wins-as-the-great-equalizer debate is that it treats winning percentage of .500 as a constant. This forces a specific mindset: (a) do you adjust this across time, so that a .500 win% in 1917 is not the same as in 2017? (b) do you accept it is the same, solely for the purposes of peer-comparisons?

If you accept (b), we're holding the players accountable to their era only. And so any comparison of Ty Cobb to Rickey Henderson is limited to their comparisons to their direct peers, and then players of the early 1900s, as a group, are being treated as equals to the players of the 1980s and of today (.500 = .500). So, we're actually comparing Cobb and Rickey through this "common baseline".

If you accept (a), you need to accept that players have gotten better across time, just like in every other sport, and so, the average player in Cobb's time would likely be a bench player today, at best. I say this not only because players must have gotten physically stronger and faster, but also because 3/8ths of the best players today would have been precluded from playing during the segregation era. If we use players of the 1900s as any kind of benchmark, by this assumption, we'd be voting in ten players every year.

In my view, before we jump through the hoops that a careful timeline comparison would require, (including establishing which of the two assumptions to follow, of which no one will agree with anyway) let's at least do the easy part and compare players to their direct peers.

If Scott Rolen is one of the 12 (or so) best nonpitchers of his peers, great, put him in.

If Scott Rolen is one of the 40 (or so) best nonpitchers of his overlapping generations, then great, put him in.

(And I'd rather see a debate as to what IS a good range to have.)

If Scott Rolen is NEITHER of the above, THEN, let's look at the history of baseball. Then you can try to find a bone for him to put him in the hall of fame.

My point is that if Rolen qualifies under one of the two main tests, we can't knock him out because he fails the final test. He just needs to pass any of them.

And same applies for Andruw Jones. His case does not rest on him being the best fielding outfielder of all-time. It's possible he can pass either of the peer or generational tests. Let's focus on the main points first. If he fails both, then we can have the spirited debate of Andruw, Willie Mays, and Paul Blair. In my view, we jumped over the main points directly to the Mays/Blair arguments.

I've been relying heavily on a player's birth date, more for logistical reasons. But, what I would do is find the player's "peak" value point (say take a 7 consecutive years, and mark year #4 as his central year). This makes the most sense for "peers" like Randy Johnson and Dwight Gooden, who were born in proximity, but whose peaks are 10-15 years apart.

Then, find every player whose central years are within 5 years of that. That's the "peer" group. Then find every player whose central years are within 15 years of that. That's the "generational" group. So, RJ and Gooden are not part of the same peer group, but they are part of the same generational group.

Feel free to come up with better names.

Once you do that, we can then decide around "how many" players you'd want to honor. And decide how segregation and expansion impacts things.

Tom

COMMENTS (8 Comments, most recent shown first)

tangotiger
ks: it is only done when creating these all-time lists, usually to sell books or magazines.

When it comes to the Hall of Fame for example, you will NEVER see those comparisons being made, unless you are talking about an all-timer. You might see Mario Lemieux being compared to Jean Beliveau for example and usually just for one facet of their play.

Take the most recent inductees:
www.hhof.com/LegendsOfHockey/jsp/LegendsMembersByYear.jsp?type=Player

Dave Andreychuk
Paul Kariya
Mark Recchi
Teemu Selanne

These are great players, but not all-timers. The ONLY players they get compared against are their peers, the guys they actually competed against. You don't see Kariya being compared to Yvan Cournoyer for example (or anyone from the pre-Kariya era).

That's because we all realize that hockey has changed so much, not only in their size and speed, but also just style of play.

9:34 AM Jan 22nd

PeteRidges
Do we need to consider both views at the same time?

If we are looking at pitchers, an all-time comparison would suggest that 19th century pitchers did far more to help their teams win than 21st century pitchers- but we don't want to keep modern pitchers out of the Hall.

But if we say we want the same number of pitchers from each cohort, that means that the Hall of Fame should have as many starting pitchers who were born in the 50s as any other decade: but there really weren't many Hall-worthy pitchers between Blyleven and Clemens.

Why not try looking at both as far as we can?
2:59 PM Jan 21st

ksclacktc
Tango: I'm a huge hockey fan and you're being overly definitive with regards to player comps across eras in hockey. The NHL itself has created top 50 and top 100 lists for all time, and you have to compare players from different eras to do this.
www.thehockeynews.com/news/article/ken-campbell-ranks-the-top-100-nhl-players-of-all-time It is done all the time!
10:44 AM Jan 20th

MarisFan61
I think you're both 100% right. Truly.

It's definitely fun and interesting, and I think also meaningful, to compare players of different eras.
But, although meaningful, it's not at all clear what it means.

I compare them all the time, and it has meaning for me.
At the same time, I realize that there is no way to know what it really means. It's not even clear what we think we're talking about when we talk about it.

Do we mean taking the Ty Cobb of 1910 and imagining that exact player in some other era? As Tom says, there's a high chance that in any reasonably modern era the guy would have been a bench player at most. Do we want to take into account the better training, nutrition, health etc. etc. of modern times, and imagine what Ty Cobb would be, which includes that he'd probably be bigger physically? Do we just mean how good he was in his own time relative to his peers, whether with or without what we consider to be some meaningful time adjustment?

I'm sure that to many, this will seem like a meaningless Clintonian thing, like many of my comments must seem. :-)
But really, folks -- this depends big-time on what we think we mean.

And I'm saying that Bill and Tom are both 100% right.
11:31 AM Jan 19th

FrankD
Gentlemen:
Thanks for the interesting and thoughtful debate. I think the issue of 'normalizing' data for players from different eras to allow for meaningful comparisons of said players is difficult but doable and necessary. It is a core baseball fan appeal to compare players of different eras. Even Tom's statements of identifying the best players of a peer group requires 'normalization' - in this case spatial normalization to allow comparison of players from different leagues. Somehow the analyst in choosing the best players of an era must perform some correction for league differences. These corrections may be small compared to time era corrections, but these corrections still have to be made to quantify who are the best players.
11:05 AM Jan 19th

Guy123
There are a few methods in addition to those cited by Bill that may allow us to measure the changing level of quality of play over time:

1) Pitchers as hitters. One of the clearest signs of improving play is that the offensive performance of pitchers, relative to position players, has declined rather steadily for 100 years. Because their ability to hit plays no role in their selection to pitch in MLB, PAH represent a rather stable measure of ability over time that can be used as a benchmark for comparisons. Today's pitchers may be somewhat better, or somewhat worse hitters than those in the 1920s (we would definitely want to adjust for their hitting less frequently in recent decades), but the change is likely to be modest.

2) Absolute measures. We can compare the size, strength, and speed of players over time, and also estimate the impact of these characteristics on baseball performance. Height and weight info is notoriously inaccurate, but not useless, and could be supplemented by better sources (military, HS/college, family medical records, NFL combine, etc.). We could develop a "forensic sabermetrics" that uses old film and video to measure the actual velocity of pitches, runners, and balls off the bat from earlier eras.

3. Variance and quality. The higher the quality of play, the less variation we see among players. That is why, famously, we have no .400 hitters today. We have excellent tools for measuring variance in each time period, and that allows us to reverse-engineer an implied level of play. (This may fall under Bill's point #2.)

I don't believe any one of these measures alone can answer these questions. But to the extent several measures provide fairly consistent answers, they may collectively tell us quite a bit.

7:09 AM Jan 19th

tangotiger
No one compares Steve Yzerman to Stan Mikita. Or Peter Stastny to Toe Blake.

You are talking about the all-time greats. If you wanted to compare Barry Bonds to Ted Williams, fine. But, guys at that level are not the issue.
9:58 PM Jan 18th

Rich Dunstan
Not sure why Tom says we don't compare recent greats to long-ago greats in hockey. I've seen and heard a lot of discussion as to whether Wayne Gretzky or Sidney Crosby was greater than Gordie Howe or Maurice Richard. I once interviewed a former teammate of Howe's (Paul Henderson) and asked him who was greater, Gordie or Gretzky? He seemed to think that was a natural question, and told me he thought Gretzky was better.
3:59 PM Jan 18th

Comparing a Player Outside His Era

COMMENTS (8 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: