Chet Lemon and Kirk Gibson were teammates on the Detroit Tigers from 1982 through 1987.
Being teammates removed a lot of potential variables in trying to assess their relative contributions. We don’t have to adjust for league or park effects because they played in the same league, the same park. In evaluating their defense, we don’t have to adjust for the groundball/flyball tendencies of their relative pitching staffs. As fellow outfielders…granting that one was a centerfielder and the other a corner outfielder…we don’t have to worry significantly about positional adjustments.
Here they are as offensive contributors:
Player
|
PA
|
R
|
H
|
2B
|
3B
|
HR
|
RBI
|
SB(CS)
|
BB
|
Lemon
|
3244
|
419
|
755
|
154
|
22
|
113
|
393
|
8 (19)
|
315
|
Gibson
|
3133
|
461
|
735
|
124
|
31
|
131
|
439
|
142(36)
|
351
|
And their triple-slash lines:
Player
|
BA
|
OBP
|
SLG
|
OPS+
|
Lemon
|
.267
|
.352
|
.457
|
121
|
Gibson
|
.272
|
.358
|
.485
|
130
|
They are similar offensive players. Gibson has a slight edge in the power department, and he was a better baserunner than Chet Lemon, but their contributions as hitters are very close.
As defensive players, their margins are further apart. Lemon didn’t win any Gold Gloves, but he was thought of as a very good defensive outfielder. Gibson, though fast, was not a natural outfielder: he was a college football star who came into baseball late. He had a less-than-stellar throwing arm.
The comparison between Gibson and Lemon is, as you can guess, a continuation of an article I wrote earlier about
Dave Winfield and Dwight Evans. I didn’t expect that article to generate such a useful conversation, but a lot of people ended up chiming in, and I thought it would be good to continue along.
Winfield and Evans both had good defensive reputations. They had similar defensive reputations: they were judged to be decently mobile corner outfielders with very strong throwing arms.
The advanced analytics…the analytics that WAR is founded on…concluded that Evans was a good defensive outfielder (+80 runs saved, according to Total Zone), and that Winfield was something of a butcher (-97 runs saved).
What do the metrics say about Lemon and Gibson?
Player
|
FanGraphs Total Zone
|
Lemon
|
67
|
Gibson
|
8
|
Does that seem accurate?
Sure. Of course. That seems like a very reasonable measure of the defensive contributions of Chet Lemon and Kirk Gibson from 1982 to 1987. Lemon was a fine defensive outfielder, and Gibson, though not excellent, was at least capable of getting to a few tough outs.
From that, we get the following breakdown of each player’s contribution, and their respective WAR:
Player
|
Off
|
Def
|
fWAR
|
Lemon
|
81.4
|
62.6
|
25.6
|
Gibson
|
121.8
|
-25.2
|
20.6
|
‘Off’ stands for offense, and it a combination of hitting and baserunner. ‘Def’ is Defense, and it measures defensive runs above the league average, making an additional positional adjustment. That’s explained in the other article, but I’m repeating myself for those who missed it.
Tallying both sides of the equation, FanGraphs WAR tally credits Lemon as being a better player than Gibson during their years on the Tigers. Not much better, but a little better. Gibson was a slightly better hitter and baserunner, but Lemon was way ahead on defense, and he played a tougher position. Added up, Lemon is ahead.
Do I buy that?
Of course I do. The numbers fit the story. Chet Lemon was a good defensive outfielder. Gibson wasn’t a great, instinctive player, but he could run at that point in his career. He had been a wide receiver in college: we can chalk his positive defensive contributions to speed.
Not only do I buy it: I think it’s beautiful. It is a beautiful distillation of a wide range of variables that comes to a conclusion that elucidates something useful. It lets in some light.
There is a logic to the story that WAR is trying to tell, one that takes what observers noted about each player, and gives those observations a kind of depth by diving into the numbers.
We can understand Lemon and Gibson’s defensive contributions in two ways, and they align:
Player
|
Consensus Opinion
|
Statistical Dive
|
Chet Lemon
|
An excellent defensive outfielder who was probably underrated b/c there were some strong OF’s in the AL during his peak
|
+66 Total Zone Runs
|
Kirk Gibson
|
A fast defensive OF who could play CF in a pinch but was better suited for corner OF. Not a great throwing arm.
|
+8 Total Zone Runs
|
The metrics that WAR uses gets all of this exactly right. It successfully identifies Gibson as being the slightly better offensive player, and it successfully understands Lemon as having an edge defensively. Overall, it gives Lemon a nod over Gibson. It all lines up.
I know I sometimes come across as anti-WAR. I’m not. I don’t want to get that reputation, that interpretation. A lot of people put a great deal of work into making WAR a tremendous statistic, and I wouldn’t want to disparage those efforts. Trying to tackle a question like ‘What was Chet Lemon like as a defensive player’ forty years after his career started is a monumentally difficult task, and I am not trying to dishonor that work.
What gets my hackles up, I suppose, is certainty. What concerns me is that there are a lot of people in our community who have a tendency to accept any conclusion of WAR as Gospel Truth.
The distinction between Lemon and Gibson is useful because it shows the general accuracy of the metrics that FanGraphs and Baseball-Reference uses, and it gives us a decent frame for what we might expect a comparison of Winfield and Evans to look like.
Dwight Evans, like Lemon, was a natural outfielder. Considering that Evans played in a tougher park and had a better arm, and considering that Evans was winning a few of the Gold Glove awards that Lemon didn’t win, we’d expect him to rate a little higher than Lemon. As he does.
And if we think Winfield is an overrated defensive player…as I agree he was…we can imagine Gibson as a decent enough parallel. Gibson, like Winfield, wasn’t a natural outfielder, but both men had gifts that made managers think, "I can live with this guy in centerfield in a pinch. It’s not ideal, but I can live with it."
The Total Zone metric credits Gibson as being a few ticks in the positive. +8.
But Winfield is massively in the negative, and his negative years weren’t just his late years playing outfield where he tallied his demerits. Those years ding him significantly, but he rates as below average for most of the 1980’s.
So we get this:
Player
|
Consensus Opinion
|
Statistical Dive
|
|
|
|
Dwight
Evans
|
Reputationally excellent defensive OF famous for having a strong throwing arm. Awarded eight Gold Gloves.
|
+80 Total Zone Runs
|
Dave
Winfield
|
Reputationally good defensive OF famous for having a strong throwing arm. Tended to play deep to steal homers. Awarded seven Gold Gloves
|
-96 Total Zone Runs
|
While Lemon and Gibson were very obviously different defensive players, Evans and Winfield had identical strengths. Both had extremely strong arms which pushed them to the corner outfield. Both were rangy enough to patrol big fields. Evans probably had better positioning and a better read on balls, but Winfield was a little faster, and his extra five inches of height and background as a basketball player meant he could jump a fair bit better than Evans.
I maintain that there is a disconnect here. Evans and Winfield were similar players by reputation, and utterly dissimilar by the advanced defensive metrics. The gap between them isn’t a gap that makes an intuitive sense, as the gap between Lemon and Gibson. It is significantly at odds with their reputations.
And that small disconnect…leads to a much broader conclusion: that Dwight Evans was the greater player.
Do I buy that?
I’ll say, first, that the excellent work that Charles Saeger did in the comments section is very convincing. Charles went deep into the statistical database, and concluded that Evans 192 more putouts than expected, while Winfield has a shortfall of 226 putouts. His research also finds that Winfield, strong arm aside, had a below average assist rate. Charles’ work is so thorough that I have no interest trying to nit-pick at his points. You should read his work and take his word for it. There’s every reason to believe he’s correct.
I remain skeptical that Winfield was so poor a defensive outfielder that it makes up for the other differences between him and Dwight Evans. If I was asked to take one player for their whole career, I’d take Winfield.
But I am closer to buying the conclusion that WAR outlines than I was a week ago. That should read as a credit to Charles’ work, and Tom’s work. To everyone else who defended WAR, too.
There are a few stray questions worth asking, which I will mention to keep the conversation going. I don’t have a conclusion about these questions, just my biases and leans, but I wanted to share them with you.
1. If the metrics are right, how in the world did the observers get Winfield so wrong?
I know that the Gold Glove awards in the 1970’s and 1980’s were dubiously slanted towards star hitters. And certainly, Winfield had the type of skills (ability to leap tall buildings, ability to throw baseballs at canon velocities) that would make him look impressive. Still, it is impressive that such a poor defensive player would manage to fool people for so long. It’s a neat trick.
2. How much do co-outfielders impact defensive metrics?
Reader ‘rtayatay’ pointed out that Winfield played with Devon White in Anaheim, two years where Winfield rated as a very poor defensive outfielder. White also had a poor defensive year, by his standards, in 1990, and then he bounced back in 1991. Winfield played with Rickey in New York, who was another good outfielder. How much does the presence of a Devon White or Henderson impact the scores of other outfielders?
3. How much blame can we pin on managers or GM’s?
Dwight Evans was moved to first-base half-time in 1987, giving him the chance to rest his legs every other day. He was a full-time DH by 1990. Winfield, who had many more miles on his legs, stayed in the outfield for every year until 1992, when the Jays finally made him a full-time DH. He went back out in 1993 and 1994.
Those ‘old’ years – those years when a player with less of a reputation would’ve been moved to 1B or DH – significantly cut into Winfield’s reputation. Is it fair to knock him down because managers and GM’s reasoned he could handle the position, when Evans had a team that better recognized his limits?
4. The ‘robbing dingers’ question.
The reputation on Winfield is that he played deep to steal homeruns. Charles did a good job showing that Winfield didn’t do a great job of limiting double, but what about homeruns? Is there a reasonable way to calculate how many homers Winfield might have turned into outs, or is the event so rare that we can’t get close to approximating that value?
5. Where are other outliers between reputation and statistical analysis, and what meaning can be gleaned in looking at parallels?
I’m interested in Evans and Winfield because…repeating myself….they were defensively similar players: corner OF’ers who could play center in a pinch, who had great throwing arms. They are similar, too, in the subjective opinions of the people who watched them. They are dissimilar in the deep statistical dive into their records.
Assuming that the statistical record is correct, there is useful information to unpack. What were the factors that allowed observers to draw such conclusions? What other players might be underrated or overrated by these factors, and how can we right the subjective judgements of the past?
* * *
Wrapping this up, I suppose some of you…those of you who have skepticism of WAR…will read this as a capitulation. Some of you who are pro-WAR will count this as a victory.
That’s fine: I have no control over how anyone reacts to what I write. But I don’t view it as either a capitulation to better reasoning or a failure of skepticism. What drives my thinking – all of my thinking – is a strong rejection of certainty.
WAR is a useful statistic, but in its effort to answer all of the questions leads to a mode of thinking that allows for a kind of certainty to creep in that other baseball statistics don’t allow. If you say that Tony Gwynn is the best player because he hit .370 last year, I can counter with Andre Dawson’s 47 homeruns. If you say, however, that Tony Gwynn had a WAR of 8.6, there is no statistical counter to it. WAR has claimed all of the pieces.
And it has every right to. I don’t object with the effort to develop an all-encompassing statistic: what I object to is the mode of thinking that a statistic like WAR brings forward: a mode of analysis that is convinced of its certainty. This mode of thinking is becoming pervasive in our community, and it shuts down better conversations. It narrows the avenues we can take towards understanding.
Am I certain that Dave Winfield is a better player than Dwight Evans? I am not. But I am still not certain that WAR gets it right, either. I still believe there are gaps in the narrative, errors in the code.
David Fleming is a writer living in western Virginia. He welcomes comments, questions, and suggestions here and at dfleming1986@yahoo.com.