Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Matt Kemp and the Limits of WAR

By Dave Fleming

March 28, 2017

This is my second consecutive article about the Braves, which breaks my previous record of consecutive articles about the Braves by exactly two.

I’ve meant to write other articles in between these, but life has a habit of getting in the way of things. I wanted to get something written about the Red Sox, and their lefty-heavy rotation. I wanted to check in with Billy Hamilton. And Russell Martin deserves a little bit of a write up. I’ve never written a word about Russell Martin, which I should maybe do something about. Russell Martin is a really interesting player.

But we’re back on the Braves. What can you do? I have to write where the spirit leads me.

* * *

Matt Kemp is slated to be the Braves leftfielder in 2017. He is coming off a season during which he hit .268 with 35 homers and 108 runs batted in. He scored 89 runs. He hit 39 doubles. He did that while splitting time between two teams: San Diego and Atlanta. San Diego is pretty rough on power hitters. You all know that.

I don’t want to imply that everything was positive about Matt Kemp last year. He drew just 36 walks, and had an on-base percentage of .305. That’s not good. The NL’s cumulative on-base percentage was .322 last year, and that is including the pitchers. Matt Kemp wasn’t great at getting on base.

And he was a pretty bad defense player. He only made three errors, but his range was abysmal, and the advanced defensive metrics all give him negative marks. He’s getting old. It happens to all of us.

As a baseball player, Matt Kemp has one strength, and a bunch of weaknesses. His strength is that he hits for power. Last year he finished 31^st in the majors in Isolated Power. His .231 mark was a few ticks below Mike Trout (.235) and Robinson Cano (.231). He was ahead of guys like Votto and J.D. Martinez and Hanley Ramirez. That’s his one positive: he cranks out the extra-base hits.

And then there are the negatives. He doesn’t get on base. He’s no longer a good defensive outfielder. He attempted one stolen base last year, which is a far cry from the 51 attempts he made in 2011.

So that’s who Matt Kemp is. We can all agree on those general parameters. He can hit a baseball much farther than I’ve ever hit a baseball, but purely as major league player, Matt Kemp rates as a negative at every other facet of the game. I don’t think any of us disagree about this.

* * *

Which brings us to a recent article posted at FanGraphs, listing the rankings of the left-fielders of all major league teams by position. This is one part of an absolutely essential and brilliant series projecting team rankings at all positions, including the rotations and bullpens of each team. I look forward to this every year, and I encourage all of you to check the whole thing out at FanGraphs. I think they’re wrapping up the bullpens today.

What I want to talk about is where the Braves, and Matt Kemp, rank among major league left fielders.

There are thirty teams in major league baseball. Of those thirty teams, FanGraphs ranks the Atlanta Braves 29^th by anticipated production from their leftfielders. Impressively (or un-impressively), the Braves clock in with one of the few negative totals at a position in all of baseball. Here’s their table:

Name	PA	AVG	OBP	SLG	Bat	BsR	Fld	WAR
Matt Kemp	630	.266	.315	.455	1.2	-0.8	-13.1	0.1
Emilio Bonifacio	35	.247	.296	.315	-1.5	0.1	-0.3	-0.1
Jace Peterson	28	.243	.325	.348	-0.6	0	-0.1	0
Mel Rojas Jr.	7	.227	.287	.354	-0.3	0	0	0
Total	700	.263	.314	.443	-1.2	-0.7	-13.5	-0.1

FanGraphs projects Matt Kemp getting the bulk of playing time in leftfield, and they project that he will be about the same player that he was in 2016: a decent power hitter, a less-than-ideal baserunner, and a poor defensive outfielder.

When I saw the Braves listed 29^th in baseball in leftfield, my brain processed the information is two distinct ways. I thought:

That’s probably right. Matt Kemp isn’t a very good ballplayer anymore. He can’t get on base, and he’s terrible in the field and on the base paths. All he does is hit dingers.

And I thought:

Home runs matter. And just who the hell was going to drive in runs for the Braves if not Matt Kemp?

Let’s unpack these responses for a second.

The first response is the one that I’ve been conditioned to think from my years of immersion in sabermetric-leaning baseball writing. Home runs matter, but so does defense. So does a player’s baserunning. So does getting on base. That’s important. That’s the most important thing.

I think this thinking is right. On-base percentage matters. Catching liners in the outfield matters. Matt Kemp might’ve walloped a few dingers last year, but he used up a lot of outs, too, and gave back runs on defense.

So there is a part of my brain that thinks it is absolutely reasonable to think that Matt Kemp, playing full-time, is a ‘zero’ player. There is a part of me that doesn’t question valuing a 35-HR, 108 RBI season as being worth 0.7 Wins Above Replacement, because I understand that 35 homers and 108 runs batted in doesn’t cover a wide percentage of events that Matt Kemp was involved in during the 2017 season.

But another part of me thinks a team just needs home runs, damn it.

A baseball season is a long thing. There are times when you can win with singles and doubles, and there are times when you can win by a well-executed sacrifice bunt, or by a monster bullpen. But there are times when you need instant runs, and when those times happen, it helps to have a couple guys who can hit the ball out of the park, even if they don’t do anything else.

This is where WAR comes off the rails for me, frankly. This is where I think the metricwhiffsin its effort to encompass the totality of a player. Saying that Matt Kemp has less value to the Braves than, say, a platoon of Edwin Rosario and Robbie Grossman have for the Twins strikes me as not only wrong, but silly.

It’s silly in part because it is understanding Matt Kemp in a vacuum: what does he do, and how does it compare to what other individual players do? There is no understanding of whether or not his specific skills or flaws matter to his team. Matt Kemp is a 0.7-win player on the Braves, and he’d be a 0.7-win player if he played on the New York Mets.

I don’t buy this. The Braves have one decent power hitter on their roster (Freeman), and a string of singles and doubles hitters. Getting Matt Kemp helped them because he gave them a second guy capable of hitting a homerun every now and then, even if that second guy was a crummy defender. That was an important add to the Braves offense.

And it wouldn’t have mattered at all to a team like the Mets. What would the Mets have done with another low-OBP slugging OF? They had already cornered the market on those guys. They could put one on every position on the diamond, and have a spare to talk to David Wright.

Matt Kemp played 56 games for the Braves last year, posting a .280/.336/.519 batting line. He was third on the team in homers and fifth in RBI’s, despite playing a third of a season. He was the Atlanta version of Gary Sanchez.

For that, Baseball-Reference tallied his WAR at 0.0. If you expand the decimal down, it’s actually somewhere in the negatives. Baseball-Reference calculated Kemp as having had negative value for the Braves in 2016.

Let’s unpack that further. Chase d’Arnaud, in about the same number of plate appearances for the Braves, posted a .245/.317/.335 batting line. He hit one homer. He is credited with a WAR of 0.3. Jeff Francouer was at .249/.290/.341, same plate appearances as Kemp. He’s credited with 0.7 WAR. Unless Baseball-Reference gives extra credit for players with French names, this doesn’t pass the smell test.

Name	PA	AVG	OBP	SLG	WAR
Matt Kemp	241	.280	.336	.519	-0.0
Jeff Francouer	276	.249	.290	.381	0.7

This is ridiculous, right? We can all see that this is ridiculous. Jeff Francouer could be the reincarnation of peak Willie Mays defensively, but there is no way that difference makes up for a forty-six point gap in on-base percentage, and a 138-point gap in slugging percentage, not unless Matt Kemp laid down and took a nap every time he played left. Jeff Francouer absolutely did not help the Braves win more games than Matt Kemp last year, and any metric that argues that his value significantly outpaced Kemp’s contribution to the Braves is missing something big.

And we know that Matt Kemp improved the Braves: the record strongly suggests that this is the case. Atlanta went 37-68 before they acquired Matt Kemp at the deadline…and they went 31-25 after he came over. What changed?

It wasn’t pitching and defense. The Braves allowed more runs per game after Kemp arrived, which suggests that Kemp’s defense might’ve hurt the team:

2016 Braves	W-L	Pitcher's ERA
Before Kemp	37-68	4.39
After Kemp	31-25	4.76

But if the Braves lost a bit on the defensive side of the coin, they more than made up for that loss on offense:

Before/After	W-L	Batter's Runs Per Game
Before Kemp	37-68	3.4
After Kemp	31-25	5.2

The team’s improved offense isn’t all about Kemp, of course. Freddie Freeman had a monster second-half, and Swanson started to get his legs under him as a major league player. Matt Kemp, alone, didn’t turn the Braves from an 100-loss team to a borderline contender.

But he quite obviously did help. He contributed in one area where the Braves really needed help (slugging), and the team started to win games. His presence had to have mattered. The change in the Braves offensive performance can’t be credited entirely to coincidence. Matt Kemp changed the Braves for the better. Our best metrics don’t credit him for any of that improvement.

* * *

I didn’t follow the Braves last year. I suspect that most of you didn’t follow them too closely, either. And I had sort of forgotten about Matt Kemp entirely….I doubt that I ever thought about him last season. I don’t think I watched a single at-bat.

If I judged his 2016 season entirely through the lens of the WAR's tallied on FanGraphs or Baseball-Reference, I’d assume that Kemp had a mediocre year. I’d think, too, that the Braves made a mistake spending money on a player like Kemp, who has no ‘value’ to a baseball team, only cost.

But it turns out that’s not right. The Braves team experienced a dramatic turn-around at the start of August, and Matt Kemp seems like he was a key part of that turn-around. Although the range of his contributions as a baseball player is limited to power hitting, his presence on the Braves improved the team’s offense, and any metric that suggests Matt Kemp was a net-zero for Atlanta last year seems to be missing the wider picture.

I’m not trying to knock WAR. I reference the metric in just about every article I write for this site. What I’m trying to address is the divide that occurs in my head when I think about the metric. What I’m trying to understand is that way that it sometimes blinds me to other possibilities.

I think that the blindness is an individualized one. I think that the flaw in WAR…the big limit of the metric…is that it tries to reduce a lot of variables into separate ones. It sees individual players, but it does not adjust to the wider contexts of a team’s structure. There is no adjustment made for the fact that baseball teams need some guys to hit homeruns, just as some teams need left-handed bullpen arms or a good defensive shortstop.

In a way, this distinction echoes the old debate about ‘best’ and ‘most valuable.’ Matt Kemp is not a great baseball player, or even a particularly good one. But Matt Kemp had value for the Braves in 2016. He was an important part of the team last year, and if he’s health he’ll be important again in 2017.

I think that matters. I think that’s a distinction that has some importance in our struggle to understand the game, and I think it’s important to the story we tell about what happened in a year. Matt Kemp, from what I can tell, was an integral part of the Braves turnaround last year, and I didn't read a single article in any saber-leaning site that covered it, or even considered the possibility. Matt Kemp was a zero. Matt Kemp was washed up. Matt Kemp doesn’t matter.

He did matter. There’s a good chance that Matt Kemp helped turn around a last-place team. That’s an interesting story, in that it might tell us things we don’t know about batting order, or lineup structure, or what makes an offense efficient or inefficient. It's interesting because it give us new avenues to understand the game. And it’s a fun story, because it’s always fun when a bad team starts to play good baseball.

We’re missing these story. We shouldn’t.

Dave Fleming is a writer living in New Zealand. He welcomes comments, questions, and suggestions here and at dfleming1986@yahoo.com

COMMENTS (67 Comments, most recent shown first)

MarisFan61
Here's a little follow-up about the Rob Wood/Tom Tango discussion of Win Shares, whose link was posted by Studes.
I realize that nobody might still be looking here, but, since I did start looking at it and since nobody else has said anything about it, here are some initial notes. I won't follow up further right here unless there's some interest; I would put anything further on Reader Posts (and I'll put a notice over there about this post). Although actually, I may not have much extra to say, in view of these initial impressions.

-----------------------------------

What is clear from the git-go is that they do have a different 'world view' than Bill does about how a large metric should be viewed and approached, Tom more so than Rob; in fact, to be fair, I'm not sure that Rob's 'world view' of this is much different from Bill's, and he seems very well able to see the system from Bill's standpoint. Tom immediately shows his differing view, with a difference so basic that I'm not sure how terribly revealing his specific quarrels with the system can be. He seems simply to see differently what player value is about:

1. CONCEPT OF WIN DISTRIBUTION
A. Do players "contribute wins" or do they "contribute towards winning" or do they "contribute towards trying to win"?
I think in a team sport, a player can't contribute a win. The best he can do is contribute towards a win. However, this would mean that a pitcher pitching a no-hit 1-0 loss can't contribute towards a win, since his team didn't win. So, I rely on the concept that a player's performance contributes towards trying to win.

That makes totally perfect sense -- but it shows a different basic view than Bill; I'd call it a basic disbelief in the system's premise, and as the discussion proceeds (from the little bit further than I've gone so far), while Tom acknowledges what Bill is doing, it seems this disbelief keeps him from assessing it for what it is and for what it's doing.

Rob counters it, and tries to get the discussion indeed to look at the system on its own terms. Tom says he accepts that -- but adds that if you move away from "contribute towards trying to win," you are "introducing some error somewhere" -- although he does add further, "And that going to 162 games, these errors sorta/kinda balance out."

While it "introduces some error" in a technical sense, I see it more as making a fitting compromise between the technical and the "real"; by the latter I mean actual wins. To a large extent, this is about the willingness to take actual account of what many call "luck" (but which I do not, which doesn't mean I deny any element of that). As I see it, a big part of the beauty (literally beauty) and genius of Win Shares is this 'seasoning' of the technical with this big dose of pragmatics. Some may see this as involving "error" in a pure technical sense. I see it as countering the inaccuracy that results from a 'purer' technical approach, one in which all the math agrees with everything that you think it should.

Rob and Tom then get into a discussion of the relation between marginal win contributions and actual wins, particularly the lack of total correlation between them in Bill's system: the total of the marginal win contributions of a team's players is more than the actual win total. Rob suggests that maybe it's because of the absence of Loss Shares (as of then).

Recognizing that I'm light years behind all the principals in that discussion (Rob, Tom, and Bill in absentia), an amateur among pros, and therefore that my impressions may be without sound basis.....
My impression is that Rob and Tom are worrying about a mathematical issue that isn't germane to Win Shares and which the Win Shares system sails past, and in fact they seem to acknowledge some confusion about it, or at least uncertainty. It's an interesting thing they're bringing up (and I have to admit I don't fully understand it), but it's not evident to me that the thing matters. The system totals up the players' contributions and converts them to Win Shares, with results which (to me) generally meet the smell test extremely well. I'm not sure it matters if such-and-such, which in theory should equal this-and-that, doesn't.

I note with some surprise, and great satisfaction, that much of what Rob states is what I would have stated, and have stated. Like, in this discussion which came before Bill's adding Loss Shares, Rob says -- as I have said loudly -- that he's not sure Loss Shares is needed; it's presumably just "the flip side" of Win Shares, provided we keep in mind a player's playing time.

In sum, so far: My impression is that the discussion doesn't by any means poke big holes in the Win Share system. It reflects on the system interestingly, but more than anything else, I think it boldly shows how Win Shares was a cosmic break from anything prior in sabermetrics, defying and bypassing some of the previously assumed principles.
2:25 AM Apr 13th

Brock Hanke
Maris - That's interesting. I was taught that "rigor" meant that the proof (the term applied to proofs when I was taking math) was absolutely, positively, logically proven, with no chance for any sort of error or wiggle room. The broader concept that you cited was only implicitly there - if you don't know what you're doing, or if what you're doing doesn't make sense, then it's not logical - it lacks rigor. That is, no one ever bothered to try to teach us that broader concept on its own. It was just considered a byproduct of rigor. Applied math paid no attention to that sort of thing at all, because you're not proving things. You're doing word problems.
7:10 AM Apr 9th

MarisFan61
Brock: Well said -- but, I have to add, "rigor" had a different meaning, an importantly broader one, in what I learned.
It wasn't only about the numbers and the mathematical expressions and the steps that you took with them. It was also -- in fact began with -- an understanding of what you were doing and an overriding principle of making sense.
10:19 AM Apr 8th

Brock Hanke
Maris - I pretty much agree with what you said about significant digits. The important thing to me is that Bill, by rounding off to integers, made the implicit statement, "We're going to try to do honest statistics here. Honest statistics, being applied math, have to sweat significant digits. So, we're going to sweat significant digits, and that means no decimal points." The important underlying point is that baseball statistical analysis should use the rules and constraints of mathematical statistical analysis. One of the biggest constraints is significant digits.

I have less problem with the first decimal point of a WAR system, because, representing one tenth of a win, it represents one run. At least it represents one whole thing, rather than just a part of a thing.

In my experience, one of the hardest things to do, when talking with a sabermetrician who had all his math training in a liberal arts college, is to get him to free himself of "Rigor." In theoretical math, Rigor is the Gold Standard, if not the God Standard. In applied mathematics, it's assumed that Rigor is impossible, so we don't use it here. There are sabermetricians who have a LOT of problems abandoning rigor. I can't blame them; it's what they were taught. It's what I was taught during the half of my college career when I was taking liberal arts math. Fortunately for my perspective, I had taken the first half of my career in the Engineering school, taking Applied Math, so I became familiar with the differences. There are very serious differences between the two approaches, and statistics are, without doubt, applied math. Significant Digits. Not Rigor.
12:54 AM Apr 8th

MarisFan61
A couple of little things about Steve's post:

-- I bet you sometimes do feel 1 degree Fahrenheit!
If you just mean 'in the abstract,' i.e. judging whether the temp is 70 or 71, sure. But in terms of feeling the difference between 1 degree higher or lower, I'd bet you do. Probably this is most clear when we're working with indoor adjustments of thermostats. It's very common to feel chilly at a certain temp, then to feel fine after you've turned it up 1 degree; and if you turn it up 2 degrees because you think it's way too cool, usually you wind up baking. It might be thought that the actual change from these thermostat adjustments is greater than the amount that you turned it up, but for what it's worth, the thermometer on the thing generally does indicate that the temp increase is exactly that amount.
I don't know that this helps anything about Win Shares :-) but this next thing does.

-- "3 Win Shares" does feel pretty palpable to me, especially in the middle of the scale. Well, at the bottom too: There's a large difference between 1 and 4 but it doesn't matter because both players have done basically nothing. In the middle range, to me there's quite a difference between 12 and 15, and between 15 and 18.
If it's about position player regulars, 12 is pretty mediocre or worse; 15 is quite OK -- you can get by with him; 18, that's a pretty good player.
5:12 PM Apr 7th

MarisFan61
I'd be interested in other people's takes on the "Walk Through" of the Win Shares system.
(Studes' last post gives a link to that blog.)
Or your take on my take. :-)
Studes feels the "Walk Through" is a good critique of the system. I think it's a very good review of the system, and raises some reasonable questions. I think it doesn't poke much of a hole in the system, and mostly just shows the take of someone with a different attitude about baseball.
12:38 AM Apr 7th

MarisFan61
Brock: (re using just integers) Indeed that's exactly 'all' that Bill is doing -- but I regard it as an important philosophical and representational thing. To me it's an emblem, a label of sorts, loudly proclaiming, 'please, let's not have any bull about this.' The inclusion of decimals by other systems feels like putting forth an illusion of precision, and whether or not originally intended, an invitation to have plenty of bull about it -- and in any event, plenty of bull is what's done with it. (I know, of course, that many of you see it differently.)
10:40 AM Apr 6th

steve161
Interesting analogy by Dave relating WAR/Win Shares to Celsius/Fahrenheit. It's not only the basic scale, though, it's also the 'size' of the units. One WAR is a lot, one Win Share (or even three) doesn't feel like much.

Similarly I can feel a Celsius degree, but not a Fahrenheit one, so I prefer the former. Dave, your expatriate credentials are hereby revoked.
7:34 AM Apr 6th

Brock Hanke
Doesn't presenting Win Shares as integers do nothing more than acknowledge the lack of significant digits? I mean, a tenth of a Win Shares is a 30th of a win, or about a third of a run. Baseball statistics do not support significant digits at that level. I admit to being sensitive to significant digits because 1) I had my college math in an Engineering school, where significant digits are an obsession, and 2) I'm so old that we only had slipsticks, not even calculators, and significant digits are a major issue with slipsticks. But sabermetics is, really, a branch of applied (engineering) math, not theoretical math. We don't spend a lot of time here proving theorems.
4:46 AM Apr 6th

MarisFan61
P.S. Let me just say, MANY MANY THANKS for the Tango/Rob Wood PDF!!

I just glanced at it, and I see that I'll be having some fun hours with it.
12:24 AM Apr 6th

MarisFan61
Well.....

I have to note some things from those links.

From "Part I" of the "Walk Through": (who's the writer??? Does he maybe only go by "Patriot"?
BTW the 'Walk Through' is an excellent summary of the method.)

The area of Win Shares which James takes most pride in is the fielding system he developed, which first credits the team fielding as a whole and then distributes the team fielding Win Shares to fielding positions, and then finally to individual fielders. Coincidentally, fielding statistics are the area of sabermetrics that I am least qualified to comment on (if you consider me qualified to comment on any). So the fielding system may well be a conceptual breakthrough, although again I have my doubts.

So -- this guy says he didn't necessarily know that much about assessing defensive metrics.

Besides that, Part I is mostly just an intro.

Much of the writer's criticism in general relates to what I'd call "interactional" aspects, which are a big part of what I love about the Win Shares system and which I think are lacking in other methods that I've seen.

Part II doesn't have any major criticisms, just a couple of minor quibbles. It's mostly quite positive on Win Shares.

Part III has criticisms which to me look debatable either way. What he says about "Runs Created" appears to be debatable, and I don't see a problem with using "marginal runs to distribute absolute wins." I do see the issue of "zeroing out negative performers," but I don't think it's at all clear that on balance this is more of a negative than a positive, including because I think some of the stated argument is questionable: I don't think transferring the ignored negative value to middle-of-the-order hitters is a valid way to examine the question, because I don't agree that it does "nothing to change the run scoring of the team." (Granted, what I'm saying depends somewhat on thinking that batting order makes more of a difference than most of sabermetrics allows.)

Part IV has what appear to be some good questions about why Bill did certain things about pitching. Much of it is just that Bill doesn't explain why he decided so-and-so; those particular things don't bother me, because they don't look unreasonable.

Part V has a number of quibbles and questions re what Bill does on pitching, some of it phrased strongly but none of which to me seems really more than a shrug. I know that others might see it otherwise.

Part VI, the first of 2 parts on fielding, again has a series of quibbles that don't hit my radar. The writer just has a different 'world view' of baseball, and to some extent, of logic, than Bill did when he created the Win Share system, and than I do, and this shows itself again and again. The best/worst example is here in this part (VI):

"In discussing the .200 subtraction, James says “Intuitively, we would assume that one player who creates 50 runs while making 400 outs does not have one-half the offensive value of a player who creates 100 runs while making 400 outs.” This is either true or not true, depending on what you mean by “value.”

Is he kidding??? "This is either true or not true, depending on what you mean by......."?
OF COURSE it's true. It's true by any common-sensical meaning of value. It's only 'not necessarily true' if you're locked into some robotic meaning of "value." Part of what I'd call the brilliance of the Win Share system is that Bill kept thinking of the common-sense actual reality of things and trying to discern how to get at it. I don't know the exact context of the quote from Bill, but it looks like he was probably coming from exactly such a place: Starting with the actual reality of a phenomenon, and then using it to help create a metric that is as meaningful as possible. What bothers me about other systems can be pretty well captured by saying that they don't do enough of this. Bill does it here, and this guy doesn't know what he's talking about. Or, he sort of does, but writes it off as being 'maybe true and maybe not true.'

Part VII: Continuing on fielding, besides saying again that he's not real qualified to criticize about it, he says that this part of the system (divvying the fielding WS among a team's players) seems OK.

Summing up, he re-states some quibbles, including (oddly, IMO) a complaint about Bill rounding off Win Shares to integers -- which is yet another aspect that I applaud and on which I un-applaud other systems, but that writer disagrees.

As said above, this "Walk Through" is a good summary of the Win Share system. I honestly don't think his criticisms amount to very much, and I think many of them show a way of analytical thinking that I would criticize.

I'll also go and look at the Tango/Rob Wood PDF paper.
I won't necessarily take up more space talking about that. :-)

Studes, thanks again for the links. Need I say, no need for you to reply. We know that you see all of this differently, and that you know a lot more about the stuff than I do.
12:16 AM Apr 6th

MarisFan61
Thanks for the extra posts, and the links!!!
11:07 PM Apr 5th

studes
I tried to find a good link or two encapsulating the Win Shares/WAR debate, and they are hard to find. However, I have found a couple of good links that critique Win Shares. One is Patriot's Win Shares Walkthrough. It's a multi-part series, but here is Part Two. You can find links to the other parts on the side column:

walksaber.blogspot.com/2005/12/rate-stat-series-pt-2.html

The other is the old Tango/Rob Wood PDF paper. Here is the link to the PDF:

www.tangotiger.net/winshares.pdf

I'm not going to debate these points with you folks. This was over for me ten years ago.

5:29 PM Apr 5th

studes
I understand that many people here prefer Win Shares, but I guess I want to state again that the majority of influential baseball analysts don't agree. There were many, many articles about the subject over the years, at THT, Tango's site and many other places, and I don't feel like rehashing them here. But it's not just happenstance or publicity that made WAR come out on top.

Yes, WAR is more mainstream because of Fangraphs and BRef, but there is a reason those sites chose WAR instead of Win Shares. The sabermetric community had come to a consensus and those sites followed suit.

By the way, those same debates led to WPA coming out of deep freeze and is also now featured at both sites. So we have Bill's original Win Shares publication to thank for that too.
4:59 PM Apr 5th

DaveFleming
I'm guessing that the biggest reason that WAR entered the mainstream while Win Shares hasn't caught on is tied to the access and distribution of WAR on sites like FanGraphs and BB-Ref. It's just much, much easier to access a sortable database of WAR than Win Shares, so people reference it more frequently. I like Win Shares a lot more than WAR, but I almost always use WAR as my back-of-napkin check because it's more readily available to source through.

I also think the 0-10 scale of WAR has a certain intuitive/aesthetic appeal. It's why Fahrenheit is the much better measure of outside temperature for human beings: a 1-100 scale makes more sense than whatever the hell Celsius is trying to do. I've lived in a Celsius country for seven years and it's still a mystery to me...
4:04 PM Apr 5th

MarisFan61
Studes: Thank you!

About "WAR" vs. Win Shares, I know that you understand them in far greater depth than I do, but from where I stand, I don't agree at all that WAR is a better system. Even considering that it's easier to work with (which is certainly true, even for folks like me) and even if its current results tend to seem more correct than those of Win Shares (which I don't agree with), I find the basic approach of Win Shares to be far more appealing and to offer greater potential. IMO it has the right idea in starting with actual wins, and its ways of dividing up the credit are more creative, including going way outside the box when there seem to be outside-the-box ways of getting at something.
11:37 AM Apr 5th

studes
You can do some things to WAR directly off Fangraphs or Baseball Reference. For instance, you can insert RE24 instead of batting above average if you value clutch hitting. Or you can make all the WAR equal to team wins on your own, in a spreadsheet, after downloading the vital stats from Fangraphs.

To do something more fundamental, all the data is available via Retrosheet. You do need Excel and/or (more likely) Access skills.

In the beginning, Win Shares received a lot of publicity, but more than a few analysts had problems with it and proposed their own framework, which is how WAR came into existence. Win Shares lost the battle among the majority of high-profile baseball analysts.

It didn't have much of anything to do with publicity or getting the word out. Shoot, I did more than just about anyone to get Win Shares out and about in the baseball community, but it just isn't as good a system as WAR.
10:19 AM Apr 5th

MarisFan61
Studes: What do you mean about how we can "make" WAR have this-and-that? (Like, clutch hitting, for example.)
Do you mean that readers/users can somehow do it (y'know, maybe like how anyone can futz with photos on iPhoto), or are you talking only about people who work with it in a technical way? I'm figuring it's probably the latter.

It has seemed to me that the main reason "WAR" has taken hold to such a greater extent than Win Shares is that Bill hasn't done as much to get it out there (notwithstanding the Win Shares book and his use of the system in the New Historical Abstract).
12:35 AM Apr 5th

studes
Just throwing some opinions in the mix:

I like the word "chance" better than luck too. Too bad we didn't start using that a long time ago when discussing random stuff.

MGL has said many times that he wishes Fangraphs and BRef would regress individual UZR (or DRS) to the mean each season. Given that we're still figuring this stuff out, I think that's good advice.

Having said that, Kemp's defensive rating last season was completely in line with his most recent previous seasons. Also, Win Shares essentially caps the range of defensive stats, but does it too much, IMO. See the difference between Kemp and Kiermaier (just half a win) in fielding Win Shares for a good example.

I'm still totally against the notion of dWAR. Replacement level should be applied to players in total, not individual components. I prefer to reference individual components vs. average.

WAR is more of a framework than Win Shares, and when we're criticizing WAR, we're really criticizing the current implementations of it at BRef and Fangraphs. You can make WAR fit each team's actual record pretty easily if you want to. You can add clutch hitting (as in Win Shares) if you want to. Etc.

Win Shares is less of a framework because Bill presented very specific instructions when rolling it out. He encouraged people to play with the assumptions (and some of us did) but those never really took off.

For a couple of years, we ran Win Shares at The Hardball Times while other sites were rolling out WAR, but Win Shares lost the competition. I fought it for a while, but in the end I had to agree that WAR is superior. And its inherent flexibility makes it a more likely platform for future innovation.

I agree with Dave's basic point that straight WAR on a team level misses some important team synergies, and we overlook those synergies at our peril. Still, it's tough to mathematically identify those synergies in any mathematical way. This article takes a stab but doesn't really prove anything.

Nice breakout stats on wins and losses, MarisFan. Love it.
1:57 PM Apr 4th

steve161
I share Maris' uneasiness with the term 'luck', while recognizing that factors other than skill matter. But it's not a simple either-or thing: if a pitcher makes a good pitch and the hitter manages to squib it past the infield, the pitcher was unlucky but the hitter, just because he got his bat on the ball, was skilled. If the hitter knocks the snot out of the ball but a fielder makes a brilliant play, the hitter was unlucky, the pitcher was lucky and the fielder was skilled.

There's also the somewhat negative connotation of the word 'luck', implying something undeserved. 'Chance' might be a more neutral equivalent.
9:27 AM Apr 4th

Brock Hanke
Since luck is exactly what i'm talking about, I probably should go into more detail. When you have a system for analyzing baseball, and your system does not agree with the actual results on the field, you have a disparity. Let's call the disparity "hidden value", just to have a term for it. there are two possible reasons for the disparity: luck and skill. If you think about it, there can't be any other choices. WAR assumes that all discrepancies, all of the hidden value, between the WAR system and actual results is due to luck, and so reconciles the actual results to the WAR results. Bill's WAR methods from before Win Shares do this just as much as any WAR system today. Win Shares, however, attributes all this Hidden Value to skill, and therefore reconciles the system results to the actual recorded game results.

The important point is to realize that luck and skill are pretty much your only choices. WAR makes one of those choices. Win Shares makes the other. It is very important to realize that luck and skill are the only options. So, no, I'm not willing to quibble on the question of "luck." Instead, I will dig in my heels and defend my position. Maris means no harm, but he did, very mildly, attack the most important part of my thinking on this. I'm always going to defend that. And so will you, if I inadvertently underestimate the core of what you are saying.
1:03 AM Apr 4th

MarisFan61
What Brock said about how "luck" is treated (although I'd quibble strongly about his calling it "luck") is exactly why it's a very significant misunderstanding if one fails to realize that the Win Share system uses actual wins as the basis. It assumes that actual wins, regardless of how they occurred, are the basic thing -- and then takes it from there.

That means, among other things, that it allows for possible importance of what we sometimes call "intangibles." BTW, need I say, I don't know if Bill conceives of it in such a way.
10:53 AM Apr 3rd

Brock Hanke
Dave - Thanks! -13.6 defensive runs is a perfectly reasonable score for a truly lousy outfielder. It's -1.36 defensive WAR, though.
6:46 AM Apr 3rd

DaveFleming
Just to clarify on Brock's post...when a site like FanGraphs lists dWAR, they are actually listing 'defensive runs'. So when it lists Matt Kemp's dWAR as -13.6, they mean that his defense cost his teams 13.6 runs, not 13.6 wins.

Their defensive metrics, especially with outfielders, is still baffling....there are a lot of OF's with good defensive reputations who rate in the negatives by their metrics. But they don't credit poor Matt Kemp with thirteen whole games worth of terrible defense....just thirteen runs.
3:31 AM Apr 3rd

Brock Hanke
I suspect that a very large part of the hitter split between games won and games lost is due to the difference in opposing pitcher quality. If you were to take a hitter and look at his games from a point of view of what the opposing pitcher did the rest of the year, you might find that nearly all the hitting split disappears.

The big difference between Win Shares and WAR taken as approaches comes down to how they treat luck. All WAR systems, including the ones that Bill devised in the early 1980s, work by comparing team WAR to actual team W/L. But, if there is a difference, it's assumed that the team WAR is correct and the difference between it and actual Games W/L is due to luck. 100% luck. 0% skill. We know that can't be totally true, but no one came up with a WAR system that worked around it.

Win Shares assumes that, if there is a difference between system results and team W/L, then the difference is all due to skill, and the system results should be adjusted to fit the team's W/L. 100% skill. 0% luck. That can't be totally true, either, but so far, no one has figured out how to split the middle.

Unless I have missed several system changes, WAR measures things in terms of Wins. Matt Kemp had, in one year alone, a dWAR (defensive WAR) of -13.6. THIRTEEN GAMES? Lost to his team because of ONE player's DEFENSE? Barry Bonds didn't win games at that rate with his bat. I'd believe -3.6, but THIRTEEN? Can't be real.

I participate in another project where we try to rank all the players in MLB for each individual year. I compare players' WAR to their Win Shares for the year. The one most serious and consistent difference between WAR and Win Shares has been that WAR systematically overrates pitchers. Or Win Shares underrates them. My own personal opinion is that Win Shares is correct about this, but that's not the point. The point was just to let you know the the largest disconnect between WAR and Win Shares is in the ratings of pitchers.

2:44 AM Apr 3rd

MarisFan61
Dave: Yes indeed -- all of that, for sure.

But I view the basis on a team's actual wins to be very key aspect of it, and I would indeed say that thinking it's based on some theoretical version of the team's wins rather than actual wins is a mistaken idea on a very basic aspect of the system.
Don't take it personally -- you've got lots of very good company :-) and I don't doubt your great knowledge about the system.
5:20 PM Apr 1st

DaveFleming
Saying that I have 'mistaken ideas' about Win Shares is like saying I don't know about Lou Gehrig because I can't remember if his AL RBI record is 183 or 184. There's a big difference between knowing a specific data point and understanding a system. A scientist might not know how much rain fell in my backyard last night, but he still understands where rain comes from.

I messed up the Pythag/actual W-L thing: sure. But I think that my wider answer to the original question demonstrates an understand of the principles which drive the Win Shares metric, and the central difference between Win Shares and WAR.
2:10 PM Apr 1st

steve161
Maybe it's because there are so many WARs. They've got Win Shares outnumbered.
7:05 AM Apr 1st

MarisFan61
Well :-) ....since you usually know what you're talking about and you said the other thing.......

Thanks for clarifying.

The main thing I get from what happened is confirmation of the irony that I've pointed out many times:
Despite this being Bill's site, pretty few people here have a great knowledge or appreciation of the Win Share system, and even the most expert among our members may have mistaken ideas about it.

Although Bill wrote a whole book about the system, and used it prominently in The New Historical Abstract, he hasn't done nearly as much as the "WAR" people to get his system out there -- so they're way ahead in getting their system out there.
4:23 AM Apr 1st

DaveFleming
C'mon Maris....you can just figure the answer to your question out yourself. Do you need someone else to confirm it? We'll take your word for it if you show your math.

Anyway...to answer the question: Win Shares goes by actual win-loss record, not Pythag W-L. The Twins had 59 actual wins last year, 66 expected wins. Our site credits them with 176.8 Win Shares, which is probably a few decimals not extended far enough to get it exactly right. Not sure why I thought it was the pythag record, but it's the actual record. Which I prefer, incidentally.
3:26 AM Apr 1st

MarisFan61
So -- nobody knows (or wants to say) the answer to that question about the 'wins' basis of Win Shares??

For mine own part :-) .....unless securely said otherwise, I'm assuming it continues to be actual wins, not the Pythagorean projection of wins.

-----------------------

Neat post by Marc, below.
8:12 PM Mar 31st

Marc Schneider
I never thought of sabermetrics as being a precise roadmap on how to play each and every game. It seems to me it's a guide to how to structure the team and to the generally most optimal strategies. IMO, you don't simply apply sabermetric analysis blindly without considering any other factors, including human psychology. At best, if you apply sabermetrics to line-up construction or in-game strategy, you might add a few runs per year, maybe a game or two. There is still room for some human intuition, although it should be informed by the information that sabermetrics provides. But if relief pitchers are convinced that they need a structured environment to perform best, it might be a self-fulfilling prophecy, but it's not something to simply ignore. I keep thinking about Robert McNamara and how, according to statistical analysis, the US won the Vietnam War.
1:48 PM Mar 31st

OldBackstop
Another example is the vet. Jeter batting second, Wright batting third....the Captain.
8:39 AM Mar 31st

MarisFan61
Kudos about the ego/comfort part!! Isn't that a great point, that if the sequencing per se doesn't make a difference of more than just a few runs a year, why not heed the emotional aspects and ego aspects? And in fact, it may well be that at least occasionally, those ego and comfort aspects do make a positive difference of at least that kind of amount.
9:35 PM Mar 30th

tangotiger
Dave: I'm noted many times that more important then the sequencing is a player's ego or comfort, because whatever 2 run or 3 run gain you can have by optimization can get undone by 10 runs of dissatisfied performance. I'm very well aware that this is not a pure math problem.

If people want to argue the math, they are wrong. It's like arguing facts.

If people want to suggest that Ichiro batting 2nd instead of 1st, because batting 1st is a huge honor in Japan, and Ichiro would feel slighted: fine, I can buy it. Quantify that effect, and if it's more than 2 or 3 runs, then, fine, leave Ichiro as leadoff hitter.

8:26 PM Mar 30th

MarisFan61
Yes -- Dave asserted that it's hewed to the Pythagorean projection. If true, it's a shift I hadn't been aware of, and I think many others haven't been either.

Dave or anyone else -- please help!
2:45 PM Mar 30th

Rich Dunstan
Dave, what's the answer to Maris's question about whether Win Shares are currently based on a team's Pythagorean record or its actual record? It was certainly actual record in the original book--among other things, that issue is a key point in Bill's essay "The Snider/Mays Dilemma" (p 188), in which Snider's win shares for 1954 include his share of the credit for the fact that the Dodgers' actual record greatly exceeded their Pythagorean record. (The word "Pythagorean" doesn't actually appear in the article, but the concept does.)
2:13 PM Mar 30th

ventboys
Feeeeeeelings ....

Woah wo-wo feeeeeeeeelings ...

I think we all get sidetracked sometimes because we think of SABR as a science and we dismiss anything that isn't numerical in nature. We don't look for formulas or statistics, though; we look for evidence. Often evidence comes in statistical form, but not always.

Are feelings evidence? Of course. But that doesn't mean we can quantify them, or get them admitted into the courtroom. Casey Stengel interviews are evidence, too, but good luck coming up with a (Malaprop)/(kernel of wisdom) ratio.

What ever happened to that guy - Morris Albert? Something like that - who sang "Feelings"? If Karma is real, he died and came back as a feminine hygiene product, but I would guess he's still alive down in Brazil, playing Bocce with Julio Eglesias and the reanimated corpse of Robert Goulet.
11:43 AM Mar 30th

MarisFan61
re "With all this talk about players' feelings being affected by where they're batting in the lineup, we are in danger of turning into an anti-sabermetric site":

Bill wrote (and if he's seen some of the hundred-or-so times that I've cited it, he might be getting sick of it) :-) that sabermetrics isn't numbers; it's the search for better information. Sometimes that search may involve thinking about things that aren't in the numbers, or can't be in the numbers; sometimes it might lead to additional things we can do with data, that may shed light on what we wondered about, whether it's feelings or something else; sometimes it might leave us just realizing there are things that might be of importance that aren't in the numbers.

BTW in this case people did give numbers and they suggested that the feelings didn't much matter, so those of us talking about the feelings pretty much shut up about it. :-)
11:05 AM Mar 30th

dbutler69
This article reminds me of another (in my mind) limitation on WAR. There was a starting pitcher last year (unfortunately I can't remember who) who, in the first month or two of the season, had had several pretty good starts, and one really dreadful start that made his overall numbers look bad. He had a negative WAR at that point in the season. However, my thinking is that, no matter how bad he was in that game, no matter how many runs he gives up, that only counts as one loss. Once he gives up more than, say 5 runs (to make up a number) the team's win probability isn't going to drop by very much for each subsequent run he gives up, yet I think his WAR is still getting dinged because it's only looking at his overall numbers for the season. He had a bunch of games where he added value to his team, and one lousy game, and winds up with a negative WAR. Something like that probably doesn't happen often, and over a player's career, perhaps it always evens out, but it seems to me that, at least for pitchers, sometime WAR doesn't tell the whole story.
9:57 AM Mar 30th

DaveFleming
The story of Murphy/Henderson is interesting.

Murphy started the 1979 season as the A's leadoff hitter. He did pretty good, posting a .301/.459/.437 line in the leadoff spot. The team brought up Rickey halfway through and slotted him in the leadoff spot, Murphy second. Murphy's numbers plumetted (.235 as a #2 hitter). He gradually moved down in the order...hitting 3rd, 4th, 6th, 7th, 8th, 9th. That's hitting every branch as you fall out of the tree.

1980 - Rickey hit 1, Murphy hit 2, all year.
1981 - Same.

In 1982, they did the same thing through the first 110 games, Rickey leadoff and Murphy second. Murphy scuffled, hitting .232, and probably complained about hitting behind Rickey. The manager decided to drop him down to third, and Murphy came alive, posting a .324/.419/.574 line.

But it didn't work out the next year. Getting away from Rickey wasn't a miracle cure, and Murphy hit like he always hit, albeit out of the #3 spot. In 1984 Murphy hit mostly 2nd or 3rd and did better in the #2 spot....turns out he missed Rickey a little. Next year Rickey was in the Bronx, inflating Mattingly's RBI count.
3:57 PM Mar 29th

OldBackstop
Dwayne Murphy batted .252, OBP .360 in the second spot and .253, .359 OBP in the third slot.

So shut up and sit down, Dwayne.

I'm sure there are cases where someone who runs their mouth about a position or who they are in front of or behind actually has poorer/better performance, but every time I've looked it up it there hasn't been any traces.

I'm sure psychologically...if Rickey Henderson isn't a friend, if he keeps thinking your name is Wayne and reintroducing himself every spring training, then watching him dance around while you have to take pitches or bunt could pisss you off. Likewise, if you are Lou Gehrig and keep having to see Ruth sweep the bases clear and then (the later fat Ruth) not score from second on a smash off the wall, that could make you crabby.

Although the numbers don't show that one either.
3:40 PM Mar 29th

MarisFan61
Dave: Good comment.
I just want to add, it isn't just the "psychological component" in how a player feels about being in a given spot in the batting order.

I think the thing of Dwayne Murphy hating to hit right behind Rickey Henderson is a terrific example for this. Can't we be quite sure this wasn't just psychological? I'd suggest it was very little psychological, if at all; it was strategic. Isn't it more likely that it was that hitting right behind Rickey affected these two very real and concrete things:

-- The way he was pitched, and
-- How he approached hit at-bats.

I realize that this doesn't necessarily mean that having him in the #2 hole rather than elsewhere was worse for the team, because, to whatever extent it may have been disadvantageous for him to be there, it may have been equally disadvantageous for whoever else would be put there.

BUT -- and I'd say this is a terrific example for another thing we've discussed on here: IF it's disadvantageous for various players to be in such a place in the batting order, in terms of impairing their production and therefore their value, that's yet another argument for the traditional belief that it's best to have a large variety of different kinds of hitters in the lineup, and that certain kinds of players who might not be shown to have much calculatable value by usual metrics might actually have value in a particular role in the lineup -- like, with this example, if the player's 'natural game' is well suited to that spot, and especially if the things that he does have more value there than they do on the average.
3:21 PM Mar 29th

OldBackstop
Hmmm.....Matt Kemp wins:

Lowest dWAR 2010-2016 -13.6 (next is Prince UnFielder at 12.0)
Lowest dWAR 2012-2016 -9.8 (next is Shin Bark Ow -8.1)
Lowest dWAR 2014-2016 -8.1 (Next is Ryan Awkward -5.6)
Lowest dWAR 2015-2016 -4.9 (Next is Yesmany Errowas -5.0)

so...that's bad.

What's dWAR? Sounds French...

Personally, I think Dave is getting cold feet on his Braves pick for surprise team :-)

3:19 PM Mar 29th

DaveFleming
Well...I think Tom Tango knows a helluva lot more about batting order than I do, and I think that his argument boils down to a strict math perspective: in the realm of pure numbers, it probably doesn't matter all that much. I think that's all they're claiming...'this is the math says.'

That leaves open, I think, the question of harder-to-quantify stuff. It might not matter how you stack a lineup, but certain hitters obviously prefer hitting in certain spots....we know this is true, right? Like Dustin Pedroia...I think, just an impression, that he really likes hitting 1 or 2, and getting an at-bat in the first inning.

Or Votto. Votto said that he really likes hitting behind Billy Hamilton. And Dwayne Murphy hatted hitting second behind Rickey Henderson. Those are individual preferences, and we can't really count what Votto gains with Hamilton on base, or what Murphy loses hitting behind Rickey. We can guess, but it's tough to nail it down.

I don't think Bill or Tom discount the psychological parts of it....I think they are just speaking to the math end of the question, and people interpret it as a final word on the subject. But I shouldn't speak for them...can't get the accents right.

2:50 PM Mar 29th

MarisFan61
Dave: I commend you for having the [whatever we should call it] to be highlighting the thought that there may be things about batting order and lineup structure, things of importance, beyond what sabermetrics currently thinks -- or, in some instances, allows :-) :-) (like, how the boss of this site tends to dismiss and ridicule any such suggestion).

Bill is admirably open to just about anything, including questioning and re-questioning his own work. This thing of batting order seems like an odd kind of outlier on which to be so dismissive.
1:48 PM Mar 29th

Mike137
I think that the research done on lineups shows pretty clearly that teams don't need specific types of hitters. So I don't think that wierd result in comparing Kemp to Francoeur has anything to do with undervaluing Kemp's home runs.

Baseball reference gives Francoeur an oWAR of -0.1 and Kemp an oWAR of 1.2 for his time in Atlanta. About 40% of a season each, so that is a difference of about 3 wins over a full season. I don't that indicates that Kemp's offensive contributions are not being fully valued. But the dWAR's are 0.4 for Francoeur and -1.5 for Kemp, making Francoeur about 5 wins better than Kemp over a full season. That is the part that does not pass the smell test.

I have noticed this before: the defensive evaluations often seem exaggerated. I don't know if that is due to noise causing some valuations to be way too high or low, or if there is some systematic thing giving too much value to the defensive metrics.
1:48 PM Mar 29th

jollydodger
This reminds me of the discussion of the 2008 Rays improvement in the Reader Posts section. We still have no precision when it comes to offense vs. defense (defense being pitching + fielding).

And yeah, maybe taking into account what a player does that the rest of the team doesn't do should come into play. idk
1:02 PM Mar 29th

OldBackstop
What it is hard to remember, since it was waaay back last year, is that the Braves didn't just need home runs, they needed them in a way the was leading journalists to batsjtt articles like this:

45 MLB players have more home runs than the Atlanta Braves

ftw.usatoday.com/2016/04/atlanta-braves-not-good-oof-homers-sad-mlb

Great work as always, Dave.
12:44 PM Mar 29th

MarisFan61
I agree also that in general Win Shares does better in most situations (and I also find it more appealing in its theory and approach).

BUT, we ought to mention, in this case, while Kemp shows better in his Win Shares than on "WAR," it's not much better. It's still pretty mediocre.

His 5 Atlanta Win Shares prorated to a full season, come to just under 15. That's 'okay' at best. It feels like this also somewhat underrates his value to the team.
12:21 PM Mar 29th

smbakeresq
WAR in a vacuum is incorrect to me in many respects. Availability is a key component of a players career, being injury free and ready to play isn't a skill but is important to a team.

Kemp clearly has more value then a platoon of replacement players, having a player that can play 150 games makes your bench bigger. That's not accounted for WAR, but its a contribution.

Kemp is also a known quantity, replacement level players are available but not necessarily known.

I have always thought some of the WAR defensive credits are overstated, because of defense you get some players listed as the "best" players in the league because they led in WAR, like Nick Markakis and Ben Zobrist. Those players are fine players with great seasons, but I think WAR overstates their value somewhat.

Its a valuable tool in conjunction with many other things, but that's all
11:27 AM Mar 29th

ventboys
Sooner or later position adjustments will go away, and these things won't happen. I think - someone who understands the formula better than me can correct me if I'm wrong - what happened with Kemp is the double-whammy nature of defensive metrics AND position adjustments.

Kemp's offensive "demand" was higher because he played left field, and his defensive "penalty" was higher because he was playing left field badly. Usually they sort-of balance each other out - the various factors that combine into "defensive position" and "defensive production" - but in this case they all pointed in the same direction, creating one of those logical fallacies that happen at the edges of any single-number metric.

Brooks Robinson's defensive stats didn't fit in Pete Palmer's formulas either, or Johnny Bench, or a few others ... I think you pretty much gave the answer I would normally give, Dave - any formula that says 35 homers, 36 walks and a .268 batting average HAS to be worth something, but I might add a small addition:

A formula can't both penalize a hitter for playing a right-end defensive position AND punish him for playing it poorly as much as it would if he was playing a position on the left.

As always, beautifully written, Dave.
11:25 AM Mar 29th

evanecurb
Excellent article, Dave. I agree that Win Shares is a better measure than WAR, but I would caution that I don't think we're anywhere close to having a magic stat that can measure total value to the nth degree. We need to look at all of the things that Dave and the other posters below have pointed out, including team context and the fact that defensive metrics are still not as reliable as offensive metrics.
11:02 AM Mar 29th

MarisFan61
(It's in Win Shares.)
11:01 AM Mar 29th

astros34
Don't forget, Kemp hit into 9 DP in those 241 PA and 56 games with Atlanta, which extrapolates to about 26 in 162 games. That has a big negative affect to WAR. I assume that's also factored into the calculation of Win Shares? It's been so long since I've read the book.
10:51 AM Mar 29th

MarisFan61
Dave: Are you sure it is a team's Pythagorean projection, not its actual record, that Win Shares is hewed to? (If so, that's a change from the original basis.)
10:41 AM Mar 29th

DaveFleming
To comment on Gary's post: the big edge that Win Shares has is that it is correlated to a team's win-loss record (actually, their pythag W-L), instead of being correlated to the production at a position on the diamond.

This is a useful 'check' because it has to hew to actual performance, instead of layers of math. A team like Atlanta (67 Pythag Wins) gets 201 Win Shares to distribute. Matt Kemp is credited with five of those, which approximated to 1.66 'wins' for the Braves. That makes sense to me...that seems like a reasonable measure of his influence. It's a lot more reasonable, at least in my vantage, than -0.0.

I do think that the defensive stuff really skews the numbers. We're getting really AMAZING breakthroughs in the stuff we can count defensively, but this is, at least in my opinion, causing us to overvalue that stuff. Like pitch-framing....right now everyone is talking about pitch framing, which is really terrific and fascinating stuff, and it probably matters more than we've ever realized. But how MUCH does it matter? What are the parameters of it? How many more runs (or outs, or whatever you want to count) can a great framing catcher save over an average one? Or a bad one? Where are the edges?

I think we're trying to find those edges, and I think we'll come close to finding them, but we should tread carefully, and we should question conclusions that seem like they can't pass objective analysis.

2:08 AM Mar 29th

MarisFan61
Sorry, Gary!! Don't mean to trample on your post!

Everybody, look below that post of mine to see Gary's -- which is actually relevant to the article. :-)
1:36 AM Mar 29th

MarisFan61
BTW, let's check out that last thing I said a little bit:

The average player, in his team's "games won," is a Hall of Fame hitter.

Here were those stats:

Average slash line, all major leaguers, 2016, in their teams' "games won"
.292/.361/.494

I won't even make it easy for that assertion by just using a borderline Hall of Famer as the comparison. Let's use an average Hall of Fame hitter.

What's the slash line for an average Hall of Famer?

Of course let's ignore pitchers, as well as people who didn't mainly get in as players, like Leo Durocher and Connie Mack. But for now I won't toss out players who got in mostly on fielding, like Maranville.

Using the list of Hall of Famers as shown on baseball-reference.com:
www.baseball-reference.com/awards/hof_batting.shtml
......and putting them in order according to OPS -- which makes it easy to eliminate the pitchers and managers etc., although you can't just do it mechanically and take everyone who's 'above the line,' because then you still have guys like Ned Hanlon, Wilbert Robinson, Bucky Harris, Miller Huggins, and Jocko Conlan, not to mention Al Spalding who actually had a half-decent slash line, as did Billy Southworth...... doing my best to make sense -- that leaves us with about 159 players. (I threw out 11 guys, including three who actually ranked surprisingly well on OPS among these Hall of Fame position players: Casey Stengel, Red Ruffing, Whitey Herzog [!] and Dick Williams [!!]. I didn't throw out John McGraw and Joe Torre -- arbitrary choices.)

Who are the guys in the middle, i.e. the median offensive Hall of Fame position players?

Well actually, now that I've gone through this whole exercise, I look in the middle of the list and I see that baseball-ref has done the work for us:

"Average Batting HOFer..... .302/.376/.463" :-)

But as long as I've done that work, here's the guys I find in the middle:

Hugh Duffy .326/.386/.451
Kirby Puckett .318/.360/.477
Eddie Murray .287/.359/.476
Roberto Clemente .317/.359/.476
Elmer Flick .313/.389/.445

These numbers show about the same as the "average batting HOFer" figures given by that site (as would be expected for the median of the list), but, these guys are below-average Hall of Fame hitters for the right end of the defensive spectrum, i.e. for players who are basically hitters.

Let's see what we get if we look at the guys who are average for Hall of Famers who are basically hitters.....

It's fuzzy to try to see that. Using the listing that you get when you sort by OPS, the first 102 are essentially hitters, then you start running into guys like Buck Ewing and Joe Sewell, but it's still mainly Hall of Fame-ish hitters pretty solidly down to about #130.
So, let's look at the guys around #65:

Jim Rice: .298/.352/.502
Eddie Collins: .333/.424/.429
Joe Kelley: .317/.402/.451
Billy Williams: .290/.361/.492
Orlando Cepeda: .297/.350/.499

The average major league player last year, in their teams' "games won," pretty much did match those numbers.
1:35 AM Mar 29th

garywmaloney
Matt Kemp had 16 Win Shares in 2016 -- 11 with SD, 5 with ATL.

For comparison, Chris Davis of BAL had 17 WS. Nick Markakis, also ATL, had 17. As did Trea Turner (WAS), in 2/3 of a season.

Lonnie Chisenhall and Mike Napoli of CLE had only 14 each.

I am avoiding comparisons with pitchers . . . but if you wanted to go down that road, Jeurys Familia (NYM) also had 16. As did Jake Arrieta (CHC).

But bWAR for Davis was 3.0; for Turner, 3.5; for Markakis, 1.7. Chisenhall, 1.4. Napoli, 1.0.

Kemp's offensive bWAR is +1.8, but his defensive bWAR is -2.6.

Isn't that really it, Dave -- WAR (as did the old original Linear Weights) often has the fielding component completely dominating and outweighing the offensive side? Kemp is docked massively for defense, and the 35 HR / 109 RBI / near-.500 SA aren't valued enough?

IMO, Win Shares remains superior, because it has better balance between offense and defense.
1:31 AM Mar 29th

MarisFan61
P.S. Since that split of stats in Games Won/Lost is interesting, and (I think) since the "normal" is not at all well known (I only have a rough idea; I think the 'normal' is just a little less extreme than the one shown for Kemp) .....here are a couple of other ones I looked up.

Babe Ruth, 1927
Games Won: .398/.528/.887
Games Lost: .252/.373/.497

Mickey Mantle, 1956
Games Won: .390/.506/.769
Games Lost: .290/.389/.600 (i.e. probably an unusually narrow split)

And also (now I'm about to really learn something)....

OVERALL MAJOR LEAGUE FIGURES, 2016
Games Won: .292/.361/.494
Games Lost: .217/.280/.338

WOW! Never would have thought it was quite like that.
The average player, in his team's "games won," is a Hall of Fame hitter.
11:39 PM Mar 28th

DaveFleming
That split (between a player's triple-slash in wins and loses) is common across all players, and it generally doesn't matter if the hitter is the lone guy in the offense (a la Freeman), or part of a juggernaut offense. David Ortiz was 1.225 OPS in wins, .737 in losses, and the Red Sox had plenty of hitters last year.

Partially, it suggests that offense is interconnected more than we tend to think it is. For all our talk that baseball is one man standing by himself, facing off against a pitcher, it's possible that there is a greater team influence than we currently credit.
11:36 PM Mar 28th

DaveFleming
And for what it's worth:

Freeman before Kemp: .881 OPS
Freeman with Kemp: 1.146 OPS

Protection might not matter statistically, but I think there's sometimes a mental element that comes into play. Freeman sure seemed to like hitting in front of Kemp a lot more than Nick Markakis.
11:18 PM Mar 28th

MarisFan61
(typo: Didn't mean "player's comp," but split)
11:17 PM Mar 28th

MarisFan61
Don: Not sure what exactly is the point you're trying to make, including because I don't think such a split is unusual.

I figured I'd check that split on the first other player that came to mind: Yoenis Cespedes. As luck would have it, his split is almost identical to Kemp's. (I'll make sure to check and post at least one other player's comp.)

Cespedes, 2016
Games won: .333/.408/.634
Games lost: .205/.275/.385

Let's see, who else.....trying not to cherry-pick but to be relevant....

Kemp's teammate.....

Freddie Freeman, 2016
Games won: .372/.479/.688
Games lost: .251/.338/.482
11:16 PM Mar 28th

doncoffin
Not intending to be contrarian, but I wondered how well your thesis would hold up if we looked at in which games Kemp delivered the most. So here's his slash line in the games in whicn he played which the Braves won--and lost:

Games won: .333/.381/.650/1.030
Games lost: .216/.280/.327/.607

Games with SD: .262/.285/.489/.794

So it does appear that he not only hit better in Atlanta than in SD, but he hit a hell of a lot better in games the Braves won.
10:57 PM Mar 28th

MarisFan61
This is exactly what one of my quibbles has been about the large metrics.

The homogenation of everything into a single datum imagines that the totality gives an accurate picture. I do think that for the most part, the good large metrics come pretty close, but they don't account for things like what you're saying.

Baseball has evolved in such a way that a lineup typically contains various roles: Guys who get on base, guys who put their bat on the ball (I know that sabermetrics mostly discounts that), guys with speed, guys who hit with power. The traditional thinking is that it's best to have some of all those things. The large metrics lump all the elements together in terms of their individual calculatable values, as though "value is value," period.

I see it as far more complicated. Sometimes a player who seems from the large metrics to be without value can be (apparently) very important for a given team, as you are noting. But, there's got to be a limit to how mediocre he can be on an overall basis and still be valuable. Where is that limit? How poor would Kemp's on-base average or defense have needed to be in order to keep him from being of value? I'm not proposing to suggest an answer -- although, what the heck, why not: I'd say that if his on-base average with the Braves had been .290 rather than .336, or if he'd struck out in 40% of his at-bats with them rather than 26%, he would have been a negative, even with that many HR's. But the main thing I'm saying is just that it's complicated, and it can't easily be assessed in a single metric.

This reminds me, sort of in reverse, of a thing in one of Bill's old annuals (probably late '80's) about the Cleveland lineup: They had like 7 guys who all did the same thing -- a little of everything, and all mediocre-ly. I think that in such a lineup, each of those players had less 'value' than they would have had in better-constructed lineups. Maybe it also means that such players in general are less valuable than the sum of their elements; I don't know. Bill also said (elsewhere, talking about the fallacy of "linear weights") that baseball offense isn't linear, it's geometric. I would suggest that a big part of what enables the geometry is the effective mixture of roles.

I don't mean either that 'specialization' necessarily gives more value to an offensive player. There's also great value in having lots of different things in a single package, like Mike Trout or Willie Mays or Mickey Mantle.

It's just complicated, and it's always good to wonder about what kinds of things aren't encompassed in metrics.

Thanks for this article.
10:49 PM Mar 28th

Matt Kemp and the Limits of WAR

COMMENTS (67 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: