By Bill James

August 19, 2018

John Dewan introduced six candidates for the MVP race in an article posted a couple of days ago. I just got back from vacation and realized that I have NO idea who should win the NL MVP race, so I thought I would spend a few minutes looking at John’s six candidates.

1) The Carpenter is hitting just .233 with runners in scoring position and has homered almost exactly twice as often with the bases empty as with men on base. I think those are legitimate issues. 27 of his 33 home runs are with the bases empty, a split that is influenced but not created by at bats. On the other hand Carpenter is hitting .250 with 11 homers at home, .288 with 22 homers on the road, OPS .828 and 1.090.

The issue in an MVP race is not how Carpenter as an individual is impacted by the park; it is the value of what the player has done. Suppose that there was a neutral offensive park, but that park was very favorable to a left-handed hitter, but very difficult for a right-handed hitter. Suppose that in that park there were two MVP candidates, a right-handed hitter who was badly hurt by the park, and a left-handed hitter who was helped by the park. Overall their runs created are the same. Is the park relevant to the MVP debate?

It is not, or not really. The issue is the value of what the player did, not what he would have done or might have done in some other context. If two players create 100 runs each in the same offensive context, the same number of wins for their team are likely to result, so the value is the same, since the "value" is the number of wins that result.

One may argue that in another park, the right-handed hitter would have been more valuable than the left-handed hitter, and that may be true. It’s not relevant. What a player might have done, could have done, would have done in some other set of circumstances is not relevant and cannot be relevant, because (a) that kind of analysis leads to endless speculation on many different issues, and (b) we don’t actually know. We don’t KNOW what the player would have done in some other park. We’re trying to stick, as much as possible, to the facts, rather than speculation.

Of course, if you want to say, "these guys are equal in value in these circumstances, but the one guy is really better on an underlying level than the other guy, so I’m going to vote for the guy who was hurt by the park," sure. It’s not wildly irrational to think in those terms as a tie-breaker, small-change type of thing. It’s outside the lines of strictly rational value, but then, many things which are true are outside the lines of strictly rational thought, until rational thought catches up with them.

But as long as we are staying within the lines of those issues which have been thoroughly thought through, the issue is not how Carpenter as an individual is effected by the park, but the relationship between his runs created and the number of runs necessary to win a game in his environment. So what is the Park Factor for St. Louis this year?

The Cardinals this year (through Saturday) have scored and allowed 506 runs in 61 games at home, 570 runs in 63 games on the road. That calculates to a raw park factor of .922, adjusted park effect of .964. Baseball Reference lists the St. Louis Park factor this year at .98, but I don’t know how they’re getting that number. It probably has something to do with their using run elements (singles, doubles, triples, etc.) rather than run totals, I don’t know. If that’s what they are doing, I would argue that it’s not what is most relevant to the MVP race. A park factor based on run elements would be more accurate than a park factor based on runs in projecting what happens in the future, but an MVP contest is not about what happens in the future; it is about the value of what happened in the past. What happened in the past is real runs, so you use real runs to calculate the park factors in an MVP debate, I would argue.

So let’s say the Park Effect is .964. The National League average is 4.41 runs per game, and the Cardinals have played 124 games, so that’s a "context" of 527 runs (124 times 4.41 times .964). Actually, since there are two teams playing in each game, it’s a context of 1054 runs, so Matt Carpenter has created 103 runs in a context of 1,054 runs, or 9.8% of the runs in the context.

2) Trevor "What’s The" Story. The Rockies this year have a raw Park Factor of 1.20, adjusted park effect of 1.093, so that’s 4.82 runs a game. They’ve played 123 games, so that’s a context of 1,186 runs (123 * 4.41 * 1.093 * 2). Story has created 90 runs, so that’s 7.6% of runs in context.

You’ve got 18 hitters in a game, so the average hitter has created 5.6% of the runs in context, or 1 out of 18. Carpenter is at 9.8%, Story is at 7.6%, so Story isn’t even remotely comparable to Carpenter on that level.

Story is hitting .277 with runners in scoring position, so that’s not anything notable. He has, however, hit 15 bombs with men on base as opposed to 10 with the bases empty, so you’d have to give him a couple of points for that.

3) Freddie Freeman, leads the National League in hitting at the moment at .320. He has created about 104 runs. The park is almost perfectly neutral, adjusted park effect of 1.002, so in 122 games that’s a context of 1079 runs (122 * 4.41 * 1.002 * 2). Freeman has created 104 runs, so that’s 9.6% of context runs. About the same as Carpenter.

4) Nolan Arenado. Colorado context is 1,186 runs, as we established earlier. His walk rate is better this year than it has been, so his on base percentage is up to .391, and he has 14 homers, 41 RBI on the road, although his average is still almost 100 points higher at home than on the road, as it often has been in the past. His OPS+ this year is 143, which is a career high by about ten points.

Anyway, RA Nado has about 99 Runs Created, so that’s 8.3% of context. It’s a good number but it’s not really an MVP number.

5) Javier Baez, leads the National League in RBI with 89. We have him figured with about 78 Runs Created. The Cubs this year have allowed 282 runs at home, only 215 on the road, so they have a Park Factor of 1.168, adjusted of 1.078. That creates a context of 1,160 runs for the Cubs (122 * 4.41 * 1.078 * 2). 78 Runs Created out of 1,078 is 6.7% which is. . .well, if 8.3% isn’t really an MVP type number, then clearly 6.7% isn’t.

Javier has a strikeout/walk ratio of 116 to 18, not real good. I know he has some special skills in terms of defense, but I think to talk about him as the MVP, you’re kind of falling into old style thinking, ignoring the Park Effects and the on base percentage, which is .319.

6) Jacob de Grom Grom. The Mets this year have a raw Park Factor of .699, adjusted Park Effect of .860. Context runs for a Mets player would be 122 (games) times 4.41 times .860 times 2, or 926 runs.

The hitters, we have been comparing to zero. The equivalent number for a pitcher would be 2 times the norm. . ..in other words, suppose that the context for a hitter is 100. If the hitter is at 120 he is +120 from zero. If a pitcher is at 80 he has equal impact, and he is +120 from 200. The zero baseline for runs NOT allowed is twice the norm or runs allowed.

De Grom is 104 runs better than a pitcher allowing twice the league norm in the Mets’ run context. (The norm for the Mets, park-adjusted, is 3.793 runs per 9 innings. Twice that would be 7.586. A pitcher allowing 7.586 runs per nine innings would allow 142 runs in 168 innings. De Grom has allowed 38 in 168 innings, so he is 104 runs better than a zero performance level.)

104 runs in a 926-run context is 11.2%, or a higher number than Carpenter or Freeman. The wrinkle is that of the runs prevented, some are prevented by the fielders. How many?

We don’t know. The Mets, I would guess, are not a good fielding team compared to the average, but the average is not zero. Let’s say that an average team allows 700 runs in a season; then it is axiomatically true that an average team allows 700 runs less than twice the average. Of those 700 runs, not ALL are prevented by the pitchers; some are prevented by the pitchers, some by the fielders. An average team is + or – zero compared to the average, but not + or – zero compared to zero. We have to have some way to remove the fielders from the 104 runs "saved" by de Grom, which we don’t have, since we don’t have a zero-point calculation system for fielders.

In this contest, the Rockies players have an advantage over the other players, which is that the Rockies have won 67 games with individual stats which would ordinarily produce only 60 wins. They are 67-56, should be 60-63 based on their runs scored and allowed. That means that each run they are producing is worth 11% more in terms of wins than it "ought" to be or would be expected to be.

So that boosts Story’s number, 7.6% of context, up to about 8.5%, and Arenado, at 8.3%, up to about 9.2%, putting them in better shape as MVP candidates. The Mets are about 5% short, so that would cut de Grom back by about 5%, but we don’t know 5% of what anyway, so we don’t know what to apply the 5% to.

Baseball-Reference WAR lists the NL leaders as Scherzer, Nola and deGrom, all in the range of 7.8 to 8.4, no position players over 5.3, and lists Lorenzo Cain first among position players at 5.3. I don’t know that that’s a credible ranking.

Just as a general comment. . . I don’t think John’s "Total Runs" system is intended to be an ultra-sophisticated method; I think it is intended to be more of a shortcut. The system mixes up zero-based calculations (Runs Created) with average-based calculations (Runs Saved), and also does not "normalize" batting stats by adjusting for the offensive context, which obviously we know needs to be done, and John would do that if he was actually making an argument that Matt Carpenter was the MVP, rather than just using this method to focus on the candidates. But mixing up zero-based values with average-based values (a) is theoretically improper, and (b) causes real, real, real serious problems in fact.

I suspect that the reason that Baseball Reference shows the top pitchers as ridiculously far ahead of the position players is that they ALSO are mixing up zero-based calculations with average-based calculations. I don’t know that; I don’t really understand the system, but I suspect that what they are doing is calculating the runs saved by pitchers against a zero point (twice the league norm), and then adjusting that for the AVERAGE defensive performance, thus implicitly assuming that defense has zero value in the average case. That was the problem with the Pete Palmer Linear Weights system; it implicitly confused performance averages with zero-based numbers. I would suppose that WAR, which is a descendant of Linear Weights, still has that problem. But I don’t actually understand their system well enough to say. I suspect that the people who created the system probably don’t understand it that well, either. It’s an inherently confusing process, combining different measurements into one, and almost all systems that attempt to do so wind up accidentally adding apples to grapefruit.

©2023 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy

## COMMENTS (17 Comments, most recent shown first)

Max_FischerInteresting to come back and read these articles well after the fact. The eventual 2018 NL MVP winner was someone not even mentioned: Christian Yelich, after a scorching hot September.

11:20 AM May 11thklamb819Citing the typical lack of pitchers on WAR leaderboards does not refute Bill's conclusion that WAR's calculation overstates the value of the best pitchers. Considering WAR allocates 60% of wins to position players and 40% to pitchers (600 and 400 for a full season), then with 1.5 times as many wins as pitchers, position players should dominate the WAR leaderboards.

The problems with average-based values are many. One is that there is no way to know whether the average is improving or declining year-to-year. Another is that when you add up all the fielding value on every team in MLB, WAR and DRS tell us that MLB's total value for fielding is none at all. Adding that 0 to values based on zero or replacement level generates an incomprehensible result.

Let me try to explain the problem in a different way. A person has a total savings of $10,000, with $5,000 in the bank and $5,000 in a 401(k). But all he is permitted to know is that he has $5,000 in the bank and $1,017 above average in his 401(k). How does that $6,017 sum help him know that his total savings is $10,000?

7:58 PM Aug 22ndRiceman1974Tango is correct that the Pthag numbers are wacked this year. This seems to be happening more often. Is this due to the HR or K offense strategy? Seems to me that HRs are hit in bunches, then comes a drought. Then another bunch of HRs, followed by another drought, ad infinitum. The result is a string of 10-5 victories follwed by 2-1 defeats. If the HR is your main weapon, it automatically creates a feast or famine environment, which could explain the Pthag madness we are witnessing. Just a guess.

6:29 AM Aug 22ndJohnPontoonIt's my presumption that if all current factors remain steady through the season's end, we should default to my contention, commented in John Dewan's NL-MVP article, that Matt Carpenter should be MVP. Also, then, I should be declared a prophet, or infallible at the very least.

10:30 PM Aug 21stMarisFan61Guy: You seem to absolutely assuming things to be totally "luck" and "good fortune" that arguably aren't. A lot of things in life that involve some degree of

chance(which I'd day is a better word for it), even a high degree of it, may also involve some degree of ability, or other personal qualities, or both.10:29 AM Aug 21stGuy123I'm broadly sympathetic to Bill's preference for focusing on what happened over hypotheticals. But we still have to decide which aspects of what happened *matters* for our analysis. Let's say teams scored 8% more runs in Busch III without increasing their OBP or SLG, to use Bill's example. How should that change our assessment of Carpenter? Bill suggests we should value his runs created less (about 4%), by applying a reality-based park factor.

But is that reality? Do we have any reason to think Busch3 caused the more fortuitous timing of offensive events that happened there? I can't see any reason to believe that. If StL and their opponents scored 8% more runs in Fri/Sat/Sun games than the rest of the week, would we discount Carpenter's production because of this "advantage?" Of course not. I don't see any reason to treat a home/away difference differently, absent some evidence that the park actually made a difference.

Similarly, I can see the case for giving players credit for their team doing a good (or bad) job of timing their offensive events, such that they scored more runs than they "should" have. But *how* we allocate this good fortune to individual players still needs thought. Win Shares distributes it proportional to a player's context-neutral production. But is there any evidence that luck accrues to players in proportion to their productivity? That is, if we compare WPA to wOBA, do we find that over-performers (i.e. most lucky players) are also the best hitters? I don't believe there is any evidence this is true. And if so, value from good timing should be allocated based on playing time, not productivity.

8:42 AM Aug 21stMarisFan61.....I thought it would be useful to note that there seems to be a split in what we're talking about, and probably in how different people may be seeing these articles.

Is it who we think WILL BE the MVP (or maybe better to put it, who

would beif we assume that the candidates would look at the end of the season about like they look now), or who we think SHOULD BE, according to the criteria or measures that we think to be most relevant?Need I say, those are separate things.

I've been looking at it in terms of who

would or will be; no regard at all to who I think "should" be. I'm not going on any standards or criteria of my own, or at least trying not to; trying to look at it just in terms of what I gather to be voting history and voting tendencies.It looks to me like John and Bill, to the extent that they were thinking of it in terms of one or the other, were doing the other.

10:44 PM Aug 20thMarisFan61Perhaps the most extreme manifestation of the opposite view is how "WAR," from what I understand,

doesn't even recognize actual numbers of wins (!),but just theoretical numbers of wins based on runs, to the extent the system 'thinks' of a team's actual wins at all.(If this is incorrect, or no longer correct, please let's hear that!)

12:02 PM Aug 20thbjamesTango, Guy; I appreciate your comments and your information.

On the issue of real vs. expected runs, I have come to clarity in my own mind in the last year, although I don't know whether I will be able to make others see it the way that I do. I now believe absolutely that in discussing past accomplishments, only what really happened is relevant, and what could have been expected to happen is absolutely and universally irrelevant.

This issue permeates every corner of sabermetric discussions, but in, for example, park effects. There could be (and would be, somewhere) a park in which the batting average, on base percentage and home run rates for the season, the slugging percentage. . .there could be a park in which these numbers were not effected by the park, but nonetheless the number of runs scored in the park was 8% higher than on the road, simply because the run elements at home were combined in more efficient patterns than they were on the road. Which then do you use: what the park run effect SHOULD have been, or what it actually WAS?

It is clear to me now, was not a year ago, that you should use what actually happened in every case in discussing the past.

Thanks.

11:57 AM Aug 20thMarisFan61Guy: I noted also that baseball-ref doesn't generally show pitchers at the top on "WAR" and that I think it's been rare, so, as you say, it's a thing specifically of this year.

But, anything about why the pitchers show so much more favorably this year on "WAR" than on Win Shares?

(As I noted, all 3 of the mentioned pitchers plus also Freeland are ahead of any position players on baseball-ref (and a couple are far ahead) while on Win Shares, all those pitchers are behind all the named position players.)

10:50 AM Aug 20thtangotigerJust to piggy back on Guy, and considering I was one of those that spearheaded the versions of WAR you see on BR and Fangraphs: all the calculations are based on "average", with the final step being to convert to "above replacement".

Sidenote: This last step is really not that relevant, since I could have (or SHOULD HAVE) converted it into a W-L record. Once you have it as a W-L record, you can easily convert to "above replacement". It's not so clean the other way.

The more interesting thing is how to convert runs to wins, which is the main issue Bill had last year. This year, it's going to be an issue again, because we have an ENORMOUS gap in observed and Pythag W/L records. Mariners are ALREADY +13 wins. The Astros/Redsox are a gap between each other of 13 wins.

10:27 AM Aug 20thGuy123B-R WAR does not make the error that Bill is attributing to it. This is clear if you just look at total WAR for position players (451) and pitchers (315). Pitchers are receiving credit for 41% of wins. In fact, the system is designed to average 40% for pitchers.

The WAR calculation for pitchers does not make this error because it doesn't work from a baseline of twice runs allowed (or any similar estimate). Pitchers are compared to average, and then "replacement value" is added based on playing time. The same is true for hitters. And these two replacement values are set so that position players receive 60% of wins -- quite close to Bill's own estimate.

While 4 of the top 5 WAR players at the moment are pitchers, this is not typical. Last year, only 1 of the top 5 was a pitcher. The current pitcher domination is presumably just a function of how players have played so far this year.

6:49 AM Aug 20thMarisFan61It was a puzzling way to put it (although completely correct!!). Most people wouldn't get it. They'd also think it's a non-sequitur.

(".......He says, as though he knows what most people would get.") :-)

I would have also thought it was just a mistake, if you didn't re-assert it.

It's right, but hey, what we're used to is that when people say "twice as often" it means twice as much, i.e. twice as many times.

How to have said it clearer??

Only more awkwardly. :-)

Like, "at exactly twice as high a rate."

------------

The MVP race does seem uncertain and wide open. Maybe somebody breaks from the pack in the last few weeks. If not, it might be a very close 4 or 5 man race.

There have been at least a few of those before, maybe many. Like, in 1955 it happened sort of in both leagues.

N.L.: Campanella and Snider were 5 points apart; Banks, Mays, Robin Roberts not far behind

A.L. Berra, Kaline, Smith all between 200 and 218 points.

A.L. '57 (sorry, love those '50's): Mantle, Williams, Sievers, Fox, McDougald all had 4 to 6 first place votes.

N.L. '57: Aaron, Musial, Schoendienst all betwen 221 and 239, Mays not far behind, Spahn very strong too; all got first place votes.

--------------

Bill, that's great how you clearly point out such a fundamental seeming glitch about combining "zero-based" and "average-based" calculations.

I had no idea why it is that Baseball Reference shows pitchers so dominantly at the top of the "WAR" rankings, with those 3 pitchers you mentioned plus Kyle Freeland ahead of any position player, and the highest position players so far behind the leading pitchers. (The pitchers don't show nearly like that on Win Shares; all the mentioned ones are behind all the mentioned position players.) The explanation offered here seems real possible, although it would still leave the question of why the "WAR" system shows in such a way this year whereas (I think) it rarely showed like this before.

Very interesting to look at MVP in the terms looked at here, including run % "in context" and situational performance.

Of course we'd want to consider other things besides offense, especially in such a possibly-close race. It could count big for someone like Baez, especially if the Cubs finish basically where they are. For someone like him, I think his impressive offensive numbers, taken together with (as mentioned here) his "special skills" on defense, could make him stand out in the crowd. The bare fact of his offensive numbers could make the voters not mind how they show on the finer analysis.

I think Carpenter's being able to fill in all around the diamond could help him too in such a close race.

I think the only one of the pitchers that might figure majorly in the MVP vote is Nola, because he's the only one on a team that's in it, which I think will count a lot for the pitchers in particular.

11:17 PM Aug 19thbjamesTrying again to explain the problem referenced above. . . .suppose that in a league an average team scores and allows 700 runs. To represent defense as being of equal value to offense, pitchers and fielders combined must also be represented as saving 700 runs on average, as opposed to a zero-value point.

Actually that is not inevitably true, and I’ll get back to that point in a moment. But John’s method and, I assume and believe, BB-Ref WAR use this model: that if there are 700 runs created on offense, there must be 700 “saved” on defense. The problem is that of those 700 runs saved by pitching and defense, some number—about 500—are saved by the pitchers, and some number—about 200—are saved by the fielders.

But these methods do not represent the runs saved by fielders as coming out of the “pitchers and fielders share”; rather, they credit positive runs to a good fielder and charge negative runs to a poor fielder, thus, in effect, crediting the runs saved by a good fielder as coming at the expense of a poor fielder, and thus arguing that the sum total of runs saved by the fielders is zero. But if the sum total of runs saved by the fielders is zero, then the sum total of runs saved by the pitchers remains at 700. The problem that this creates is not that it fails to credit fielding; it does credit fielding. The problem is that it fails to reduce the number or runs that are saved by the pitchers.

In reality, about 50% of baseball is hitting + baserunning, about 38% is pitching, and about 12% is fielding. But BB-Ref WAR, I believe, is using a structure that assumes that pitching is not 38% of the game, but 50% of the game. This exaggerates the values given to pitchers—exaggerates them by failing to diminish them when they logically should be diminished—and this causes the WAR values of top pitchers to be higher than they should be. In some seasons, like this one, this causes all of the top players in the league to be shown as pitchers.

Back to this point:

To represent defense as being of equal value to offense, pitchers and fielders combined must also be represented as saving 700 runs on average, as opposed to a zero-value point.

But that is not literally true. One could represent their WIN impact as being equal but represent their RUN impact as being unequal. Let’s say that you assume that replacement level is .294 or whatever they assume that it is; let’s say .294. If an average hitter has a replacement level of .294 and an average team scores 700 runs, then a replacement-level offense is 452 runs, or 248 runs below average.

But if an average PITCHING STAFF (and defense) allows 248 runs more than 700, that doesn’t create a winning percentage of .294; it creates a winning percentage of .352. To get to .294, you need to add another 137 runs, making it 700 against 1085.

Thus, one COULD make pitching and defense equal to offense not by assuming that the number of RUNS was equal on each side, but that the number of WINS was equal on each side, and thus assuming that 248 runs on offense are equal in win impact to 385 runs of pitching and defense. If you did that, then you would have more space to work with, and then you could discount the Win Value of runs saved by pitchers without reducing the run value.

11:13 PM Aug 19thBobGillOh, I get it.

8:02 PM Aug 19thbjamesBoth statements are correct.

7:20 PM Aug 19thBobGillAbout Carpenter, you said he "has homered almost exactly twice as often with the bases empty as with men on base." But then you give the breakdown: 27 homers with nobody on, only six with men on base, which is obviously far from a 2-to-1 ratio. Was the first statement supposed to be about something other than home runs?

6:57 PM Aug 19th