A response about WAR

By Bill James

September 29, 2022

From Tom Tango

Just a general point regarding WAR v Win Shares, which we can bypass altogether if we just focus on Win Probability Added (WPA), which has the advantage of guaranteeing everything adds up, not only at the game level, but at the individual play level.

And if you look at Pedro's WPA, he comes in at +51 wins above average for his career.

His W/L record is 219-100, or +119, or +59.5 wins above average.

His runs allowed rate is 66% of league average, and Pythag (using 1.82 exponent) says that's close to a .680 record, or +58 wins above average.

So, trying to come to terms with how good Pedro is is pretty straightforward, as we have good agreement using multiple methods. He's +50 to +60 wins above average. This is good enough for my illustration below.

So, if we were to create an "Individualized Won Loss Record" for Pedro, it should be pretty straightforward: let's give out for each pitcher a "game slice" of .42 games for each 9 innings. Pedro's 2827 IP is 314 9-inning games and so he'd get 132 game slices. The average is obviously 66-66.

Since Pedro is about +50 to +60 wins above average, using any method you choose (and using +50 in this illustration), then his Individualized Won-Loss record will come in at 116-16 or so. If you chose .37 games for each 9 innings, then it's 108-8 record. It doesn't matter (too much) what you use, whether .37 or .42 or whatnot.

It will matter (a bit) when you compare to the ".300 level" pitcher, or whatever baseline you choose. A 116-16 record is 76 WAR and 108-8 is 73 WAR.

The key point is that I can make everything add up at the season, game, or play level. And I can do so by using the centering point of .500. And I really, really, really think the entire problem of WAR v Win Shares is we are not talking about it using two dimensions. Because if either of them is appreciably different from this 108-8 or 116-16 record, then we'd have something more tangible to talk about that would actually move the argument forward.

Can you Bill provide the Win Shares / Loss Shares of Pedro's career?

Bill: No editorial responses here, because I don’t want this to become a debate exactly, but I can’t produce Pedro’s Win Shares/Loss Shares right now because I haven’t used that spreadsheet in a couple of years and don’t remember what it was called, where it is or how to use it. I’ll look into it, but the next three weeks are the busiest time of the year for me, because this is when we write the annual Bill James Handbook. But I’ll try to remember to get to that.

COMMENTS (38 Comments, most recent shown first)

Brock Hanke
Tom - Thanks for the response. I figured that there had to be a way for WPA to deal with this; I just didn't know what it was. As for Reader Posts, I've never used it before, and am not quite sure I know how. I'll try reading a few and see what I can figure out. Thanks Again!
8:48 AM Oct 6th

tangotiger
I agree, reader posts is best. That said:

"That produces a positive change in Win Probability, and the credit for it goes to the hitter, as far as I know about WP systems"

That is not true. It's not REQUIRED for win probability that ALL of the change in the event goes to the batter. It's just EASIER to do that, because it's alot of work otherwise.

You can certainly break it up into a 40% chance of taking the extra base and 55% chance of not, and 5% chance of being thrown out, and so that's what goes to the batter, the "average" expectation of what the runner will do. That's an "intermediate" state. Then, from that intermediate state to the actual final state, you give that change to the runner.

There was a more recent example where Stanton got a hit off the Monster, and the runner from 2B was thrown out, and Stanton cruised to 2B on the throw. From a run expectancy standpoint, they went from runner on 2B and 0 outs to runner on 2B and 1 out. So it LOOKS like Stanton blew it. Again, that's just easiness/laziness in coding.

If you think in terms of "segments" of a play, then run expectancy (and win expectancy) can totally and absolutely properly handle these scenarios, without question.
3:09 PM Oct 5th

jgf704
Hey Brock Hanke...

You really should consider joining Reader Posts! It is a much better venue for back-and-forth discussions.
10:42 AM Oct 5th

Brock Hanke
Hi, Tom Tango. This is Brock Hanke, the guy whose comment was part of starting all this.

I'm not sure that this is the place for this, because what I have is about offense, not pitching, but there is a question about Win Probability Added that I've wanted an answer to for several years, so here it is.

Here's the issue: You have a very fast baserunner, and he hits a double. That produces a positive change in Win Probability, credit for which goes to the hitter. Now the next hitter hits a single right at the RF, which most runners could not score on, but the fast runner does. That produces a positive change in Win Probability, and the credit for it goes to the hitter, as far as I know about WP systems.

But that's wrong. Some of the credit should go to the runner, who did something that most runners could not do. The question is how much of the credit should go to the runner. You can take the WP change for men on first and third but no run scored and subtract it from the WP change for man on first and a run, but I'm not completely comfortable with that. There are a lot of situational issues in there.

So, my question. How does the version of WP that you (Tom Tango) uses deal with this sort of thing? I'm not here to argue, but just to get an answer. I haven't been able to come up with anything i'm happy with. I'd like to know what you have. You have had a lot more people working on Win Probability than I ever have.
5:20 AM Oct 5th

hotstatrat
Sorry to come to this discussion so late - been on the other side of the Atlantic in countries I could not pull up BJOL.

Sorry also, if my comment is unhelpful or too off topic. I'm struggling to understand the point of this discusion.

Of course, a two dimensional statistic will give a clearer picture of a player's career value than a single stat - just as two stats are better than one (if they are the most appropriate stats).

The beauty of WAR is that it is a single stat which allows all sorts of stiudies that would be far more cumbersome with a multiple dimensioned stat.

Anyway, whatever new stat you are coming up with, it needs to be tested against many different types of pitchers to see if it makes sense. What lost interest in (I think it was Pete Palmer's) Wins Above Average (or something aimilar) was when it showed Sandy Koufax to be a fairly mediocre pitcher over his career, because he had a bunch of years that were below average which diminished his great ones. Let's see if what you are doing makes sense for the Pedros, the Koufaxes, the long and merely very good career of Warren Spahn, and many more types of careers.

9:24 PM Oct 4th

tangotiger
I'd be happy to talk about the share for SP now and in the past, as well as the split between SP and RP.

But that really takes away from the thrust of this small article I wrote, which is to focus on Wins Above Average, and how that is perfectly able to be central to the methodology.

So, can we save that for some other thread?
6:09 PM Oct 3rd

jgf704
abiggoof... Strikeouts are not the reason that Greinke has a higher rWAR total than Ford, despite having a higher ERA.

Check out the player comparison: https://stathead.com/tiny/uNgUh

Ford is +0.68 on ERA compared to Greinke. But Greinke has the advantage (i.e. things that make him look good relative to Ford) in every other area. Greinke...

* pitched in a higher run-scoring era
* had a worse defense behind him
* gave up fewer unearned runs
* pitched in an era where SP had it relatively tougher than RP (compared with Ford's era)
* pitched in hitters parks

Note that my main point here is that Greinke's higher WAR has nothing to do with strikeouts.
5:58 PM Oct 3rd

abiggoof
So if today’s top starters have almost the same share, is it that relievers today get the shaft despite pitching more innings as a group, or — and I would not be shocked if this is the case — the back end of the rotation is far worse than days of old?

If WAR says relief pitching today provides a smaller share, that’s folly. If it says the difference between Scherzer and your fifth starter is where things have changed, I could buy that. Also if it shows a similar difference between top bullpen aces and the dozen guys arriving annually on each team’s minor league shuttle.

But I still don’t buy that a 1.70 ERA in 200 innings beats a 2.10 in 300 innings, because the difference (about 2.90 over 100 innings) has massive value. I suspect the gap between the top, the middle, and the bottom is widening, and WAR blows that out of proportion compared to WS because of that replacement level notion.

Can someone with better math skills than me either confirm or dispel the idea that the disparities today and the single season sample size (see Bill’s recent series on that) distorts WAR? As I have said before, it can usually pass the smell test compared to other results that year, and year to year for consistent players. That doesn’t mean it works when you pull back. I find that WS does.

I think Tom is brilliant, but this just does not seem to add up across eras.
5:36 PM Oct 3rd

DrDoom
Any of those who are confused about how WARs can be so high for modern pitchers, I think the article Tango recently wrote would be really informative. It's simple work, deals only with W-L records and IP, and explains the same insight that WAR is capturing:

tangotiger.com/index.php/site/article/war-quality-x-quantity
3:15 PM Oct 3rd

abiggoof
Maybe it looks good on the granular level, but from what I gather, WAR fails spectacularly due to the strikeout obsession to compare career value across eras.

Ford: 3170 IP, 133 ERA+, 53.6 WAR, 255.6 WS (pitching only)
Greinke: 3241, 123, 71.4, 235.7
That’s crazy.

I want to see more numbers to better gauge if this assertion is true, but you can’t give starters more credit when the whole pie is the same and they pitch less.
3:02 PM Oct 3rd

tangotiger
BAsed on another thread, it seems that starting pitchers are getting about 0.30 game shares per 9 IP. That would give Pedro about 94 game slices, making a .500 pitcher 47-47. At +50 WAA, that's 97 W and negative 3 L. So, 291 expected Win Shares in that case. At +60 WAA, that's 107+13, or 314 expected Win Shares.

That 0.30 is also an issue to consider.
5:23 PM Oct 2nd

tangotiger
For those who want to try their hand at estimating their own WAA for these other three top pitchers, go for it. Here's the data and what we get as their Won-Loss Records.

Clemens (546 9-inning games)
+85, based on 354-184 W-L record
+86, based on pythag of ERA 70% of league average
+77, using WPA

Maddux (556 9-inning games)
+64, based on 355-227 W-L record
+68, based on pythag of ERA 76% of league average
+55, using WPA

R.J. (459 9-inning games)
+68, based on 303-166 W-L record
+59, based on pythag of ERA 75% of league average
+54, using WPA

Individualized Won Loss Records (using .36 games per 9 IP):
181-15 Clemens (+83 WAA)
162-38 Maddux (+62 WAA)
143-23 R.J. (+60 WAA)

***

In terms of Win Shares, this would imply (with actual WSh in parens):
543 Clemens (437)
486 Maddux (398)
428 R.J. (326)

9:23 PM Sep 30th

jgf704
willibphx: "504" was my grandfather. I'm "704" :)

I agree that decreased IP for SP contributes to a drop in SP value, all things being equal. OTOH, we know that one reason SP's are pitching fewer innings is to increase their effectiveness when they *are* pitching. And in fact, that data shows this. From 1970 to 2022, IP by the top 10 SP (by ERA-) has decreased from 237 to 181. OTOH, the ERA- of these top 10 pitchers has *also* decreased from 70 to 60. The overall runs saved vs. average has actually increased a little.

That said, the bigger issue, as I see it, is that there are ZERO pitchers in the top 30-35 WS in any of the last 10 year. If you go back further in history, the situation is not as extreme, but still exists (see https://bit.ly/3SkJwl3). There are relatively few pitchers among the top WS values. Ultimately, IMO, it is because WS compresses the range of pitcher values.
9:08 PM Sep 30th

tangotiger
kaline: Pedro has a career 2.93 ERA and 2.91 FIP. I don't know that we need to try to unravel his team's W/L record.
8:02 PM Sep 30th

kaline09
I didn't mean to say that Pedro Martinez is 47.5 WAA (or to make a distinction between 47 and 50 - fine to call it 50). Rather, I meant that his team was 50 WAA in his starts. Which I think begs the question of whether the team was otherwise average, and all 50 WAA were Pedro, or the rest of the team was above average, and some of the 50 WAA should be allocated to others. If the rest of the team were 10-15 WAA over the course of his games then I could see how 35-40 WAA for Pedro is plausible.
6:21 PM Sep 30th

watcan
Tom, understood, and I appreciate the focus.
5:39 PM Sep 30th

tangotiger
watcan: thank you for that.

I think it's important at this stage not to get to decide on whether the Reference or Fangraphs or ANYONE's approach to wins above average is better. The point here is that using Wins Above Average is the key to making sure everything works out.

We can debate HOW to calculate WAA in another thread. Right now I'm just asking everyone to come up with their own WAA for Pedro, in whatever manner they want.
5:19 PM Sep 30th

watcan
Tom, your logic here showing how Win Shares undervalues all-timers like Pedro is impeccable. Once you add the second dimension, it's clear that WAR can add up to match overall team W-L records the same as Win Shares.

Where I run into trouble with WAR outputs, however, is where I don't accept the wins above average for a given pitcher. However we look at Pedro, he'll wind up between 50-60 WAA for his career. But Zack Wheeler last year... if he was 5.7 WAA (where BBRef has him), then when you figure his game responsibility he's going to wind up at the top of the WAR leaderboard. If you have him closer to 3.6 wins added (where WPA has him), then he's still a Cy Young candidate but not in the top ten compared to position players. That's a big discrepancy, and it's not unusual.

So it's two different issues. Win Shares doesn't give enough credit to elite starters. But the way some implementations of WAR account for defensive support to pitchers may conversely be overrating some of those top starters (not Pedro).
5:09 PM Sep 30th

willibphx
To jgf504.

The decline in WS for the top pitcher vs the top hitters is entirely driven by the decrease in the the number of BF by pitchers over the last 50 years. A quick list of the avg WS for the top 10 pitchers vs the the avg of the top 10 pitchers in BF.

2021 802 BF 16.5 WS or 49 BF per WS
2010 961 BF 20.0 WS 48 BF per WS
2000 999 BF 20.9 WS 48 BF per WS
1990 1001 BF 20.5 WS 49 BF per WS
1980 1144 BF 22.2 WS 52 BF per WS
1970 1236 BF 24.1 WS 51 BF per WS

The BF by the top 20 pitchers has dropped by 35% since 1970 and the top 10 WS pitchers have dropped by 32%.

The number of WS is just being spread across a far greater group of pitchers.

4:54 PM Sep 30th

tangotiger
kaline: I only used the Pitcher Won-Loss record because of its ubiquity. To include the team's Won-Loss record would mean including the won-loss portion of games he was a no-decision on, which now starts to really call into question the use of the metric at all. You are also ignoring his 11-3 record as a reliever.

That said, I provided three different ways to look at it so that I don't get locked into one method. And I also don't just want it to be about Pedro or not.

I'd be happy to repeat all this for Clemens, RJ, Maddux, and Seaver. But I don't want to get bogged into the minutiae here, since I've isolated something very fundamental about the way Win Shares is behaving.

That THAT said: your work says 47.5 WAA for SP. Fine, we'll accept it. Please provide a WAA for his 11-3, 2.18 record in 107 relief innings.
4:50 PM Sep 30th

kaline09
Thank you for the info about pLi.

I see the approaches many comments are taking.

Is this approach also plausible:

The teams Pedro pitched on went 252-157 in his starts. 37-42% of 252 is 93-106. But Pedro pitched less than 7 innings per start. Those 93-106 wins should to some extent be credited to the pitchers of those other 2 innings. To attribute 108-116 wins to him, it seems that both the other pitchers have to be attributed negative wins and the pitchers overall attributed more than 42% of the 252 wins.
I see why Tango gets to 116-16, or 108-8, or otherwise at least 50 wins better than average. But I also see that in his career, Pedro’s teams were only 47.5 wins above average in his starts notwithstanding that he pitched on good teams that were otherwise above .500 in games he did not start. From that perspective, it seems harder to believe that he alone was worth more than 50 wins above average, despite the run prevention values, and it seems useful to ask how to explain why one approach yields 108-116 wins and another yields some amount less than 93-106.

4:37 PM Sep 30th

tangotiger
Royal: it's totally fine to use ER instead of R. I wouldn't do that, but you can certainly present a reasonable case to do so.

Note that runs-per-win really work on runs allowed, not ER allowed (though it doesn't invalidate it, just a wrinkle).

In addition you are giving 10 runs per win, which is on the high end for someone of Pedro's run environment.

That said, even with all those factors going against him, you still had him at +50 wins, which is probably as low as you'll be able to make him.

Looking forward to others sharing their work.
3:44 PM Sep 30th

ForeverRoyal
I had Pedro at 50. I used the same calculation as Rally but using Earned Runs since ERA+ uses Earned Runs. Pedro gave up 919 earned runs which is roughly 500 runs better.
2:39 PM Sep 30th

tangotiger
Rally: thank you for that.

And one can certainly say that Pedro depresses the run environment so much that +543 would translate at 9:1 runs to win, or +60 wins. Nonetheless, Rally has shown his WAA expectation. Looking forward to others showing what Pedro's WAA is, in their own work.
12:45 PM Sep 30th

Rallymonkey5
Pedro gave up 1006 runs. His ERA+ is 154 on baseball-reference, meaning that the league average runs allowed was 54% more than Pedro. He seems to have normal ratio of unearned to earned runs. ERA+ take care of adjusting for league and ballpark.

So average for his innings would be 1549 runs, making Pedro +543, or about 54 wins better than average.
12:26 PM Sep 30th

tangotiger
Thank you for that Rally, it's an excellent way to get to how many Game Shares Pedro would get. And it's consistent with my "guess" as to simply applying the .36 game shares per 9IP. Not so much a guess as consistent with how Bill has Win Shares working. But I didn't want to presume it. But your method works well.

And to clarify for others, Rally quoted Baseball Reference with a WAA (wins above average) of 61 for Pedro and 0 for Flanagan (which I've rounded because the decimals get in the way).

I would also encourage everyone out there to come up with their own estimate of Pedro's WAA, however they want to. I've suggested around +50 to +60 is what anyone will find.
11:52 AM Sep 30th

Rallymonkey5
I posted this in another thread, Tom asked me to post it here:

I guess we could get a sense of how many game shares are going to pitchers. Look for an average pitcher with about the same number of innings as Pedro. For example, Mike Flanagan. 2770 innings, 100 ERA+.

A few others who are close, within 75 innings of Pedro, era + between 95-105, and at least fairly recent, 1960s or later:

Javier Vazquez
Ken Holtzman
Bob Forsch
Mike Morgan
Mike Moore

Flanagan had 158 win shares, and presumably 158 loss shares, or something close to that, so 53-53 in terms of real wins. So about .34 game shares per game, 106/308.

Or we could see Pedro is 98 win shares, or 33 wins better than Flanagan. Bbref has 61.4 for Pedro, 0.4 for Flanagan, so nearly twice the difference.
11:43 AM Sep 30th

jgf704
Jacob deGrom, NL CYA winner, 1.70 ERA in 217 IP, also earned 20 Win Shares.
11:22 AM Sep 30th

jgf704
maodorica: I think WS and WAR, and most analytical systems, correlate very highly on great players.

Yes, but a big disconnect with Win Shares is that it undervalues top pitchers compared to top hitters. Each season of the last decade there are roughly 35 hitters who earn more Win Shares than the top pitcher. It's not as stark when you go back further in history, but it is still disparate.

A specific example: 2018 Asdrubal Cabrera, Met 2B, traded at deadline for a nobody pitcher (Franklyn Kilome), earned 20 Win Shares, whereas AL Cy Young winner and runner up Blake Snell and Justin Verlander earned 21 and 20 pitching Win Shares respectively.
11:15 AM Sep 30th

tangotiger
jgf: Thank you for that illustration as to possible impact on various Loss Shares for Pedro, if we stick with the published Win Shares of 256, which everyone can see here:
bjprofiles.bisdata.com/StatisticsReport_new.aspx?Type=114&Team=0&Player=1&men=2

I agree with your conclusion that "something's gotta give". I know what has to happen, but first I'd like to see Bill work thru my request in his own way.
10:57 AM Sep 30th

tangotiger
CTR: yes, for purposes of this discussion, let's accept that the percentages are as you are presuming. However, don't let those percentages be a roadblock, since I have just showed that we can get past it anyway (for the most part).
10:53 AM Sep 30th

tangotiger
Thank you to jgf on the response for Leveraged Innings. He is correct that the average LI for relievers (and SP) is right close to 1. This is not an issue.
10:52 AM Sep 30th

jgf704
kaline09:
Can the 2827 innings translate into 132 game slices if relievers get credit for pitching leveraged innings? Or, to make room for the leveraged innings, do the 2827 innings have to translate into fewer game slices?

FWIW, overall reliever leverage index is around 1.0. See the pLI column in Fangraphs: https://bit.ly/3SsKNGG

So no need to discount starting pitcher innings.
9:36 AM Sep 30th

jgf704
Bill has not published Loss Shares, but we can do some what ifs...

BJOL shows Pedro with 256 Win shares, or 256/3 = 85 wins.

If you agree that Pedro is 50 wins above average, this implies an "Individualized Win Loss" record of 85-(-15). Negative losses are fine, but this further implies only 85-15=70 game slices, or 0.22 per 9 innings, much lower than the .35 Win Share allocation to pitchers in the Win Shares book, or the 0.41 Bill has mentioned in recent "Hey Bill's".

Alternatively, if you agree with 116 game slices, this implies an Individualized Win Loss record of 85-31, which is only 27 wins above average (58-58). With 132 game slices (and 85 wins), Pedro's Win-Shares-implied individual record is 85-47, only 19 wins above average (66-66).

So something's gotta give, and in a big way, for 256 Win Shares to be correct.
9:15 AM Sep 30th

kaline09
Can the 2827 innings translate into 132 game slices if relievers get credit for pitching leveraged innings? Or, to make room for the leveraged innings, do the 2827 innings have to translate into fewer game slices?
8:10 AM Sep 30th

maordorica
I think WS and WAR, and most analytical systems, correlate very highly on great players. I think the most interesting differences are on players like Harold Baines / Dave Parker who overall had solid, but mostly offensively driven careers, and on down from there. I think this is when Bill's point about the little differentials makes a big difference on which system you prefer. I think it is possible that WAR has the effect of showing a graph with a truncated Y axis, that can overinflate differences near the X axis.
9:31 PM Sep 29th

TheRicemanCometh
Tom:

That's the greatest, most simplified version of WAR I've ever seen. Thank you. It's incredibly helpful.

And the .37/.42 numbers represent the % number for pitching (37%/42% pitch / 13%/8% field / 50% hitting). Is that correct?

Thanks again
-CTR
5:16 PM Sep 29th

A response about WAR

From Tom Tango

COMMENTS (38 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: