Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Porlander and Vercello

By Bill James

November 22, 2016

Porlando and Vercello

In regard to the American League’s 2016 Cy Young vote, a notion has arisen that a vote for Rick Porcello was an old-line, traditionalist vote based on the won-lost record, whereas a vote for Justin Verlander was a modern, forward-looking vote based on sophisticated analysis. This is not an accurate perception, and, in the third part of this article, I will attempt to show why it is not. But first, I need to clear the decks in regard to a couple of related issues.

Part 1, trying to test the proposition that a pitcher as far behind in the won-lost record as Verlander is (from Porcello) would have had less than zero chance to be noted in the Cy Young Voting prior to about 2005 or 2009, but that this is no longer true.

My old, first, crude Cy Young formula from the 1970s was 2W – L (Two times wins, minus losses.) Porcello has a 2W – L score of 40 (44 - 4); Verlander, a score of 23 (32 – 9). Historically, a pitcher with a score of 40 or 41 has a 58% chance of winning the Cy Young Award, whereas a pitcher with a score of 22 or 23 has a 1% chance to win a Cy Young Award. This is the full chart:

2W - L g	Count	Winners	Vote Pct	Win Pct
42 or more	19	16	77%	84%
40	12	7	63%	58%
38	20	10	63%	50%
36	27	8	51%	30%
34	59	20	48%	34%
32	60	7	28%	12%
30	111	13	21%	12%
28	112	2	11%	2%
26	150	6	8%	4%
24	179	3	4%	2%
22	210	2	2%	1%
20	284	0	1%	0%
18	283	1	1%	0%
16	291	0	0%	0%
15 or less	647	1	0%	0%

First, I have to explain my study group. I studied all pitchers:

From 1956 to 2015

Who made 20 or more starts

; And who had a 2W – L score which was no more than 22 points worse than the best such record of any pitcher eligible for that particular Cy Young Award.

  Except that I eliminated the data from the leagues in which the Cy Young Award went to a reliever.

I didn’t include 2016 because I don’t have the 2016 data blended into my files yet, and I eliminated the years when a reliever won the Cy Young Award because I figured that those weren’t helpful to understanding Porcello vs. Verlander. When I publish this I will hear from two or three people who skip the details and want to tell me that I have 12 pitchers who score at 40 or 41 when actually there are 43 such pitchers in history, so what’s wrong with my study. Anyhoo. . .explaining. There are 12 pitchers in the study who have 2 W – L scores of 40 or 41, of whom 7 won the Cy Young Award, or 58%. The pitchers had an average vote share of 63%, meaning that they got 63% of the maximum possible total of Cy Young votes. On the other hand, there are 210 pitchers in the study who had 2 W – L scores of 22 or 23, of whom two won Cy Young Awards, or 1%, and they had an average vote percentage of 2%. So we can see, historically, that a pitcher who has a won-lost record of 22-4 will usually win the Cy Young Award, whereas a pitcher who has a record of 16-9 has a 99% chance of NOT winning the Cy Young Award—not that this means anything about who SHOULD win the Cy Young Award, but who might.

But this, of course, demonstrates in hard numbers something that we all knew to be true, anyway, so it’s not that helpful. I looked at how this has changed over time. Since 2006 a pitcher who has a 2 W – L score of 22 or 23 has a 5% chance to win the Cy Young Award, and a pitcher with a score of 40 has a 100% chance to win, but again. . .not all that helpful.

Next I looked at the margin between competing pitchers in this area. Since Porcello has a 2 W – L score of 40 and Verlander of 23 the margin between them is 17; Verlander can be coded as -17, and Porcello is coded at "0", meaning that he has the best won-lost record of any pitcher competing for the award. Each full game by this system is three points, so a point can be seen as a third of a game. In other words, 21-7 is three points better than 20-8 (35 to 32), 20-8 is three points better than 19-9 (32 to 29), 19-9 is three points better than 18-10 (29 to 26), etc.

What is most interesting in this chart is the impact of the first game, the first three points. There are 111 pitchers in the study who had the best 2W – L score in their league (or in the two leagues in the years 1956-1966). . .obviously counting ties as both having the best. That is entered in the chart below as "0", meaning that that pitcher is 0 points away from having the best won-lost record in the league. Those 111 pitchers had an average Vote Percentage of 66%, and 54% of them actually won the award. But when we look at pitchers are just ONE GAME OR LESS below that level, their chances of winning the Cy Young Award are more than cut in half, dropping to 20%. The sum of the "1", "2" and "3" pitchers below is 17 for 85, exactly 20%:

Margin Gp	Count	Win Tot	Vote %	Win Pct
0	111	60	66%	54%
1	22	6	40%	27%
2	25	6	30%	24%
3	38	5	27%	13%
4	40	6	28%	15%
5	40	1	16%	3%
6	31	4	20%	13%
7	53	2	11%	4%
8	63	2	8%	3%
9	79	2	7%	3%
10	97	0	3%	0%
11	99	0	4%	0%
12	99	0	3%	0%
13	122	0	2%	0%
14	134	0	1%	0%
15	156	1	2%	1%
16	152	0	1%	0%
17	163	0	1%	0%
18	176	0	0%	0%
19	190	0	0%	0%
20	184	0	0%	0%
21	192	1	1%	1%
22	198	0	0%	0%

If a pitcher’s won-lost record is more than one up to two games worse than the best in the league, his chance of winning the Cy Young Award drops to 10% (11 of 111); if he is two to three games below that level, it drops to 3%. And if he is more than 3 games below the level of the best in the league, his chance of winning the award basically drops to zero—historically. In recent years it is slightly above zero, but it is still very, very low. In the time period 1956 to 1980, the pitcher who had the best won-lost record in the group had a 62% chance to win the award. From 1981 to 2005, this drops to 55%, and from 2006 to 2015, it drops to 40%. This is the data from 2006 to 2015:

Margin Gp	Count	Win Tot	Vote %	Win Pct
0	25	10	63%	40%
1	5	2	45%	40%
2	6	1	21%	17%
3	13	0	20%	0%
4	7	1	18%	14%
5	9	0	8%	0%
6	9	2	36%	22%
7	14	1	12%	7%
8	19	0	7%	0%
9	25	1	8%	4%
10	21	0	1%	0%
11	27	0	4%	0%
12	41	0	3%	0%
13	31	0	4%	0%
14	26	0	0%	0%
15	36	0	3%	0%
16	37	0	1%	0%
17	33	0	2%	0%
18	43	0	1%	0%
19	48	0	1%	0%
20	44	0	0%	0%
21	40	1	3%	3%
22	58	0	1%	0%

We can still see, even in the most recent data, that having THE BEST won-lost record in the league, rather than the second-best, is tremendously important in Cy Young voting. If your won-lost record is two to three games behind the best in the league, your chance of winning the award is still only 3 to 4%.

OK, those were the things I did that didn’t actually work all that well. The thing that I did that DID work pretty well was this. I took the "Season Score" formula that I use, a formula which predicts the Cy Young vote with a good deal of accuracy, and created a version of it which takes out all references to wins, losses, and saves. This creates a "NOWOL Effectiveness Score", NOWOL standing for No Won Lost.

Then I ranked all of the Cy Young candidates in each group based on (1) the won-lost record, and (2) their NOWOL score. When you take the Wins and Losses out what is left in the formula is inning pitched, runs and earned runs allowed, strikeouts, walks and hit batsmen. I was trying to measure "How much have things changed in this area?" How much different is the voting NOW as opposed to other years? For illustration, in 1956 (the first Cy Young vote) Don Newcombe has the best won-lost record of any candidate (27-7), but the 7^th-best NOWOL score, behind Warren Spahn, Early Wynn, Herb Score, Johnny Antonelli, Lew Burdette and Whitey Ford. In 1957 Warren Spahn, the winner, has the second-best won-lost record (behind Jim Bunning) and also the second-best NOWOL score (also behind Jim Bunning), although Spahn won. In 1958 Bob Turley won the Cy Young Award; he had the best won-lost record in the majors but the 8^th best NOWOL score, with his teammate Whitey Ford having the best and Warren Spahn the second-best.

OK, so that’s the method. In the first group in my study, 1956 to 1980, the Cy Young Winner had an average rank of 1.56 in the won-lost record, but an average rank of 2.91 in the NOWOL score. In the second group, 1981 to 2005, the Cy Young winner had an average rank of 1.86 in the Won-Lost record, but 2.55 in the NOWOL score. And in the third group in my study, the years 2006 to 2015, the Cy Young winner had average rank of 3.30 in the Won-Lost record, but 1.50 in the NOWOL score.

That’s a LITTLE BIT misleading, because we are only talking about 20 contests, so one anomalous result has a large impact. There is an anomalous result, which is Felix Hernandez winning the Cy Young Award with the 27th best won-lost record in the league in 2010. But even if you throw out that one anomalous result, the scores are still 2.05 and 1.50. The NON-won-lost record elements of the pitchers record are more important in determining the Cy Young Award, in the last ten years, than the won-lost record.

Looking back by this method, we can actually see that the voting was starting to change before 2006. . .the numbers were 1.56 (won-lost) and 2.91 (NOWOL) from 1956 to 1980, but 1.86 (Won-Lost) and 2.55 (NOWOL) from 1981 to 2006, so the thinking of voters was changing BEFORE 2006. But when did it really change? If you divide the 1981 to 2006 period into two groups, 1981 to 1991 and 1992 to 2005, you see that the average ranks were 1.81 (Won-Lost) and 3.14 (NOWOL) in the first half of that time period, but 1.91 (Won-Lost) and 1.95 (NOWOL) in the second half of that group. So from 1992 forward, other factors were basically AS IMPORTANT as the won-lost record in determining the Cy Young vote.

Here’s what really happened, I think. You will remember that in the first half of the 1980s there are a series of "Bad Years" and bad votes for the Cy Young Award. This is basically because there aren’t any great pitchers in that era; it is the Era of No Great Pitchers. Steve Stone won 1980 with pretty mediocre NOWOL stats (5^th best in the league). In 1982 Pete Vuckovich won the AL Award with the best won-lost record in the league but the tenth-best NOWOL score. In 1983 LaMarr Hoyt won the Award with the best won-lost record in the AL but the 6^th-best NOWOL score. The NL was similar although not as bad.

Basically, what was happening was that the voters were looking for dominant pitchers like Seaver, Koufax, Gibson, Carlton, Palmer and Guidry, but, not finding any dominant pitchers, didn’t really know what to do. But here is what I didn’t understand before doing this study.

FOLLOWING that "Era of No Great Pitchers" there are a series of non-instructive awards, non-instructive meaning that they don’t tell us anything about the relative importance of the won-lost record vs. the other elements of a pitcher’s record. In 1984 a reliever won in the American League (Willie Hernandez) and Rick Sutcliffe won in the National League, which is a weird award because he came over in mid-season and was 16-1 and led the Cubs to the division championship. In the National League in 1985 Dwight Gooden had both the best won-lost record AND the best NOWOL score, so that’s a non-instructive example. In 1986 Roger Clemens had both the best won-lost record and the best NOWOL score, so that’s non-instructive. In 1987 and 1989 relievers won in the National League, so those are non-instructive as to how viewers weight the won-lost record. In 1988 Orel Hershiser has both the best won-lost record in the National League (tied with Danny Jackson) AND the best NOWOL score. In 1989 in the American League Bret Saberhagen has both the #1 won-lost record and the #1 NOWOL score. In 1985 Saberhagen is second in both areas, second behind Ron Guidry in the Won-Lost record and second behind Bert Blyleven in the NOWOL score, so again, that is a non-instructive example.

In 1990 the American League Award went to Bob Welch, who had the 5^th-best NOWOL score in the league, but then, he did win 27 games, so you kind of have to cut the voters a little slack there. Doug Drabek in the National League was #1 in both areas. But the thing is that after this series of non-instructive examples. . .well, after the Era of No Great Pitchers followed by era of non-instructive examples, things had clearly changed. In 1991 Roger Clemens won the American League Award; he had the fifth-best won-lost record in the league but the best stats in the league other than the won-lost record. Tom Glavine won in the National League; he did not have the best won-lost record in the league. Then Greg Maddux won in the National League in 1992, 1993 and 1994, although he did not have the best won-lost record in the league in any of those three seasons, did have the best NOWOL score. In 1996 Pat Hentgen won, although Andy Pettitte had a better won-lost record. In 1997 Roger Clemens won in the American, although Randy Johnson had a better won-lost record, and Pedro Martinez won in the National, although four other pitchers had better won-lost records. The Won-Lost record was no longer the king of the library. From 1992 to 2005 other statistics were basically AS important in the Cy Young voting as the won-lost record, and since 2006 the other stats have been MORE important than the won-lost record.

OK, let’s drill down now on Porcello vs. Verlander, 2016. Other than the fact that Porcello’s won-lost record is better than Verlander’s by a margin which would have been 100% decisive 100% of the time up to 2005, their summary stats appear to be almost dead even. Porcello pitched 223 innings; Verlander pitched 227.2. Porcello faced 890 batters; Verlander faced 903. Porcello allowed 78 earned runs; Verlander allowed 77. Porcello allowed 238 runners to reach base; Verlander allowed 236. Porcello allowed opponents a .635 OPS, with .268 on base percentage and .367 slugging; Verlander allowed opponents a .630 OPS, with a .263 on base percentage and .368 slugging.

Those are all summary stats, looking at what the individual categories add up to. In individual categories they are not as close. Verlander has way more strikeouts than Porcello, but almost twice as many walks. Verlander gave up 30 home runs; Porcello gave up only 23.

The effects of these things are folded into the summary stats. Verlander has a small advantage in the summary stats, but it is too small to be very meaningful. But when we look at the park effects, things would appear to swing strongly in Porcello’s direction. Porcello pitched in Fenway Park, which had a park factor of 120. Verlander pitched in Comerica, which had a park factor of 102. If you adjust their ERAs for the parks in which they pitched, Verlander goes from 3.04 to 3.01, but Porcello goes from 3.15 to 2.89. Porcello appears to be ahead.

Why, then, is there a perception that Verlander has better analytical numbers than Porcello does? I should say, at some point, that had Verlander won the award, that would have been reasonable; Red Sox fans could not have complained from a position of strength. It would have been like the MVP Award; Mookie was deserving, too, but you can’t really complain about Trout winning it. But back to the question: why is there a perception that Verlander has better analytical numbers than Porcello does?

What I learned from Joe Posnanski, in discussing this before either of us published anything about it, is the clear answer to that question.

So, you will ask, why was there a perception that Verlander had the better advanced metrics season?

Answer: Baseball Reference.

Baseball Reference WAR

Verlander 6.6

Porcello 5.0

Now, let me pause here to say: Baseball Reference is a miracle. It is the joy of my life and the joy of most baseball writer’s lives. If forced to give up Baseball Reference or a family member, well, it would depend on which family member. But I am convinced that the main reason Justin Verlander got 14 first place Cy Young votes to Porcello’s 8 is because of that fairly sizable gap in Baseball Reference WAR. There might be other factors, but I would wager that this is by far the biggest one.

I say that because Baseball Reference WAR is absolutely the biggest reason I thought that Verlander had the better statistical season.

Hey, I check Baseball Reference WAR every single day of the season. Well, I’m on the site every single day — I imagine many baseball writers are on the site every single day — and WAR is on a front page box, updated constantly. That Verlander lead in Baseball WAR absolutely played in my mind all season long. Everything else about the two pitchers was so close so for me it came down to Porcello’s won-loss record or Verlander’s 1.6 win edge on Baseball Reference.

Until Joe explained that to me, I didn’t understand what was going on here. Sure, I use Baseball Reference every day, as most edjactated baseball writers do, but I don’t pay ANY attention to their WAR, or to Fangraphs WAR, or to any WAR unless the government is threatening to draft one of my sons to fight in it. Honestly, I didn’t realize that Baseball Reference HAD a front page. I just bypass that and look up whatever I am trying to look up. Also Tom Tango, who was participating in the discussion with us, helped us both to understand how the pieces were fitting together.

I don’t pay any attention to their WAR for pitchers in part because I don’t believe one should substitute evaluative numbers for educated judgment, but also in part because I know from past studies that their WAR estimate is just not that good; my apologies to Sean, but it isn’t. But I am getting ahead of myself; the next question in the logical sequence is, if Porcello has a better park-adjusted ERA than Verlander in basically the same number of innings, why in the world does Verlander have a WAR which is 32% higher? Isn’t that kind of a big discrepancy?

A little bit of it comes from small details. I gave you their ERAs before and park-adjusted ERAs, but I believe that Baseball Reference uses ALL runs allowed, rather than EARNED runs allowed. Porcello allowed three more un-earned runs than Verlander did, so there’s that. Another little thing is that I adjusted their ERAs based on the 2016 park factors, but I believe that Baseball Reference uses a multi-year park adjustment. The multi-year park factor for Fenway is smaller than the 2016 number. Neither is right or wrong; there are problems created by either option, and it’s just not clear which gives you a truer read on the effect of the park.

But the big thing is this. Baseball Reference, I am told, adjusts the pitcher’s performance for the park in which he pitched, and also for the quality of the defense behind him. I am going to say that again, because it turns out to be a really big thing: and also for the quality of the defense behind him.

According to Baseball Info Solutions, the Fielding Bible and John Dewan, the Boston Red Sox’ defense in 2016 was 54 runs better than average. The Tigers’ defense was 50 runs WORSE than average. It’s 104 runs in 162 games. It’s, well. . .a lot of runs.

The logic of the Baseball Reference WAR analysis is that, given the same defense behind them, the same park, Justin Verlander WOULD HAVE allowed significantly fewer runs than Rick Porcello. The question this pushes us to is, Is this actually a reasonable thing to believe?

No, it isn’t. Maybe it is a reasonable adjustment in theory, I don’t know. Maybe if we compared 100 different pitchers, this would be a useful and instructive adjustment in the other 98 cases; I don’t know. But we’re talking about this case.

Verlander faced 903 batters, of whom 349 either homered, walked, struck out or were hit by a pitch. 554 put the ball in play.

Porcello faced 890 batters, of whom 257 homered, walked, struck out or were hit by a pitch, and 633 put the ball in play.

This defensive impact adjustment, then, must occur on these balls in play, right? But of the 554 balls in play against Verlander, 141 were hits, a .255 average. Of the 633 balls put in play against Porcello, 170 were hits, a .269 average.

Nine players reached on error while Porcello was on the mound (actually eight players, but Max Kepler reached on errors twice while Porcello was pitching. Love Max Kepler.) Only four batters reached on error while Verlander was pitching.

There were 35 doubles against Verlander, 36 against Porcello. There were 4 triples against Verlander, 5 against Porcello.

There were 5 bases stolen against Verlander, with 6 runners caught stealing, whereas there were 7 bases stolen against Porcello, with 3 runners caught stealing.

Also, according to Posnanski:

The biggest difference in the two defenses was in right and centerfield. The Red Sox centerfielder and rightfielder saved 44 runs, because Jackie Bradley and Mookie Betts are awesome. The Tigers centerfield and rightfielder cost 49 runs because Cameron Maybin, J.D. Martinez and a cast of thousands are not awesome.

But the Tigers outfield certainly didn’t cost Verlander. He allowed 216 fly balls in play, and only 16 were hits. Heck, the .568 average he allowed on line drives was the lowest in the American League. I find it almost impossible to believe that the Boston outfield would have done better than that.

Joe says it’s 13 runs, whereas I figure it must be 16, but 13 runs, 16 runs. . .it’s a bunch. Sixteen runs is the difference between Rick Porcello and Jeremy Hellickson or Miguel Gonzalez or Liam Hendriks or Erasmo Ramirez or Jeff Samardzija or Sean Manaea. My point it, you can’t just infer something like that without evidence. The proposition that Verlander WOULD HAVE allowed significantly fewer runs than Porcello given an equal defense is just not reasonable, given the facts on the ground.

We’ll get into the issue of whether we should COMPLETELY ignore wins and losses another time. Justin Verlander is a great pitcher, probably a Hall of Fame pitcher; Rick Porcello will need to have two or three more dominant seasons before he enters that conversation. I haven’t seen Rick Porcello’s wife or lady friend and don’t know if he has one, but if she’s hotter than Kate Upton, that would be surprising.

But there is no reasonable sabermetric argument that Justin Verlander had a significantly better season in 2016 than Rick Porcello, or even, really, that he had a better season at all. The idea that he did was created by flawed analysis.

COMMENTS (28 Comments, most recent shown first)

steve161
They don't even need to be egregious. They just need to be in that area where the human eye isn't as reliable as multiple cameras, capable of slowing the action down and showing it from different angles.

What we've learned from a couple of years of replay is that there is on average such a call every two or three games. Some are shown to be correct, some are too close to call, some are shown to be incorrect (I wanted to post percentages, but the Replay-O-Meter has disappeared from the Web). In some cases--rarely if ever included in the statistics--the team declines to challenge.

All taken with all, I think replay has worked remarkably well. In baseball, I mean: I really don't care about the other sports.
5:25 PM Nov 24th

flyingfish
Brian: I think they are two different things. To evaluate someone's performance, you need to know if the officiating was fair in general, not unfair in one or two cases. To decide to use instant replay, the decision criterion is different: it is whether there are some cases where the officiating seems to be so egregious that we need to have a method of rectifying it during the game. One can reasonably conclude that the officiating is generally fair, so it's not a factor in evaluating someone's performance, and at the same time there are some egregious miscalls that need to be corrected.
11:30 AM Nov 24th

jollydodger
Brian, MLB making a rule which specifies the earliest a vote can be submitted would be great, and if that date was the last day of the regular season, that'd be better than whenever it is now. But without that rule in place already, we can't complain about the 2016 vote, can we?

If a voter turned in their ballot early with Tanaka on it, but was allowed to do so, tough shit, right? The voter didn't do anything wrong, he was within the rules.

As for a rules change going forward, great idea.
7:29 AM Nov 24th

jollydodger
So the pitcher with the worse defense behind him actually had better defense during the 20% of his team's season that he pitched than the other guy?

Okay. I'm just thankful we can find that out. I'm not sure why this Cy Young vote seems to be getting more attention than any other. There was no obvious choice, which means, there were several acceptable choices, one of which was the winner.

Verlander finished 2nd. Cool. It's not some tragedy. He got the most 1st-place votes and finished 2nd. Great. We knew that was a possibility before the first vote came in.

What's so noteworthy about all this?
7:25 AM Nov 24th

Brian
I was responding to this: "If you are complaining about the refereeing in a game, you can't say "the issue isn't the refereeing; it is those two plays where he called holding (or a charging foul) when the video doesn't show holding (or a charging foul.)" But if you're talking about the refereeing, then the issue isn't about one or two calls; it is about whether the refereeing was overall fair, or whether it changed the outcome of the game."

So it wasn't my analogy.
3:23 PM Nov 23rd

MarisFan61
You don't think that's way different?

The reason for replay review is specifically to correct incorrect calls, so of course it's because of that. It's hard to see even a little piece of an analogy to this.
3:17 PM Nov 23rd

Brian
BTW, it can be argued that the issue that forced instant replay into baseball was not the overall umpiring, but the occasional isolated bad call - such as the Galarraga near-perfect game.
3:05 PM Nov 23rd

Brian
I don't know if I'm supposed to be the person who"keeps saying the issue isn't the vote, but the 2 people who left him off the ballot." If so, I actually only said it once. And a reading of what I actually said indicates that the only problem I had with any of it was the writer who admitted to turning his ballot in before Verlander's last 2 starts, and I believe putting Tanaka on instead. That's just wrong, no matter who wins. Whether or not it caused Verlander to lose is irrelevant to me. As I said, I voted for Britton, and could see any of the 4 -Kluber included - winning. I also said I could even see a ballot without Verlander - if Happ and Sanchez were on it.

I just think it's lazy and wrong to turn in a ballot before the races were decided, and shouldn't be allowed. Does anybody have a serious problem with that proposition?
2:54 PM Nov 23rd

flyingfish
This is one of the more helpful articles and discussions I've been privileged to read on BJOL. Thanks to all. It seems to me--and perhaps I haven't thought about this enough--that the criteria that the voters seem to use for Cy Young are clearer than those for MVP. As an example, I haven't heard any complaints that Porcello won because voters weren't using the criteria that the complainer would have used, but I HAVE heard such complaints about Trout's MVP award.
1:06 PM Nov 23rd

bjames
Responding to the fellow who keeps saying that the issue isn't the vote; it is the two voters who left Verlander off the ballot. . .well, no, the issue is the vote. If you are complaining about the refereeing in a game, you can't say "the issue isn't the refereeing; it is those two plays where he called holding (or a charging foul) when the video doesn't show holding (or a charging foul.)" But if you're talking about the refereeing, then the issue isn't about one or two calls; it is about whether the refereeing was overall fair, or whether it changed the outcome of the game.
11:08 AM Nov 23rd

bjames
Regarding the suggestion that this defensive support adjustment is exactly the same thing as a park adjustment, it seems to me that it is not at all the same.

What we are adjusting for with a park adjustment is the value of the run. One run scored in Citi Field in New York has a different (and greater) impact on the likelihood of a win than one run scored in Coors Field. If bread is cheaper in Mobile, Alabama than in Minneapolis, we adjust the value of money for the cost of bread. This is that kind of adjustment; it has to do with the value of a run.

But an adjustment for defensive support has nothing at all to do with the value of a run; it is a completely different TYPE of adjustment, and has to be thought through in a different way. The value of the run is the same whether it is allowed by the pitcher or the defense. The issue THERE is how many extra runs the defense may have prevented or allowed.
10:44 AM Nov 23rd

bjames
Well, suppose that there was a pitcher who pitched for a team with a poor offense, scoring only 3.80 runs a game for the season, but that we know that, FOR THIS PARTICULAR STARTING PITCHER, they happen to have scored Six runs a game. Would we pay any attention at all to the GENERAL offense of the team, or would we be justified in focusing 100% on the offense of the team in the games which this pitcher started?

Would it be a reasonable argument to say that, even though this particular pitcher benefitted from six-run-a-game offense, we should still put a 75% or 80% weight on the GENERAL offense of the team?

Why is this different? If we cannot demonstrate that Verlander HIMSELF suffered from a poor defense, why would we be justified in applying ANY adjustment for that issue at all, based on things that may have happened to other pitchers on his team?
9:51 AM Nov 23rd

CharlesSaeger
Bill, some of the park adjustment will be on balls in play. You need to take that info account.
7:50 AM Nov 23rd

CharlesSaeger
Hey, mgl, do you have a link to that The Book thread? I've been developing a newer version of CAD and comparing to UZR, DRS, and Total Zone with Hit Location. The first two don't have as strong a relationship with the traditional fielding stats as TZ-Hits (I'm not using the older TZ for my sample), and DRS in particular has some weird ratings, weird enough that I don't trust it for outfielders. I'd like a better understanding of why, if nothing else; I've attributed the gap with UZR and the traditional stats (PO, A, E, you know the drill) with the fact that UZR excludes many plays as being not meaningful to a fielder's rating.
7:40 AM Nov 23rd

steve161
Good point, Maris, on empty magnification: it follows from what MGL says about the confidence intervals on defense. How large are they? They are rarely if ever published, certainly not in The Fielding Bible.
6:46 AM Nov 23rd

MarisFan61
BTW (side point), I think that citing the decimal points in Win Shares goes against what Bill thinks of stuff like that, unless he's changed his mind, and I hope he hasn't.

I think it's a bad idea, and not mainly because the decimals are probably meaningless but because it seems to be claiming a precision for the system that doesn't exist.
2:27 AM Nov 23rd

brewer09
WIN SHARES

Verlander, 20.2
Kluber, 19.6
Porcello, 19.3
Britton, 19.3
Miller, 18.9

All of these pitchers are so close that a reasonable case can be made for any of them. But Verlander is still first.
2:02 AM Nov 23rd

mgl
Just to be clear, if we didn't know Verlander's and Porcello's BABIP, then it would be 100% correct to do the type of adjustment that B-R did with respect to the teams' respective defense. It is exactly the same thing as doing a park adjustment.

Given that we Do know their respective BABIP do we want to ignore the team DRS or UZR? No. We still want to use it, but now it becomes a Bayesian problem to estimate the effect of the defense on each pitcher. Tango estimates (and I don't know if he is right, off the top of my head) that we actually want to assume that Verlander had a slight edge in defense when he pitched even though we think his team was much worse in defense over the season.

There is also the chicken/egg problem. Given that pitchers have SOME influence on BABIP (certainly much less than the defense) we have to assume that Verlander himself played some part in his low BABIP even though he had a poor defense behind him (but maybe they didn't play poorly when he was on the mound or didn't even have a chance to play poorly or well because they got lots of routine balls).

That being said, one thing Tango and I (and others) discussed on the thread on The Book blog was that the numbers that Fangraphs uses for team defense (DRS and UZR) need to be shrunk anyway if we want an estimate of how the team defense actually did in 2016. The methodology that each system uses is flawed as I describe in the thread. So really that 104 run gap should probably be more like 75 or 80.

One more thing. When a certain metric has a relatively low certainty, i.e., the confidence interval is large (as with defense), as is the case with many "implicit" metrics, it is OK to give them less weight even if they are deemed accurate on the average. For example, let's say that a pitcher has a BABIP of .310 AND his team had a terrible UZR or DRS. It would seem to be OK to adjust his RA9 or ERA downwards and it is. It is likely that his poor BABIP and thus RA was largely due to a poor defense. But we don't know that for sure. Maybe they played OK defense behind him and he just allowed lots of hard hit balls so that we shouldn't adjust his RA9 or ERA downward. Well, even though the downward adjustment is correct to do from a mathematical perspective, it is OK to give more weight to something that we are more certain of, like a pitcher's W/L record or his K or BB rate, etc.
11:21 PM Nov 22nd

MarisFan61
As I said (actually diatribed) under Dave Fleming's article, I was completely puzzled by the vote, maybe because I don't much look at "WAR" either. (I mean, I do look at it, but don't take it as other than a rough indicator.) It was hard for me to see why Porcello wasn't a near-unanimous pick.

About the historical shifts: To me it seemed the watershed moment, more than anything about who won the award in any year, was 1987, when Nolan Ryan got some tangible support despite going 8-16.
I attributed it mainly to a growing awareness of Bill's work (by then, lots of people had been reading the Annuals for a few years), maybe aided by Palmer & Thorn's book.
7:41 PM Nov 22nd

steve161
wovenstrap, people would have talked more about Kluber if he had a hot girl friend with a Twitter account. The squeaky wheel and all that.

Tango, thanks for the link. The discussion following the article is outstanding. I have one comment: MGL talks about the offensive side of the argument and notes that everybody likes pitches down the middle and nobody likes pitches on the black. True, but some hitters are good at taking pitches on the outside black the other way and some ('dead pull hitters') are not. Just one more complication in a sea of them.
4:33 PM Nov 22nd

DaveNJnews
Brian, even if the 2 writers who omitted Verlander had him on the ballot in fifth place, he wouldn't have won.
4:18 PM Nov 22nd

tangotiger
Just as a footnote, here is Poz's article:

joeposnanski.com/porcello-v-verlander/

And this is my little bit:

tangotiger.com/index.php/site/comments/pre-statcast-lab-how-much-different-fielding-support-do-pitchers-on-the-sam

Though I would urge you to also read the comments by MGL.
2:49 PM Nov 22nd

wovenstrap
I don't really understand why the word 'Kluber' isn't mentioned in this article, given that Kluber has the best NOWOL in the AL in 2016.

I'm an Indians fan. When the season ended and Kluber hadn't reached 20 wins, I shrugged and figured Porcello would win, even though Porcello had a lot more run support. Then the vote was announced, and all the talk was about Verlander.

Kluber finishing 3rd is a bigger scandal than Porcello winning the Cy Young only by a little bit.
1:41 PM Nov 22nd

ksclacktc
OldBackstop
...and you understate it.

I'm not so sure.

If I use his Season Score and park adjust ER/RA, and add one little twist; which is, I count the ERA and FIP at 50/50 like Tango does. BTW, this helps with the fielding thing. Everything else stays the same.

Player SCORE
Rick Porcello 314
Corey Kluber 262
J.A. Happ 253
Justin Verlander 239
David Price 236
Aaron Sanchez 231
Chris Sale 227
Masahiro Tanaka223
Cole Hamels 214

Of course this includes Wins/Losses, but even looking at Bill's NOWOL below it is very close! I think Bill is correct that the defensive method is the issue.

Player NOWOL
Corey Kluber 163
Rick Porcello 158
Justin Verlander 156
David Price 145
Chris Sale 141
Masahiro Tanaka131
Jose Quintana 124
Aaron Sanchez 121
Cole Hamels 119

1:08 PM Nov 22nd

Brian
I'm a Tigers fan. Had Verlander over Porcello, but voted for Britton over both. As far as I'm concerned, Any of those 3 plus Kluber would have been good choices
.
I think the issue with Verlander, though, was the 2 writers who completely left him off their ballot. Even that is defensible, because Happ and Sanchez were out there. However, one writer admitted that he turned his ballot in early. I believe that writer also had Tanaka on his ballot. He should lose his vote. No, that's probably too strong, but there should be a rule that no ballots can be turned in until all playoff positions are set.

Either way, I don't feel bad for Verlander. It seems like he has a pretty good life. He'll get by without this award.

Now, Alan Trammell and the 1987 MVP race, on the other hand...

12:24 PM Nov 22nd

OldBackstop
...and you understate it. The full standings were:

Verlander 6.6
Corey Kluber 6.4
Mas.Tanaka 5.4
Jose Quintana 5.2
Hamels 5.0
Chris Sale 5.0
Porcello 5.0

So, that may not only be instructive on the binary decisions voters made, but also how others slot into the standings.

In the broader topic of how voters thank 'bout stuff, I think an analogy might be made to the ages of MVPs of late, discussed in a Reader's Post thread cleverly entitled "MVP's ages."

The last 21 MVPs were 30 and under....this compares to a stretch in the early '90s where 8 out of 12 NL MVPs were 30 or older, even after taking Bonds and Sosa out.

The question, I guess, is whether the power of sabermetrics caught up with W/L in the Cy Young race around the same time that it caught up with name recognition/established stars in the MVP race.....
12:15 PM Nov 22nd

OldBackstop
Hi Bill,

I think Joe has a point about the BBR WAR
11:58 AM Nov 22nd

steve161
As Winston Churchill said, "Jaw, jaw is better than war, war." (Using his accent, they rhyme.) What that means here is that simply relying on a number without working out what it means is lazy and simple-minded.

I thought Porcello's best case was his season-long consistency. Verlander's first half was undistinguished; his second half was stellar. But overall the W-L record reflects that consistency.
11:49 AM Nov 22nd

Porlander and Vercello

COMMENTS (28 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: