Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

The MVP Vote Bias Detector Part 3

By Bill James

November 23, 2011

Other Elements That Can Also be Tested

In addition to batting statistics, I also tested the MVP voting performance of seven other factors. Two of those I have already discussed; those are the effects of playing a key defensive position, and the effects of team performance. The other five factors that I analyzed were:

a) Youth versus age,

b) The impact of the Run Environment,

c) The effects of playing in a bigger city,

d) Pitchers vs. Non-Pitchers, and

e) The impact of race.

Let’s start with race, since that gets a clear non-result. In an interview someplace, Henry Aaron explained his winning only one MVP Award by saying that when he came into the league, the writers would find some way to give the award to a white player if there was any way they could. I ridiculed this comment by listing the actual MVPs in the first nine years that Henry Aaron was in the National League: Willie Mays, 1954 (black), Roy Campanella, 1955 (black), Don Newcombe, 1956 (black), Aaron, 1957 (black), Ernie Banks, 1958 (black), Ernie Banks, 1959 (still black), Dick Groat, 1960 (white), Frank Robinson, 1961 (black), and Maury Wills, 1962 (black). Eight black MVPs in the first nine years that Aaron was in the league.

Of course, these players did not win the MVP Awards because they were black. They were great players. The only year in their in which the MVP perhaps should have been a white guy was 1956, when Duke Snider may have been more valuable than Don Newcombe, but Snider had a "down" year in terms of RBI, which probably killed any chance he had to win the award. Plus he walked too much. He led the league in walks. Voters don’t like that.

But when you get into the 1960s, the picture does change. In the 1960s there actually are several MVP Awards that were won by white players in preference to black players with more Win Shares. In 1963 Sandy Koufax won the award, when the best player in the league by Win Shares was Aaron (Aaron 41, Koufax 32). In 1964 the MVP Award went to a white player, Ken Boyer, although a black player (Dick Allen) appears to have been far better (41-28). In 1965 the American League MVP Award went to Zoilo Versalles with 32 Win Shares, although a teammate (Tony Oliva) had 33 Win Shares. Versalles and Oliva were both Cuban, but Oliva was much darker than Versalles, so I suppose we could count that one. In 1969 the American League MVP Award was won by a white player, Harmon Killebrew, but the best player in the league was probably a black guy (Reggie). 1973 National League, Rose vs. Morgan; Rose was white and won the award, Morgan was a better player, but black. In 1981 and 1985 the best player in the American League was probably Rickey Henderson, but the MVP Award went to white players (Rollie Fingers and Don Mattingly.)

Still, on balance, there is no evidence whatsoever of a black/white bias in MVP voting. The winning percentage of black MVP candidates against white ones is .506 (160-156). Black players have won 19 MVP awards in seasons when white players had better seasons. Jimmy Rollins won the award when David Wright was probably better. Sammy Sosa won the MVP Award in 1998 although his home-run hitting rival, Mark McGwire, had more real value; Sosa was black at that time. Mo Vaughn won over Edgar Martinez in 1994; Frank Thomas won over John Olerud in 1993.

White players have won 18 MVP awards in seasons when black players were better; we mentioned several of those earlier. Justin Morneau won over Derek Jeter. Taking all awards as a group, there isn’t the slightest evidence of a bias going in either direction.

That’s the only one of this package that gets no finding. On the issue of the value of a pitcher, let me re-explain the method just to be sure we’re all on the same page. In 1931 Lefty Grove won the first American League MVP Award. Grove also led the league (and led the majors) in Win Shares, so that’s a non-event, no evidence of bias for or against pitchers.

In 1933 Carl Hubbell won the National League award. Hubbell had 33 Win Shares; Wally Berger had 36. We count that as 6 "wins" for "Pitchers" in "Pitchers vs. Hitters"—3+3. In 1934 Dizzy Dean won the NL award. Dean had 37 Win Shares, Mel Ott had 38, so we count that as 4 wins for pitchers, 3+1.

In the American League in 1935, on the other ear, Hank Greenberg won the AL award with 34 Win Shares, although Wes Ferrell, a pitcher, had 35. We count that as 4 wins for "hitters" in the Pitchers vs. Hitters competition. In 1940 Greenberg won the award again, although, once again, we believe that a pitcher (Bob Feller) was actually better (34-31). We count that as 6 wins for hitters (because a hitter won the award), and that makes us even at 10-10.

That was the last time pitchers and hitters were even in this contest. The last time a pitcher who was the "true" MVP failed to win the MVP award was 1972, when Johnny Bench won the National League award although Steve Carlton—27-10 with a last-place team—may have been the most valuable player in the league; anyway Carlton leads in Win Shares, 40-37. Not counting the 2011 award to Verlander, the won-lost record of pitchers in pitcher vs. hitter MVP competitions is 152-22, an .874 winning percentage. If we did count the 2011 award, it would be 166-22, an .883 percentage. It appears to us that MVP voters have significantly over-valued and over-rated pitching, giving pitchers several more MVP awards than they deserved.

You can’t take that conclusion too seriously. We have to judge our own statistical calculations by the standard of their ability to convince an honest skeptic. If a skeptic were to argue that it was we who had under-valued pitchers, rather than the MVP voters who had over-valued them, the logical loop that we would have to lead him through to show him why we had reached our conclusion would be so long, and the potential jumping-off points so numerous, that our argument would have little ability to persuade. We have done the best we can to properly value pitching, and we think we’ve got it about right, but. ..we’ll just have to leave it at that.

A similarly "clear-but-ambiguous" result comes from the study of Big City vs. Small City MVP competitions. Let’s go back to the 2011 awards. In the American League the MVP Award winner (Verlander) and the "true" MVP (Cabrera) played on the same team. Those are the only two candidates that we consider—the elected MVP and the player with the most Win Shares—so we would ignore that contest in this calculation.

In the National League, on the other hand, Ryan Braun and Matt Kemp earned 37 Win Shares each; thus, the victory for Braun over Kemp can be looked at as evidence of bias. Kemp plays in a very large city, Braun in a much smaller one, so we would count this as 3 "wins" for Small Cities in the Big Cities vs. Small Cities competition. The evidence of bias in this one example, if there is any such evidence, would be bias in favor of players who didn’t play in big cities.

Let’s walk it back a little bit. In 2010 Joey Votto won the National League MVP Award with 33 Win Shares, although Adrian Gonzalez of San Diego had 35. San Diego is a much, much bigger city than Cincinnati, so this, again, would be evidence of bias in favor of the small city, as opposed to the large city.

In the American League in 2010 Josh Hamilton won the MVP Award with 30 Win Shares, although Robinson Cano of New York and Jose Bautista of Toronto had 34 Win Shares each. Dallas is a huge city, but both New York and Toronto are even bigger than Dallas, so, again, the bias in the vote, if there was one, favored the small city. We give Small Cities 3 wins for Braun over Kemp, 5 Wins for Votto over Gonzalez, and 7 Wins for Hamilton over Cano and Bautista.

In 2008 Albert Pujols won the National League MVP Award, although Lance Berkman had more Win Shares (36-34). Houston is bigger than St. Louis, so, again, that’s an entry for the Small Cities. On the other eyebrow, Dustin Pedroia won the American League award in 2008 over Joe Mauer although Mauer had more Win Shares (30-26), and Boston is bigger than Minneapolis and St. Paul, so that one goes to the Big Cities.

Over the history of the vote—I am surprised to see this, but—over the history of the vote, there is some evidence of a bias in favor of the players who play in Big Cities. The score for all 160 elections is Big Cities 440, Small Cities 326, a .575 winning percentage for the Urban Monsters. (Adding Braun’s victory to the count would change the total, but not really. ..it’s just 3 wins.) There have been 45 MVP awards that went to Big City players when players in smaller cities were having better years. A few of these include the National League in 2006 (Ryan Howard vs. Albert Pujols, Philadelphia over St. Louis), the National League in 1998 (Sammy Sosa over Mark McGwire, Chicago over St. Louis), the American League in 1998 (Juan Gone over Albert Belle, Dallas over Cleveland), the American League in 1996 (Juan Gone over A-Rod, Dallas over Seattle), the American League in 1995 (Mo Vaughn over Edgar Martinez, Boston over Seattle), the National League in 1988 (Kirk Gibson over Will Clark, Los Angeles over San Francisco), the National League in 1987 (Andre Dawson over Tim Raines, Chicago over Montreal), the American League in 1987 (George Bell over Alan Trammell, Toronto over Detroit), the American League in 1984 (Willie Hernandez over Cal Ripken, Detroit over Baltimore), the American League in 1976 (Thurman Munson over George Brett, New York over Kansas City), and the National League in 1974 (Steve Garvey over Mike Schmidt, Los Angeles over Philadelphia.)

I thought, heading into this study, that I would have an issue with determining which city was really "bigger"—you can get different reports from using different definitions of a metropolitan area—but, in fact, there were almost no cases in which that turned out to be a real issue, might have been one or two. I thought I might have a cross-correlation issue with awards going to bigger cities because bigger cities more often win the pennant, and I can’t say for certain that there was no such cross-correlation problem. It may be that there was. It isn’t evident. Looking over the list above. …George Bell didn’t win in 1987 because of Toronto’s performance. Detroit won that race; Toronto finished second. Andre Dawson didn’t win in 1987 because of the Cubs’ performance; the Cubs finished last. Mo Vaughn didn’t win in 1995 because of Boston’s performance; the Red Sox won their division that year, but so did Seattle. Juan Gonzalez didn’t win in 1998 because of Texas’ performance; Texas won their division that year, but so did Cleveland. Thurman Munson did not win in 1976 because of team performance; the Yankees won their division that year, but so did Kansas City. It just is not evident that this is a cross-correlation because bigger cities win more often.

I am surprised at this finding. The MVP voting is "controlled"—two ballots to each city—to prevent the voting from being dominated by big-city writers. I had always believed, and I had expected to find, that there is no Big City bias in the voting. The example I always use to demonstrate the lack of bias is the Mets. The Mets have never had an MVP Award winner, even though they have had quite a few good teams and some players who might well have won the Award, like Seaver in 1969. If anybody would benefit from a Big City bias in MVP voting, wouldn’t you think it would be the Mets? I thought there was no bias. Maybe there is; maybe there isn’t—but Big City players have stolen awards from Small City players more often than has happened the other way around.

Environmental bias. ..by "environmental bias", what I mean is the park helping the player’s statistics. As long as we’re comparing hitter vs. hitter, this is just asking "did the player who played in the better hitter’s park win?" When you’re comparing a pitcher to a hitter, then you have to flip it around and ask who was working in a more "friendly" environment: that pitcher or that hitter. Using the 2011 season for illustration again (although this research was completed before the 2011 awards were announced), compare Verlander vs. Miguel Cabrera. Comerica Park in 2004 functioned as a hitter’s park, which means that it helped Cabrera pile up impressive numbers, but hurt Verlander. In spite of that, Verlander won the award—so that counts against the environment. On the other hand, Miller Park is a better hitting park than Dodger Stadium, so the victory of Braun over Kemp is a victory for the environment.

The surprise here is to discover that, in the first twenty years of MVP voting, the environment did not help players at all. In the first 21 years of MVP voting (1931-1951), the won-lost record of the environment was 87-129, a .403 winning percentage. Players who played in "positive" environments tended not to win the Award.

In the early years of MVP voting, voters were so willing to toss out hitting statistics, and so inclined to give the award to the best-hitting up-the-middle performer on the championship team, whoever that was, that they were not impacted at all, on balance, by players playing in big-hitting parks. I was very surprised by that.

Since 1952, this has not been the case. Since 1952 the winning percentage of the player in the better run environment is 420-182, a winning percentage just short of .700. Including the early data, the winning percentage of run environments in MVP voting is .620 (507-311).

Citing a few recent cases to illustrate the point. . .Joey Votto won over Adrian Gonzalez in 2010. Votto was playing in a much better hitter’s park. Dustin Pedroia won over Joe Mauer in 2008. Pedroia was playing in a better hitter’s park. Jimmy Rollins won over David Wright in 2007. Rollins was playing in a better hitter’s park. Ryan Howard won over Albert Pujols in 2006. Howard was playing in a better hitter’s park. Justin Morneau won over Derek (Dirty Rotten) Jeter in 2006. Morneau was playing in a better hitter’s park.

Between 1995 and 1999 there were nine awards that went to a player in a better hitter’s park, in preference to a better player in a worse hitter’s park. There were no awards (in that era) that went the other way. Three of those were the awards to players in Texas (Juan Gonzalez and Ivan Rodriguez). Larry Walker won an MVP award playing in Colorado. The MVP Awards to Barry Larkin, Sammy Sosa and Ken Griffey Jr. were facilitated by their playing in better hitter’s parks than their MVP competitors.

In other respects I have noted that we have gotten better at finding the "real" MVPs. Voters are no longer as likely to be thrown off course by the performance of the team, no longer as likely to romanticize and exaggerate defensive value. In this respect, I should note that we have gotten worse. Voters in the last twenty years have actually done a worse job of filtering out the environmental biases in the statistics than in any other era. Which I’m really not surprised by; I’ve noticed in other areas, like Cy Young votes, that people don’t seem to be paying any real attention to park effects.

We come, finally, to the issue of younger players versus older players. This was the second-biggest surprise in the voting, behind the accidental discovery of the importance of strikeouts in MVP contests.

I have always believed that MVP voters tended to be attracted by shiny objects—to state it in the most negative way—and thus tended to discriminate in favor of players who surprised them, and against players who just played at their usual level. I formed this opinion in the 1970s, based in part on the example of Steve Garvey. Steve Garvey had the same statistics every year—200 hits, 35 doubles, 25 homers, 110 RBI, 35 walks, .315 batting average. The first time he did that he won the MVP Award, because nobody had expected him to play like that, so everybody focused on him. The longer he kept doing the same thing, the less attention anybody paid to him.

Terry Pendleton won the MVP Award in 1991 not because he was great, but because he was so much better than anybody expected him to be. So did Ken Caminiti in 1996. Frank Thomas had a fantastic year in 1996 (other league), hitting .349 with 40 homers, 134 RBI, .459 on base percentage, 1.085 OPS, but, because he had played at that level for several seasons, nobody paid any attention. He actually finished 9^th in the MVP voting—with a season that was just as good as his MVP seasons of two or three years earlier.

Miguel Cabrera in 2011 is actually a really good example of this. Cabrera had a remarkable season, but few people even talked about him as an MVP candidate, because. . .well, Miguel Cabrera always has fantastic seasons. Jacoby Ellsbury also had a fantastic season, but he finished second in the MVP voting because in his case it was unexpected. Mickey Mantle in the years 1958-1961 and Stan Musial in the years 1949-1952 are further examples; they were just doing what everybody expected them to do, so other players—and lesser players—were given the MVP Awards. Josh Hamilton in 2010. . .he won the MVP Award because nobody had expected him to be so good. Fred Lynn was actually better in 1979 than he was in 1975, but he won the MVP Award in 1975 because he caught everybody by surprise. A moving object draws the eye.

Because the MVP voters are attracted to players who play much better than they were expected to play, they tend to discriminate in favor of young players and against older players; this was what I thought before doing this study.

Dead wrong, actually. I’m not giving up on the idea that there is a "surprise effect" in MVP voting—I think obviously there is—but it clearly does not translate to a bias in favor of younger players. In the early years of the voting—again, I did not know this until I did this study—but in the early years of the voting there was a strong, strong bias in the voting in favor of older players, and against younger ones. Let me run down a few of those:

In 1931 Frankie Frisch (aged 32) won the MVP Award over Wally Berger (aged 25) although Berger was actually a much better player. Berger actually was not mentioned in the voting in 1931; 29 players were mentioned in the voting, but Berger—the best player in the league—was not, in part because he played for a bad team, and in part because he played in a pitcher’s park.

In 1935 Gabby Hartnett (aged 34) won the MVP Award over Arky Vaughan (aged 23), although Vaughan was obviously much more valuable.

In 1937 Charlie Gehringer (aged 34) won the Award over Joe DiMaggio (aged 22), even though DiMaggio’s team won the pennant, and DiMaggio drove in 167 runs.

In 1940 Hank Greenberg (aged 29) won the MVP Award over Bob Feller (aged 21), although Feller had gone 27-11 and was in my opinion at least equally deserving of the award.

In 1941 Dolph Camilli (aged 34) won the MVP Award over his teammate Pete Reiser (aged 22), even though Reiser was probably a better player, and would for years after that be lionized by writers as the greatest young player the game had ever seen.

Ted Williams lost an MVP contest that he might have won in 1941 and a contest that he should have won in 1942 to Joe DiMaggio and Joe Gordon, both of whom were four years older than Williams.

In the National League in 1942 the "true" MVP was Enos Slaughter, who was 26. The award went to his teammate Mort Cooper, who was 29.

From 1931 to 1959 disputed awards almost always went to the older player. From 1931 to 1959 the won-lost record of the older players in MVP contests was 291-64, an .821 percentage. (I ignored contests in which one player was just months older than the other.)

This began to change about 1960, when Maris took the two awards from Mantle (although, since Berra had also taken two awards from Mantle, this may have nothing to do with ages, and more to do with The Mick.) But since 1960 age has been fairly even in MVP voting, with only a minor bias in favor of established players. Through 2010 the record of the older player in MVP contests was 584-311, a .652 percentage. There have been 54 MVP Awards that were won by an older player over a younger player who was better. There have been 38 Awards that went the other way.

I think what happens in some cases is that it just takes a little time for people to focus on a young player and accept that he is the player that he is, particularly if his skills are a little bit subtle. In 1973 Joe Morgan was the best player in the National League, with 26 homers, 67 stolen bases and 111 walks, but it took people a couple of years to catch up to the kind of player he had become. Mike Schmidt became a great player in 1974 (like Garvey), but didn’t start winning MVP Awards until six years later. Carl Yastrzemski was perhaps the best player in the American League by 1963, but didn’t win an MVP Award until 1967. Reggie Jackson was the best player in the American League in 1969, but didn’t win the MVP Award until 1973; I would would compare Reggie in 1969 to Joe DiMaggio in 1937. George Brett was the best player in the American League in 1976, didn’t win an MVP Award until 1980. Dick Allen was the best player in the National League in 1964, didn’t win an MVP Award until 1982. Rickey Henderson was the best player in the American League in 1981, didn’t win an MVP Award until 1990. Some players, it just takes time for the writers to focus on what they can do, and these offset the cases like Garvey and Lynn, who win the Award in their first big season.

Thank you all for reading, and enjoy your holiday. I have posted my research file in the Stats Depository (look under Stats up above), so that you all have a chance to find my mistakes. Thanks. Bill

COMMENTS (14 Comments, most recent shown first)

tbell
Regarding older candidates beating out more deserving younger ones ... might be a "Career Achievement Award" factor pushing votes to the older guy, as opposed to the younger guy who is probably viewed as having his whole career still ahead of him.

I wonder what the winning percetage is of those who had already won an MVP vs. those who hadn't.
12:35 PM Dec 3rd

Brian
So when Trammell lost the MVP award in 1987, not only was he the better player but the most important biases (better team, defensive position) should have worked in his favor.

Go figure.

I wonder which of these biases carry over to Hall Of Fame voting...

8:00 PM Nov 26th

CharlesSaeger
So, did you come up with an MVP predictor?
2:39 PM Nov 26th

bjames
However, still responding to SFisher's suggestion, I studied all votes comparing the performance of the MORE WESTERN performer against the more eastern. This involves all kinds of borderline and irrelevant calls about which is more eastern, Montreal or Philadelphia and which is more eastern, Seattle or Oakland, but. ... .somebody is always further West.

And the guy who is further west usually wins, actually. The "more western" candidate has a won-lost record in MVP voting of 496-302, a .622 winning percentage. This appears to be an accident, although one never knows. In the first 8 contests, all eight were won by the player further to the West (Frankie Frisch, St. Louis against Wally Berger, Boston; Chuck Klein, Philadelphia, against two guys from Brooklyn and New York, Carl Hubbell, New York, against Wally Berger, Boston (New York is to the West of Boston), Mickey Cochrane, Detroit, against Lou Gehrig, New York, Dizzy Dean, St. Louis against Mel Ott, New York, Hank Greenberg, Detroit, against Wes Ferrell, Boston, Gabby Hartnett, Chicago against Arky Vaughan, Pittsburgh, and Charlie Gehringer, Detroit, against Joe DiMaggio, New York. Then you pile on the New York-against-Ted Williams awards, and the West wins handily.

It's probably coincidence, but it COULD a manifest of the same anti-East paranoia that you credit to your father. If people in the midwest in the 1930s felt the same way about them durned East Coastians that your father did, they could have voted against the East Coast and created a bias going the other way.
1:50 PM Nov 25th

bjames
Responding to Fisher. . ..there clearly is no bias in favor of East Coast Players against West Coast. There are relatively few MVP contests that pit an actual East COAST player against a West COAST player. More than half of those have been won by the West Coast Player; I refer you to Vida Blue against Bobby Murcer, 1971, Steve Garvey against Mike Schmidt, 1974, Don Baylor against Fred Lynn, 1979, and Vladimir Guerrero against Gary Sheffield, 2004. Actually, I think those may be the only four award contests that actually pit an East Coast Player against a West Coast player. Broadening the definition of the East Coast gradually to include everything East of Denver, we can find some awards taken by Eastern-block teams from West Coast teams. . .for example, Killebrew over Reggie, 1969, Joey Votto against Adrian Gonzalez, 2010, Juan Gonzalez against Alex Rodriguez, 1996, Barry Larkin against Barry Bonds, 1994, Mo Vaughn against Edgar Martinez, 1993, Rollie Fingers against Rickey Henderson, 1981, and Roberto Clemente against Willie Mays, 1966. However, even then it's just a wash; I have it at 104 points East Coast for 102, West Coast.
1:40 PM Nov 25th

bjames
Responding to hots. . yeah, you and Steven Colbert; you don't see race.
12:31 PM Nov 25th

hotstatrat
"White players have won 18 MVP awards in seasons when black players were better . . . Justin Morneau won over Derek Jeter."

I did a double take when I read this. Did I read it right? It took me about 10 seconds to remember Jeter has a parent who is "black", which by most people's standards I guess qualifies him. I'd love to come back to this earth in 100 years and see how people label each other.
10:13 PM Nov 24th

sfisher
How about a bias favoring east coast teams? We live in California and my father has always been upset about the "east coast boas".
5:47 PM Nov 24th

hotstatrat
Good point, though, rgregory. I, too, suspect there is a bias in those directions.
- Hoot
4:05 PM Nov 24th

golden82
Good article, one error tho: in 1998 Albert Belle played for the White Sox not the Indians.
9:35 AM Nov 24th

exegesis
Very interesting articles, thanks!

7:19 AM Nov 24th

bjames
responding to rgregory. .. .I don't see how I could test those with those model, since those are one-sided issues.
5:28 PM Nov 23rd

rgregory1956
Hey Bill, two other biases I have noticed are playing for a new team (like Maris in 1960 and Dawson in 1987) and playing for a team that came out of nowhere, a team that improves dramatically from the previous year (Groat in 1960 and Robinson in 1961). Terry Pendleton had both of these factors working for him in 1991.
5:06 PM Nov 23rd

bjames
Typo there. . ..Dick Allen won his MVP Award in 1972, not 1982. Although he would still have been in the majors in 1982 if he had taken better care of himself and not pissed off the establishment at a historic pace. Wish I knew how to fix these typos in the body of the text.
2:04 PM Nov 23rd

The MVP Vote Bias Detector Part 3

COMMENTS (14 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: