By Bill James

June 18, 2013

**Quick Hit 1**

**Manny Machado**

Manny Machado has hit 31 doubles through the Orioles’ first 71 games. At that pace he would hit 73 doubles this season, which would break the major league record for doubles in a season, which is 67 (Earl Webb, 1931). What is the chance that he will break that record?

As best I can estimate it, it’s about 3%. Although Machado has been hitting doubles at an incredible pace, he is only 6 games ahead of the record pace at this writing. He would still have to hit 37 doubles in 91 games to break the record—a formidable assignment.

To estimate his chance of breaking the record, we need to estimate two things:

1) The number of at bats he will get the rest of this season, and

2) How many doubles he should hit per at bat.

Machado has 307 at bats through 71 games. At that pace he would have 700 at bats on the season. Only four players in history have had 700 at bats in a season, so that’s asking a lot.

Let’s assume there is an array of possible numbers of at bats for Machado in the rest of the season, like this:

390 more at bats 30%

380 more at bats 10%

365 more at bats 10%

350 more at bats 10%

335 more at bats 10%

320 more at bats 10%

305 more at bats 10%

290 more at bats 10%

Of course these ten possibilities are standing in for a more scattered array. . .there is a chance that he could get 384 more at bats, and a chance that he could get 383 more at bats, etc. We’re simplifying the problem by reducing the number of possibilities that we have to deal with.

How many doubles per at bat will he hit?

Well, let’s assume that his propensity to hit doubles is the equal to that of any young player in baseball history. If we take all players in baseball history who have 800 at bats (career) by the age of 22, the only one who has averaged .075 doubles per at bat (as a young player) is Ted Williams. Stan Musial and Albert Pujols were at .074 doubles per at bat, and Joe Jackson, A-Rod and Lou Boudreau were at .072. Let’s assume for the sake of argument that Machado is at .075.

If Machado hit doubles with the frequency of a young Ted Williams, his chance of hitting 37 more doubles in 390 at bats would be 8.5%. If he had 380 more at bats, his chance of hitting 37 more doubles would be 6.4%; 365 at bats, 3.9%; 350 at bats, 2.3%; 335 at bats, 1.2%; 320 at bats, 0.6%; 305 at bats, 0.28%; and 290 at bats, .01%.

Combining those into one number, then, the chance would be 4%. . .(.30 times .085, plus .10 times .064, etc.)

So that method says four percent, but that method is perhaps a little unrealistically optimistic. It assumes that his propensity to hit doubles is the equal of Ted Williams and greater than any other player (and greater than Williams if you measure it per plate appearance, rather than per at bat), and it assumes that he has a 30% chance of getting 697 at bats this season, which would be the 8^{th} highest total of all time. So… .4% seems high; let’s say 3%.

**Quick Hit 2**

**Opposition Adjusted Winning Percentages**

I had an idea to "normalize" a pitcher’s won-lost record for the quality of the opposing team. It works like this: Suppose that a pitcher earns a win against the 1984 Detroit Tigers, who finished the season 104-58. Because he beat a team that won 104 games, we credit that as 104 wins—whereas if he were to lose to that team, we would charge that as only 58 losses, since most people lost to that Tigers team most of the team.

What this does, in essence, is to make every opponent a .500 team. The winning percentage against every team for the season, regardless of their won-lost record, will be .500. Take Terrell Wade, 1998. Terrell Wade in 1998 made two starts, getting one win and one loss—but the two starts were against the Yankees (114-48 that year) and the Red Sox (92-70). He beat the Red Sox, which gives him 92 points, but lost to the Yankees, which cost him only 48 points, so his won-lost record was 92-48, a .657 winning percentage. If you face teams of that quality and you break even, you’re doing OK. On the other end of the scale is Bob Anderson, with the 1962 Cubs. He also was 1-1 as a starting pitcher (in four starts), but he beat the 1962 Mets (40 points) and lost to the Cardinals. The Cardinals were 84-78, so that’s 78 points. His winning percentage was .339.

If a pitcher has only one decision, his opposition-adjusted winning percentage is always the same as his actual winning percentage (1.000 or .000). Among pitchers with two decisions or more, Terrell is the one is gains the most (+.157) and Anderson is the one who loses the most (-.161).

What do we learn from doing this?

It doesn’t make any real difference. Everybody’s winning percentage winds up about the same as it was anyway, because "quality of opposition" is not a very significant variable among starting pitchers.

I’ve noticed this before. . .it *seems* like it could be a big deal in some cases, but it never actually is. Among all pitchers in the study, the biggest "gainer" in winning percentage in a season was Roy Halladay in 2008. That is, Terrell Wade (1998) is the biggest gainer in terms of winning percentage, but Roy Halladay is the biggest gainer in terms of win impact. Halladay was 20-11 in 2008, a .645 winning percentage, but if you adjust for the quality of teams against which he pitched, his winning percentage jumps to .684. That’s a difference of 1.22 wins; Halladay was 20-11, but if you adjust for the quality of his opposition, he was really 21-10. Among all of the tens of thousands of pitchers in the study, that’s the biggest gain in a season. Joe Kennedy in 2002 was second (1.161), followed by John Montefusco in 1976 (1.132) and John O’Donoghue in 1964 (1.072). You will notice I kept the list going until I hit one of the Kansas City A’s of my childhood.

The biggest loss in opposition-adjusted winning percentage is for Tom Murphy in 1970; he went 16-13 (.552) with the California Angels, but his winning percentage drops to .512 if you adjust for the quality of the teams he pitched against. That’s a net loss of 1.15 wins, followed by Dave Burba in 2001 (1.14), Freddy Garcia in 2004 (1.11) and Ismael Valdez in 2002 (1.04). Nobody else loses a full game.

With regard to career numbers, same thing; the changes are so small they are hardly worth mentioning. The biggest gainer, when you adjust for the quality of the teams he pitched against, is Phil Niekro . Niekro’s winning percentage as a starting pitcher was .535; adjusting for the quality of opposition, it increases to .543. That’s 3.97 wins.. ..4 wins, over the course of his career. Second on the list is Gaylord Perry (3.52 wins), and third is Josh Beckett (+3.38 wins).

The note about Beckett is kind of interesting, because

a) Beckett’s career is much shorter than the other pitchers at the top and bottom of the list, and

b) We have observed. .. .many of you have observed. . .that Beckett has pitched extremely well against good teams. His career winning percentage against teams with winning records is .607; against teams with losing records, .547.

That’s unusual, and this is a manifestation of it, but still. ..it’s 3.38 wins. It’s not a big deal.

The pitcher whose* loses* the most in this analysis is Steve Carlton (-4.24), followed by Jeff Suppan Sandwiches (-3.90), Kyle Lohse (-3.78), David Cone (-3.65) and Blue Moon Odom (-2.97.)

There are two related questions here, the quality of the opposition and the quality of the support. The quality of the support for a pitcher is a big thing, because there’s a consistent bias from start to start. The quality of the opposition. . .it pretty much evens out.

## COMMENTS (15 Comments, most recent shown first)

jayodumAnd it's now July 29th, and he still has 39.

11:45 PM Jul 29thCharlesSaegerFunny, I was checking this one out again when I saw that Manny hit is 39th double of the year today.

7:12 PM Jul 6thbjamesManny Machado Update, July 6. Manny has now not hit a double in his last five games, and his chance of breaking the record has now dropped to 5%.

11:20 AM Jul 6thbjamesThe hitter has more impact on the outcome of an at bat than does the pitcher. We know this because the spread of occurrence among hitters is greater for almost every type of event there is than the spread of occurrence among different pitchers. But I don't understand at all how you think this is relevant to the Game Scores study.

7:10 AM Jun 21stKevinBrazeeThank you for the articles or Quick Hits.

Observation on Quick Hit #1 - does it make sense that the most likely % of AB is 390? Wouldn't it follow a normal curve or with the potential for injury be shifted to the left? I would think if anything > 390 is basically zero than 390 would not be the highest of all the possible expectations.

Observation on Quick Hit #2 - I have been thinking about similar thoughts mostly in the batter/match up who has the most impact. I would guess the batter but that is just a guess. In this study would it be useful to use the same logic but apply it to the opposing starting pitcher? Or more complicated maybe a weighted ratio of the opposing starting pitchers W-L record and the opposing teams W-L record. You could weight by the starters average number of innings per start. Maybe also instead of weighting in team W-L you could weight in opposing teams bullpen record but I am not sure.

7:05 AM Jun 21stTrailbzr@wwiyw: Bill posted three years ago that he thought the career record for doubles would be broken by an active player, because it would take 20 years of 40 apiece, and most seasons there are lots of players who hit that many. However, at the time Bryan Roberts was the only player with over 205 doubles in five seasons, so 40 doubles doesn't seem to be something done consistently.

I suspect Machado will finish this season with more doubles than Bill estimates. He's obviously a good hitter, but has no HR power. Makes me think of Wade Boggs, who hit 40 doubles in eight if his eleven seasons in Boston.

6:43 AM Jun 21stbjamesGiven the approach I used, the doubles' tendency of Camden Yards would only appear to be relevant if it was higher than any other park or almost any other park. It isn't. Camden Yards increases doubles by about 4%--as opposed to Coors Field (19%), Fenway (30%), Arizona (11%), Texas (11%) or Pittsburgh (8%).

9:44 PM Jun 20thwwiywIsn't Nick Markakis an uncommonly proficient doubler? I seem to remember a piece you wrote about his chance of breaking Sisler's career record (before his career path stalled out). Is there something conducive to doubles about Camden Yards? And if so, should that cause you to raise Machado's chances--even just a bit?

7:12 PM Jun 20thbjamesGillette's post perfectly illustrates the Whitey Ford problem. If not facing the Dodgers VERY OFTEN lowered the quality of opponents faced by Warren Spahn. . .how often exactly did Ford face the Yankees?

10:35 PM Jun 19thbjamesI kind of regret the way I stated this, particularly with regard to Halladay. I didn't make effectively the argument I was trying to make.

If a pitcher faces a .600 team (as opposed to a .500 team) in one game, that's a "challenge load" of .100. In the case of Halladay, who was +1.22 in a season, that means that he faced TWELVE more .600 teams than expected. This is, in a sense, quite remarkable. If you just look over the list of teams that he pitched against that season, it looks very remarkable. . .it seems that he is pitching the entire season against .600 teams.

However, if you actually calculate the impact that this has on his expected winning percentage, you find that it is still relatively small. In the case of Halladay it is about 40 points, which is not trivial, but that is the most extreme case in the data. And it is still not a very large effect. It LOOKS large, but it actually isn't, if you do the math.

Regarding the "Whitey Ford effect". . .the largest examples of that are not in the 1950s; they're actually NOW. The reason that is true is that, in the 1950s, a team only potentially faced 7 opponents, and faced each one the same number of times. It is not really possible, in that environment, for a pitcher on a great team, to face strong opposition, because there is only one .600 team in the league, and he's on it.

But in the modern environment, Halladay pitching in a 16-team league with interleague play. ....a pitcher can be (and Halladay was) repeatedly challenged to pitch against other .600 teams. My point was not really that this doesn't happen; my point was that it actually doesn't have very much impact on a pitcher's won-lost record.

4:41 PM Jun 19thhankgilletteHow about Warren Spahn during those years that he almost never faced the Dodgers? That must have lowered the winning percentage of the average team he faced.

4:39 PM Jun 19thbjamesResponding to the Whitey Ford thing, which I have researched before (although I broke off this particular study at 1960). . .Ford actually faced below-average competition. Stengel DID spot Ford against the best teams of the other seven teams in the league; however, since he was always ON the best team in the league, he never pitched against the best team in the league. By far the largest bias in quality of competition faced is the advantage or disadvantage of being on a good or bad team. In an extreme case (like Ford, or Carlton in '72), no other element of the problem is large enough to offset this advantage/disadvantage.

4:32 PM Jun 19thDavidToddTwo good articles. You answer questions I don't even think of.

11:08 AM Jun 19thTrailbzrI might have expected Casey Stengel's Yankees to have significant divergence, because I thought he tried to move is best pitchers to face the most important opponents. However, this affect could be masked because the average non-Yankee team of the era was only .480, so everyone on the staff was getting penalized a few points compared to .500.

8:11 AM Jun 19thjalbrightI think if you look at guys back further, and Whitey Ford comes to mind, that some pitchers were "saved" by their managers to face the better teams in the league. It's not something we've seen much since the early sixties, I agree. In fact, you discussed it when writing about Stengel in your book on managers.

7:55 AM Jun 19th