Building Out the Files
In yesterday’s work I outlined a spreadsheet to explain what I am trying to do, tracking the total "Runs Prevented" by each of the 11 factors that we can measure. Today I am going to add three more columns to that, and speculate a little bit about where we seem to be headed. Today we are looking at Home Runs, Wild Pitches, and Hit Batsmen:
The team preventing the most runs ever by not allowing Home Runs was the 2015 Pittsburgh Pirates, who allowed only 110 Home Runs, 106 Park Adjusted, which helped them to finish 98-64. If they had been 4 standard deviations below the norm they would have allowed 200 home runs, so they beat that by 94 home runs, or about 132 runs.
The team preventing the fewest runs by not allowing Home Runs was the 1913 St. Louis Cardinals, 51-99. They allowed 57 home runs, park-adjusted 66, while the zero standard would have been 68. They were only about 3 runs away from having zero ability in this area. They led the league in Home Runs Allowed by 40% (57-40) while playing in a park with a Home Run Factor of 67. The Home Run factor may be misleading, because in that era many Home Runs were inside-the-park Home Runs.
The team preventing the most runs by Wild Pitch Avoidance was the 2018 New York Mets, who threw 26 Wild Pitches against a zero-competence standard of 112, saving themselves about 14 runs.
On the other end of that scale was the 1958 Dodgers, who threw 70 Wild Pitches, which is about 1 run WORSE than the zero-competence standard. They threw almost twice as many Wild Pitches as any other team in the league.
The team which prevented the most runs by Hit Batsmen Avoidance was the 1907 Chicago White Sox, who hit only 22 batters with pitches, while every other team in the majors hit at least 38 that season. The zero-competence standard would have been 105 Plunkers, so the White Sox saved themselves about 25, 26 runs by not hitting people with pitches. The 2004 Atlanta Braves did almost as well. They hit only 27 batters with pitches, in a season in which the zero-competence standard would have been 109.
On the other end of the scale is Ty Cobb’s 1922 Tigers, who hit 84 batters with pitches, in a season in which no other team in the American League hit more than 46. They were actually worse than the zero-competence standard, the only team that was.
OK, let’s do a little bit of assembly work. We’ll start with the 1951 New York Giants, since that’s a very famous team, and suits our other porpoises. They won the National League, of course, with a 98-59 record. This is our summary of their defensive performance so far:
Team: 1951 New York Giants (98-59)
|
|
|
Runs Prevented By:
|
|
Strikeouts
|
76
|
Control
|
111
|
Home Run Avoidance
|
155
|
Hit Batsmen Avoidance
|
14
|
Wild Pitch Avoidance
|
6
|
Balk Avoidance
|
|
Fielding Range (DER)
|
|
Fielding Consistency (F Pct)
|
|
Double Plays
|
|
Stolen Base Control
|
|
Passed Ball Avoidance
|
|
|
|
Sum of the Above
|
362
|
Actual Runs Prevented:
|
761
|
Error/Discrepancy:
|
|
And, to take a team of my childhood, the 1964 Kansas City A’s; they finished 57-105:
Team: 1964 Kansas City A's (57-105)
|
|
|
Runs Prevented By:
|
|
Strikeouts
|
105
|
Control
|
66
|
Home Run Avoidance
|
88
|
Hit Batsmen Avoidance
|
7
|
Wild Pitch Avoidance
|
7
|
Balk Avoidance
|
|
Fielding Range (DER)
|
|
Fielding Consistency (F Pct)
|
|
Double Plays
|
|
Stolen Base Control
|
|
Passed Ball Avoidance
|
|
|
|
Sum of the Above
|
274
|
Actual Runs Prevented:
|
581
|
Error/Discrepancy:
|
|
Overall, all 2,550 teams in the study, there are supposed to be 1,783,676 Runs Prevented which need to be accounted for, and 832,149 which have so far been accounted for. That’s 47%. The 1964 Kansas City A’s were at 47%; the 1951 New York Giants were at 47.6%. That’s why I chose them for illustration, although there are several hundred teams around 47%.
Let me point out to you a couple of things that we can do with this data, IFF we can make the system work. The 1951 New York Giants have a very high number of runs prevented by Home Runs, a fact which is hidden in the raw data since they gave up a pretty large number of home runs, but it was a Home Run park; the Giants always gave up homers because it was short down the lines at the Polo Grounds. The Giants’ manager was Leo Durocher, who had great success with his first two teams and significant success with the Cubs of the 1960s, getting them not into first place, but close.
Suppose that you line up all of Durocher’s teams, and look at where they were in each Run Prevention area before Durocher took over the team, and how they changed when Durocher took over the team. Compare Durocher to Casey Stengel, let’s say, or Joe McCarthy. In what areas of run prevention were one manager’s teams better than the other’s? In what areas were they worse? What tradeoffs were they making? It is possible that you might gain some understanding of how one manager was different than another. It is possible that, at some point in the distant future, 30 years from now, there might be a general understanding of a manager’s role in shaping his team which is different than the understanding that we now have, because we have a method that measures the success of the manager’s teams in a series of different areas.
The 1964 Kansas City A’s—this is from memory, so don’t quote me—but the 1964 Kansas City A’s gave up 220 home runs, which I believe was a major league record at that time, and remained a major league record for quite a while after that. Because of that, I expected the A’s to be near the zero-competence standard for home runs allowed.
But they’re actually not all that close to it. The A’s that spring had moved their fences way in. They had traded that winter for two sluggers, Rocky Colavito and Jim Gentile, and Charley Finley had visions of battering opponents into submission with the A’s new power. Their home runs allowed, park-adjusted, are not all that bad; we park-adjust the 220 Home Runs down to 172, and they’re still 63 homers away from the zero-competence standard, the misery line.
They’re not actually HORRIBLE in that area, but they’re not good, either. If we say they gave up 172 "actual" home runs, that’s 241 runs, and we see that they "prevented" 88 runs by Home Run prevention. That’s a record, for home run prevention, of 88-241. Their success rate, in that area, was 27% (.267. . ..88/329). Suppose, then, that you figured each team’s success rate in each area. Do you see where I am going with this? It seems like you would gain some real understanding of why certain teams succeeded, and why they failed.
But the 1951 Jints and the 1964 A’s are "good" examples; those are examples of the system apparently working more or less the way it is supposed to work. I mean, 47% seems too low; I’ve already accounted for strikeouts, walks, and home runs allowed; I should probably be up around 70, 75%, but that’s not a real problem, since I’m just guessing at the values. I can straighten that out later in the process.
However, frankly, all of the news here is not good news; the good news is not actually winning the battle. I appear to have some real problems. I’ll give you the two worst exemplars: the 1968 Dodgers and 1900 Boston Braves:
Team: 1968 Los Angeles Dodgers (76-86)
|
|
|
Runs Prevented By:
|
|
Strikeouts
|
127
|
Control
|
111
|
Home Run Avoidance
|
193
|
Hit Batsmen Avoidance
|
11
|
Wild Pitch Avoidance
|
8
|
Balk Avoidance
|
|
Fielding Range (DER)
|
|
Fielding Consistency (F Pct)
|
|
Double Plays
|
|
Stolen Base Control
|
|
Passed Ball Avoidance
|
|
|
|
Sum of the Above
|
450
|
Actual Runs Prevented:
|
461
|
Error/Discrepancy:
|
|
Team: 1900 Boston Braves (66-72)
|
|
|
Runs Prevented By:
|
|
Strikeouts
|
50
|
Control
|
43
|
Home Run Avoidance
|
35
|
Hit Batsmen Avoidance
|
15
|
Wild Pitch Avoidance
|
4
|
Balk Avoidance
|
|
Fielding Range (DER)
|
|
Fielding Consistency (F Pct)
|
|
Double Plays
|
|
Stolen Base Control
|
|
Passed Ball Avoidance
|
|
|
|
Sum of the Above
|
146
|
Actual Runs Prevented:
|
1150
|
Error/Discrepancy:
|
|
The 1968 Dodgers are supposed to have 461 Runs Saved, and we’ve already credited 450 of them. We’ve already accounted for 98% of their Runs Prevented, and we still have six categories of information to be added to the study. And we’re probably going to have to increase the values in the categories we have studied so far. Even if the Dodgers were fantastically awful in all of the areas not yet incorporated into the system, they would still show as saving substantially more runs than we believe that they actually saved—probably 60% more. I’m going to have a huge error in that team’s estimate of Runs Prevented.
The 1900 Braves, on the other, have so far accounted for only 13% of their runs saved. We’re not going to get there in their case.
Those are SERIOUS problems for my study. Those readings indicate that I may have a fundamental error in my concept, quite possibly so fundamental that my effort may fail. I may have to go back and re-think my outline for the system, and I may not be able to find a way to make it work. But I’ll keep moving forward, and we’ll see what we’ve got.
There is SOME tolerance for error; it’s an estimate, after all. It would be wonderful if the standard error of the estimates, at the end of the process, was 2%. If the error was 3% or 4%, that clearly would be acceptable; if it was 5%, that would be disappointing. If it was 10%, that would be intolerable; that would indicate a complete failure of the effort, unless one of my assumptions is wrong somewhere. The standard deviation of Runs Scored is probably somewhere around 10% of Runs Scored, maybe less, so if you have a 10% error, then you don’t have anything useful; you’ve just got random data.
Some of these problems may be simple errors; I may have a data entry error somewhere, or a problem that can be solved by altering one assumption. I don’t really know how many teams are going to be out of range at the end of the day. But I’ll just keep plugging away at it. Thanks for reading.