87 Percent Less Fun than Wild Bitches

By Bill James

March 17, 2020

87% Less Fun than Wild Bitches

The 1977 Boston Red Sox, with a colorful starting rotation including Bill Lee, Luis Tiant, Bob Stanley and Ferguson Jenkins, threw only 15 Wild Pitches all season. 15 Wild Pitches is not a record low; the 1921 Boston Braves threw only 8, but for reasons beyond the need of immediate explanation the 1921 Braves calculate as 1.9 Standard Deviations better than the period norm, or 119, whereas the 1977 Red Sox are 2.7 standard deviations better than the norm, or 127. It’s a record, but you probably guessed that. These are the top 10 of all time:

YEAR	City	Team	Score
1977	Boston	Red Sox	127
1978	Boston	Red Sox		127
2002	New York	Mets	123
1992	San Diego	Padres	123
2008	Houston	Astros	123
2018	New York	Mets	123
1974	Cleveland	Indians	123
1994	Baltimore	Orioles	123
2000	Atlanta	Braves	122
1981	Kansas City	Royals	122

And these poor sufferers were the most Wild Pitch-afflicted, relative to their era:

YEAR	City	Team	Lg	Score
1958	Los Angeles	Dodgers	NL	52
1936	Philadelphia	A's	AL	54
1986	Texas	Rangers	AL	58
1920	Washington	Senators	AL	58
1989	Philadelphia	Phillies	NL	59
1970	Houston	Astros	NL	62
2000	Cincinnati	Reds	NL	64
1915	Philadelphia	A's	AL	64
1973	Cleveland	Indians	AL	66
2009	Kansas City	Royals	AL	68

The 1958 Dodgers threw 83 Wild Pitches, while several teams since then have managed to clear the century mark. These are the decade norms and standard deviations for Wild Pitches divided by batters faced:

From	To	Average	Standard Deviation
1900	1909	.00553	.00158
1910	1919	.00536	.00178
1920	1929	.00363	.00118
1930	1939	.00428	.00139
1940	1949	.00424	.00133
1950	1959	.00490	.00142
1960	1969	.00791	.00191
1970	1979	.00746	.00184
1980	1989	.00731	.00179
1990	1999	.00862	.00195
2000	2009	.00806	.00193
2010	2019	.00930	.00222

As you can see, Wild Pitch rates have gone steadily upward since 1960. They were more than twice as high in the 1970s as in the 1920s, and have continued to go up since then. This is, I would assume, because of the increased velocity of the fastball. More max-effort pitches. Also, let me confess to a minor error in the previous report. In the chart published yesterday, it said that the average of Hit Batsmen in the 1910-1919 era was .00536. That was actually the average for Wild Pitches, today’s number. The average for Hit batsmen was .00737.

As has been true in other categories, the teams which threw fewer wild pitches did better in wins and losses, with the top 510 teams having an average won-lost record of 82-74:

Fewest Wild Pitches	82	74	.526
Second Fewest	80	77	.512
Average	79	78	.503
More Wild Pitches	76	80	.487
Wild Pitchers	74	83	.472

Also, Wild Pitches are fellow-travelers of both Walks and Hit Batsmen. Teams which had more Wild Pitches also had more Walks, in about the same proportion as teams which had more Hit Batsmen also had more Walks. And teams which had more Wild Pitches also had more Hit Batsmen, in a slightly smaller proportion.

Let me explain now what I am actually trying to do, why I am doing these studies. For at least 35 years it has been my opinion that the idea of evaluating fielders compared to an imaginary center line, what an average fielder would do, is a very poor idea. I would explain it this way. In 1876 Cap Anson hit .356—110 for 309—while the National League Batting Average was .265. A two-game outfielder named Live Oak Taylor hit .375 (3 for 8), while Oscar Bieleski hit .209 (29 for 139), and Bill Craver hit .224 (55 for 246).

It can be said, then, that Cap Anson was 28 hits better than an average hitter (since the distance between .356 and .265, in 309 at bats, is 28 hits.) Live Oak Taylor was 1 hit better than average (+1), Oscar Bieleski was 8 hits worse than average (-8), and Bill Craver was 10 hits worse than average, or -10.

Suppose, then, that the league statistician in 1876, rather than reporting that Cap Anson had hit .356 in 309 at bats, had merely reported what he regarded as the essential fact: that Cap Anson was 28 hits better than the league. Cap Anson was +28, Live Oak Taylor was +1, Oscar Bieleski was -8, and Bill Craver was -10; I’m sorry folks, that’s all the information I have to give you.

Can you begin to see the problems that this would have created for future generations of baseball fans? Can you understand why that really would not have been the best way to go about it?

The essential problem is that the evaluation--+28 for Anson, -10 for Craver—the evaluation creates no real basis for understanding. It lays no foundation for the growth and development of better methods. It’s an end point.

Pitchers? Well, the basic thing is Wins and Losses, and Al Spalding was 47-12, so we will record that as +35; that gets the essence of that, don’t you think? That’s all you need to know. Jim Devlin was 30-35, which is -5, and Dick McBride was 0-4, so we’ll record that for posterity as -4.

It is not the basic job of statistics to evaluate players. It is the basic job of statistics to describe the player’s performance. If you describe the player’s performance with accuracy and detail, then it becomes possible to evaluate his performance, to find the value of it. But if you start the process with the evaluation, then there’s no pathway to walk on toward greater understanding.

As you probably know, John Dewan has created many or most of the modern, sophisticated Defensive Statistics that we now have. Runs Saved, Runs Above Average, Outs Above Average. . . that’s all John’s work, or most of it is. John Dewan is my close friend and longtime colleague. We have been business partners for almost 40 years, have worked together on countless projects, and John is the co-owner of this site, Bill James Online. I have the highest possible regard for John, and for the work that he does.

But we have also had this argument for three-plus decades. I don’t think that the idea of rating players by how they compare to the average is bad; I think it is horribly, absolutely, fantastically bad. While there is no doubt that we have much better fielding numbers now than we had a couple of decades ago, I believe that John’s initial mistake—rating players first, rather than describing them first—has enormously limited the value of what he has done in this area. It has hobbled him. It has prevented his work from leading to the understanding of fielding performance that we SHOULD now have, given the tremendous amount of work that he has done in this area.

John and I have argued about this issue since the mid-1980s, and John has made dozens of efforts to address what he sees as my concerns. He has methods now that will tell us that Shortstop X is +7 in going to his left and +1 on balls hit toward him, but is -28 in going to his right, thus -20 overall. In John’s view, this solves the problem; it fills out the chart, describing the player’s performance. In my view, it doesn’t address the problem at all; it merely extends the bad methodology into new areas. It’s like saying that Cap Anson, in addition to being +28 hits, was +7 doubles and -2 triples. It doesn’t solve the problem.

There are facts, and there are opinions. If you say that there were 184 balls hit between the shortstop’s normal position and the second base bag, and Shortstop X made outs on 116 of them, that’s a fact. If you say that he reached 147 of them, made outs on 116, made errors on 6 of them, made no throw to first on 11 of them and threw too late to first on 14 of them, those would all be facts, assuming that that is what happened. If you say that he made 97 outs to first base on those and 19 outs at second, those would be additional facts. If you say that he started 47 double plays on those balls, that would be an additional fact; if you say that he started 9 of those double plays by touching second base (6-3) and started the other 38 by throwing to the second baseman (6-4-3), that would be another fact. If you can tell us how many double play situations that there were there and how many double plays were NOT turned, and how many double plays were not turned because the runner from first moved up and how many were not turned because the throw to first base was too late or was not made at all, those would be additional facts.

But if you say that he is +7 runs going to the second base side on ground balls, that is not a fact; that’s just an opinion, stated as if it were a fact. I’m glad to know your opinion; I value your opinion and I respect your opinion. But what I really WANT is a better set of facts. I would like to know what each of those things is at home and on the road, and I would like to know what it is with each pitcher on the mound, and I would like to know what it is against left-handed batters and right-handed batters. We could have and should have as many facts at our disposal to create an understanding of Nolan Arenado’s defense as we do to create an understanding of his batting. But what we have is, he’s +8 runs.

It may seem that I am asking for a vast amount of information here, but really, I am not. Suppose that we designate the area between second base and a halfway point between second and third as "Zone 4", and we publish a Zone 4 Ground Ball report, like this:

Player	Zone 4 GB	6-3 GO	6-4 GO	6-3 DP	6-3 Adv	643 DP	6-2 Out	1B	IH	Other
Nobody, Jack	154	71	21	5	18	18	2	37	3	1
Somebody, John	170	92	22	8	14	17	4	31	2	2

The category totals do not add up to the Zone 4 Ground Ball total because 6-3 Advance (a runner goes first to second on a ground ball to short) is a subset of 6-3 Ground Out. Also, now that I look at it, I see that I left "errors" out of there; there should have been a column for errors. But it’s not complicated, is it? You can detail the performance of all shortstops, on ground balls hit in that zone, in one chart.

I have been frustrated on this subject for nigh-on 40 years now. I still hope that, if baseball survives, that we may eventually get defensive records that describe the outcomes and that make sense, so that we can decide for ourselves how we would evaluate the fielders. I do not have the coding skills to create the records that I would like to see, and, more particularly, being an older person, I am actually more interested in the defensive performance of Yogi Berra and Mark Belanger than I am in the defensive performance of Watch It Buster Posey and Angular Andrelton Simmons.

What I am trying to do here, with this series of studies, is to fill in a gap in our defensive records, not with John’s approach but with my own. This is the cornerstone assumption of this series of studies, what I am about to explain in the next few sentences. The Washington Nationals scored 873 runs last season, 2019. Based on the batting statistics we have and the methods which have been developed over the years, we can say with a fair degree of accuracy who created how many of those runs. Anthony Rendon created 130 of them, Juan Soto created 117, Matt Adams created 42, Adam Eaton created 88, Howie Kendrick created 58, etc. We don’t absolutely know, but we kind of know.

Runs created estimates are not EVALUATIONS; they are ESTIMATED FACTS. Two players created 70 runs each, let us say. One of them might be a near-MVP, the other might be a drain on the offense, due to outs made, parks, defensive contributions and other things. If we were offering this as an EVALUATION, then we would be saying that a player creating 75 runs is better than a player creating 70, but we’re not saying that, at all. We are merely saying that here is an estimated fact that you can throw into your evaluation. It is one fact among many which can be used to evaluate the player’s contribution.

What if there was a similar estimated fact for each player’s Runs Saved—not his Runs Saved against Average, but his Runs Saved, gross? Would not this be a contribution to our understanding of his role on the team?

That is what I am trying to get to here; I am trying to create a way to estimate Runs Saved by each fielder and each pitcher—not Runs Saved against average, which is an end point to the discussion, but rather, Gross Runs Saved or Runs Saved against zero, which is an element of understanding.

Well. . .but how many runs are Saved by the team? The 2019 Washington Nationals scored 873 runs, but how many runs did they Save, as a team?

It is not a perfect and unassailable truth that Offense and Defense are perfectly balanced, that Scoring Runs is half the game and preventing them is half the game. It is not a perfect and unassailable truth, but it is a general and usable truth which can be validated in various ways. If offense and defense are equal then, on a "league" basis—understanding that the league is no longer a completely self-contained entity—but on a league basis, runs prevented are equal to runs scored. If there were 11,449 Runs SCORED by National League teams in 2019, there must also have been 11,449 Runs PREVENTED by National League teams—not perfectly, because the league winning percentage was not exactly .500, but we can adjust for that. The question is, who prevented how many of those 11,449 Runs that were Prevented by Defensive Performance?

Can you understand why THAT question creates a better platform from which to evaluate defensive performance than the zero-based, +/- system? Does that make sense to you? Probably it does to some, doesn’t to others, but let’s move on.

How many runs were prevented by the Washington Nationals?

Well, that number we can get to pretty easily. The Nationals pitched 1,439.1 innings during the 2019 season. The league average for runs allowed/inning was .5268. .5268 runs per inning for 1,439.1 innings would be 758 runs. The Nationals Park Factor was 110, which creates a Park Adjustment Factor of 1.046. Adjusting for that increases the Expected Runs Allowed to 793. If Runs Saved are equal to Runs Scored, then every run they did NOT allow below twice that number would be a Run Saved. Twice that number would be 1,586. The Nationals actually allowed 724 runs. That means that their Runs Saved were 862. That Nationals’ pitching and defense, combined, saved 862 runs on the season. The team scored 873 runs; they saved 862.

In the 1970s, I was able to figure out a way to say how many of the runs scored by a team were created by each hitter. What I am trying to do HERE is figure out a way to say how many of the runs SAVED by the team were saved by each pitcher and each fielder. It is a very comparable undertaking.

It will take me months to do this; hell, it has already taken me six weeks or so—and there is no guarantee, not even a reasonable expectation, that I will reach the finish line of the effort. I might get something that works, I might not. But that’s what I am trying to do.

The FIRST thing you have to do, in that effort, is to establish the zero points, the position of NO effective contribution. At what point can we say that this number of strikeouts, this number of walks, this number of Wild Pitches, etc.. . . .this number is not even a major league number; it is, rather, the floor against which a major league player may be measured?

It appears, based on the research that I have done so far, that the floor might be five standard deviations below the norm. If you’re five standard deviations below the norm, you’re just some guy on the field taking up space. The worst team in terms of strikeouts, the 2003 Detroit Tigers, was 2.8 Standard Deviations below the norm. The worst team in terms of walks, the 1915 Philadelphia Athletics, was 4.5 standard deviations below the norm. The worst team in terms of Hit Batsmen, the 1922 Detroit Tigers, was 4.6 standard deviations below the norm. The worst team in terms of Wild Pitches, the 1958 Los Angeles Dodgers, was 4.8 Standard Deviations below the norm. I also know, because of research that I have done but not yet reported to you, that the worst term in terms of Home Runs Allowed (Park Adjusted) was 4.0 Standard Deviations below the norm, that the worst team in terms of DER (Defensive Efficiency Record) was 3.9 Standard Deviations below the norm, and the worst team in terms of turning the double play was 3.0 Standard Deviations below the norm. It is beginning to look as if 5.0 Standard Deviations below the norm might be a floor, beyond which there are no major league teams.

But we don’t KNOW that yet; that is, in truth, merely an estimate of where we should begin to LOOK for the floor. In order for the system to work, the runs saved by the 2019 Washington Senators by all of these things and several more things has to total up to somewhere pretty close to 862 runs—and the pathway toward that number has to be at every step reasonable and consistent with all of the other measurements. I have no idea, at this point, how I am going to get there; I merely believe that it can be done. The floor might be 50 (five standard deviations below the norm), or it might be 60 or 70 or 20; I really have no idea. It might be 50 for one category but 70 for another. But that’s what I am doing; I am taking measurements, looking for the floor. I am looking for the zero point.

COMMENTS (50 Comments, most recent shown first)

DHM
I also really like chrisbodig's first comment, with the example of what Nettles' ACTUALLY DID. The context of the entire game is useful in determining how much value was in his defensive play. Kind of like RBI opportunities on the other side of the field. Nettles wouldn't have been able to save a run if there wasn't anyone on base, but it still happened and that should be given credit somehow. Thanks everyone, I'm enjoying the opportunity to catch up on all these articles and comments!
10:06 AM Mar 29th

GuillermoMountain
Okay, since we're still going here, let me try to explain why runs saved must equal runs created, even though I know our host is tired of the back and forth.

One of the things that we love about the game is that is is a zero sum game: if a hitter gets a hit, the pitcher's record reflects that hit as well. Hitters' hits equal pitchers' hits. Likewise, When a team scores a run, the opposing team is credited with a run allowed. As I said, zero sum.

Now let's think about the Runs Created stat. The stat's premise, at least in theory, is that if it's calculated correctly, all of the players in baseball will have their cumulative Runs Created sum up to the number of runs actually scored in MLB.

Now let's say we want to come up with a stat, similar to RC, which will accurately measure the value of pitchers and fielders in absolute terms. Let's say we've found the formula to quantitate each player's defensive value precisely. Cool! It's a defensive runs stat analogous to Runs Created! Now we add all players' defensive numbers together. What should the total be? Well, if you believe, as most of us do, that the fielders and pitchers are equally important to the outcome of a game as are the hitters and baserunners, that final total MUST be equal to the total of all runs scored. Zero sum. Which is why runs saved must equal runs scored. If the runs saved total were less than runs scored we'd be undervaluing defense. If were larger than runs scored we're overvaluing defense.
4:42 PM Mar 24th

raincheck
Sometimes I wonder if we gotten to the right units yet. Defense creates out. It ALLOWS runs. It is good when it creates outs at a greater rate - that results in less runs allowed.

We can measure both outs and runs. Runs saved is elusive, at least to me, unless it is a product of excellent measures of outs created and runs allowed.
1:22 PM Mar 24th

MarisFan61
KSC: It has nothing to do with my feelings being hurt.
In fact they aren't.

It's that I'm genuinely interested in Bill's work, and in these kind of sophisticated things in general (whether about baseball or otherwise) -- and I'm interested to know that thing as part of understanding this whole thing.

Plus, as a separate thing, believe it or not, I'm trying to HELP Bill.
I realize full well that he doesn't particularly want it, and certainly not from me; but to me, that's not a reason for me not to.

It HURTS Bill -- it can't not -- when he puts out stuff that isn't clear and which he doesn't explain adequately -- in this case, at all. It at least somewhat undermines the goodness of what he does, And it can only help him if he realizes that a thing he has put out is unclear, which this thing without a doubt is.

Of course if he doesn't care and doesn't think about it, it doesn't help him, but again, that's not a reason either for us not to try.
8:14 PM Mar 21st

sprox
I'm okay with Runs Scored = Runs Saved. You have to start somewhere and 50% offense / 50% defense is a place to start. A league scores 11K runs so a league therefore prevents 11K runs. Sure why not? Once we see where this takes us it will become readily apparent either way whether this was a good starting point or not. So let's have some patience.
6:17 PM Mar 21st

ksclacktc
@MarisFan61
Stop your damn whining. We're in the middle of a pandemic and you whine because he said he was moving on and your feelings are hurt. Get over yourself. If it bothers you leave, and stop dominating every thread on here with your rubbish.

PS he said some get it, and that is true.
1:09 PM Mar 21st

voxpoptart
Oops, that Run Scores EQUAL Runs Saved, etc. Because we know, by observing hundreds of leagues, that offense and defense contributions are equally important to winning.
11:36 AM Mar 21st

voxpoptart
Some of us -- me, Studes -- get the idea, and we're trying to explain it. He says Runs Scored + Runs Saved because offense and defensive contributions have an equal effect on winning. It's because being the best or worst offensive team has the same effect on winning that being the best or worst defensive team has ... so the contributions have to be assessed as basically equal.
11:29 AM Mar 21st

jgf704
chrisbodig wrote:

What I think most baseball fans would love to see is not an evaluation of how good a player is based on a set of formulas but how many runs he ACTUALLY SAVED.
...
I want to know what a player DID and how his deeds impacted real baseball games.

FWIW, I think modifying a runs above average measure (like DRS or Statcast Outs Above Average) with leverage index will get at what you are asking for. Unfortunately, it doesn't exist yet, but I wouldn't be surprised if it were in the works (especially for Statcast OAA).
10:17 AM Mar 21st

nettles9
chrisbodig’s post— the first for this article— is the best post of all of these! :-)
6:57 PM Mar 20th

MarisFan61
(just did a follow up on this to "Hey Bill")
12:30 PM Mar 20th

MarisFan61
To Threedog and anyone else who doesn't get that thing:

Bill's reply on Hey Bill was Bill James at his worst.

We've got a thing that absolutely none of us get (at least from what's been said on here), and which may well not be true, or if it's true it isn't evident in a way that Bill thinks it is, which means:

......either way, he is mistaken.

His declining to try to explain it, and worse yet, his being dismissive in that kind of way, utterly sucks.

We all do, sometimes. But it helps when we know that we've sucked, and hopefully we can (1) come back to the current situation and undo it, to the extent possible, and (2) realize that we sucked, for whatever that may be worth.
12:26 PM Mar 20th

hotstatrat
Perhaps, the "floor" Bill is looking for is an all-time replacement level. It is how bad a team can be in each area if they are really a Major League team. Perhaps, it doesn't have to be 5 deviations below normal or the worse of any team ever. Perhaps, some major league teams in history have been worse than replacement level.

How would one go about deciding on replacement level? Can we judge some of those 20th century Philadelphia teams really Major League? St. Louis Browns? 1962 Mets? 2003 Tigers?

Perhaps, it is unfair to make these judgements. However, we are making these judgements about individual player performances. Is it crazy to say that about a team performance in some specific area? I don't know. I haven't had enough coffee yet this morning to think this through. Perhaps, one of you can tell me why I am way off.
10:08 AM Mar 20th

MarisFan61
Studes: What you're stating is what we all know.

What we DON'T know is how that translates to "runs prevented = runs scored."

Do you see how Bill takes that leap?
9:53 PM Mar 19th

studes
BTW, in case you haven't see it, Tango and Bill had an extensive conversation about this on Twitter. Here's part of the conversation:

https://twitter.com/tangotiger/status/1240047629775572992
8:46 PM Mar 19th

studes
Well, I think I get what Bill means. We all know that runs scored = runs allowed and that the responsibility for outcomes is roughly 50/50 between offense and defense. So this is the math he uses to get to a 50/50 split.

In Win Shares, he used marginal runs scored and allowed (anything over or under 50% or 150%). Now he's kind of setting the margin at 100% to get to a zero-based system. That's my interpretation anyway.

Bottom line, he wants offense and defense to be weighted roughly equally and this is his language for getting there.
8:43 PM Mar 19th

MarisFan61
Interim impression: Nobody really gets what Bill meant, or why that would be so.
2:40 PM Mar 19th

ksclacktc
@steve161 Thanks that is exactly the point.
8:50 AM Mar 19th

steve161
Using data from the Handbook, here's a way of looking at it:

National League teams allowed 11449 runs last season, or (rounded) 763 per team.

The Dodgers allowed 613 runs, or 150 fewer than average. How many of those runs-not-allowed are credited to the pitchers, how many to the defense?

The new DRS system (Handbook pp 49-56) credits the Dodgers with 135 Defensive Runs Saved. Doesn't leave much for the pitchers, and at least offers a suggestion as to why DRS numbers are so non-intuitively high.
8:27 AM Mar 19th

MarisFan61
Cool -- that takes care of my "preamble," which I was pretty much dismissing myself.

Anything more about that thing Bill said, about "runs saved" (or prevented, or whatever) being equal to runs scored?
12:47 AM Mar 19th

studes
I think Runs Saved and Runs Prevented are the same thing.
10:34 PM Mar 18th

MarisFan61
Studes: Thanks for that answer.

Now I do get where you were coming from, but I don't see it as applying here.
Maybe the issue is similar to the issue (or one of them) in what I just said down there: Maybe it's that we're going on very different concepts of what "Runs Saved" means.

I see the thing in Win Shares as being just a model to use for attributing the relative amounts of defensive goodness to the respective teams.
I don't see the "Runs Saved" concept as relating to that, at least not how I see that phrase. (More upon request, by anybody, but don't want to take more space on it right here.)

It's feeling like much of the unclarity and disagreement here does relate to the "runs saved" vs. "runs prevented" thing that I talked about down there -- not necessarily that Bill meant different things by them, which I'm guessing he didn't, but that the seeming interechangeability of the terms is a manifestation that some here see "runs saved" as a broader, more general thing than some of the rest of us.
9:32 PM Mar 18th

MarisFan61
OK, re my saying I think I get it:

Actually I should have said I think I 'sort of' get it.
I think I might be close to getting it; I still can't exactly see it. I can sort of see it if we put a couple of tweaks in it.

Preamble:
Maybe part of the confusion is that in the middle of the article, Bill switches temporarily from saying "runs saved" to "runs prevented," and then back. Did he mean them synonymously, or did he digress for a moment into a slightly different thing?

(When he said the thing we're not getting, it was in the midst of his having switched to "prevented.")

I'm guessing they're synonymous (and no issue), but better to be aware of this, just in case.
End of preamble.

I can see that depending on how we see the meaning of "runs saved," one way of seeing it is that it can't be any greater than the typical amount of runs that the opposing teams would score.
.....because, "saved" compared to what?
Like, remember in that early annual Abstract where Bill was talking about the common utterance that Ozzie Smith or whoever saves 100 runs a year, and Bill wrote something like, "Compared to what? Compared to me playing shortstop? Compared to not having a shortstop at all?"
I can see that it's not meaningful to compare it to anything but an average competent major league regular SS.

Similarly, I can see that the only meaningful standard against which to measure a whole team's "runs saved" is to compare it to an average team's runs allowed.

Which of course is the same as an average team's runs scored -- which is how I maybe come close to getting what Bill meant.

But with a couple of "buts."

-- I don't see why Bill would have put it in terms of runs scored rather than allowed, because in how I'm seeing it, "allowed" is the most relevant thing, the direct thing.
(Which makes me think he means something different by everything he's saying than what I'm imagining.)

......and

-- By any concept of "runs saved" that I can see, a team's amount of runs saved wouldn't anyway be that amount; it would be less.
That amount -- i.e. an average team's number of runs allowed (or "scored" -- same number) would be the maximum theoretical number of runs that a team could "save," which would be only if they never allowed a run for the whole season.

So, as I said, I'm still not really seeing it. At most, I'm maybe getting close to sort of seeing it.

At least I'm trying. :-)
8:26 PM Mar 18th

77royals
Isn't all of this for Runs Saved (walks, hit batters, wild pitches, strikeouts) leading to the amount of runners on base and how many opportunities they have to score?

Vs outs made? Or actual runs scored?

Help?

6:49 PM Mar 18th

jgf704
FWIW, I agree with studes. Bill's WS thing of MDR = 1.5 * RAVG - R is the same thing he's doing now. Except now he's set the RAVG multiplier as 2 instead of 1.5.
5:20 PM Mar 18th

frisco
I always thought defense metrics should start with just straight range factor per inning. Then, you could take that number and make a series of adjustments based on K's, handedness of pitching staff and hitters faced, GB/Flyball ratio, and so on. But you would START with the raw, undisputable number. That's your base.

Obviously, today we have a lot of data but I think the numbers are based on interpretation and assumptions that present stats as seemingly objective but aren't. "New" fielding metrics tend rely on evaluation on what happened right at the start.

There was a comment in the Bill James Handbook that they are only rating fielders now on what they did AFTER the ball was hit. That doesn't make any sense to me. A play that is made is made. Whether due to extraordinary physical effort or maybe ordinary physical effort and great positioning. Shouldn't be penalized for the latter.

My Best-Carey
4:19 PM Mar 18th

studes
Hey Marisfan, check out page 17 of Win Shares. Marginal defensive Runs = Anything less than 1.5 times average runs scored. IOW, runs saved is a factor of runs scored. Maybe I'm misinterpreting, but I think it's fundamentally the same thing.
4:19 PM Mar 18th

voxpoptart
To me, the logic behind "runs saved = runs scored" is that the variation among teams' runs-scored and the variation among teams' runs-allowed have always been about equal. In other words, there's the same amount of *achievement* to account for on both sides of the ball, because there's the same amount of gap between best and worst.

It's not an equality that's abstractly necessary; it's just the one that fits a century-plus of data best.
3:52 PM Mar 18th

MarisFan61
I think i get it. Details in a bit....
3:23 PM Mar 18th

ksclacktc
Chances are if you don't get what Bill is espousing you never will. And, as was mentioned before you're likely a WAR guy. That is fine by me, Coke vs. Pepsi, different strokes . I just Bill doesn't spend too much time trying to explain to those who will never agree.
3:23 PM Mar 18th

MarisFan61
I wouldn't be so quick to say that Bill's premise is wrong.

But if it's right (or even just arguably reasonable, which I don't think it is either), it needs to be explained. It isn't an evident thing.
3:02 PM Mar 18th

jgf704
To mysersb... I understand. I will say that I *thought* that raw advanced fielding data was available through baseballsavant. However, now that I've looked a bit, I'm not so sure. But I'm not intimately familiar with baseballsavant. Perhaps tangotiger can weigh in on this.
2:31 PM Mar 18th

myersb
To jgf704: a big table of data is exactly what Bill is aiming for, so that others might use it to build interesting ways to evaluate defensive performance. I agree with him that it's the right place to start, and I like his take on it.

In general:

However, the idea that runs scored = runs prevented is in no way correct. The game IS half run creation and half run prevention. But while it's true that the net result of that combined activity is an equal number of wins and losses, it is as untrue that runs created and runs prevented are equal as it is that yards gained and yards prevented on a football field is untrue. A run created or a yard gained is a run allowed or a yard surrendered: the prevention is invisible, ie it is something which did NOT happen, and so a great deal trickier to measure.

This is an error in logic, and I hope Bill will realize and acknowledge that. I like Bill's approach of starting with a compilation of data and figuring out how to evaluate it from there. It's cleaner and more comprehensive, and allows for much wider contributions from the crowd, which can use that data to reach evaluations in more creative ways.

But I also value the work John has done, and the information it contains. I realize that Bill values that as well. He just has his own unique take on how to approach the problem, which is exactly what makes him Bill James.
2:08 PM Mar 18th

MarisFan61
Studes: I don't see what "runs scored = runs saved" has to do with the Win Share system, no matter how broadly I try to look at it.
2:05 PM Mar 18th

studes
I'll just jump in and say that I agree that runs scored = runs saved isn't intuitive to me either. And if you're uncomfortable with that, then you probably should be uncomfortable with Win Shares, too, since it essentially uses the same approach (Bill, correct me if I'm wrong about that). It's one of the things that didn't sit right with me about Win Shares and eventually turned me toward the WAR approach instead.

David Kaiser, I endorse your comments about Wizardry, though it is a very hard read. Worth it, though.
1:00 PM Mar 18th

jgf704
And while I think having access to more defensive "facts" like he has in the Jack Nobody and John Somebody table would be useful, the table he has provided does not tell the whole story. In particular, having knowledge of how hard the ball was hit, how far the player had to move, etc., is necessary too in order turn turn these facts into estimates of their impact (i.e. runs).

I suppose you can find all this info in Statcast, or in the raw data used to come up with DRS. But I don't think seeing a big table of numbers would be all that useful, except to provide the raw data for converting the big tables into helpful summaries. Which, IMO, is what DRS and Statcast Defense summary numbers (i.e. the runs above or below average) try to do.
12:18 PM Mar 18th

jgf704
I agree with most here in that I totally do not understand how Runs Prevented = Runs Scored.

I like dfan's comment. Along similar lines... If Runs Prevented = Runs Scored for a season, then it ought to be true for a game as well. But this leads to the conclusion that more runs are prevented in a 6-4 game than in a 1-0 game. Which makes no sense to me.

What I think Bill has done here is assume (in essence) that the baseline for Runs Prevented is in essence 2 times the league average of Run Scored (with adjustments for context). This is what he does with the Nationals example.

I also disagree with the premise that says it is wrong to use average performance as a baseline. As long as the baseline is known and well-defined, any baseline can work.

Heck, here Bill is basically saying that he thinks the baseline is 5 standard deviations below average. So if we compare a player to average and call him a -4, how is that inferior to saying that player is +1 compared to the -5 baseline?
12:05 PM Mar 18th

evanecurb
Total runs prevented is a concept that makes perfect sense to me. If we accept that runs created is equal to the sum of offensive events, with assigned values for each event, e.g. each single = 0.35 runs or something, then it follows that preventing these events can be calculated in the same manner.
9:27 AM Mar 18th

evanecurb
I really appreciate the explanation of what Bill is attempting to do here. I had been wondering. I found the discussion of the disagreement between Bill and John Dewan interesting. I had never thought about that aspect of comparing numbers to an average. I'm still not sure what I think about it, but now I'm able to think about it for the first time.
9:24 AM Mar 18th

dfan
One more person here who doesn't understand how Total Runs Prevented = Total Runs Scored. Runs aren't zero-sum like win-loss records. Given that we already know that Total Runs Scored = Total Runs Allowed, this is telling us that Total Runs Prevented = Total Runs Allowed, which is certainly nonintuitive.

One way that helps me understand analyses is to take small deltas. Take a season that already exists, like the 2020 NL. 11608 runs were scored; that's a fact. Now imagine one single change in that season; at some point, with two outs and the bases loaded, an outfielder muffed a catch and three runs scored instead of catching the ball and ending the inning (as occurred in reality).

In this alternate season, 11611 runs were scored, a tiny bit more than before. Is it also true that 11611 runs were prevented, more than before? Really? The result of someone making a brutal error is that we credit defenses with performing better in the aggregate?
9:09 AM Mar 18th

ksclacktc
I think what Bill is doing is flipping the scoreboard. When you score 723 runs as a team, that is above ZERO (You can't have negative Runs Scored). If this is done in a league that has 15 teams that scored 11,250 Runs (avg of 750.) And, defense and offense are roughly equal you then can theoretically account for Run Scored in equal terms in team defense.

Which brings you to the the flip side of that. The scoreboard for team defense starts at double the league runs scored since we've accounted for the offense but not the defense yet, and they need to be equal because we've theorized offense=defense.

Furthermore, he is acknowledging the each of these numbers have a Replacement Level separate of the Zero number.
7:32 AM Mar 18th

KaiserD2
Not only could I write a book about this controversy, I did, in effect. [I]Baseball Greatness[I] is based on evaluating players and teams based on deviations from average by hitters, fielders, and pitchers. I tried to make clear in that book why I think that's the best way to do it and I won't try to summarize all that here. I will say this: especially if you looking historically at earlier periods with only teams per league, average will vary a lot based on who happens to be in the league. Example: Duke Snider, it turns out, was an average center fielder at best. (So for most of his career was Mickey Mantle, by the way.) But Duke had the misfortune to be playing in a league that included Willie Mays and Richie Ashburn, two of the greatest center fielders in history. (On the other hand, he was also competing against Gus Bell, the Derek Jeter of centerfielders.) So Duke may have been somewhat better than his stats look although I still think the Dodgers would have been stronger with Carl Furillo in center field.

Michael Humphreys in [i]Wizardry[i] measured every fielder in history against the average and was the inspiration for my writing [i]Baseball Greatness.[i] Dick Cramer in his autobiography described [i]Wizardry[i] as the greatest single achievement in the history of sabermetrics. I'm bringing this up because the "Nettles problem" doesn't really exist in Humphreys's methodology. It's based on an estimate of how many balls were hit into Nettles's area, taking into account total balls in play, ground ball/fly ball tendencies, and left-right handedness of the Yankees' pitchers and thus the hitters that they faced. Doing those calculations for every team in the league allows Humphreys to estimate the number of outs an average Yankee third basemen would have created out of ground balls and compare that to Nettles' assists. (Incidentally, all Humphreys's data is available in spreadsheets at the Oxford University Press website. And seamheads has continued to use his method, Defensive Regression Analysis or DRA, for all MLB players in the years since [i]Wizardry[i] appeared.)

Count me as the third reader who doesn't understand where Bill is getting his runs saved figure.

DK
7:22 AM Mar 18th

steve161
The problem with chrisbodig's Runs Saved is that it doesn't necessarily reflect the fielder's value. Take the Nettles game. There seem to have been a lot of runs saved, because Nettles was given the opportunity to save them. If his pitchers and fellow fielders had been more efficient, and not allowed all those baserunners and hard-hit balls, the exact same plays would have saved far fewer runs, but Nettles' ability would be the same. So Chris' methodology is uninformative. Take the situational aspect away and you've got the Statcast-based Outs Above Average. Bill, does the same contextual objection you have to Defensive Runs Saved apply to Outs Above Average?

I don't completely follow the math of what Bill is trying to do here, but it seems to be based on the notion that there is a universe of possibility when it comes to runs scored and allowed by major league teams. Given that, the idea is to evaluate teams' offense and defense by their position in that universe, and to evaluate individual players' contributions to achieving that position.

Just as Win Shares start out with wins and attempts to assign individual players responsibility for achieving those wins, so this approach starts with runs allowed and, I assume, will attempt to assign responsibility similarly. Just as Win Shares makes more sense to me, intuitively, so I allow myself to hope that this system for evaluating defense will make more sense than Dewan's, with its manifest scaling issues.
6:08 AM Mar 18th

MarisFan61
I'm with Threedog.

Maybe it's a thing that's intuitive to wiser folks, but what I can say for sure is this: It's not obvious why that would be true, nor that it is for sure true.

The article takes a leap there.

And let me say, at risk of having to eat my hat, which actually is no risk because it has already been thoroughly eaten :-)
.....I think it isn't so -- at least in terms of what "runs saved" means to most of us.

-------------------

Totally separate note: The material here about gross amounts vs. relation-to-average is exactly a very big part of why I so much prefer "Win Shares" to the other systems.

-------------------

Other totally separate note:
I sometimes don't get stuff :-) so I need a little help here.

Was this next thing sarcastic, sort of a parody on what Bill is criticizing?
Please tell me yes. Otherwise I don't get it.

"Pitchers? Well, the basic thing is Wins and Losses, and Al Spalding was 47-12, so we will record that as +35; that gets the essence of that, don’t you think? That’s all you need to know. Jim Devlin was 30-35, which is -5, and Dick McBride was 0-4, so we’ll record that for posterity as -4."
12:56 AM Mar 18th

threedog
runs, not rums.
8:52 PM Mar 17th

threedog
It is not clear to me how a run scored means that a run was saved. Why would runs scored equal rums saved?
8:51 PM Mar 17th

gregforman
I’m not seeing how you determine there were 11,449 runs saved. Why weren’t there an infinite number of runs saved?
8:21 PM Mar 17th

CharlesSaeger
Incidentally, why do you think Wild Pitches are so much higher nowadays, especially when Passed Balls have not risen? Both fell dramatically throughout the 19th century as catchers put on gloves, but while Passed Balls have been fairly constant since about 1920 (0.05 to 0.12 per game since then), Wild Pitches went up dramatically starting in the late 1950s, though have been basically stable since the 1980s (0.12 per game in 1928, the low, to a modern high of 0.38 in 2018).
5:18 PM Mar 17th

CharlesSaeger
As someone who is constantly fiddling with fielding numbers (and I have a thing for forces at second lately, after the Alomar comments over the winter), I absolutely applaud what you're trying to do.
5:09 PM Mar 17th

chrisbodig
This is great.
I too am fascinated and frustrated by defensive metrics. For years gone by, I care about them but don't trust them.
The "defensive runs saved" stat I am dying to see is approximately how many runs a fielder actually saved in the real world, one in which a robbed home run counts more with 2 outs and the bases loaded than it does with 2 outs and the bases empty.

I've been watching a lot of '70's (and now '80's) postseason games on YouTube recently. About a week ago I watched Game 3 of the 1978 World Series.
How many runs did Graig Nettles save in that 5-1 Yankees victory?
-- In the top of the 3rd he snared a hard hit grounder by Reggie Smith with Bill Russell on 1st (who might have or might not have scored if the ball had gone into the left field corner).
-- In the top of the 5th with the bases loaded and 2 outs, he caught a hard-hit grounder by Steve Garvey that took a tough hop, getting a force at 2nd base. That's 2 to 3 runs saved.
-- In the top of the 6th, the Dodgers loaded the bases again with 2 outs. Nettles made a brilliant stab on a missile from Davey Lopes, again getting the force at 2nd. Another 2 to 3 runs saved.
All told, Nettles saved anywhere from 4 to 7 runs. With today's Statcast metrics and Run Expectancy, people smarter than me could create a more precise number in between 4 and 7.

As Bill has said, current defensive metrics are only designed to point out "better or worse than average." What I think most baseball fans would love to see is not an evaluation of how good a player is based on a set of formulas but how many runs he ACTUALLY SAVED.

Can a player be overly rewarded or not rewarded by fortunate timing? Of course. But timing is everything. If Nettles had made all of those plays with the bases empty, it still would have been a great game. But that's not what happened. He made those plays in the most important situations and that's why it's arguably the greatest defensive postseason performance ever.

I want to know what a player DID and how his deeds impacted real baseball games.
4:43 PM Mar 17th

87 Percent Less Fun than Wild Bitches

COMMENTS (50 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: