Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

The Fielding Percentage Part

By Bill James

July 17, 2020

Formula 20: Tm-Err-Av (Team Errors Avoided)

We come now to the one category by which fielders have traditionally been measured, the one category which was designed in baseball’s primordial soup to evaluate fielders, and which served that function, however badly, for 100 years.

I’ve been avoiding writing essays here to explain the logic behind the system; I’m trying to reach the end of the long process of explaining how the process works, and essays about why it works that way just hold us up. But sometimes it is necessary to explain.

As you all know, fielding percentages have been going up steadily since organized baseball began. Eventually, fielding percentages went up so much that they became largely irrelevant, no longer central to the task of evaluating fielders.

Errors are awkward to deal with because, more than any other statistical category, there is an inconsistency in their consequences. An "error" can be a one-base error, a two-base error, a three-base error, even, in theory, a four-base error, although I have never seen a four-base error in the major leagues. (The last time it happened was ten years ago.) An error can put a runner or base, or it sometimes (often) does not result in an out/no out distinction, but merely advances a runner who is already on base. It might advance him one base, or two bases, or three bases. Some errors occur on foul balls, and have no base or out consequence; they merely extend the at bat.

No other statistical event has this range of outcomes. Also, fielding statistics including fielding percentages are extremely different from one position to another, much more so than batting statistics. These things can make it difficult to place a run value on errors, but we haven’t reached that problem yet. At this time, we’re just dealing with the question, "How many errors did this team NOT commit, that they might have committed?", not the question of "what was the run benefit of those errors not committed?"

The New York Giants of 1900, a last-place team, committed 439 errors, which seems like a lot, but you had to be there. 439 errors was .0721 of their fielding opportunities; they had a fielding percentage of .928, but I focused on the error percentage.

The error percentage of all teams in the data (1900-2019) is .0246 with a standard deviation .0092, so five standard deviations worse than the norm would be an error percentage of .0708, or a fielding percentage of .929.

To this point in our process we have encountered only one team which is five standard deviations worse than the norm in any category, that being the 1915 Philadelphia Athletics, who were five standard deviations worse than the norm in Walks + Hit Batsmen. We have, however, not one but two teams which had fielding percentages more than five standard deviations worse than the norm—the 1900 Giants, and the 1901 Baltimore Orioles.

As I have explained before, we have to create SOME room to say that INDIVIDUALS on the team can be given credit for run prevention in this area, even if the TEAM deserves none. On the 1900 Giants, for example, Hall of Fame shortstop George Davis had a fielding percentage of .944, which was much better than the league average at the position, .922, and certainly better than five standard deviations below the norm. We have to do something to create space to acknowledge the few players on those teams who did make a contribution to run prevention in this area.

I will make a "carve out" rule that every team in baseball history must be credited with at least 100 errors not made, regardless of how many errors they DID make. That rule would affect 32 teams in the 1900-1911 era, which otherwise would be credited with less than 100 errors not made. All teams since 1911 would already be above that standard.

Second, I will award claim points, for those errors not made, based on the player being above the standard of three times the error percentage at his position, in his season. In other words, let’s say that the fielding percentage at Catcher in a given season is .950; then catchers in that season would be given Claim Points for the runs prevented by not making errors based on how far their fielding percentage was above .850. If the fielding percentage was .970, they’d get claim points for fielding above .910.

The lowest error percentage for a team was .0090 (fielding percentage .9910) by the 2013 Baltimore Orioles. The most Errors NOT MADE, however, was by the 2006 Boston Red Sox. That star-crossed team fielded .989 (error percentage .0107) but had several hundred more fielding chances than the 2013 Orioles. That team is crediting with not making 389 errors.

Is it reasonable to say that a team’s fielders should be credited with not making 389 errors, in a season?

In my opinion, yes, it is very reasonable. I mean, we’ll see, down the line, whether the numbers work out so that that works within the structure I am trying to create, but "just doing your job", as a fielder, is really, really important in preventing runs. A baseball team prevents a lot of runs, over the course of a season, by people just doing their job. One of the basic questions we have to ask, in an analytical system designed to measure the defensive contribution of every player, is "how do we assign each player credit for just doing his job at a major league level?" I think this is how we do that. We’ll have to check the proportions later on.

Anyway, first we assign "errors not made" to the TEAM, and then we assign "claim points" to individual fielders based on their individual fielding records and fielding percentages. It’s a two-stage process, like flying from Kansas City to Los Angeles but you have to go through Atlanta or Denver. This is the first flight.

The Errors Avoided by a Team are figured as follows. Team Error Percentage is the complement of the fielding percentage, 1 minus the fielding percentage:

Tm-Err-Av = (PO + Assists + Errors) * .0708 – Team Errors

But never less than 100.

These are the errors "not made" by each of the 15 teams that we are following:

YEAR	City	Team	Lg	PO	A	E	F Pct	Errors Not Made
1960	Pittsburgh	Pirates	NL	4199	1774	128	.979	304
1964	New York	Mets	NL	4316	1914	167	.974	286
1968	Detroit	Tigers	AL	4463	1615	105	.983	333
1972	Texas	Rangers	AL	4116	1618	166	.972	252
1976	Cincinnati	Reds	NL	4413	1678	102	.984	336
1980	Seattle	Mariners	AL	4372	1930	149	.977	308
1984	Detroit	Tigers	AL	4392	1667	127	.979	311
1988	Baltimore	Orioles	AL	4248	1726	119	.980	312
1992	Toronto	Blue Jays	AL	4322	1591	93	.985	332
1996	Detroit	Tigers	AL	4298	1727	137	.978	299
2000	New York	Yankees	AL	4273	1487	109	.981	307
2004	Arizona	Diamondbacks	NL	4308	1706	139	.977	297
2008	Philadelphia	Phillies	NL	4349	1698	90	.985	344
2012	Houston	Astros	NL	4270	1729	118	.981	315
2016	Chicago	Cubs	NL	4379	1635	101	.983	332

Formula 21: RS-Tm-Err-Av (Runs Saved by Team Errors Avoided)

Each Error not made has a run-saving value of .293 runs.

Rs-Tm-Err-Av = (Tm-Err-Av) * .293.

These are the Runs Saved by Error Avoidance for the 15 teams:

YEAR	City	Team	Lg	Errors Not Made	Runs Saved
1960	Pittsburgh	Pirates	NL	304	89
1964	New York	Mets	NL	286	84
1968	Detroit	Tigers	AL	333	97
1972	Texas	Rangers	AL	252	74
1976	Cincinnati	Reds	NL	336	99
1980	Seattle	Mariners	AL	308	90
1984	Detroit	Tigers	AL	311	91
1988	Baltimore	Orioles	AL	312	92
1992	Toronto	Blue Jays	AL	332	97
1996	Detroit	Tigers	AL	299	88
2000	New York	Yankees	AL	307	90
2004	Arizona	Diamondbacks	NL	297	87
2008	Philadelphia	Phillies	NL	344	101
2012	Houston	Astros	NL	315	92
2016	Chicago	Cubs	NL	332	97

Formula 22: Pos-BL (Positional Base Line for fielding percentage)

Credit for these Runs Prevented is transferred from the team to the individual player by the way of "Claim Points". The more claim points each player has—that is, the more errors he doesn’t make—the more Runs he will be credited with saving.

The positional baseline for each position is one minus three times the error percentage for that position in that league in that season.

Pos-BL = 1 – ((1- LgFPct) * 3)

These all being pitchers, the baseline is established by the league fielding percentage for pitchers. For the pitchers on the 15 teams we are following, these are the baselines:

YEAR	City	Team	Lg	FPct	Baseline
1960	Pittsburgh	Pirates	NL	.962	.886
1964	New York	Mets	NL	.953	.859
1968	Detroit	Tigers	AL	.957	.871
1972	Texas	Rangers	AL	.950	.850
1976	Cincinnati	Reds	NL	.950	.850
1980	Seattle	Mariners	AL	.957	.871
1984	Detroit	Tigers	AL	.958	.874
1988	Baltimore	Orioles	AL	.961	.883
1992	Toronto	Blue Jays	AL	.953	.859
1996	Detroit	Tigers	AL	.960	.880
2000	New York	Yankees	AL	.956	.868
2004	Arizona	Diamondbacks	NL	.965	.895
2008	Philadelphia	Phillies	NL	.959	.877
2012	Houston	Astros	NL	.950	.850
2016	Chicago	Cubs	NL	.948	.844

Formula 23: Claim-Ind-Err-Av (Individual Player Claim Points for Errors Avoided)

Each player’s Claim Points for Errors are avoided are established by subtracting his errors from the number of errors that would be made by a player with the same number of fielding chances, fielding at the level of the positional base line:

Claim-Err-Av = ((PO + Ast + Err) * (1 - Pos-BL)) - Errors

Formula 24: Claim-Team-Err-Av (Team Claim Points for Errors Avoided)

The Claim points for each team is the sum of the Claim Points for the Individual fielders on the team. Not sure how to write that as a formula:

Claim-Team-Err-Av = Sum all (Claim-Err-Av)

Formula 25: RS-Err-Av-P8 (Runs Saved by Error Avoidance, 8^th pitcher’s value)

The Team Runs Saved by Error Avoidance (RS-Tm-Err-Av, formula 23) are distributed to the individual players in the same proportions as their Individual Claim Points (Claim-Err-Av, Formula 25):

RS-Err-Av-P8 = (Claim-Err-Av) / (Claim-Team-Err-Av) * RS-Tm-Err-Av

The process for Runs Saved by Error Avoidance will be the same at the other 8 defensive positions as it is for pitchers. I just had to run through the entire process at pitcher to establish how it works for pitchers. It works the same at the other positions.

I need to explain why the system works this way. Why can’t we just figure the number of errors that each fielder avoided, compared to a fielder making twice the league norm for errors, and credit the fielder with .293 Runs Saved for each error that he did not commit?

In a relative runs saved analysis, in which each fielder was compared to the other fielders in his league and season, that is the way you would do that. But this is not a relative measurement; this is an absolute measurement. Think about two fielders, one from 1900, who fields .940 in a league in which the fielding percentage at the position is .915, and one from 2010, who fields .980 in a league in which the fielding percentage at the position is .975.

In a relative analysis, the fielder who fields .940 in a .915 league has had a better season. But in absolute terms, it is .980 against .940. In absolute terms, the higher number is better. The player making fewer errors has avoided more errors—not a novel concept when you think about it.

IF you can measure the runs saved by each fielder accurately, THEN you can very easily make a relative comparison based on that. But if you make the relative comparison first, then it is basically impossible to transition from that to an absolute number.

Because fielding percentages vary widely both over time and from position to position, any fielding percentage between .910 and .990 can be a completely normal fielding percentage, based on the era and the position. I created this two-tiered system to interpret error rates both globally—that is, on a constant scale from 1900 to 2019—and locally—that is, in a manner specific to the position and the year. I don’t know how else you could do it.

(Later note—This system has some issues. To be honest, this part of the process is not working sensationally well. What we are doing, really, is measuring two things which are a little bit different but mostly the same, and then treating them as if they were actually the same. The result of this is that the value of an error avoided. .. the cost of an error. . .can vary with the team in ways that may not be rational. But I don’t know how to do this and avoid that problem, with the great multiplicity of standards in fielding percentage.)

Anyway, this the updated list of the 15 pitchers on these 15 teams who are saving the most runs. Bob Friend has moved ahead of Randy Johnson into second place on the list, and Dan Petry has now moved up to 5^th on the list, ahead of Whole Camels. I’m glad that has happened, actually, because it shows that these "little" categories do matter, although the number of runs in each one is small compared to strikeouts and walks and homers, but there is a cumulative effect. Dan Petry illustrates the point, as he was not even in the top 10 after strikeouts and walks, but has now moved up to 5^th place by posting consistently strong scores in the small-ball categories. Andy Pettitte drops a slot because he made four errors in 2000:

Year	Player	P1	P2	P3	P4	P5	P6	P7	Err Av	Team Total	Team RS	P8	Total
1968	Denny McLain	45	27	21	7	1	1	6	9	295	97	3	110.95
1960	Bob Friend	29	27	23	7	0	0	4	7	308	89	2	92.22
2004	Randy Johnson	47	20	18	4	2	0	0	3	178	87	1	92.21
1960	Vern Law	19	26	18	7	1	1	7	7	308	89	2	81.78
1984	Dan Petry	23	16	17	5	4	1	2	8	264	91	3	71.38
2008	Cole Hamels	32	18	12	5	1	1	2	3	216	101	1	71.20
1968	Earl Wilson	27	15	16	4	2	1	4	4	295	97	1	69.22
1968	Mickey Lolich	32	13	14	5	3	0	1	1	295	97	0	67.28
2016	Jon Lester	32	13	12	3	2	0	3	5	245	97	2	67.06
1992	Jack Morris	21	13	20	4	2	0	4	6	244	97	2	66.43
1984	Jack Morris	24	13	19	4	1	1	2	5	264	91	2	65.65
1976	Gary Nolan	18	25	13	4	1	0	2	5	322	99	1	65.49
1964	Jack Fisher	19	18	16	5	1	1	4	5	330	84	1	65.04
2016	Kyle Hendricks	27	13	14	3	2	1	2	7	245	97	3	64.79
2000	Andy Pettitte	20	11	17	5	2	1	7	3	220	90	1	63.62

The new categories are Errors Avoided by the pitcher, Errors Avoided by the team, Runs Saved by Error Avoidance, and the pitcher’s runs saved, which are calculated based on the other three. Lucas Harrell of Houston (2012) was the leader in this category, as he did not commit an error in 51 fielding chances, the most of any pitcher in the study who fielded 1.000.

COMMENTS (6 Comments, most recent shown first)

CharlesSaeger
My reasoning for PO-SO+E as a team denominator is that it’s tough to make an error on a strikeout and counting assists in the team denominator makes groundouts count twice, though since most reached on error happen on a groundball that assists might be a wash. My reasoning for BFP-HR-SO as the denominator is that about 40% of errors happen on existing runners, which are typically for outfielders (batter advances an extra base on a hit because the outfielder overran the hit) or catchers (runner gets an extra base on a steal because the catcher overthrew). These are tough to handle on an individual level, but on a team level, you can adjust for this.
1:56 PM Jul 18th

CharlesSaeger
bjames:

No. I am commenting on this:

Tm-Err-Av = (PO + Assists + Errors) * .0708 – Team Errors

So, why PO+A+E as the base, other than tradition?
1:37 PM Jul 18th

bjames
Charles--

No disrespect, but your comment doesn't actually make any sense, and isn't relevant. You're commenting on parts of the system that you either don't understand, or haven't thought through. Sorry.
11:47 AM Jul 18th

CharlesSaeger
Why have the team error rate be PO+A+E? I realize that's the official base for fielding percentage, but for allocating credit/blame for a team, you could use PO-SO+E, or BFP-HR-SO, to reflect errors happening more often with runners on base (especially for outfielders and catchers).
10:23 AM Jul 18th

Rallymonkey5
I thought I was lucky enough to see a 4 base error in the very first game I ever attended, 39 years ago. Fernando Valenzuela fielded a bunt by Mario Soto and threw the ball into the RF corner. Soto and the 2 runners on all cam around to score.

But checking the game logs, Soto was credited with a single, so just a 3 base error.
11:27 PM Jul 17th

TheRicemanCometh
Can I just say hat tip to Dan Petry, the forgotten Ace of the 1984 World Champion Tigers!

4:23 PM Jul 17th

The Fielding Percentage Part

COMMENTS (6 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: