The Fielding Percentage Part

July 17, 2020
  

            Formula 20:  Tm-Err-Av (Team Errors Avoided)

            We come now to the one category by which fielders have traditionally been measured, the one category which was designed in baseball’s primordial soup to evaluate fielders, and which served that function, however badly, for 100 years. 

            I’ve been avoiding writing essays here to explain the logic behind the system; I’m trying to reach the end of the long process of explaining how the process works, and essays about why it works that way just hold us up.  But sometimes it is necessary to explain.

            As you all know, fielding percentages have been going up steadily since organized baseball began.   Eventually, fielding percentages went up so much that they became largely irrelevant, no longer central to the task of evaluating fielders. 

            Errors are awkward to deal with because, more than any other statistical category, there is an inconsistency in their consequences.    An "error" can be a one-base error, a two-base error, a three-base error, even, in theory, a four-base error, although I have never seen a four-base error in the major leagues.  (The last time it happened was ten years ago.)   An error can put a runner or base, or it sometimes (often) does not result in an out/no out distinction, but merely advances a runner who is already on base.  It might advance him one base, or two bases, or three bases.   Some errors occur on foul balls, and have no base or out consequence; they merely extend the at bat.

            No other statistical event has this range of outcomes.  Also, fielding statistics including fielding percentages are extremely different from one position to another, much more so than batting statistics.   These things can make it difficult to place a run value on errors, but we haven’t reached that problem yet.   At this time, we’re just dealing with the question, "How many errors did this team NOT commit, that they might have committed?", not the question of "what was the run benefit of those errors not committed?"

            The New York Giants of 1900, a last-place team, committed 439 errors, which seems like a lot, but you had to be there.   439 errors was .0721 of their fielding opportunities; they had a fielding percentage of .928, but I focused on the error percentage. 

            The error percentage of all teams in the data (1900-2019) is .0246 with a standard deviation .0092, so five standard deviations worse than the norm would be an error percentage of .0708, or a fielding percentage of .929.

            To this point in our process we have encountered only one team which is five standard deviations worse than the norm in any category, that being the 1915 Philadelphia Athletics, who were five standard deviations worse than the norm in Walks + Hit Batsmen.   We have, however, not one but two teams which had fielding percentages more than five standard deviations worse than the norm—the 1900 Giants, and the 1901 Baltimore Orioles.

            As I have explained before, we have to create SOME room to say that INDIVIDUALS on the team can be given credit for run prevention in this area, even if the TEAM deserves none.   On the 1900 Giants, for example, Hall of Fame shortstop George Davis had a fielding percentage of .944, which was much better than the league average at the position, .922, and certainly better than five standard deviations below the norm.  We have to do something to create space to acknowledge the few players on those teams who did make a contribution to run prevention in this area. 

            I will make a "carve out" rule that every team in baseball history must be credited with at least 100 errors not made, regardless of how many errors they DID make.   That rule would affect 32 teams in the 1900-1911 era, which otherwise would be credited with less than 100 errors not made.   All teams since 1911 would already be above that standard.

            Second, I will award claim points, for those errors not made, based on the player being above the standard of three times the error percentage at his position, in his season.  In other words, let’s say that the fielding percentage at Catcher in a given season is .950; then catchers in that season would be given Claim Points for the runs prevented by not making errors based on how far their fielding percentage was above .850.  If the fielding percentage was .970, they’d get claim points for fielding above .910. 

            The lowest error percentage for a team was .0090 (fielding percentage .9910) by the 2013 Baltimore Orioles.   The most Errors NOT MADE, however, was by the 2006 Boston Red Sox.  That star-crossed team fielded .989 (error percentage .0107) but had several hundred more fielding chances than the 2013 Orioles.   That team is crediting with not making 389 errors. 

            Is it reasonable to say that a team’s fielders should be credited with not making 389 errors, in a season?

            In my opinion, yes, it is very reasonable.  I mean, we’ll see, down the line, whether the numbers work out so that that works within the structure I am trying to create, but "just doing your job", as a fielder, is really, really important in preventing runs.   A baseball team prevents a lot of runs, over the course of a season, by people just doing their job.   One of the basic questions we have to ask, in an analytical system designed to measure the defensive contribution of every player, is "how do we assign each player credit for just doing his job at a major league level?"   I think this is how we do that.   We’ll have to check the proportions later on. 

            Anyway, first we assign "errors not made" to the TEAM, and then we assign "claim points" to individual fielders based on their individual fielding records and fielding percentages.   It’s a two-stage process, like flying from Kansas City to Los Angeles but you have to go through Atlanta or Denver.   This is the first flight. 

            The Errors Avoided by a Team are figured as follows.  Team Error Percentage is the complement of the fielding percentage, 1 minus the fielding percentage:

Tm-Err-Av = (PO + Assists + Errors) * .0708 – Team Errors

But never less than 100.  

These are the errors "not made" by each of the 15 teams that we are following:

YEAR

City

Team

Lg

PO

A

E

F Pct

Errors Not Made

1960

Pittsburgh

Pirates

NL

4199

1774

128

.979

304

1964

New York

Mets

NL

4316

1914

167

.974

286

1968

Detroit

Tigers

AL

4463

1615

105

.983

333

1972

Texas

Rangers

AL

4116

1618

166

.972

252

1976

Cincinnati

Reds

NL

4413

1678

102

.984

336

1980

Seattle

Mariners

AL

4372

1930

149

.977

308

1984

Detroit

Tigers

AL

4392

1667

127

.979

311

1988

Baltimore

Orioles

AL

4248

1726

119

.980

312

1992

Toronto

Blue Jays

AL

4322

1591

93

.985

332

1996

Detroit

Tigers

AL

4298

1727

137

.978

299

2000

New York

Yankees

AL

4273

1487

109

.981

307

2004

Arizona

Diamondbacks

NL

4308

1706

139

.977

297

2008

Philadelphia

Phillies

NL

4349

1698

90

.985

344

2012

Houston

Astros

NL

4270

1729

118

.981

315

2016

Chicago

Cubs

NL

4379

1635

101

.983

332

 

 

 

 

 

Formula 21:  RS-Tm-Err-Av (Runs Saved by Team Errors Avoided)

Each Error not made has a run-saving value of .293 runs. 

            Rs-Tm-Err-Av = (Tm-Err-Av) * .293.

            These are the Runs Saved by Error Avoidance for the 15 teams:

 

YEAR

City

Team

Lg

Errors Not Made

Runs Saved

1960

Pittsburgh

Pirates

NL

304

89

1964

New York

Mets

NL

286

84

1968

Detroit

Tigers

AL

333

97

1972

Texas

Rangers

AL

252

74

1976

Cincinnati

Reds

NL

336

99

1980

Seattle

Mariners

AL

308

90

1984

Detroit

Tigers

AL

311

91

1988

Baltimore

Orioles

AL

312

92

1992

Toronto

Blue Jays

AL

332

97

1996

Detroit

Tigers

AL

299

88

2000

New York

Yankees

AL

307

90

2004

Arizona

Diamondbacks

NL

297

87

2008

Philadelphia

Phillies

NL

344

101

2012

Houston

Astros

NL

315

92

2016

Chicago

Cubs

NL

332

97

 

 

 

            Formula 22:  Pos-BL  (Positional Base Line for fielding percentage)

            Credit for these Runs Prevented is transferred from the team to the individual player by the way of "Claim Points".   The more claim points each player has—that is, the more errors he doesn’t make—the more Runs he will be credited with saving. 

            The positional baseline for each position is one minus three times the error percentage for that position in that league in that season. 

            Pos-BL = 1 – ((1- LgFPct) * 3)

 

            These all being pitchers, the baseline is established by the league fielding percentage for pitchers.   For the pitchers on the 15 teams we are following, these are the baselines:

YEAR

City

Team

Lg

FPct

Baseline

1960

Pittsburgh

Pirates

NL

.962

.886

1964

New York

Mets

NL

.953

.859

1968

Detroit

Tigers

AL

.957

.871

1972

Texas

Rangers

AL

.950

.850

1976

Cincinnati

Reds

NL

.950

.850

1980

Seattle

Mariners

AL

.957

.871

1984

Detroit

Tigers

AL

.958

.874

1988

Baltimore

Orioles

AL

.961

.883

1992

Toronto

Blue Jays

AL

.953

.859

1996

Detroit

Tigers

AL

.960

.880

2000

New York

Yankees

AL

.956

.868

2004

Arizona

Diamondbacks

NL

.965

.895

2008

Philadelphia

Phillies

NL

.959

.877

2012

Houston

Astros

NL

.950

.850

2016

Chicago

Cubs

NL

.948

.844

 

 

 

            Formula 23:   Claim-Ind-Err-Av   (Individual Player Claim Points for Errors Avoided)

            Each player’s Claim Points for Errors are avoided are established by subtracting his errors from the number of errors that would be made by a player with the same number of fielding chances, fielding at the level of the positional base line:

Claim-Err-Av  = ((PO + Ast + Err) * (1 - Pos-BL)) - Errors

           

            Formula 24:  Claim-Team-Err-Av  (Team Claim Points for Errors Avoided)

            The Claim points for each team is the sum of the Claim Points for the Individual fielders on the team.  Not sure how to write that as a formula:

 

            Claim-Team-Err-Av  =  Sum all (Claim-Err-Av)

 

            Formula 25:   RS-Err-Av-P8  (Runs Saved by Error Avoidance, 8th pitcher’s value) 

            The Team Runs Saved by Error Avoidance (RS-Tm-Err-Av, formula 23) are distributed to the individual players in the same proportions as their Individual Claim Points  (Claim-Err-Av, Formula 25):

            RS-Err-Av-P8 = (Claim-Err-Av) / (Claim-Team-Err-Av) * RS-Tm-Err-Av

 

            The process for Runs Saved by Error Avoidance will be the same at the other 8 defensive positions as it is for pitchers.  I just had to run through the entire process at pitcher to establish how it works for pitchers.  It works the same at the other positions. 

            I need to explain why the system works this way.  Why can’t we just figure the number of errors that each fielder avoided, compared to a fielder making twice the league norm for errors, and credit the fielder with .293 Runs Saved for each error that he did not commit?

            In a relative runs saved analysis, in which each fielder was compared to the other fielders in his league and season, that is the way you would do that.  But this is not a relative measurement; this is an absolute measurement.   Think about two fielders, one from 1900, who fields .940  in a league in which the fielding percentage at the position is .915, and one from 2010, who fields .980 in a league in which the fielding percentage at the position is .975.  

            In a relative analysis, the fielder who fields .940 in a .915 league has had a better season.  But in absolute terms, it is .980 against .940.  In absolute terms, the higher number is better. The player making fewer errors has avoided more errors—not a novel concept when you think about it. 

            IF you can measure the runs saved by each fielder accurately, THEN you can very easily make a relative comparison based on that.   But if you make the relative comparison first, then it is basically impossible to transition from that to an absolute number.  

            Because fielding percentages vary widely both over time and from position to position, any fielding percentage between .910 and .990 can be a completely normal fielding percentage, based on the era and the position.  I created this two-tiered system to interpret error rates both globally—that is, on a constant scale from 1900 to 2019—and locally—that is, in a manner specific to the position and the year.   I don’t know how else you could do it. 

            (Later note—This system has some issues.  To be honest, this part of the process is not working sensationally well.  What we are doing, really, is measuring two things which are a little bit different but mostly the same, and then treating them as if they were actually the same.   The result of this is that the value of an error avoided. .. the cost of an error.  . .can vary with the team in ways that may not be rational.  But I don’t know how to do this and avoid that problem, with the great multiplicity of standards in fielding percentage.)

            Anyway, this the updated list of the 15 pitchers on these 15 teams who are saving the most runs.  Bob Friend has moved ahead of Randy Johnson into second place on the list, and Dan Petry has now moved up to 5th on the list, ahead of Whole Camels.   I’m glad that has happened, actually, because it shows that these "little" categories do matter, although the number of runs in each one is small compared to strikeouts and walks and homers, but there is a cumulative effect.  Dan Petry illustrates the point, as he was not even in the top 10 after strikeouts and walks, but has now moved up to 5th place by posting consistently strong scores in the small-ball categories.  Andy Pettitte drops a slot because he made four errors in 2000:

Year

Player

P1

P2

P3

P4

P5

P6

P7

Err Av

Team Total

Team RS

P8

Total

1968

Denny McLain

45

27

21

7

1

1

6

9

295

97

3

110.95

1960

Bob Friend

29

27

23

7

0

0

4

7

308

89

2

92.22

2004

Randy Johnson

47

20

18

4

2

0

0

3

178

87

1

92.21

1960

Vern Law

19

26

18

7

1

1

7

7

308

89

2

81.78

1984

Dan Petry

23

16

17

5

4

1

2

8

264

91

3

71.38

2008

Cole Hamels

32

18

12

5

1

1

2

3

216

101

1

71.20

1968

Earl Wilson

27

15

16

4

2

1

4

4

295

97

1

69.22

1968

Mickey Lolich

32

13

14

5

3

0

1

1

295

97

0

67.28

2016

Jon Lester

32

13

12

3

2

0

3

5

245

97

2

67.06

1992

Jack Morris

21

13

20

4

2

0

4

6

244

97

2

66.43

1984

Jack Morris

24

13

19

4

1

1

2

5

264

91

2

65.65

1976

Gary Nolan

18

25

13

4

1

0

2

5

322

99

1

65.49

1964

Jack Fisher

19

18

16

5

1

1

4

5

330

84

1

65.04

2016

Kyle Hendricks

27

13

14

3

2

1

2

7

245

97

3

64.79

2000

Andy Pettitte

20

11

17

5

2

1

7

3

220

90

1

63.62

 

            The new categories are Errors Avoided by the pitcher, Errors Avoided by the team, Runs Saved by Error Avoidance, and the pitcher’s runs saved, which are calculated based on the other three.  Lucas Harrell of Houston (2012) was the leader in this category, as he did not commit an error in 51 fielding chances, the most of any pitcher in the study who fielded 1.000. 

 

 
 

COMMENTS (6 Comments, most recent shown first)

CharlesSaeger
My reasoning for PO-SO+E as a team denominator is that it’s tough to make an error on a strikeout and counting assists in the team denominator makes groundouts count twice, though since most reached on error happen on a groundball that assists might be a wash. My reasoning for BFP-HR-SO as the denominator is that about 40% of errors happen on existing runners, which are typically for outfielders (batter advances an extra base on a hit because the outfielder overran the hit) or catchers (runner gets an extra base on a steal because the catcher overthrew). These are tough to handle on an individual level, but on a team level, you can adjust for this.
1:56 PM Jul 18th
 
CharlesSaeger
bjames:

No. I am commenting on this:

Tm-Err-Av = (PO + Assists + Errors) * .0708 – Team Errors

So, why PO+A+E as the base, other than tradition?
1:37 PM Jul 18th
 
bjames
Charles--

No disrespect, but your comment doesn't actually make any sense, and isn't relevant. You're commenting on parts of the system that you either don't understand, or haven't thought through. Sorry.
11:47 AM Jul 18th
 
CharlesSaeger
Why have the team error rate be PO+A+E? I realize that's the official base for fielding percentage, but for allocating credit/blame for a team, you could use PO-SO+E, or BFP-HR-SO, to reflect errors happening more often with runners on base (especially for outfielders and catchers).
10:23 AM Jul 18th
 
Rallymonkey5
I thought I was lucky enough to see a 4 base error in the very first game I ever attended, 39 years ago. Fernando Valenzuela fielded a bunt by Mario Soto and threw the ball into the RF corner. Soto and the 2 runners on all cam around to score.

But checking the game logs, Soto was credited with a single, so just a 3 base error.
11:27 PM Jul 17th
 
TheRicemanCometh
Can I just say hat tip to Dan Petry, the forgotten Ace of the 1984 World Champion Tigers!


4:23 PM Jul 17th
 
 
©2020 Be Jolly, Inc. All Rights Reserved.|Web site design and development by Americaneagle.com|Terms & Conditions|Privacy Policy