How Well Do Advanced Defensive Statistics Correlate?

February 27, 2012

We've put a lot of effort into improving defensive metrics in recent years, but how much progress have we really made? In the introduction to The Fielding Bible—Volume III, I said:

"For hitters, we might be at the 85-90 percent mark of being able to measure offense. We have a lot of good tools like OPS (on-base plus slugging), Runs Created, Wins Above Replacement. For pitchers, we are not quite as far along. Maybe we’re at the 75 percent level of understanding pitcher effectiveness with our numerical tools like ERA, Batting Average on Balls in Play, and Opponent OPS. For defense, ten years ago we were probably around the 10th percentile. Now with three volumes of The Fielding Bible under our belts, plus the work of many other excellent sabermetricians, we are probably in the 60-70 percent range."

In our book, The Fielding Bible—Volume III, we put our newest defensive analytics to the test. If our statistics are measuring something meaningful, we would expect them to correlate well from year to year. In other words, since Evan Longoria topped all third basemen with 20 Defensive Runs Saved in 2010, we would expect him to remain one of the league's top defenders at the position in subsequent seasons. (Longoria saved an estimated 22 runs in the field in 2011, also a league-leading total.)

To measure the consistency of our Defensive Runs Saved numbers, we calculated what we'll call Even/Odd Year Correlations. We added each fielder's Runs Saved totals from 2006, 2008, and 2010 and compared to the subtotal from 2007, 2009, and 2011, with the requirement that the fielder have amassed at least 667 innings in both subsets. We would expect the players with higher totals in even years to also have high totals in odd years, while players with low totals in even years should also tend to have low totals in odd years.

By calculating the correlation coefficient of the even and odd year totals, we can measure just how consistent our statistics are. Correlation coefficients range from -1.0 to 1.0 and show relationships between two sets of numbers. A correlation coefficient of 1.0 represents a perfectly predictable relationship. For instance, if every fielder had the same number of Runs Saved in both even and odd seasons, that would produce a correlation of 1.0. On the other hand, a correlation coefficient of zero means that there is no measurable relationship, while a correlation coefficient of -1.0 signifies an inverse relationship between the sets of numbers.

Defensive Runs Saved produced an Even/Odd Year Correlation of .59. This high, positive correlation value indicates a strong relationship between even and odd season totals and a good consistency in measuring fielders' value. But, how does this compare to traditional hitting and pitching statistics?

Even/Odd Year Correlation Coefficients for Commonly Cited Statistics
Statistic Correlation
Batting Average .56
ERA .51
Defensive Runs Saved

As you can see, both batting average and ERA also produce high positive Even/Odd Year correlations, though Defensive Runs Saved correlates better than both. (We used a minimum of 150 innings or 500 at bats in both subtotals for pitching and hitting statistics, respectively, although the correlations didn't change much when we adjusted the minimum cutoffs in either direction.)

Comparing our defensive analytics to batting average and ERA, which have been the staples of analytics in baseball for the first 100 years of its existence, we find that our Defensive Runs Saved system is a better way to measure defense than are batting average to measure offense and ERA to measure pitching.

Of course, we now have more advanced measures of hitting and pitching performance. Let's see how well a few other statistics correlate between even and odd seasons.

Even/Odd Year Correlation Coefficients for Additional Statistics
Statistic Correlation
Home Runs .83
OPS .69
Pitcher Strikeouts per 9 Innings .88
Pitcher Walks per 9 Innings .79
Opponent OPS .61


Home runs correlate at .83, indicating a very strong correlation between even and odd seasons. OPS correlates at .69, and Opponent OPS, which for me is the most important pitching statistic, correlates at .61.

We are at the point where our defensive analytics are nearly as reliable as offensive and pitching analytics. Just looking at the single best statistic in each: OPS is .69, Opponent OPS is .61, Defensive Runs Saved is .59. We’ve come a long way.


COMMENTS (3 Comments, most recent shown first)

One issue is if there is a systematic bias in the visually-collected data. If for example, balls that are fielded are marked as closer to one spot, but balls that are not fielded are marked as being farther than that very same spot.

There is evidence to suggest that there is a "clumping" affect that does occur.

In my view, there's no reason to go with 100/0 with Dewan/MGL on the one side and TotalZone and the other systems on the other. Just do a 2/1 weighting, and don't get too hung up otherwise.

Fangraphs has WAR as well, and they use MGL's UZR, which is analogous to Dewan. If you really are bothered by, then use Fangraphs.

9:08 AM Feb 29th
I see that baseball-reference is listing Baseball Info Solutions' fielding runs saved in their fielding tables, alongside their own version of runs saved above average. In comparing them, there seem to be many large differences in runs saved, both for players and for teams as a whole.

For instance, I count 63 AL players whose runs saved differ by at least 5 runs between the two systems. 13 players differ by over 10 runs, and this can change a player from being a so-so defender into a great one or a poor one, or vice versa. Examples: Austin Jackson (22 by your method, 7 by theirs), Granderson (-15 your method, -1 theirs), Zobrist (23 your method, -4 theirs).

It's causing me to doubt the reliability of the bb-ref defensive ratings, and thus their defensive wins above replacement, which is a component of the total wins above replacement we often use on a daily basis. Given a choice between a system based on visual analysis of each play, and one based on extrapolation from statistics, I would choose yours in a heartbeat.

Any thoughts on this, and how useful the bb-ref defensive ratings for players may or may not be, in years prior to when your data started being collected?
2:51 PM Feb 28th
Best analysis of baseball I have ever seen. Can it be summed and expanded to teams? Would still leave enough variance
to make us watch the games. Nice work
7:51 AM Feb 28th
©2022 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy