Remember me

Having Fun With The Components Of WAR

June 30, 2022
Acknowledgements
 
Before I get into the article, I just wanted to give a quick shout out to Tom Tango and to Bill James Online member jgf704 for their help in steering me to the WAR master data file on baseball-reference.com. The raw data I was able to access is at the heart of this article, and I would not have been able to efficiently pull and organize the data the way I wanted without having that file available.
 
Introduction
 
What if you were trying to identify players who were good "across the board"? There are lots of different ways to approach that, starting with defining what "across the board" means. Does it mean a player exhibits excellence across a set of "tools" (speed, power, defense, etc.)? Does it mean doing well across different statistical categories? What exactly are you trying to capture?
 
I know Bill has alluded to this type of thing before. For example, in the New Bill James Historical Abstract (from 2001), Bill referred to Barry Larkin (his #6 ranked shortstop) as one of the most well-rounded stars in baseball history because he could hit .300, had good power, good speed, excellent defense, and was a good percentage player, saying he right up there with DiMaggio and Mays in that regard. 
 
In his comment on George Grantham (#62 at second base) in the same book, he referenced a couple of studies he had done 20 years apart where he looked for players who were above average in several offensive categories and also played a key defensive position. In the first study, he came up with 2 players: Willie Mays and George Grantham.   In the second one, it was again just 2 players, this time being Jackie Robinson and George Grantham.
 
Those types of things got me thinking about a similar type of search. What if we looked for players who were "positive" across all of the components of WAR? First, let’s do a quick primer on WAR and its 6 "components".
 
The Components of WAR – Basic Concepts
 
I would suspect most of you are familiar enough with the baseball-reference.com version of WAR, and probably have a pretty decent understanding of it and how it gets used in a variety of different contexts. And I’m also sure that many of you are also familiar as to what is known as the "components of WAR", but for the benefit of readers who may not be used to looking at that level, I wanted to take a few minutes to review the components, what they represent, and some of the results they generate.
 
The following is straight from Baseball-reference.com:
 
======================
WAR  for position players has six components:
·         Batting Runs
·         Baserunning Runs
·         Runs added or lost due to Grounding into Double Plays in DP situations
·         Fielding Runs
·         Positional Adjustment Runs
·         Replacement level Runs (based on playing time)
 
The first five measurements are all compared against league average, so a value of zero will equate to a league average player. Less than zero means worse than average, and greater than zero means better than average.
 
These five correspond to the first half of our equation above (Player_runs - AvgPlayer_runs).
 
The sixth factor is the second half of the equation (AvgPlayer_runs - ReplPlayer_runs).
======================
 
Back to me now….
 
I’m not going to go into depth as to the calculations that are behind each of those components. You can read up on that if you like. I merely wanted to level set that I’m not just using WAR in the aggregate, but more interested in each of the building blocks.
 
Here’s a quick summary of the 6 components, including the abbreviations I’ll be using to reference them from here on (these are the same abbreviations you would find in a player’s "Player Value-Batting" section on his baseball-reference.com player page):
 
Component Name
Abbreviation
Measures
Compares to
Batting Runs
Rbat
Runs attributable to hitting
Average
Baserunning Runs
Rbaser
Runs attributable to baserunning events
Average
Double Play Runs
Rdp
Runs attributable to avoiding double plays
Average
Fielding Runs
Rfield
Runs attributable to fielding
Average
Position Adjustment Runs
Rpos
Bonus/penalty based on positions played
Average
Replacement Level Runs
Rrep
Runs above replacement level
Replacement Level
 
As the definition stated, the first 5 estimate how many runs above average a player was within each measure, and the last component estimates how much above replacement level he was. Adding it all up gives you total runs above replacement (which is referred to as RAR). Then, the final step to translate RAR into the more recognizable WAR figure is to apply a conversion factor of runs-to-wins. To quote Baseball-reference.com: If you had to pick one number over the history of baseball to convert runs into wins, it would be 10.
 
The exact conversion factor for the ratio of runs-to-wins varies (for example, the league run context has an impact), but generally if you simply take RAR divided by 10, it’ll get you pretty close to a player’s actual final WAR figure.   As I look over the yearly data for players, that factor generally stays between 9 and 11 for the vast majority of them. If you take the top 500 position players in terms of WAR, the range goes from a low of 9.0 to a high of 11.6, and the average is 10.1. The main point is that the runs-to-wins conversion is applied to turn RAR into WAR, and over time that amounts to roughly 1 win per 10 runs.
 
Component Characteristics
 
Next, I wanted to investigate some basic statistical characteristics of these 6 components before applying it to the final study. 
 
First, it’s worth noting that the Double Play Runs component (Rdp, which is a hitting component for being able to avoid a double play, not a fielding component of an ability to turn double plays), is zero for every player through the 1930 season. I didn’t see that directly addressed in the explanation of the WAR components, but I suspect it’s simply due to lack of necessary play-by-play data. 
 
So, since I ultimately wanted to investigate players who were good across all component categories, I decided to limit the scope to 1931 or later. In addition, it looks like a similar effect holds true for the Negro Leagues (Rdp is zero for all Negro League players), so I will also eliminate Negro League stats from this study. At the end, I’ll do a separate quick summary for players in those last 2 groups.
 
So, the scope of my data will be based on:
·         National League and American League only
·         Using seasons from 1931 to 2021 only
·         A minimum of 1,000 career games played during that time frame
 
Note that if a player had career stats both prior to and after 1931, I only included the years for 1931 and beyond, so someone like Babe Ruth, who only had 568 games played in 1931 and beyond, is not part of the final data set result.
 
This yielded a data set of 1,259 players after I removed pitchers Hoyt Wilhelm, Trevor Hoffman, Kent Tekulve, and John Franco, all of whom were in the original data set and played more than 1,000 games, since they’re not relevant.
 
Here are some summary stats for the players in the data. The "range" is the difference between the high (max) and the low (min):
 
Measure
 G
 PA
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
 Total Runs Above Replacement
Career WAR
Totals
1,955,968
7,549,552
76,334.1
3,727.3
(621.6)
8,985.0
(5,358.7)
255,113.4
338,233.2
33,180.9
Min
1,000
2,101
(304.9)
(39.0)
(46.6)
(253.3)
(203.8)
66.1
(48.7)
(6.9)
Max
3,562
15,890
1,128.4
143.8
56.2
293.6
161.7
506.6
1,646.3
162.8
Range
2,562
13,789
1,433.4
182.8
102.8
546.9
365.6
440.5
1,695.0
169.7
Average
1,554
5,996
60.6
3.0
(0.5)
7.1
(4.3)
202.6
268.7
26.4
 
A few observations about the basic data resulting from these 1,259 players:
 
·         An important characteristic of WAR data is that negative values are present. Most traditional baseball stats are either counting stats (beginning at zero and then increasing) or a calculation/ratio/percentage of something (like batting average, on-base percentage, winning percentage, etc.). But because WAR components are values relative to something else (relative to an average, or relative to a replacement level), they can be positive or negative.

·         It’s an over simplification, and it’s a little tricky given the negative figures, but about 75% of the total run above replacement (which is the sum of all 6 components) is represented by Rrep, the runs above replacement, and the other 25% is represented by the other 5 (the runs above average). Of that "above average" portion, Batting Runs (Rbat) is the biggest driver (it’s more than 90% of the above average figure).

·         The largest range, by far, is in Batting Runs (Rbat). Rbat has the highest individual value (1,128.4 runs, Barry Bonds) and the lowest individual value (Negative 304.9 runs, Larry Bowa), a range of 1,433.4 runs, or roughly 143 "wins".

·         The smallest range is in RDP, which ranges from a high of 56.2 runs (Ichiro Suzuki) to a low of negative 46.6 runs (Ernie Lombardi).  

Here are some the individual highs and lows within my data set for each of the components:
 
Component
Best
Best Figure
Worst
Worst Figure
Rbat
Barry Bonds
1,128.4
Larry Bowa
(304.9)
Rbaser
Rickey Henderson
143.8
David Ortiz
(39.0)
Rdp
Ichiro Suzuki
56.2
Ernie Lombardi
(46.6)
Rfield
Brooks Robinson
293.6
Derek Jeter
(253.3)
Rpos
Ozzie Smith
161.7
David Ortiz
(203.8)
Rrep
Pete Rose
506.6
Dave Hansen
66.1
 
Notable, of course, is that David Ortiz has the lowest figures in 2 different categories – baserunning runs, and the position adjustment, probably not a surprise to any of you, considering how slow he ran and also how much of his career was spent at designated hitter. Ortiz also has negative figures in 2 other components (Rdp and Rfield), but he was such a valuable hitter than he was able to overcome all those negatives.
 
Here’s another summary table, showing how many players had positive (greater than zero) figures in each component. As you can see, every single player had a positive runs figure in the 6th component (Replacement Runs), which really just reinforces the notion that any player with 1,000 or more Major League games is undoubtedly someone who is above replacement level. Most of the other 5 components, except for Batting Runs, come out close to 50/50 splits, give or take a few percentage points:
 
Component
Total Players
Positive Runs
Negative or Zero Runs
% with Positive
Rbat
1,259
769
490
61%
Rbaser
1,259
610
649
48%
Rdp
1,259
595
664
47%
Rfield
1,259
677
582
54%
Rpos
1,259
609
650
48%
Rrep
1,259
1,259
0
100%
 
 
I think the following is an important point as well. The table below is from baseball-reference.com and displays the position adjustments that have been used in coming up with a player’s Rpos (the position adjustment) figures over time. I would say it is somewhat similar to (though not in perfect harmony) with the concept of the "Defensive Spectrum" that Bill introduced decades ago, but that it tries to capture the relative value of playing different positions in a more quantitative manner. 
 
The original table on baseball-reference.com shows each season, but for brevity I’m only showing one decade at a time to show how the values have changed over time, starting with 1871 and then moving forward 10 years at a time, culminating with the last year shown in the table (2017).
 
In general, the sum of the position run adjustments tends to net out at zero or close to zero each year, with relative adjustments changing over time as individual positions become more or less "important" defensively. The list is color-coded with a green/yellow/red model, with greens being high, reds being low, and yellows falling in the middle. The more intense the color, the stronger the effect is in that direction. 
 
Shortstops and catchers have been consistently high over history, starting as high as +10 runs in the early days. Catchers got down as low as +5 in the 30’s, 40’s, and 50’s, but have moved back up since. Shortstops started at +10, and are still relatively high at +7. Left field and right field have been consistently low, starting as low as -10, and improving slightly to the modern -7. 
 
Positions like first base and center field are the most interesting, and have evolved in different directions. First base was more of a neutral position in the early days and then became increasingly less important over time, to the point where it now represents the biggest negative adjustment aside from DH. On the other hand, center fielders started off as one of the more negative adjustments, but evolved over time to more neutral territory and, now, represent a positive adjustment.
 
Note: "DH" is constant at -15 runs each year, and they are displayed even for years for which there was no DH. 
 
Year
runs_c
runs_1b
runs_2b
runs_3b
runs_ss
runs_lf
runs_cf
runs_rf
runs_dh
1871
10
0
3
5
10
-10
-8
-10
-15
1881
10
0
3
5
10
-9.5
-8
-9.5
-15
1891
10
0
3
5
10
-9.5
-8
-9
-15
1901
10
-3.5
1
5
10
-8
-5
-8.5
-15
1911
10
-5
0
5
10
-8
-4
-8
-15
1921
6.5
-6.5
3.5
5
10
-7
-4
-7.5
-15
1931
5
-7
5
3.5
10
-7
-2.5
-7
-15
1941
5
-7
6.5
1
10
-7
-1.5
-7
-15
1951
5
-7
6.5
0
9.5
-7
-1
-7
-15
1961
8.5
-9
4
3
9
-8
-1
-7
-15
1971
9
-9
4
3
9
-8
-1
-7
-15
1981
9
-9.5
4
2
8.5
-7
-0.5
-7
-15
1991
8.5
-9.5
3
1
8.5
-7
1.5
-7
-15
2001
8.5
-9.5
3
2
7.5
-7
2.5
-7
-15
2011
9
-9.5
3
2
7
-7
2.5
-7
-15
2017
9
-9.5
3
2
7
-7
2.5
-7
-15
 
Anyway, I wanted to present that table to help explain why some players (example, Willie Mays or Tris Speaker) end up with a negative position adjustment despite spending most of their careers at what we would normally consider to be a "key" position. It has to do with the table above, and what position adjustments were in effect for the seasons they were active.
 
OK, so now you have a sense of each component. Next, I wanted to look at the results and who had positives across the board, and other related observations.
 
The Results
 
So, if you had to guess, how many of the 1,259 players in the data set have positives in all 6 components?  As a clue, keep in mind the following:
 
1)      Every player in the dataset has a positive value for Runs above Replacement.

2)      About half the players get eliminated off the bat because they primarily played position(s) that result in a negative adjustment. In general, over this time frame, the players who end up with positive position adjustments have tended to be catchers, shortstops, second basemen, third basemen, and (sometimes) center fielders.
 
Those of you who are especially sharp might have wondered what would happen if you just simply multiplied each component’s "% of players with positive runs" (the percentages provided in the table above) by the total number of players. In other words:
 
1,259 players x .61 x .48 x .47 x .54 x .48 x 1.00  = 45 players.
 
Well, as it turns out….that would be a really good estimate, as there are actually 43 players (about 3.4%) that were able to pull off positives in all 6 components.
 
Let’s work our way down the WAR leaderboard and take a look at who doesn’t make the grade. Here are the top 13 by career WAR (and remember, this is only data for 1931 and later, so there are a few big names that won’t be present):
 
Name
 G
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
Total Runs Above Replacement
Career WAR
Barry Bonds
2,986
1,128.4
43.6
5.6
175.0
(101.2)
394.8
1646.3
162.8
Willie Mays
2,992
805.9
78.0
(8.4)
184.7
(18.9)
453.4
1494.6
156.1
Henry Aaron
3,298
877.2
44.1
(11.6)
97.6
(140.9)
496.1
1362.9
143.0
Stan Musial
3,026
868.7
9.9
17.4
49.8
(130.4)
458.3
1273.7
128.6
Ted Williams
2,292
1,050.5
5.3
7.5
(32.3)
(95.4)
294.1
1229.8
122.0
Alex Rodriguez
2,784
639.5
56.3
(5.1)
22.9
71.6
440.4
1225.5
117.6
Rickey Henderson
3,081
555.4
143.8
3.4
64.4
(97.3)
449.0
1118.9
111.1
Mickey Mantle
2,401
801.6
50.5
12.1
(36.9)
(34.4)
304.9
1097.8
110.2
Frank Robinson
2,808
728.7
34.5
(25.4)
21.9
(146.3)
418.0
1031.5
107.2
Mike Schmidt
2,404
526.8
(1.0)
(6.6)
127.2
42.0
332.5
1021.0
106.8
Joe Morgan
2,649
449.8
79.5
25.0
(48.1)
78.2
373.6
958.2
100.4
Albert Pujols
2,971
664.8
7.2
(43.4)
137.8
(170.8)
414.3
1010.1
99.6
Carl Yastrzemski
3,308
449.5
(2.1)
(2.5)
184.0
(171.3)
465.9
923.8
96.5
 
As you can see, none of these quite made the grade in terms of having 6 positives, although some are close. 10 of the 13 had negative position adjustments that took them out of the running, even Willie Mays, which might surprise some of you (it surprised me). Although Mays was primarily a center fielder, for most of his career the position adjustment for center fielders was a small annual negative (typically around minus-1 runs per year). Given a center fielder’s general perception as being an important "up the middle" defensive position, that may be hard for some to accept, but that’s the way the adjustment was applied. As we saw in the position adjustment table earlier, center fielders in more recent years do now receive a positive position adjustment. In any case, Mays also had a second negative runs result, that being in the Double Play Runs component, so that would have taken him out of the running as well.
 
So, as you can see, none of these top 13 were positive across the board. Barry Bonds, Stan Musial, and Rickey Henderson were positive in all components except for the position adjustment, as Bonds and Henderson were mostly left fielders with Musial splitting time among first base and all 3 outfield positions. Joe Morgan had solid positives in everything except Fielding Runs, which some might question because he had a good defensive reputation and took home a few Gold Glove awards.   Alex Rodriguez was oh so close, but had a small negative figure in Double Play Runs. Hank Aaron, Ted Williams, Frank Robinson, Mike Schmidt, Mickey Mantle, and Albert Pujols all had 2 negatives, and Carl Yastrzemski had 3.
 
As you might have guessed, I stopped at 13 because the next player down is our first across-the-board positive:
 
Name
 G
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
Total Runs Above Replacement
Career WAR
Eddie Mathews
2,391
501.6
0.7
17.7
33.2
22.3
371.4
946.7
96.1
 
I was kind of surprised to see that Mathews ended up with positives in all components, because I don’t think he’s generally thought of as a great all-around player. He does just barely qualify in the Baserunning Runs component – that’s his closest call. But, anyway, he is our highest career WAR qualifier with 6 positive results.
 
The next 8 in career WAR after Mathews are Cal Ripken Jr., Roberto Clemente, Adrian Beltre, Mel Ott, Al Kaline, Wade Boggs, George Brett, Chipper Jones. Legends all, but each one has at least one negative WAR component. The next member of the 6-positive club is one that probably wouldn’t surprise you:
 
Name
 G
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
Total Runs Above Replacement
Career WAR
Ken Griffey Jr.
2,671
440.2
15.7
9.2
3.4
20.7
391.3
880.4
83.8
 
Now, Griffey Jr., like Mays, was primarily a center fielder, but by the time Griffey Jr. came along, the position adjustment for center fielders had turned from a slight negative to a slight positive, which worked in Griffey Jr’s favor. Griffey’s pure Fielding Runs figure is way behind Mays’, but he did at least come out on the positive side of the ledger.
 
For the sake of brevity, here is the full list of the 43 "across the board" positives. The list is sorted by Career WAR, with Hall of Famers highlighted in yellow.  
 
Name
 G
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
Total Runs Above Replacement
Career WAR
Eddie Mathews
2,391
501.6
0.7
17.7
33.2
22.3
371.4
946.7
96.1
Ken Griffey Jr.
2,671
440.2
15.7
9.2
3.4
20.7
391.3
880.4
83.8
Arky Vaughan
1,817
356.9
12.1
34.0
21.0
97.1
244.4
765.6
78.0
Luke Appling
2,416
233.1
18.3
9.5
42.0
149.2
341.7
793.7
77.6
Lou Whitaker
2,390
209.4
32.1
16.0
76.7
71.3
344.2
749.7
75.1
Alan Trammell
2,293
132.3
25.0
14.4
76.9
133.3
323.2
705.1
70.7
Barry Larkin
2,180
200.2
80.7
3.7
17.6
124.4
282.9
709.5
70.5
Pee Wee Reese
2,166
30.3
55.7
8.2
117.0
132.2
339.7
683.2
68.4
Kenny Lofton
2,103
139.9
78.9
23.2
107.9
43.4
318.0
711.2
68.4
Ryne Sandberg
2,164
191.9
33.6
11.1
60.0
66.3
294.8
657.6
68.0
Chase Utley
1,937
172.5
44.8
23.9
131.0
43.6
243.5
659.4
64.5
Ken Boyer
2,034
183.7
13.5
6.1
73.0
29.3
305.8
611.4
62.8
Charlie Gehringer
1,591
289.9
33.9
15.5
42.0
53.7
246.7
681.7
62.8
Jackie Robinson
1,382
259.4
31.9
9.1
80.8
19.7
218.1
619.1
61.7
Sal Bando
2,019
205.9
11.4
3.5
36.3
37.7
289.4
584.2
61.5
Yogi Berra
2,120
225.7
11.6
11.0
33.7
58.6
254.3
595.2
59.4
Billy Herman
1,922
156.6
8.2
2.4
55.0
68.0
267.2
557.7
57.3
Jim Fregosi
1,902
138.5
9.4
9.5
2.8
74.7
240.3
475.3
48.7
Mike Cameron
1,955
70.5
40.9
5.0
71.2
33.5
268.4
489.7
46.7
Chuck Knoblauch
1,632
104.5
42.8
1.9
26.1
27.5
262.9
465.8
44.6
Lonny Frey
1,535
50.7
23.0
23.0
57.0
67.2
204.0
425.0
44.5
Ben Zobrist
1,651
122.9
10.0
11.4
51.0
5.3
239.6
440.0
44.5
Nomar Garciaparra
1,434
189.0
0.7
3.4
15.0
49.4
212.6
470.2
44.3
Lenny Dykstra
1,278
132.8
45.9
5.8
45.7
20.6
162.2
413.0
42.4
Harlond Clift
1,582
167.8
22.8
1.3
3.0
17.9
232.0
444.9
42.1
Eddie Stanky
1,259
127.2
10.7
2.0
24.8
57.4
188.0
409.9
41.4
Andy Van Slyke
1,658
122.1
30.7
7.3
26.1
1.5
211.1
398.6
41.3
Gil McDougald
1,336
93.7
6.7
1.9
89.7
43.3
164.7
400.1
40.6
Eric Chavez
1,615
94.3
9.1
2.6
40.7
17.9
229.8
394.6
38.3
Ray Lankford
1,701
172.5
4.7
3.3
2.7
9.9
205.6
398.5
38.2
Garry Maddox
1,749
7.3
15.0
13.1
99.9
3.2
222.8
361.5
36.8
Don Money
1,720
70.3
3.0
0.6
22.8
16.1
242.1
355.0
36.5
Johnny Pesky
1,270
86.8
15.5
16.6
17.6
45.3
154.7
336.4
34.3
Robby Thompson
1,304
67.3
13.7
5.7
36.1
37.8
168.4
329.0
33.8
Shane Victorino
1,299
28.0
37.9
15.3
72.0
1.5
161.6
316.5
31.5
Jacoby Ellsbury
1,235
20.4
36.1
7.9
30.0
20.0
188.8
303.2
31.2
Brian Roberts
1,418
17.5
13.1
4.2
8.0
34.5
220.9
298.3
29.5
Red Rolfe
1,175
21.9
24.8
27.1
40.0
18.9
180.1
312.8
29.1
Edgardo Alfonzo
1,506
65.7
8.6
1.0
16.6
32.9
188.8
313.6
28.7
Denard Span
1,359
39.9
16.7
6.1
8.0
12.3
197.5
280.5
27.9
Lloyd Moseby
1,588
17.5
22.8
7.3
8.4
3.0
224.2
283.1
27.6
Billy Goodman
1,623
28.6
11.8
10.4
20.2
5.5
193.2
269.7
27.2
Jean Segura
1,230
2.8
20.7
2.2
18.0
57.3
169.9
271.0
26.2
 
Because the Position Adjustment Runs is a key component, this is heavily dominated by players up the middle and third basemen, although Berra was the only catcher. If you classify by primary position, here’s the distribution:
 
Position
Count
2b
14
cf
11
ss
9
3b
8
c
1
 
At this point, I thought I’d get a little more selective, because even though the result was only about 3% of the data set, a lot of these players qualify as "positive" in certain categories by an oh-so-slim margin. 
 
So, rather than looking for players who were simply "positive" across all 6, how about those who achieved a higher level in all 6? Below are the players from the previous result set who were in the 75th percentile or above (within the dataset) in all 6 categories:
 
Name
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
Career WAR
Rbat Pctile
Rbaser Pctile
Rdp Pctile
Rfield Pctile
Pctile Rpos
Rrep Pctile
Gehringer
289.9
33.9
15.5
42.0
53.7
246.7
62.8
92.5
93.7
93.0
79.8
82.1
75.9
Appling
233.1
18.3
9.5
42.0
149.2
341.7
77.6
88.6
86.6
83.3
79.8
99.6
95.0
Whitaker
209.4
32.1
16.0
76.7
71.3
344.2
75.1
85.5
93.4
93.3
91.7
89.8
95.1
Sandberg
191.9
33.6
11.1
60.0
66.3
294.8
68.0
83.3
93.6
87.9
86.2
87.6
88.9
Utley
172.5
44.8
23.9
131.0
43.6
243.5
64.5
81.1
96.5
97.9
96.9
76.5
75.0
Lofton
139.9
78.9
23.2
107.9
43.4
318.0
68.4
76.4
99.3
97.6
95.5
76.3
92.1
 
So, now we’re down to just 6 players: 4 second basemen, a shortstop, and a center fielder. And, if we got even more restrictive (say, 80th percentile or above)? That would get us down to two:
 
Name
Rbat
Rbaser
Rdp
Rfield
Rpos
Rrep
Career WAR
Rbat Pctile
Rbaser Pctile
Rdp Pctile
Rfield Pctile
Pctile Rpos
Rrep Pctile
Whitaker
209.4
32.1
16.0
76.7
71.3
344.2
75.1
85.5
93.4
93.3
91.7
89.8
95.1
Sandberg
191.9
33.6
11.1
60.0
66.3
294.8
68.0
83.3
93.6
87.9
86.2
87.6
88.9
 
How about that? After all the whittling down, the only 2 players who were at the 80th percentile or above across all 6 components are the players who were generally considered the best AL and the best NL second basemen of the 1980’s (although some might opt for Willie Randolph over Whitaker in the AL). But, I think these 2 would generally be the consensus picks.
 
And, of course, those who know me would probably take this opportunity to remind me that I am generally not an advocate for Whitaker’s selection to the Hall of Fame (which is true). But, it does point out one thing that I do believe does hold true for Whitaker, and that he is a very good all-around player who didn’t really have any big weaknesses. He had a long career and had an extremely high number of good seasons, 3 and 4 WAR type seasons, but he didn’t have many high ones (5.0 or over) and he also didn’t have any real poor ones. He stayed in the middle lanes for virtually his entire career. This is kind of a similar dynamic – Whitaker was good across the board in all the components of WAR that help result in a positive figure, and didn’t have any bad categories. That’s basically the type of player he was.
 
Pre-1931 and the Negro Leagues
 
One last analysis before calling it a day. As mentioned earlier, data from pre-1931 and data from the Negro Leagues for Double Play Runs (Rdp) is just a bunch of zeros, so I excluded all pre-1931 seasons as well as excluding all Negro League seasons. 
 
But what if we did a partial query? That is, what if we looked for players from those 2 segments who registered positives in the other 5 categories other than Rdp?
 
Using the same criteria (1,000+ games) for NL/AL players, there are 290 players in that dataset. 15 of those players had positives in the 5 non-Rdp categories (about 5%). The list is sorted by Career WAR, with Hall of Famers highlighted in yellow.  
 
Name
G
PA
Rbat
Rbaser
Rfield
Rpos
Rrep
Total Runs Above Replacement
Career WAR
Honus Wagner
2,794
11,759
637.7
33.7
85.0
107.2
373.6
1237.4
130.8
Eddie Collins
2,826
12,082
627.8
39.2
35.0
36.5
437.9
1176.3
124.4
George Davis
2,372
10,178
278.3
19.6
146.0
91.0
349.5
884.7
84.9
Bill Dahlen
2,444
10,435
135.4
11.5
139.0
144.9
344.2
775.3
75.2
Frankie Frisch
1,565
6,981
168.3
28.7
141.0
53.7
207.2
598.7
59.9
Jack Glasscock
1,699
7,372
132.9
15.7
146.0
106.5
248.0
649.0
59.5
Wally Schang
1,812
6,347
159.3
1.8
4.0
55.7
238.9
459.9
48.1
Johnny Evers
1,784
7,226
80.8
2.3
127.0
3.4
226.4
439.9
47.7
John McGraw
1,067
4,815
304.2
12.1
9.0
33.9
157.5
516.8
45.7
Buck Ewing
1,232
5,380
176.3
21.8
70.0
29.4
180.2
477.8
44.2
Hughie Jennings
1,196
5,266
168.2
12.6
68.0
48.2
174.8
471.8
41.5
Art Devlin
1,313
5,245
68.3
7.4
46.0
40.2
166.1
328.0
36.1
Bid McPhee
1,227
5,448
74.3
1.8
87.0
20.9
182.2
366.3
33.0
Hank Gowdy
1,050
3,145
27.1
4.4
55.0
38.1
97.0
221.5
23.8
Bob O'Farrell
1,222
4,065
24.3
7.6
23.0
37.2
125.3
217.5
21.9
 
5 Shortstops, 4 second basemen, 4 catchers, and 2 third basemen. 
 
The top players from that era who had one or more negative components include Ty Cobb, Tris Speaker, Babe Ruth, Rogers Hornsby, and Nap Lajoie. Cobb, Speaker, and Ruth had negative figures in the position adjustment component, and Ruth, Hornsby, and Lajoie had negative baserunning run figures.
 
How about Negro Leaguers? I reduced the games played threshold to 300 to account for the fact that we don’t have complete career data for Negro League players, and as a result the "games played" stats that have been captured tend to be significantly lower than what we tend to see for NL & AL players.
 
For this pull, there are 146 players in the dataset, with 7 (about 5%) showing 5 positives:

Name
 G
 PA
Rbat
Rbaser
Rfield
Rpos
Rrep
Total Runs Above Replacement
Career WAR
Willie Wells
1,039
4,538
309.8
5.9
36.7
64.7
147.3
564.6
51.0
Dobie Moore
453
2,008
118.5
0.1
52.3
29.8
66.6
267.3
24.5
Newt Allen
945
4,080
2.3
3.0
47.3
32.7
132.3
217.6
20.5
Bill Riggins
677
2,945
28.9
1.5
29.2
40.1
98.4
198.1
18.3
Sam Bankhead
629
2,710
1.1
4.3
19.1
21.4
86.0
133.8
12.1
Pythias Russ
311
1,260
35.2
1.8
9.7
14.5
42.9
104.0
10.0
Tom Young
345
1,190
18.1
1.5
2.6
9.6
40.7
72.5
6.9
 
These names probably aren’t as recognizable to most of you with the exception of Hall of Famer Willie Wells, although I’m sure some of you are familiar with at least some of the others like Moore and Allen. All of the other well-known Negro League stat hitters such as Turkey Stearnes, Oscar Charleston, Josh Gibson, Mule Suttles, Jud Wilson, Cool Papa Bell, Buck Leonard, Biz Mackey, Cristóbal Torriente, Bullet Rogan, and John Henry Lloyd had at least one negative category, typically either Baserunning Runs (Gibson, Suttles, Leonard, Mackey, Lloyd) or the Position Adjustment category (Stearnes, Charleston, Suttles, Wilson, Bell, Leonard, Torriente, Rogan).
 
Wrapping it Up
 
In conclusion, regardless of the league or the era, there are very few players who manage to post positive, across-the-board figures in all WAR-component categories.  In the different pulls I did (which had minimum game thresholds), about 4% of the players were able to achieve that status.  If we took all players without any regard to number of games, and if even if we included those with a 0.0 Rdp figure, it only happens about 0.5% of the time. 
 
Thank you for reading,
 
Dan
 
 
 
 
 
 

COMMENTS (16 Comments, most recent shown first)

BJHanke
Dan - I understand. The positional adjustments are certainly a component of WAR. I put all that info up because 1) I spent a lot of time dealing with the issue, so I'm pretty sure I know what I'm talking about, and 2) a lot of people take those positional adjustments seriously. But, for YOUR purposes, well, they ARE one component of WAR. I can't bitch because you used a component.
12:25 PM Jul 14th
 
DMBBHF
Hi Chris,

Always nice to hear from our resident Hall of Fame expert....... :)

Yeah, I'd have to say that this little exercise didn't really sway me on Whitaker. I think it just reinforced what he's always been to me.....a player who didn't really have weaknesses, and was good in many different ways, but he just doesn't scream "Hall of Famer" to me. I suppose I'm just stubborn. But, as I've said, I certainly won't be upset if (when?) he does make. I think it's inevitable that he will.

BJHanke,

Thank you for the comment, although I'm not sure how to respond. You may very well be right about all of that, but I was simply curious about which players achieved their total WAR figures through all-positive building blocks. That was really all I was trying to find out. I'll let others debate the merits of the individual pieces.

Thanks,
Dan
9:30 PM Jul 11th
 
BJHanke
Hi, Dan. Trying to keep this short, I was asked to compare Lou Brock to Bobby Abreu as HoF candidates, which led me to check out the BB-Ref positional adjustment methodology. The BB-Ref positional adjustments are a horror show. They are subjective in nature, don't work whether subjective or objective, and the concept is nonsense. All of their work seems to have been MISapplied from Bill James' work in the book Win Shares, 20 years ago.

The approach is taken from Bill's essay "Why Did The Defensive Spectrum Jump?" in Win Shares, an essay that documents that the defensive values of 2B and 3B changed over time. The idea is to use an offensive stat to develop a "quantified offensive spectrum (QOS)" (my terminology) which they then flip to get a quantified defensive spectrum. The biggest problem is that the offensive stat they use - tOPS+ - is complete garbage, and generates a QOS that is impossible (you can find all this on the same BB-Ref web page as the chart of values). The reasonable approach to this would be to say, "we need a better offensive stat." Instead, they throw unspecified "adjustments" at the thing until it gives them something that looks respectable. The problem is that the QOS is nowhere near close enough to justify adjustments. The question stops being, "Why did you make that adjustment?" and becomes, "Why did you stop after those adjustments instead of making more?" And the answer to that question is, "We stopped because we finally got what we wanted." Which is subjective.

And they don't work. Bill took careful note, in his essay, of whether the two big 2B bats of Lajoie and Hornsby biased the results. He found that they did not, over the course of three decades. But I was looking at Lou Brock. Check out your chart above for LF, decades of the 1960s and 1970s. They are higher than the adjustments for the surrounding decades. Why? Because Yaz, Stargell, Billy Williams and Lou Brock were all playing LF in VERY close careers (those two decades and very little else). And those four guys, synched up, produced enough offense to queer the Offensive Spectrum.

And even if they did work, the approach is indefensible, since they apply the positional adjustments to OWAR, which is offense. Bill blew up that idea 20 years ago in an untitled essay that leads off the Whys and Wherefores section of Win Shares, pp. 102-105. The key passage is on p. 104. Bill posits a LF and a 2B; the LF hits worse than average for a LF, and the 2B; hits better than average for a 2B, but the LF hits more than the 2B. He then demonstrates that the effect of using a positional adjustment approach is to claim that the 2B is a BETTER hitter than the LF. To quote Bill, "(The method requires the claim that the 2B) is a better hitter because he plays second base.... Well, of course he is not a better hitter."

I note that Tangotiger seems to understand this; he wants to point the positional adjustments at the DEFENSIVE numbers. I'm not sure that this will work, either, but at least he's not pointing the PosAdj at a completely inappropriate target.

All in all, I'd like to suggest that you just drop the PosAdj from your criteria as inappropriate nonsense. Sorry, but when someone challenges you to compare Brock to Abreu, you obsess over things like positional adjustments.
8:44 AM Jul 11th
 
chrisbodig
Really enjoyed the piece, Dan. Bookmarking it for future reference.

Personally, I found it interesting that it came down to Whitaker and Sandberg as the two players who wound up in the 80th percentile for all categories. They're the #1 comp for each other on similarity scores but Ryno has a 158-to-93 "Hall of Fame monitor" score because of the accolades advantage (MVP, 10x ASG, 9x GG) compared to Sweet Lou (ROY, 5x ASG, 3x GG).

I'm surprised that your analysis didn't make you a Whitaker supporter for the Hall!

Great stuff.
2:01 PM Jul 10th
 
DMBBHF
Hi Jgf,

No problem on the Dan vs. Dave thing.....it can get confusing, especially with Dave Fleming on board. At least you didn't call me "Mark", which I get a lot of due to my last name :)

Glad you enjoyed the article, and I do appreciate that you and Tom steered me to the WAR raw data source. That made the whole thing relatively easy to download and analyze.

I do understand the feedback regarding the possibility of combining Rpos and Rfield, and you're like the third person to mention it, so I am reflecting on it. However, I should say that what got me interested in the first place was the concept that there are 6 different components that feed into the final WAR total, and I was interested in finding players who realized positives in all 6, kind of treating each component as a binary hurdle/checkbox. As DrDoom pointed out, of course, one of those (Rrep) was kind of a given as a positive, but the other 5 could go either way. I wanted to see how common/rare it was for someone to have positives across the board.

So, I didn't really see it as being restrictive or, as Tom referred to it, as "artificial". I wanted to see which players had each of the 6 components pushing them upward. That was my original curiosity.

Thanks,
Dan
11:04 AM Jul 10th
 
jgf704
Crap, sorry. You're Dan, not Dave.
9:03 AM Jul 10th
 
jgf704
Hey Dave! Sorry I missed the article until now; I enjoyed it (and thanks for the shout-out).

FWIW, I also would have preferred to see Rpos and Rfield combined. By not doing that, you are doing what you describe in your response to him which is that you restrict yourself to players who play a key defensive position, but you simply let Rpos (greater than 0) define that for you. There's nothing wrong with that, of course...

OTOH, I think combining Rpos and Rfield, and then looking for players with the best balance between Rbat, Rbaser, Rdp, and Rfield+Rpos more directly accomplishes the goal of finding players who were good across the board. It seems like it could be a fun article to research and write, and equally fun to read.
10:53 PM Jul 8th
 
tangotiger
I think that "key defensive position" is an artificial limiting agent.

And being "above average for your position" really sets the bar much much higher for the SS and Catcher than it does for the 2B+3B.

Ichiro, Clemente, Beltran are all excellent examples of the "complete" player.
5:00 PM Jul 3rd
 
DMBBHF
Tom,

I disagree, but I also think I could have done a better job of describing what I was trying to do. One of the criteria was that, to qualify for the final list, a player had to play a "key defensive position", much like Bill had alluded to in one of his earlier studies. I assume he was using "up the middle" to identify key defensive position in his, but I decided to use a positive Rpos to determine that, so I had to keep Rpos separate.

If I combined Rfield and Rpos into a single number, the number of "all positive" players would have more than doubled from 43 to 89. The original 43 would all still qualify, of course, but it would have also added the following:

Barry Bonds
Chipper Jones
Mike Trout
Joe Morgan
George Brett
Larry Walker
Roberto Clemente
Larry Doby
Carlos Beltran
Roberto Alomar
Andre Dawson
Robin Yount
J.D. Drew
Richie Ashburn
Jose Cruz
Kevin Youkilis
Reggie Sanders
Curtis Granderson
Starling Marte
Don Buford
Ron Hunt
Ichiro Suzuki
Carlos Guillen
Marcell Ozuna
Dick McAuliffe
Bill Doran
Sam West
Bake McBride
Carl Crawford
Bill North
Willie Davis
Jose Reyes
Jason Heyward
Brian Jordan
Steve Finley
Ruppert Jones
Cesar Tovar
Josh Reddick
Curt Flood
Angel Pagan
Irv Noren
Jim Landis
Brett Gardner
Alex Gordon
Randy Winn
Terry Moore

Now, some of those might have been fine additions, but too many of those would have brought in players from non-key defensive positions. For example, Bonds is great, but as a player who was mostly a LF, he's not at one of the more important defensive positions. Same with Walker and Clemente as right fielders. Great defensive players, but not at one of the more important defensive positions.

Some like Chipper, Morgan, Trout, and Alomar had positive Rpos figures from playing "key" defensive positions, but they had negative fielding runs, so I didn't want to include them either.

In short, what I was going for was that players had to hurdle BOTH thresholds - a positive defensive runs figure AND playing a "key" defensive position. I wasn't just going for a net positive defensive result - they had to check both boxes.

Thanks,
Dan

10:12 AM Jul 3rd
 
tangotiger
You 100% have to combine the fielding and positional adjustment, given what you are trying to do.
1:55 PM Jul 2nd
 
DMBBHF
Thank you for the comments.

DrDoom - that's a fair point, I probably should have focused on the other 5 categories, although I will say that when I first downloaded and organized the raw data, there are over 9,000 players with Rrep of zero or even slightly negative (probably small rounding accumulations of the seasonal figures), but they are mostly pitchers or other non-significant players. When I put the minimum games filter into place, they all go away. So, yes, I probably could have handled that a little better and simplified the analysis.

Thanks,
Dan
7:00 PM Jul 1st
 
DMBBHF
Thank you for the comments.

DrDoom - that's a fair point, I probably should have focused on the other 5 categories, although I will say that when I first downloaded and organized the raw data, there are over 9,000 players with Rrep of zero or even slightly negative (probably small rounding accumulations of the seasonal figures), but they are mostly pitchers or other non-significant players. When I put the minimum games filter into place, they all go away. So, yes, I probably could have handled that a little better and simplified the analysis.

Thanks,
Dan
6:59 PM Jul 1st
 
DMBBHF
Thank you for the comments.

DrDoom - that's a fair point, I probably should have focused on the other 5 categories, although I will say that when I first downloaded and organized the raw data, there are over 9,000 players with Rrep of zero or even slightly negative (probably small rounding accumulations of the seasonal figures), but they are mostly pitchers or other non-significant players. When I put the minimum games filter into place, they all go away. So, yes, I probably could have handled that a little better and simplified the analysis.

Thanks,
Dan
6:58 PM Jul 1st
 
OwenH
Great article Dan, really enjoyed it.

Yaz had 184 fielding runs and Schmidt 127? Wow.
5:25 PM Jul 1st
 
77royals
Just goes to show that the positional adjustments overate middle infielders and center fields. And underate first basemen and corner outfielders.

As well as catchers, because Berra was such an exceptional players.

Looks like they got it right for third basemen.

1 position out of 9.


Still not understanding how you use positional adjustments for batters. As they are batting, not fielding.
3:38 AM Jul 1st
 
DrDoom
Thanks for this, always a fun exercise.

Adam Darowski at the Hall of Stats website does similarity scores by this method, which are tremendous fun to look at, and gets at players I believe to be even more similar than Bill's original versions.

But I would also say this much: I would consider combining Rfield and Rpos: that way, your Keith Hernandezes rank as above-average fielders, your Mickey Mantles rank around average, and your Yuni Betancourts STILL can't be saved by a positional adjustment. There's also an argument for combining Rbaser and Rdp, but I can understand if you really believe those to be separate skills. Combining those things makes three "true" categories: hitting, fielding, and baserunning. (There are some WAR systems out there that separate arm and glove; those would be really cool to use if you wanted to do a "five-tool" analysis.)

I'm also really unsure why you called these SIX categories, when there are only five - since, definitionally, the Rrep can't be negative. So I think you could've probably saved yourself some words and columns. But still, a fun article I enjoyed very much!
11:02 PM Jun 30th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy