Three Letter Codes

April 5, 2021
                                                  Three Letter Codes

            I have a million ideas like this one; usually I work on them an hour, two hours, a couple of days, a week, and very often I wind up throwing them away without publishing anything; they just interrupt the bigger things I am trying to do, without actually producing anything.  This is a normal part of research; I’ve had researchers in other fields tell me the same happens to them; they just waste time trying stuff, and once in a while something clicks.

            So, this seemed to work a little bit, so I’ll throw it against the wall and see if anything sticks.  Suppose that you narrow a player’s range of skills down to five things:



            The ability to get on base

            The ability to play a key defensive position, and

            Defensive excellence at the position.


            There’s an ancient framing of a baseball skills that a "five-tool player" is one who can run, hit, throw, field, and hit for power.    I think. . .this could be completely wrong. . .but I think that originated in talk about Willie Mays; Willie was really the only guy who could do all five of those at a high level, but when I was young people would use that to define a "superstar"; a superstar was a player who could run, hit, throw, field and hit for power.  Gradually it disconnected from a superstar, and became a "five tool player", a distinct concept, which had to happen, because very few superstars actually do all five of those things.  Eventually fans realized that 90% of the people who scouts described as "five tool players" were guys like Wily Mo Pena and Corey Patterson who had all the tools you could want, but couldn’t actually play baseball. 

            This is kind of a related concept, but not exactly.  I have tried many times to form "families" of baseball players, and all of those methods sort of work but don’t exactly work, so I am always looking for a better way to do it.  So I had this idea:  Suppose that you "coded" all players by their three major skills?

            I took the career records of all major league players in history who played 1,000 or more games; I have 1,486 of them in my file, and there are a few others that I eliminated for one reason or another.  Then I ranked those 1,486 players 1 through 1,486 in order of:

1)     Speed,

2)     Power,

3)     The ability to get on base,

4)     Their defensive position, and

5)     Their defensive excellence. 


Yes, I know; I already said that.  I ranked them 1 through 1,486 in each of those skills.   P S O D F—Power, Speed, On Base, Defensive Position, and Fielding Excellence at the position.  Speed is easy; I have speed scores for everybody.   Power is easy; that’s just Isolated Power.  On Base is just on base percentages.   

The fielding stuff is more complicated; fielding is always complicated.  The problem with "defensive position" is that, even if we all agreed that "catcher" is the most important defensive position and "DH" is the least important position, there are many catchers and many DHs, so you have a lot of ties.  I broke the ties by (1) Games Played at the position, and (2) Defensive excellence.  

Defensive excellence relative to position is of course hard.  I invented a system to rank players by defensive excellence, which I won’t share with you because it’s not ready for Prime Time, but it works pretty well in most cases, is problematic in some. 

Anyway, having ranked all players in all five categories, we then ask, about each player, "in what area does he rank the highest?"  Obviously, Kenny Lofton and Willie Davis and Brett Butler rank highest in "speed", but so do, for example, Nick Punto, Eddie Miksis, Endy Chavez and Jamey Carroll.   Willie Mays was no doubt faster than Nick Punto, but for Mays, speed was his fourth-ranking skill—thus, not part of his code. 

Then we move on to "What is this player’s second highest-rated skill?" and "What is this player’s third-highest rated skill?"  With 1,486 players eligible for the study there are about 300 players for whom "Speed" would be their highest-ranking skill.   There are 60 possible codes, so there are, on average, about 25 players in each "code group."   25 average; the numbers range from 74 for the code "DFS" down 6 for the codes "PFD" and "DSP".  DFS is 1) playing a key defensive position, 2) playing it well, and 3) having a little bit of speed.   "Running well"; not fast, but runs well. Lots of those guys, mostly guys like Tim Foli, Scott Fletcher, Walt Weiss, Mike Bordick.  There are some good players in that group—Marty Marion and Jim Fregosi, for example—and there are some catchers in the group, catchers who ran well, like Manny Sanguillen.   Remember, we’re not saying that speed is a central skill here; speed is their number 3 skill out of 5.   Mostly these are not great players.   No one in the group is in the Hall of Fame, only 7 of the 74 players had 2,000 career hits, and the only one to win an MVP Award was Marion. 

There are only six members of the group DSP (Defensive Position, Speed, some power.)  Those six are Travis Jackson, Sam Wise, Mike Lansing, Billy Martin, Eric McNair and Tom Foley.  If you look at the records of Martin, Lansing and McNair, you can see that they’re all kind of similar players.  Foley may be mis-coded because the system didn’t pick up his defensive skill, and it may be that he would be better coded as DFS, rather than DSP.   The other code that has only six adherents is PFD—players who have some power, and are good fielders at a somewhat vital defensive position (i.e., not DH, first base or left field.)  The six members of that group are George Hendrick, Larry Parrish, Jose Guillen, Matt Williams, Juan Uribe and Tom Brunansky. 

So I tried that, and it sort of worked.  Whether it "works" or not is really two different questions:  1) whether this is a valid approach to forming "identity groups" of players, and 2) whether my methodology accomplishes that.   On the second issue, of course there are problem with the methodology; it’s complicated, and this was the first time I had tried to make it work.  But I don’t have any reason to believe, based on that, that this would not be a useful approach for forming groups of players.

One thing I didn’t really expect, before doing the work, is that the "groups" would distinguish themselves by quality of performance.  My assumption going in was that there would be good players and bad players in every group, which there are, mostly, but there are some codes that are mostly stars and many codes which are mostly place keepers. The three LEAST successful groups of players are SPD, SPF and SDP.   SPD is Speed, Power, and playing a good defensive position.  SDP is essentially the same thing, and SPF is speed, power, and quality of defense relative to the position.  My system put a total of 35 players into those three groups, of whom not a single one played in 2,000 career games, had 2,000 hits, made the Hall of Fame or ever won an MVP Award or started an All Star game.  The 35 players in those three classes, alphabetically, are: Marlon Anderson, Kevin Bass, Daryl Boston, Tom Brookens, Jackie Brandt, Johnny Callison, Ed Charles, Billy Cox, Jerry Denny, Mike Devereaux, Juan Encarnacion, Gene Freese, Larry Herndon, Ruppert Jones, Gabe Kapler, Ken Landreaux, Felipe Lopez, Gary Matthews Jr., Dave May, Lee Maye, Nate McLouth, Catfish Metkovich, Dan Meyer, Raul Mondesi, Corey Patterson, John Reilly, Juan Rivera, Luis Salazar, Juan Samuel, John Shelby, Grady Sizemore, Bobby Tolan, Melvin Upton Jr., Mitch Webster, and Gerald Williams.

The three best players in those 35 were Johnny Callison, Grady Sizemore and Bobby Tolan.  All three of them were SPF, as opposed to SPD or SDP.  All three of them appeared early in their careers to be headed to Cooperstown, but all three of their careers jumped the tracks about age 25-26, because of injury or something else.  Callison hit a 3-run home run with two out in the 9th to win the 1964 All Star game, although he did not start it.  But those are the BEST players in these speed/power/defense packages, the best of them.  Most of these players, frankly, had disappointing careers.  If your central skills in baseball are speed and power, you are probably not a very good player. 

If I were to call up one of my friends in baseball now and say, "Hey, I’ve learned something really interesting.  If your scouts are raving about this college guy who is a speed/power combination, you need to be wary about that.  My studies show that the payoff rate on those guys is very poor."

As I say, if I were to do that NOW, that would be reckless and improper.  I don’t know that to be true.  It’s just something that accidentally dropped out of a study.  Many times things drop out of a study like that that don’t turn out to mean anything.  But it’s worth further study from different angles. 

Also, at this moment a few of you are chomping at the bit, anxious to tell me "Tom Brookens?  Tom Brookens wasn’t fast.  Why is he coded as a Speed Player?"  Tom Brookens wasn’t really GOOD at anything, frankly, although he was a useful player. He stole 10-15 bases a year, had an above-average triples rate and a below-average GIDP rate.  In our group, he ranks at the 14th percentile in on base percentage, the 49th percentile in fielding, the 56th percentile in power, the 64th percentile in defensive position, and the 68th percentile in speed.  So. . .speed it is. 

Not necessarily saying it is RIGHT; I’m just saying that is the classification that I got by the process that I used.   Another one like that is Gene Freese, a few years before Brookens.  Freese wound up as the regular third baseman for two pennant-winning teams, the 1959 White Sox and the 1961 Reds, although he was a notoriously awful defensive third baseman, and no one remembers him for his speed.  We remember him for (a) his power, and (b) being on those two championship teams.  He hit 23, 17 and 26 homers and had OK batting averages, making him a 5th/6th type hitter.  But this is the result I got.  If I stop to argue through all of those issues right now, I’ll never finish the study. 

A kind of similar group is SPO—Speed, Some Power, some ability to get on base.  The two best players in that group are Rick Monday and Kirk Gibson, who are, when you think about it, really similar players.  Both were left-handed hitting, left-handed throwing center fielders, although Gibson moved to a corner outfield position early in his career.  Both came up with huge hype; Monday was the first player taken in the first draft, while Gibson got his manager in trouble when the manager (Sparky) compared him to a young Mickey Mantle before he was ready to play at a high level.   Both began with American League teams, and both won World Series rings with the Dodgers, Gibson also winning one with the Tigers, and Monday getting right to the brink of the World Series with the A’s, but not there.  Both have quite similar career totals—19 years for Monday, 17 for Gibson, similar numbers of home runs, similar batting averages and on base percentages.  I had never thought to link the two together, but the system DID put them in the same group, which is a point in favor of the method. 

But then we get back to the Brookens point.  As was true of Gene Freese, we think of Monday and Gibson more as power players than as speed players.  You may well be right to do this—but this system is also correct, given its own premise.  It may well be that Monday and Gibson’s power was more VALUABLE than their speed, but that is not what the system is asking.  What the system is asking is, "When you line up all players by power and line up all players by speed, are they closer to the front of the line in power, or in speed?"  The answer is, speed.  There are lots of guys who hit with more power than Gibson and Monday, but most of them are first basemen or big, slow outfielders like Adam Dunn or Jorge Soler.

The Monday/Gibson code group (SPO) is also a relatively weak group.  There are 20 players in that group.  Gibson is the only one to win an MVP Award.  No one in the group played 2,000 games or had 2,000 hits, and the 20 players started a total of 5 All Star games.   

The strongest group in terms of totals would be POF.  The group POF (Power, On Base, Good Fielding). . .that group has won 20 MVP Awards, and its members have started 122 All Star Games.  This is true in part because there are 51 members of that group, but that makes it the fourth-largest group.  (The largest group, DFS, has 74 members but has won only 1 MVP Award, and its members have started only 22 All Star games.  Only 3 of those 74 have 2000 total Runs + RBI—whereas of the 51 members of the POF group, 26 of the 51 have 2000 total Runs + RBI.)

So who are those guys?  The players in the POF group who HAVE won MVP Awards are Harmon Killebrew, Jim Rice, Boog Powell, George Foster, Jeff Burroughs, Willie Stargell, Cecil Fielder, Barry Bonds, Frank Robinson, Willie McCovey, Orlando Cepeda, Justin Morneau and Roger Maris.   Those in the group who did NOT win an MVP Award—hardly any less impressive—include Eddie Mathews, Jack Clark, Vic Wertz, Greg Luzinski, Rocky Colavito, Joe Adcock, Frank Howard, Mark McGwire, Reggie Smith, Fred McGriff, Adam Dunn, Wally Berger, Jim Thome, Norm Cash, Del Ennis, David Ortiz, Roy Sievers, Kent Hrbek, Carlos Delgado, and Prince Fielder.   And then, within the 51, is a group of lower-ranked stars (Andre Thornton, John Mayberry, Bill Skowron, Eddie Robinson, Jay Buhner, Pat Burrell, Ellis Burks, Lyle Overbay, Tino Martinez), and then there are some guys who just wind up here but aren’t at anywhere near the same level—Walt Dropo, John Mabry, Hector Lopez, Babe Dahlgren.)

But, you are entitled to ask, how does David Ortiz, a Designated Hitter, wind up in the POF group, where "F" stands for good fielding?  David had a LOT of power (no kidding) and he had very good on base percentages.   That leaves, for choice of his third-leading skill, Speed, Defensive Position, and Fielding Quality.  As to speed. . .David did not have any speed.  As to "Defensive Position", he was a DH, so he certainly did not play a key defensive position.  He wasn’t a BAD first baseman; we moved him to DH not because he was a bad first baseman, but because he tended to get hurt playing first.   So our options for his "third skill" are zero, near zero, and near zero.  The system chose one of the near-zeroes.  The same for Greg Luzinski, Joe Adcock and maybe some others.

But you have to admit, don’t you, that David Ortiz IS grouped with the people he should be with, for the most part?  Ortiz is classified with McCovey, Stargell, Cepeda, Eddie Mathews, Jim Thome, Frank Howard, Boog Powell, Fred McGriff, Harmon Killebrew, Jim Rice, Carlos Delgado and Mark McGwire.  Isn’t that  his actual crew?   Isn’t that the right place for him?

Now, you CAN say, logically enough, that Jim Thome, Fred McGriff, Willie McCovey, Willie Stargell and Carlos Delgado SHOULD be in his group, but Harmon Killebrew, Eddie Mathews, Roger Maris and some other guys should be in some other group, because they had more defensive value.  But there’s a problem with that, too, which I’ll explain at the end of the article, in the section entitled "The end of the article." 

The POF code is tremendously strong, but it actually is not the strongest code on a per-capita basis.   The strongest codes for the number of players included are all codes that start with "F", meaning "good fielder for the position", and then all include either "O"—on base skills—or "P"—power.

The code FOP has only 18 members, but that includes three Hall of Famers—George Brett, Al Kaline and Carl Yastrzemski—plus any number of other quality players such as Robin Ventura, Don Mattingly, Mark Grace, Bernie Williams, Bobby Murcer and Chili Davis. 

The code FPO has 27 players—about an average number--but those 27 include Willie Mays, Ken Griffey Jr., Henry Aaron and five lesser Hall of Fame players—Eddie Murray, Dale Murphy, Ron Santo, Dave Winfield and Tony Perez.  It’s also got Freddie Lynn, Jim Edmonds, Cecil Cooper, Dwight Evans, George Scott, Sal Bando, Chet Lemon, Scott Rolen.  The only player in the group who couldn’t be considered a star is Willie Montanez. 

The code FOD (good fielder, fairly high on base percentage, not a first baseman/DH type. . .that code has only 10 members, but all 10 of them were very good players.  Lou Whitaker, Willie Randolph, Tony Phillips, Toby Harrah, Billy Herman, Bobby Grich, Al Oliver.  Three other old-time guys who were similar but not many of you would recognize the names.

The code FOS has 26 players including eight Hall of Famers—Roberto Alomar, Joe Morgan, Tony Gwynn, Roberto Clemente, Kirby Puckett, Sam Crawford, Jim O’Rourke and Jake Beckley.  It also has Pete Rose, Keith Hernandez, Mickey Vernon.  No genuinely weak players.  Not as strong as FPO, but awfully strong. 

The code FPD has only 11 players, no Hall of Famers, but they are all quality players, mostly third basemen—Eric Chavez, Tim Wallach, Ken Caminiti, Graig Nettles, Gary Gaetti.  For some reason Torii Hunter and Bret Boone are in this group.

The code FPS (Fielding, Defensive Position, Speed) has only 13 members but four MVP awards—Dave Parker, Don Baylor, Andre Dawson, Ken Boyer.  Mike Cameron, Vernon Wells, Andruw Jones. . . .good players. 

Others of note. . . Ted Williams is in group OPD, a group of 21 players which also includes Babe Ruth, Chipper Jones, Hack Wilson, Manny Ramirez and Vladimir Guerrero, but also includes some lesser lights like Rico Carty, Roy Cullenbine and Bernie Carbo.

Stan Musial is in group OPF, a group of 34 which also includes Mel Ott, Jeff Bagwell, Edgar Martinez and Frank Thomas (the Big Hurt), but also includes more just-pretty-good players (Hal McRae, Wally Moon, Dan Driessen.)  Tony Oliva is in there, which feels right; Oliva should be in the Musial club. 

Mickey Mantle is in group OPS, which is a large group (53 players) but which also includes Joe DiMaggio, Jimmie Foxx and Lou Gehrig, as well as a dozen or more lesser Hall of Famers.  I think this group contains more Hall of Famers than any other.

Honus Wagner is in group ODS, a group of 17 players which also includes Tony Lazzeri and Charlie Gehringer.

Ty Cobb is in group OSF, a group of 39 players mostly made up of Deadball Era players—Eddie Collins, Shoeless Joe Jackson, the original Billy Hamilton, Dom DiMaggio, George Sisler, Enos Slaughter.  Fairly recent players from this group include Don Buford, Shannon Stewart, Greg Gross and Steve Braun.  Tris Speaker is in the slightly different group OFS, which also includes Richie Ashburn, Wee Willie Keeler, and Paul Waner.  Also includes, for some reason, Bobby Abreu and Minnie Minoso. 

Yogi Berra is in group DPO, a group of 29 mostly made up of catchers; only two Hall of Famers are Yogi and Piazza.  Darrell Porter, Darren Daulton, Gus Triandos, Ed Bailey, Ernie Whitt.  Only notable non-catchers are Vern Stephens and Nomar Garciaparra. 

Johnny Bench is in group DFP, which also includes Gary Carter, Carlton Fisk, and Ivan Rodriguez.  It’s a group of 73 with no other Hall of Famers (other than those three), a lot of other notable catchers (Bill Freehan, Elston Howard, Lance Parrish, Tony Pena), but this group also for some reason includes of lot of guys who were just taking up space, like Bob Aspromonte, Don Wert, Ray Knight and Jerry Adair. 

Roy Campanella is in group FDP, similar but with fielding moved up to the top level; that group also includes Brooks Robinson, Bill Mazeroski and Cal Ripken.  Mostly glove men like Clete Boyer, Bobby Knoop and Leo Cardenas.

Mike Schmidt is in group PFO, which also includes Reggie Jackson and Billy Williams.  Only 3 Hall of Famers among 22, not that 3 out of 22 isn’t pretty good.

Ozzie Smith is in group FDS, 60 players, which also includes Mark Belanger, Omar Vizquel, Larry Bowa, Roy McMillan, Rabbit Maranville, Alan Trammell, Don Kessinger and Dal Maxvill, but also includes some other type guys like Robin Yount and Barry Larkin. 

Alex Rodriguez is in group POD, a group of 25 players which includes no Hall of Famers and no other really big stars. 

Derek (Dirty Rotten) Jeter is in group FDO, 13 players including three other Hall of Famers—Nellie Fox, Johnnie Evers and Dave Bancroft.  Also includes Buddy Bell, for some reason. 

Albert Pujols, Joey Votto, Mike Trout and other active players were not included in the study. 


OK, so what exactly do we learn from doing this, and what is the potential value of it?   We learn that this does appear to be a viable approach to placing players in meaningful groups. We learn that, at the present time, it clearly is not EXACTLY what we need. 

It seems to me that probably the next iteration of a method like this would strip down to TWO "focus skills", instead of three, and then replace the third element with something else.  I learn from doing this that there are a lot of players, and many very good players, who actually have only two skills, and then once you get past those two skills, you’re just throwing a dart at near-zero numbers. 

I discussed this earlier, in regard to David Ortiz, but another form of the same problem is that Joe DeMaestri winds up in the same group of players as Johnny Bench.  Why?  Because DeMaestri, a 1950s shortstop, has only two "markers".   Both Bench and DeMaestri were very good defensive players at a key defensive position—catcher (Bench) and shortstop.  But after you get past defensive position—it is valuable to be able to play shortstop—and defensive quality, then DeMaestri has NOTHING to sell.  He is not fast, has no power and didn’t get on base.  The system chooses "Power" because he would hit 6 to 9 homers every year, and that was better relative to the game than anything else he did.  But we both know that DeMaestri belongs in a group with Belanger, not in a group with Johnny Bench.  So. ..get rid of the third descriptive category, and put something else there. 

Early in my career, I had an impact on baseball not really because of the work that I did, but because of the work that others did in following through on my ideas.  This includes my "work" with the Red Sox.  I didn’t really do that much work with the Red Sox, but the Red Sox succeeded in some very small measure because they were following through on ideas that I had developed.

I hope that people will see this for what it is, and not misrepresent it as a complaint.  I’m just stating the facts here.  I have had a wonderful career, immense good fortune, and I have nothing I can complain about.  But getting back to the start of this, I had an impact on baseball not really because of my work, but because of the work that others did in following through on my ideas. 

I still do that.  I still spin off ideas at the same rate I always did that others COULD develop, and those ideas have as much potential impact on the game as the ideas that I had 40 years ago.  But my audience then was young people, and young people had the time and energy to pick up things that I had done and run with them.  My audience now is older people.  That’s just the way it is; it’s a normal thing. 

But this idea, if someone were to follow through on it, has the potential to have an impact on the game.  Let us say that we were to refine this concept in the ways that it needs to be refined, and we were to have groups of players based on their two central skills and the player’s level of success.   Then let us suppose that some scouting director were to assign his scouts to answer a couple of simple questions about each prospect:



If this player becomes a successful major league player, what are the TWO things—two and only two—which will make him a valuable player:

1)    His Power,

2)    His Speed,

3)    His ability to get on base,

4)    His ability to play a demanding defensive position, or

5)    The fact that he plays his position—whatever it is—well.


Put checkmarks beside two of those five things, and then answer this question: What is the probability that this player will have a successful major league career:






Then the Scouting Director reduces that to a code.  Let us say that the code is PS40—Power, Speed, 40%.  Then the Scouting Director checks the value of players drafted within that group.   Let’s see, PS 40; that’s Corey Patterson, Brandon Moss, Jacque Jones, David Delluci, and there are some good players in there, too, but the average value of that code is 27 Win Shares. 

But suppose the code is FP80—good fielder, power, 80%.  That’s the Willie Mays-Scott Rolen—Carlos Beltran code.  The average value of THAT code is 184 Win Shares.  We need all of those guys that we can get. 

Of course the coding does not dictate the thinking of the scouting director.  It helps to organize the thinking of the scouting director; that’s all.  But there is nothing impractical about this.  It can very well be done.  There is no evidence, at this point of my career, that anyone is interested in following through on my ideas, but. . .these things can still be done. 

Or, another way to follow through on it. . .suppose that you refine this system and make it work a little better than it does now, and then you set out to determine the value of each individual code.  You might find, for example that the average value of a player within this study was 125 career Win Shares; I would guess that is about what it is.  We have three letter codes.  5 * 5 * 5 is 125. 

Suppose that the value of each individual code is 3, 4, 5, 6, or 7.   Fielding skill, let us say, is the 7-point skill, the ability to get on base the 6-point skill, power the five-point skill, playing a key defensive position the 4 point skill, and speed, the 3-point skill.   Then you have two prospects, one coded FP-H (Fielding, Power—High Probability of Success); his value would be 7 * 6 * 7, or 294.  Another would be SD-L (Speed, Defensive Position, Low).  His value would be 3 * 4 * 3, or 36. 

It’s very practical to build systems like this; it’s just research.  But at the moment, nobody but me seems interested in this line of research.  Everybody’s into Spin Rates and Exit Velocities, which is fine, but there is more potential value in this line of research than there is in that.  It’s just that I abandoned my audience for that kind of stuff when I went to work for the Red Sox, and so far at least I have failed to re-build it. 




The End of the Article

I know that what many of you are thinking, probably what most of you are thinking, is that what this system needs is more codes, rather than just three.  If you take that "sluggers group", the POF group that includes David Ortiz, Willie McCovey, Jim Thome, Willie Stargell, Eddie Mathews, Harmon Killebrew, Greg Luzinski and Andre Thornton.  If you take that group, and you break them down by, let us say:

1)     Whether they were left-handed batters or right-handed or switchy guys,

2)     Their specific defensive position,

3)     Their era, let us say three different eras, 1876-1919, 1920-1970, and 1971 to the present,

4)     The length of their career, two groups, 1000-1600 games and more than 1600,

5)     The QUALITY of their play, so that Andre Thornton is not in a group with Willie Stargell.  

You can do that, certainly, and all of those distinctions are very valid.  But in the system we have, there are 60 "code groups".  If you just break them down into left-handed, right-handed and both ways batters, such as R-POF, L-POF and B-POF. . .if you just do that, that increases the number of code groups from 60 to 180, which decreases the average number of players in a group from 25 to 8.  If you break them down into three levels of quality, then you have 540 codes, which means you have 2 or 3 players in each group.  If you do ALL of the things listed above, you would have 29,160 "groups" of players, which would mean that the average number of players in a group would be .05.  You might, once a blue moon, get a group of 3 players with identical codes, but that would be about the limit. 

So that doesn’t work; that’s useless.  Everybody has their own code, their own fingerprint, their own DNA, which is neat but doesn’t have anything to do with putting them in groups. 

What you have to do, to actually make it work, is to get as much information as possible into a short a code as is possible.  That was the germ of this idea; that is what I was trying to do here.  It didn’t exactly work, but it didn’t exactly fail, so if I live long enough, I’ll circle back to the idea in another few years.   Thank you for reading.












COMMENTS (19 Comments, most recent shown first)

IS there a Piece Of Stuff group? I might have missed it scanning through the article, but I would imagine that having Power, OBP, and Speed would be a good group. Who's in it?
4:18 PM Apr 9th
I guess you really swouldn't want to be a Power On-Base Speed guy.
3:29 PM Apr 9th
Wait, Derek Jeter is in the FDO group? We're talking Derek Jeter, almost assuredly the worst fielder to play 10k innings at shortstop, right?
9:25 AM Apr 9th
I loved this!

1. I wonder if the data in these codes aren't recalling an earlier Bill James speculation: that young players with old-player skills seldom age well. I think you originally made this observation about Tom Brunansky, who is (fascinatingly) one of the rare PFDs.

2. It might be useful to have a cutoff to qualify for the third letter. Your comments on David Ortiz makes sense: he had two skills, really. So, he’s a member of PO. I bet there are other POs. Maybe there are Ps (Dave Kingman?).

3. DPO is made up mostly of catchers. I'm a little surprised that it doesn't have some modern 2B (Sandberg? Whitaker?) or SS. Is this because, if you're a regular shortstop today, your F is going push either P or O out of the picture?

4. Cal Ripken was FDP. I'd feel happier with the system if I could point to what Ripken coud have done, or not done, to be a DFP, or FDO. Like, is a FDO “Call Ripken with 100 fewer HR?” (This is not a problem with the system, but my understanding of it.)

5. Similarly, Alex Rodriguez is POD, but it's not entirely clear to me what it would have taken for him to be a DPO. Would he have been a DPO if he'd pushed Jeter to 3B? DFP?

6. OSF is weird. Richie Ashburn, Bobby Abreu, and Minnie Minoso? That’s a weird grouping. I suspect that this anomaly is about speed, which is not useless but which is....different. Power, the ability to get one base, position: lots of players keep these, more or less, throughout their career. But no one keeps speed, not in the modern era. I suspect that’s a confounding factor here.

7. Might it make sense to either (a) split career codes, or (b) base codes on an arbitrary age cutoff, like "through age 27 season" or "first five years in the majors?" That would resolve players who had distinct phases, like Ernie Banks or Hanley Ramirez.
10:57 PM Apr 6th
What's Rickey Henderson/Tim Raines?


10:47 PM Apr 6th
I can't begin to explain how much I enjoyed this article. I love the idea of developing a new way to group players, then looking for similarities and patterns within the groups.
It makes intuitive sense to me that Fielding-centred ("skill") players would be more effective ballplayers than Speed- or Power-centred ("athletic ability") players. Adam Carolla often says that in boxing and MMA "slick" beats athletic ability - this would be similar.
2:14 PM Apr 6th
Interesting concept. I can see where determining 3rd best skill is a crap-shoot. Did you try just using one code and see how the players group? Also, are the five tools really of equal value in generating wins?
1:42 PM Apr 6th
Hmm, what if you simply made a letter code for those players with no third discernible skill, like N (for No third skill). Then the Shortstops that couldn't do anything but field or the DHs that could get on base and hit for power would have the codes DFN and PON respectively and be taken out of the groups that they don't really belong in
12:15 PM Apr 6th
Would you say this study is possibly evidence that the ability to field a position well and ability to get on base are overlooked by scouts compared to speed?

I am wondering if generally the players above the 25th percentile had power potential and the further the players are above that level the better they fully delveloped that potential. What I'm getting at is - are some of these skills far more inherent in a player (speed obviously) while others have more room to be developed (power and fielding?)?

Thus from a scouting perspective, perhaps, the dedication to develop skills and stay healthy are as important as having these skills in the first place?
11:59 AM Apr 6th
My first thought was that the reason some players seem misplaced is because you measured their skills by rank rather than a rate. I would guess some skills such as speed and power have a steeper slope of ability levels than others. Thus, it would eliminate the problem of having a Joe DeMasestri in the same group as Johnny Bench.

However, I see that would create another problem of keeping the study to a reasonable and well defined set of groups.

Basing skill level on ranking then, was having a strong third skill a large factor in separating the really good players from the merely decent or was it far more about having very strong top two skills? Or...?
11:09 AM Apr 6th
This is a good idea. My question would be, how do you determine which skill set is the strongest for each player? What data are you using to determine what’s player’s x strongest attribute or player’s y strongest attribute?
8:39 AM Apr 6th
Steven Goldleaf
Huh. Don't know how I missed that--it was right there in your lede. Maybe because it's a long article, and I'm old?
6:13 AM Apr 6th
Is Derek Jeter's best skill really fielding excellence? There has to be something wrong there.

It was very wrong of you to mention it, yes.
12:28 AM Apr 6th
Steven Goldleaf
Pretty sure Mays' original 5-tool rating included "throwing arm" as one of the five,

Yes, Steven. As if explicitly stated in this article.
12:28 AM Apr 6th
Steven Goldleaf
Pretty sure Mays' original 5-tool rating included "throwing arm" as one of the five,
10:10 PM Apr 5th
Is Derek Jeter's best skill really fielding excellence? There has to be something wrong there.
9:09 PM Apr 5th
You caught me! I'm a Michigander who was thinking "was speed really Tom Brookens' best skill? More so than the fact that he was a third baseman?"
8:37 PM Apr 5th
Thanks Bill for sharing this research and the look at your process.
5:46 PM Apr 5th
It seems like these rankings could be used to make similarity scores instead of groupings...
4:33 PM Apr 5th
© 2011 Be Jolly, Inc. All Rights Reserved.