Occasionally you may set out to go to Minneapolis, find yourself diverted to Milwaukee, and then discover that M’waukee is quite a fine city in which to spend a weekend. So it was with this study. It did not go where it was intended to go, but I thought it was interesting nonetheless.
I began here with the observation, two or three weeks ago, that we can learn something from the effectiveness of closers. We tend to assume that the players playing each position are making equal contributions to victory—that first basemen are better hitters than shortstops, that shortstops are better fielders than first basemen, but that overall, they balance. But obviously, one cannot argue that set-up men or lefty specialists are as good, overall, as closers. Thus, what we assume to be true about defensive positions is clearly not true about bullpen positions, and this caused me to question whether it was necessarily true about defensive positions, either.
I thought perhaps I could work this out by some sort of card game, in which one must allocate resources to maximize offensive contribution while minimizing defensive exposure. I haven’t been able to get that to work, but then I thought maybe I could work out the problem with a statistical model, and so I constructed a model on a spreadsheet.
The model is this. Suppose that we start out with a very large number of “players” playing little league baseball, each of whom has a known offensive and defensive ability. As these players move up to higher levels the numbers of them are constantly winnowed down, with the weakest players being systematically eliminated, and the others finding positions. At the end of that sorting-out process, would the major league players playing shortstop be of the same quality as the players playing first base, of better quality, or of lesser quality?
That effort failed because I was unable to find a way to simulate, in a spreadsheet, the process of sorting out defensive positions on a team. Nonetheless, the modeling process was interesting, and I think it may provide some insight into other issues. Let me first describe the model that I created, and then we’ll discuss the other issues.
I want to warn you that this is not gripping stuff. I am trying to create a model to think through a set of problems; this is work that has to be done. It’s not exciting, and there will be no conclusions forthcoming for some time. In my current model there are 9 levels of competition:
Little League 1 (Little League Baseball for players aged up to 12)
Little League 2 (Little League Baseball for players aged 13-15)
High School Baseball
Legion Baseball
College Baseball
A Ball
AA Ball
AAA Ball
Major League Baseball
In my part of the country there are American Legion teams or Ban Johnson teams that are kind of “super high school” teams, good high school players and players who have been out of high school a year or two who form summer teams and play 60 to 80 games during the summer. These teams play a huge role in who gets the opportunity to play college baseball.
In my model we start with 61,440 beginning little league players on 7,680 little league teams. At each level, half of these players are forced out of the system, and the other half advance.
7680 Beginning little league teams, 61,440 beginning little league players, which become
3840 Older little league teams, 30,720 older little league players, which become
1920 High School teams, 15,360 high school players, which become
960 legion teams, 7680 legion players, which become
480 College teams, 3840 College players, which become
240 A Ball teams, 1920 A Ball players, which become
120 AA teams, 960 AA players, which become
60 AAA teams, 480 AAA players, which become
30 major league teams, 240 major league players
When I first attempted the model, I assigned each player a “skill level” based on two random numbers, one representing his offensive skill, and one representing his defensive skill. I then augmented these basic skill levels by “development levels” as the players moved upward.
There was an obvious problem with my first effort, however, which was that many of the most talented players were forced out of the system even before they reached the level of high school baseball. That doesn’t seem realistic.
I then repaired the system so that:
Each player was assigned a “Talent Base” which was:
A random number multiplied by five, representing his offensive ability,
plus,
A random number multiplied by five, representing his defensive ability.
At each level of competition the player has a certain amount of growth as an offensive player—which is a random number, 0 to 1—and a certain amount of growth as a defensive player, which is also a random number, 0 to 1.
Let us take for example the case of player 35683, who we will call “Jack Raymond”. Jack Raymond was initially assigned a talent base of 4.43—
The random number .9949, representing his offensive talent, and the random number .7782, representing his defensive talent. .9949 times five is 4.9746, and .7782 times five is 3.8911. Adding those together, 8.8657, divided by two—a basic talent level of 4.43.
Among the 61,440 “players’ at the beginning of the model, Jack Raymond ranked 1,610th in native talent. He was not one of the 100 most talented players in the group; he was not close to that. He was, however, in the top 3% of all players in terms of raw talent.
Jack Raymond was assigned to beginner team 4,461. As a beginning little leaguer, Jack Raymond made average progress as a hitter (.5462), but very little progress as a fielder (.2044). His skill level exiting the younger little leagues, then is 4.93:
4.97 + .55 = 5.52 as a hitter
3.89 + .20 = 4.10 as a fielder (rounding discrepancy)
We combine these by the formula (10 * offense + 7 * defense) / 17, which yields 4.93.
One of the things I built into the system was an assumption that, as players move up, offense becomes more important. I think there is no doubt that this is true. At the lowest levels of competition, fielding is terribly important, because outs are so hard to come by. As you move up it becomes ever more important that you hit.
Anyway, Jack Raymond did not make great progress as a young little leaguer, but given his native talent, he had no difficulty in moving on to the next level of competition. When beginning team 4461 was combined with beginner team 4462 there were 16 players competing for 8 spots on the “older little league” (LL-2) team 2231. Jack Raymond was the second-best player in the group, behind player 35691, whom we will call Donny Donovan. Raymond thus easily made the team. Stated another way, Raymond was 64% better than an average beginning little leaguer, who had an average score of 3.005, or plus 1.92 (4.93 minus 3.01).
As a older little league player Raymond made good progress as a hitter (.6633) and excellent progress as a fielder (.9442). This brought Raymond’s overall value as an older little leaguer to
6.18 as a hitter (5.52 + .66), and
5.04 as a fielder (4.10 + .94).
These we combine in a 12-7 ratio (12 parts as a hitter, 7 as a fielder). This makes 5.76. Thus, at the higher little league level, Jack Raymond was 33% better than an average player (the average being 4.34), or +1.32.
In the next stage of our model, little league teams 2231 and 2232 were combined into high school team 1116. Jack Raymond was again the second-best player among the 16 position players trying out for the high school team, once more behind Donny Donovan. He thus made his high school team.
As a high school player Donovan made limited progress as a hitter (.4191) and just average progress as a fielder (.5057), giving him a new skill level of 6.25:
6.60 as a hitter (6.18 + .42), and
5.55 as a fielder (5.04 + .51).
These are combined in a ratio of 14-7 (or 2-1), 14 parts hitter, 7 parts fielder. It works out to 6.25.
The average high school player has a skill level of 5.39, so Donovan is still 16% better than average, or +0.86. Among the 15,360 high school players in the country, Donovan ranks 956th. He is in the top 7%.
High school teams 1115 and 1116 are combined into legion team 558. Again, there are 16 players competing for 8 positions. Jack Raymond is now the third-best player on the team, behind Donny Donovan at 6.29 and player 35702, who we will call Big Randy Rogers, who ranks at 6.37.
As a legion player, however, something kicks in for Jack Raymond. He comes up with big numbers both for his progress as a hitter (.8138) and for his progress as a fielder (.9868). These progress numbers put Raymond far ahead of both Donny Donovan and Big Randy Rogers at the end of his legion ball career. His new value computes to 7.15:
7.42 as a hitter (6.60 + .81), and
6.54 as a fielder (5.55 + .99)
And these are combined in a 16-7 ratio, 16 parts hitter. The average legion player is at 6.31, so Raymond is 13% better than the average legion player, or +.85. Raymond is now in 299th place among 7,680 players who are still in the system.
Legion teams 557 and 558, we combine into college team 279. When these two teams are combined, Jack Raymond is by far the best player on his college team, at the start of his college career.
Raymond makes little progress as a college hitter (.2765), but outstanding progress as a college fielder (.9242). As long as the average is greater than .50, he’s still doing good, and the average is greater than .50. This gives him new skill levels of:
7.69 as a hitter (7.41 + .28), and
7.46 as a fielder (6.54 + .92).
Raymond’s fielding, over the years, has nearly caught up with his hitting. We combine these two figures in an 18-7 ratio, and this gives Jack Raymond a new skill level of 7.63. The average skill level for a college player is 7.16, so Raymond is 7% better than an average college player, and is +.47.
In our model—obviously this is imprecise—half the college players are able to move on to pro baseball, starting at the A level. In real life baseball is more complicated. High school players sometimes go directly into the pros, sometimes go to junior colleges and then to the pros, and sometimes come out of college and start in the Florida State League—and the progression is not linear. Instead, one player from Texas A & M is put on an A ball team with one player from Maryland, one player from USC, one player from Florida International, one player from Boston College, and four guys from Venezuela, while other players from Texas A & M, Maryland, USC and Florida International move on to different organizations. It would be extremely difficult to model all of this complexity in a spreadsheet. In our model, half of college players move on to A ball, and half drop by the way. College teams 279 and 280 are combined into A ball team 140.
Arriving in A ball, Jack Raymond is the third-best player on the team, behind two players we have not discussed before and will not bother to assign names.
Representing his progress as a hitter at A ball, Raymond comes up with a big, big number (.9829) but a poor number as a fielder (.2057). But again, the average of the two “improvement numbers” is still greater than .500, so he’s still doing OK. If he doesn’t continue to make progress he’s going to fail to move up, but as long as his improvement numbers are better than .500 on average, he’s going to be OK. He now comes out at 8.41—
8.68 as a hitter (7.69 +.98), and
7.66 as a fielder (7.46 + .21).
Combined in a 20-7 ratio, making 8.41. The average A ball player is at 7.98, so Raymond is +.43, or 5% above average.
A ball teams 139 and 140 are combined into AA team 70. When these two teams are combined, Jack Raymond is the fourth-best player on his AA team. Now he’s got to make some progress. If he comes up with bad progress numbers here, he’s not going to make Triple-A.
But he doesn’t. He doesn’t come up with bad progress numbers. He comes up with .7861 for offense, and .7515 for fielding. This makes his new value 9.21:
9.46 as a hitter (8.68 + .79), and
8.41 as a fielder (7.66 + .75).
Combined in a 22-7 ratio. The average player at Double-A is at 8.77, so Raymond is still 5% above average, and is now +.44.
Double-A teams 69 and 70 are combined into Triple-A team 35. There are 16 players competing to move up. Jack Raymond is now the third-best of those 16 players.
Of course, in real life there are as many Triple-A as Double-A teams. Still, only about half of Double-A player DO move up to Triple-A, and the reason for this is the time frame. Players spend a long time in Triple-A. You spend one year in Double-A if you’re lucky, two years if you need them, three years if you’re struggling but people still believe in you. But you can hang out in Triple-A for years. Vacancies don’t open up at AAA as rapidly as younger kids come up from A ball, so it does remain a funnel. A lot of careers dead end at Double A.
Anyway, Jack Raymond has now made it to Triple-A. At Triple-A, he must continue to make progress. Again, he makes little progress as a Triple-A hitter (.2724), but makes wonderful progress as a fielder (.9972). Again, the average of the two is over .500. This is the sixth consecutive level at which Raymond has made above-average progress, thus keeping him in the top half of players at his level, even though the quality of competition keeps improving. The last level at which he did not make above-average progress was high school. Raymond’s’ performance level is now 9.66:
9.73 as a hitter (9.46 + .27), and
9.41 as a fielder (8.41 + 1.00).
Combined in a 24-7 ratio. The average AAA player is 9.56. Raymond, at 9.66, is 1% above average.
Triple-A teams 35 and 36 are combined in our model into major league team 18. There are 16 candidates for major league positions. There are 8 jobs. Jack Raymond is the 7th best player. He makes the majors. Barely, but he makes.
In the majors, also, Jack Raymond has “progress” numbers, and once more he comes up with very good ones--.6519 for his progress as a hitter, .7371 for his progress as a fielder. It pushes him to 10.34:
10.39 as a hitter (9.73 + .65), and
10.15 as a fielder (9.41 + .74).
Combined in a 26-7 ratio, that makes 10.336. Raymond winds up as the most-average major league player, among the 240; the major league average is 10.334. That’s how I decided to chose him to illustrate how the system works.
Let’s quickly summarize the progress of a couple of other players. Carl Thornton:
Base Talent—Offense
|
.979 X 5 =
|
4.895
|
Base Talent—Fielding
|
.621 X 5 =
|
3.105
|
Total
|
|
8.000
|
Natural Talent Level
|
|
4.000
|
LL-1 Hitting Progress
|
|
0.070
|
LL-1 Fielding Progress
|
|
.808
|
LL-1 Hitting Level
|
|
4.965
|
LL-1 Fielding Level
|
|
3.914
|
Overall Level exiting LL-1 (10-7)
|
|
4.532
|
Average
|
|
3.005
|
|
|
|
Rank on Trying out for LL-2
|
2 of 16 (makes team)
|
LL-2 Hitting Progress
|
|
.822
|
LL-2 Fielding Progress
|
|
.597
|
LL-2 Hitting Level
|
|
5.787
|
LL-2 Fielding Level
|
|
4.511
|
Overall Level exiting LL-2 (12-7)
|
|
5.317
|
Average
|
|
4.345
|
Rank on Trying out for High School
|
2 of 16 (makes team)
|
High School Hitting Progress
|
|
0.560
|
High School Fielding Progress
|
|
0.475
|
High School Hitting Level
|
|
6.347
|
High School Fielding Level
|
|
4.985
|
Overall level exiting High School (14-7)
|
|
5.894
|
Average
|
|
5.394
|
|
|
|
Rank on Trying out for Legion Team
|
4 of 16 (makes team)
|
Legion Hitting Progress
|
|
0.626
|
Legion Fielding Progress
|
|
0.657
|
Legion Hitting Level
|
|
6.974
|
Legion Fielding Level
|
|
5.642
|
Overall Level exiting legion ball (16-7)
|
|
6.568
|
Average
|
|
6.311
|
|
|
|
Rank on entering college
|
5 of 16 (makes team)
|
College Hitting Progress
|
|
0.471
|
College Fielding Progress
|
|
0.368
|
College Hitting Level
|
|
7.445
|
College Fielding Level
|
|
6.01
|
Overall Level exiting college (18-7)
|
|
7.043
|
Average
|
|
7.158
|
|
|
|
Rank on entering A Ball
|
9 of 16
|
|
|
Fails to make team
|
Carl Thornton makes inadequate progress as a college player, and his career ends in college. He never enters pro ball.
Let’s call this player Bob Wagner:
Base Talent—Offense
|
.962 X 5 =
|
4.812
|
Base Talent—Fielding
|
.537 X 5 =
|
2.688
|
Total
|
|
7.500
|
Natural Talent Level
|
|
3.750
|
LL-1 Hitting Progress
|
|
0.120
|
LL-1 Fielding Progress
|
|
0.628
|
LL-1 Hitting Level
|
|
4.932
|
LL-1 Fielding Level
|
|
3.315
|
Overall Level exiting LL-1 (10-7)
|
|
4.266
|
Average
|
|
3.005
|
|
|
|
Rank on Trying out for LL-2
|
3 of 16 (makes team)
|
LL-2 Hitting Progress
|
|
0.270
|
LL-2 Fielding Progress
|
|
0.131
|
LL-2 Hitting Level
|
|
5.203
|
LL-2 Fielding Level
|
|
3.446
|
Overall Level exiting LL-2 (12-7)
|
|
4.555
|
Average
|
|
4.345
|
|
|
|
Rank on Trying out for High School
|
7 of 16 (makes team)
|
High School Hitting Progress
|
|
0.918
|
High School Fielding Progress
|
|
0.407
|
High School Hitting Level
|
|
6.121
|
High School Fielding Level
|
|
3.853
|
Overall level exiting High School (14-7)
|
|
5.365
|
Average
|
|
5.394
|
|
|
|
Rank on Trying out for Legion Team
|
6 of 16 (makes team)
|
Legion Hitting Progress
|
|
0.961
|
Legion Fielding Progress
|
|
0.946
|
Legion Hitting Level
|
|
7.082
|
Legion Fielding Level
|
|
4.799
|
Overall Level exiting legion ball (16-7)
|
|
6.387
|
Average
|
|
6.311
|
|
|
|
Rank on entering college
|
4 of 16 (makes team)
|
College Hitting Progress
|
|
0.288
|
College Fielding Progress
|
|
0.138
|
College Hitting Level
|
|
7.370
|
College Fielding Level
|
|
4.937
|
Overall Level exiting college (18-7)
|
|
6.688
|
Average
|
|
7.158
|
|
|
|
Rank on entering A Ball
|
11 of 16
|
|
|
Fails to make team
|
Bob Wagner, again, fails to make progress at college, and does not enter pro ball. The average skill level, in my system, is as follows:
Majors
|
10.34
|
AAA
|
9.56
|
AA
|
8.77
|
A Ball
|
7.98
|
College
|
7.16
|
Legion
|
6.31
|
High School
|
5.39
|
Older Little Leagues
|
4.34
|
Beginner Leagues
|
3.01
|
Some of this improvement comes from players getting better, and some of it comes from the weaker players being systematically eliminated. Let’s compare the “league levels” above to the 240 top players at each level:
|
Average
|
240
|
Majors
|
10.34
|
10.34
|
AAA
|
9.56
|
9.85
|
AA
|
8.77
|
9.25
|
A Ball
|
7.98
|
8.61
|
College
|
7.16
|
7.99
|
Legion
|
6.31
|
7.39
|
High School
|
5.39
|
6.75
|
Older Little Leagues
|
4.34
|
6.13
|
Beginner Leagues
|
3.01
|
5.51
|
Let’s add to that the average “raw talent level” of the players left in the system at each level. (The beginning performance level is essentially one-half the average talent level):
|
Average
|
240
|
Avg Talent
|
Majors
|
10.34
|
10.34
|
4.31
|
AAA
|
9.56
|
9.85
|
4.30
|
AA
|
8.77
|
9.25
|
4.28
|
A Ball
|
7.98
|
8.61
|
4.22
|
College
|
7.16
|
7.99
|
4.12
|
Legion
|
6.31
|
7.39
|
3.97
|
High School
|
5.39
|
6.75
|
3.70
|
Older Little Leagues
|
4.34
|
6.13
|
3.27
|
Beginner Leagues
|
3.01
|
5.51
|
2.50
|
The lower-level eliminations are based almost entirely on talent. Once you get to the professional baseball, everybody has talent, and the eliminations are based almost entirely on the failure to make progress.
I marked in this model the 100 most talented players in the system. ..the most gifted. Of the 100 most gifted players in the model—
All 100 were able to play in the older Little Leagues
All 100 made their High School teams
99 were able to make it into Legion ball
94 played college ball
73 entered pro baseball
51 made it to Double-A
30 made it to Triple-A
11 made it to the majors
And, of the 240 players who made it to the majors, all were in the top one-third in terms of basic talent. A player whose native talent was at the 40th percentile could, in theory, make it to the majors by making strong progress at every level, but it did not happen within the model.
And All This Means?
Well, here’s one thing it means. In the recent book Outliers, Malcolm Gladwell argues that we artificially limit the supply of talent in many areas by eliminating young people from the training regimen before they have the opportunity to improve.
My model suggests that this argument is probably true. Let’s take the 240 players who wind up with the major league jobs at the end of the day, and look at the “average progress numbers” at each level:
Beginner League Hitting
|
.735
|
Beginner League Fielding
|
.558
|
|
|
Little League Hitting
|
.704
|
Little League Fielding
|
.557
|
|
|
High School Hitting
|
.705
|
High School Fielding
|
.564
|
|
|
Legion Hitting
|
.698
|
Legion Fielding
|
.579
|
|
|
College Hitting
|
.714
|
College Fielding
|
.589
|
|
|
A Ball Hitting
|
.682
|
A Ball Fielding
|
.545
|
|
|
AA Hitting
|
.684
|
AA Fielding
|
.549
|
|
|
AAA Hitting
|
.660
|
AAA Fielding
|
.552
|
The progress numbers are higher for hitting than for fielding because hitting is more important toward staying in the game than is fielding. But the “progress numbers” are higher at the lower levels than they are at the higher levels. Why is that?
In order to reach the major leagues, you need to start with a high talent base and also make steady progress. The average player has a progress number, in any season, of .500. The players who make the majors are those who consistently exceed that number.
But suppose that one player has progress numbers, beginning at the lowest levels, that go .400, .500, .600, .700, .800, .900, average .650, while another player has progress numbers that go .900, .800, .700, .600, .500, .400.
The player who has low progress rates at the lowest levels is likely to be eliminated before the “good numbers” ever come up. The player who is supposed to go 400-500-600-700-800-900. …in reality, he’s going to go 400-500, oops, you didn’t make the team; your career is over. The reason that the progress numbers are highest at the lowest levels for those players who make the majors is that the players who would make progress later, rather than earlier, are eliminated from the system before their big steps forward occur. Which was part of Gladwell’s thesis.
This relates to expansion. Many people believe—in my view mistakenly—that expansion permanently weakens the level of talent in the major leagues. There were 16 teams in 1960; now there are 30. Assuming that there are 25 players on each team, that means there were 400 major league players in 1960; now there are 750. Obviously players 401 to 750 are not as good as players 1 to 400, so obviously, the quality of play today cannot be as good as it was in 1960.
The quality of player—and the quality of talent—in the major leagues today is far, far better than it was in 1960. When baseball expands, this does weaken the quality of play in the majors. The quality of play in 1962 was less than it was in 1960. However, this effect washes out very quickly. Most of it disappears within 3 or 4 years. Within six years after expansion, the quality of play is as good as it has ever been—or better.
I am certain this is true, and there are several ways to demonstrate that it is true, but those also are long articles, and we’ll get to those another time. Among the fallacies inherent in the opposite belief are:
1) It ignores the growth in population. The population of the United States in 1960 was 179 million. Now it is 307 million. The ratio of players to population, in reality, is almost the same.
2) It ignores the ever-increasing outreach to other countries. In 1960 there were only about 30 major league players born outside the United States. Now there are hundreds.
3) It ignores changes in the rate at which persons are developed into baseball players. The population of India, after all, is 140 times the population of the Dominican Republic—but the 100 best baseball players in India would not be comparable to the 100 best baseball players in the Dominican Republic.
4) It ignores the improvements that are always occurring in the game, which constantly force players to play at an ever-higher level.
5) (And this is Gladwell’s thesis, I think). . .it greatly overstates the role of “talent” in making a baseball player.
Gladwell’s thesis is that, in the creation of “outliers” such as geniuses, inventors and superior athletes, there are four parts:
1) Opportunity,
2) Training,
3) Development time, and
4) Talent.
Neither Gladwell nor myself is arguing that talent is not anything. In my model (above), after all, all of the 240 players who wound up as major league players were in the top one-third in talent, and almost all of them in the top one-tenth. However, talent is the least important of the four elements. The things that really identify the outliers are opportunity and development time. Lots of people have talent. Few of them get the opportunity or put in the time to develop that talent to the highest level.
This is what my model shows: that, while talent is certainly very important, there are many, many more players who have the talent to play major league baseball than will ever get the chance to do so. Therefore, the amount of “raw talent” available is essentially irrelevant to the quality of play. The amount of raw talent is not a variable that has anything to do with how many players develop their skills to a major league level. If it was, India would have to have a better baseball team than the Dominican Republic, since there is 140 times as much talent there. And therefore, if major league baseball expanded to 40 teams, 50, 60, or 200 or 500, the quality of talent in the major leagues would not change AT ALL, so long as the expansion occurred in an orderly way, and the system had a few years to recover between each step.
Another note about the model. .People tend to assume (and often assert) that everybody who plays major league baseball was by far the best player on his high school team or his college team. This is not true. Many players are in the major leagues today who were NOT the best players on their high school or college team—as you can confirm if you talk to players. You’ll find a number of them who were not the best players there.
In my model, 60% of the players were the best players on their High School team at the time those teams were formed, and 72% of them were the best players on their high school teams at the end of their high school experience. Those numbers are certainly lower than in real life. In my model there were 15,360 high school players, 240 major league players, a 64-1 compression ratio. The real compression ratio is probably closer to a thousand to one.
If we were able to make the model large enough and complex enough to represent reality on a higher level, it might be useful in answering questions like “where is talent being missed?” It is my belief, for example, that talent is being overlooked at the smaller-college level. I think there are a significant number of players coming out of high school who, for reason or another, don’t make it into the big-time college programs, but who catch up during their college years to the level of those who do. If the model were large enough and sophisticated enough, it might be useful in studying those kinds of issues.