Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

The Better League, 5

By Bill James

June 11, 2022

The Better League, 5

Population Versus Expansion

In 1960, before the first expansion, each major league team represented or was drawn from a population of about 12 million people.

This article does not address the relative strength of the American League versus the National, but rather, the relative strength of the major leagues as a whole over time. After the first expansion, in 1961-1962, it was commonly said that this had "diluted" the quality of play. Reporters said in 1962—yes, I am old enough to remember this—observers would say that the major leagues now consisted of 16 teams full of major league players mixed with 4 teams worth of minor league players. Actually, they would continue to say this for many years after that—that expansion had diluted the quality of the product, so major league teams were not as strong as they were years ago, before all them minor league players were added to the majors.

To assume that the number of fully qualified major league players does not vary over time is of course silly, and to assume that the number of fully qualified major leaguers varies proportional to the population is fraught with problems. There are, after all, nations of many millions of people which don’t produce any major league players at all, because people there just don’t play baseball. You can’t assume that the number of players within a population is even remotely the same in all countries, nor can you safely assume that it is remotely the same in different decades.

But the argument that expansion diluted the quality of play in the major leagues must be true to some extent, so we need to try to model the problem and estimate what that effect might be, as best we can.

In 1876 the population of the United States was about 45,500,000, more or less. There were eight teams that we now recognize as major leagues teams, rightly or wrongly, so that is a ratio of about 5.7 million people for each team. In 2021 the population was about 333,000,000 and there were 30 teams, so that is a ratio of about 11 million people for each team.

These numbers have to be modified, however, for at least three things:

1) Segregation, which limited the number of potential athletes from which each team was drawn,

2) World War II, which severely limited the number of available athletes for a period of time, and

3) International Scouting, which has drawn into the major leagues now a very large number of players from Cuba, the Dominican Republic, Venezuela, Canada, Mexico, Japan, Australia, South Korea, Thailand and Zimbabwe. None from Zimbabwe? We’ll get there.

So this is what I did, to modify the raw population numbers to create a more realistic estimate of the relevant population.

1) Segregation. For each season prior to 1947, I multiplied the actual population by .70, assuming that 30% of the eligible population of the best athletes was banned from what was then regarded as Major League Baseball. That’s a conservative assumption. After 1947, I increased that number (.70) by .01 each year until 1949, by .02 in each year from 1950 through 1959, and by .01 each year after 1960 until the number was 1.00. This creates a pattern of integration essentially consistent with the research reported here in the article BL3. The number reaches 1.00 (100%) in 1966.

2) World War II. World War II, of course, took away most of the eligible population of baseball players, leaving the leagues to be staffed by whoever was left. I estimated the percentage that was left as:

90% in 1942

70% in 1943

50% in 1944

30% in 1945

50% in 1946

70% in 1947

90% in 1948 and

100% in 1949

The population of available athletes did not immediately snap back to 100% after World War II because, of course, a great many young men who would otherwise have become major league athletes had been killed or disabled in the war, or had had their development years, ages 18 to 22, completely taken away from them.

3) I marked the beginning of international scouting at 1950, which is about when we began to have significant numbers of international players in the majors, although of course there had been some earlier.

To represent this in the chart, I multiplied the available population by 1.0032, and then multiplied that number each year by 1.0032. That is saying that the influence of international scouting has been growing at a rate of about one-third of one percent per season, so that by 2021 international scouting had increased the pool of potential players by 26%. That’s a really conservative number, but again. . .better to make too small an adjustment than too large.

Using these assumptions, the effects of the expansions beginning in 1961 were not fully overcome until 2007. There were repeated expansions—1969, 1977, 1993 and 1998. Each expansion diluted the talent a little bit more, thus setting back the time when the effects of expansion would be fully overcome.

In 1876, then, the population of the US was about 45,547,000 (you get different figures from different sources). However, since black players were not included in the game (with a very few exceptions), I count this as an effective population of 31,883,000. There were 8 teams in the NL, which makes a ratio of 3,985,000 potential athletes for each team. Of course, we could reduce this number by eliminating the women folk and the children and aged, but since these adjustments would be essentially the same for each season, they would have little impact on the conclusions.

Year	Population	Effective Population	Teams	Ratio
1876	45547	31883	8	3985
1877	46708	32695	6	5449
1878	47868	33508	6	5585
1879	49029	34320	8	4290

How do we add this to our chart, which is based around .500?

I divided the "ratio" in the chart above by the same plus 6678; in other words, the effective winning percentage for the National League in 1876, based on the size of the population from which the teams were drawn, would be .374, since (3985/ (3985 + 6678)) = .374. 6678 was the number I used because that number makes the chart center at .500. Adding that to the chart:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1876	45547	31883	8	3985	.374
1877	46708	32695	6	5449	.449
1878	47868	33508	6	5585	.455
1879	49029	34320	8	4290	.391

New leagues forming in the 1880s drove this equivalent winning percentage way down, as low as .171 in 1884, the lowest it has ever been. In 1884 each team was drawn from an effective population of not much more than a million people.

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1880	50189	35132	8	4392	.397
1881	51459	36021	8	4503	.403
1882	52729	36910	14	2636	.283
1883	53999	37800	16	2362	.261
1884	55269	38689	28	1382	.171
1885	56540	39578	16	2474	.270
1886	57810	40467	16	2529	.275
1887	59080	41356	16	2585	.279
1888	60350	42245	16	2640	.283
1889	61620	43134	16	2696	.288

The Player’s Revolt in 1890, leading to the formation of a third league, gave us the second-lowest number ever in 1890, but the consolidation into one 12-team league made the ratio of population to teams much higher, thus presumably making the league much stronger. Not just "presumably"; there is no doubt that the quality of play in the majors was improving, although the population ratio alone would not say that it was better in 1899 than in 1879:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1890	62890	44023	24	1834	.215
1891	64222	44956	16	2810	.296
1892	65554	45888	12	3824	.364
1893	66887	46821	12	3902	.369
1894	68219	47753	12	3979	.373
1895	69551	48686	12	4057	.378
1896	70883	49618	12	4135	.382
1897	72215	50551	12	4213	.387
1898	73548	51483	12	4290	.391
1899	74880	52416	12	4368	.395

The National League shucked off four teams in 1890s, creating the strongest league ever at that time for the 1900 season. In 1901 the American League formed, dividing the talent between the two leagues, and dropping the presumptive skill level back to the .330s:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1900	76212	53348	8	6669	.500
1901	77814	54470	16	3404	.338
1902	79415	55591	16	3474	.342
1903	81017	56712	16	3544	.347
1904	82618	57833	16	3615	.351
1905	84220	58954	16	3685	.356
1906	85822	60075	16	3755	.360
1907	87423	61196	16	3825	.364
1908	89025	62317	16	3895	.368
1909	90626	63438	16	3965	.373

From 1901 to 1960 the ratio of population to teams grew steadily, as the population of the US more than doubled in those years, while the number of teams stayed at 16. There were two exceptions to that, the first of which was the Federal League, 1914-1915:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1910	92228	64560	16	4035	.377
1911	93607	65525	16	4095	.380
1912	94987	66491	16	4156	.384
1913	96366	67456	16	4216	.387
1914	97745	68422	24	2851	.299
1915	99125	69387	24	2891	.302
1916	100504	70353	16	4397	.397
1917	101883	71318	16	4457	.400
1918	103262	72284	16	4518	.404
1919	104642	73249	16	4578	.407

By 1929 the presumptive winning percentage based on population was up to .443:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1920	106021	74215	16	4638	.410
1921	107721	75405	16	4713	.414
1922	109421	76595	16	4787	.418
1923	111122	77785	16	4862	.421
1924	112822	78975	16	4936	.425
1925	114522	80165	16	5010	.429
1926	116222	81356	16	5085	.432
1927	117922	82546	16	5159	.436
1928	119623	83736	16	5233	.439
1929	121323	84926	16	5308	.443

And by 1939 it was up to .479, the highest it had ever been except for the 1900 season, when there were only eight teams:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1930	123023	86116	16	5382	.446
1931	124937	87456	16	5466	.450
1932	126851	88796	16	5550	.454
1933	128765	90136	16	5633	.458
1934	130679	91476	16	5717	.461
1935	132594	92815	16	5801	.465
1936	134508	94155	16	5885	.468
1937	136422	95495	16	5968	.472
1938	138336	96835	16	6052	.475
1939	140250	98175	16	6136	.479

And then the 1940s were wild. By 1945 the presumptive quality of baseball (based on the population ratio) had dropped to its lowest point since 1890. By 1949, aided a little bit by early integration, the presumptive quality had climbed over .500 for the first time ever. It has never dropped under .500 since then:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1940	142164	99515	16	6220	.482
1941	143828	100680	16	6292	.485
1942	145492	91660	16	5729	.462
1943	147156	72106	16	4507	.403
1944	148820	52087	16	3255	.328
1945	150484	31602	16	1975	.228
1946	152148	53252	16	3328	.333
1947	153812	76445	16	4778	.417
1948	155476	100748	16	6297	.485
1949	157140	114712	16	7170	.518

By 1959, with large-scale, normalized integration, the quality of the leagues had reached a dizzying level:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1950	158804	119484	16	7468	.528
1951	162721	126099	16	7881	.541
1952	166639	132912	16	8307	.554
1953	170556	139927	16	8745	.567
1954	174473	147145	16	9197	.579
1955	178391	154567	16	9660	.591
1956	182308	162195	16	10137	.603
1957	186225	170031	16	10627	.614
1958	190142	178077	16	11130	.625
1959	194060	186335	16	11646	.636

What we are saying here is that, based simply on the size of the American Population and the introduction of black players and international players into the game, we would conclude that a major league team from 1959 would beat the bejeebers out of a team from 1939.

By 1969, however, the major leagues had expanded by 50% in 9 years. This diluted the product, and set backward significantly the overall quality of play:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1960	197977	192755	16	12047	.643
1961	198578	196022	18	10890	.620
1962	199178	199320	20	9966	.599
1963	199779	202650	20	10133	.603
1964	200380	206012	20	10301	.607
1965	200981	209406	20	10470	.611
1966	201581	212832	20	10642	.614
1967	202182	214150	20	10707	.616
1968	202783	215473	20	10774	.617
1969	203383	216803	24	9033	.575

Because of international scouting, the effective population became larger than the American population in 1962. Remember Camilo Pascual, Roberto Clemente, Orlando Cepeda, Julian Javier, Juan Marichal, Luis Aparicio and the Alou brothers? Because there was another expansion in the 1970s, not a lot of progress was made toward recovery:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1970	203984	218139	24	9089	.576
1971	206308	221331	24	9222	.580
1972	208632	224540	24	9356	.584
1973	210956	227768	24	9490	.587
1974	213280	231014	24	9626	.590
1975	215605	234279	24	9762	.594
1976	217929	237562	24	9898	.597
1977	220253	240864	26	9264	.581
1978	222577	244184	26	9392	.584
1979	224901	247524	26	9520	.588

Much more progress was made in the 1980s, pushing the product back close to the pre-expansion standard:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1980	227225	250882	26	9649	.591
1981	229466	254167	26	9776	.594
1982	231664	257423	26	9901	.597
1983	233792	260619	26	10024	.600
1984	234825	262608	26	10100	.602
1985	237968	266974	26	10268	.606
1986	240116	270246	26	10394	.609
1987	242265	273537	26	10521	.612
1988	244413	276846	26	10648	.615
1989	246562	280173	26	10776	.617

But the additions of four more expansion teams in the 1990s left the quality of the league about the same in 1999 as it had been in 1990:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
1990	248710	283519	26	10905	.620
1991	251981	288167	26	11083	.624
1992	255252	292842	26	11263	.628
1993	258524	297544	28	10627	.614
1994	261795	302273	28	10795	.618
1995	265066	307029	28	10965	.622
1996	268337	311813	28	11136	.625
1997	271608	316624	28	11308	.629
1998	274880	321463	30	10715	.616
1999	278151	326330	30	10878	.620

With no expansions since 1998, the effects of expansion were finally wiped out by 2007, leaving the quality of play stronger than ever:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
2000	281422	331224	30	11041	.623
2001	284154	335510	30	11184	.626
2002	286887	339820	30	11327	.629
2003	289619	344155	30	11472	.632
2004	292352	348513	30	11617	.635
2005	295084	352896	30	11763	.638
2006	297816	357304	30	11910	.641
2007	300549	361736	30	12058	.644
2008	303281	366192	30	12206	.646
2009	306014	370674	30	12356	.649

And it has continued to grow since 2009:

Year	Population	Effective Population	Teams	Ratio	Winning Pct
2010	308746	375181	30	12506	.652
2011	311016	379149	30	12638	.654
2012	313287	383139	30	12771	.657
2013	315557	387150	30	12905	.659
2014	317827	391183	30	13039	.661
2015	320098	395238	30	13175	.664
2016	322368	399315	30	13311	.666
2017	324638	403414	30	13447	.668
2018	326908	407535	30	13585	.670
2019	329179	411679	30	13723	.673
2020	331449	415844	30	13861	.675
2021	333230	419417	30	13981	.677

Each team now represents an effective population of about 14 million people, including international players. This is the highest ratio of all time.

Just by way of my opinion, I will say that I don’t really believe that it takes as long for the quality of play to recover from expansion as this chart makes it appear. Major league players do not simply exist; they are created--like lawyers, writers, educators and criminals. They are created out of talent, which is limited by the population size, but they are also created by training, development and opportunity. Those things are not limited by the population size. My opinion. . .the effects of the first expansion (1961-1962) were probably mostly gone by 1966, 1967. Mostly gone, not entirely.

But. . .this is just intuitive, maybe, but it seems to me it took baseball longer to recover from the second expansion (1969) than from the first one. You have two operations in a fairly short period of time, it’s going to take you longer to recover from the second one than it did the first one.

And my second opinion: most people tend to dramatically overestimate the improvement in the quality of play over recent decades. I know that a lot of people think that a team from 2022 would easily dominate a team from 1982, if a game could be arranged. I really don’t see convincing evidence for that.

But there are a lot of things left that we could or will measure, and all of those things will feed into the estimate we make of the slope of history. I am trying, as best I can, to take this issue out of the realm of complete speculation, and move it into the realm of educated speculation, careful speculation, organized speculation. It’s a hard problem; take my work for what you think it is worth.

COMMENTS (11 Comments, most recent shown first)

Brock Hanke
hotstatrat et al - I'm not at all sure that the number of 100-win (or winning percentage) seasons is all that relevant to the quality of the league, much less the game. What 100-win seasons measure is the DIFFERENCE between the best teams and the rest of the league. Teams after expansion won 100 games because they had real bad teams to beat on - the expansion teams. It does make conceptual sense that the presence of really bad teams indicates that the league is admitting too many bad players because of expansion, but there are times in history which were nowhere near any expansion, but a couple of teams were winning 100 games simply because those couple of teams just got ahead of the rest of the game. At the very least, that has to dilute the value of the expansion argument, by introducing a different factor that does the same thing as the 100-win seasons. And no, I have no idea how to compensate for that.
12:16 AM Jun 24th

stublues
I'd like to piggy back off FrankD's observation about the pool of baseball players being impacted by career options in basketball and football.
There are so many other options too. US kids playing elite youth hockey; expansion of collegiate and developmental golf, with the Olympics admitting pros, that's an athletic career athletic career path. There are probably young men currently active in skateboard, motocross, skiing, who would have been baseball players in an earlier era.
8:54 AM Jun 20th

Manushfan
I agree with Our Bill here, I don't think the 1982 Brewers for example go 60-102 now. To me that's goofy.
5:59 PM Jun 13th

Marc Schneider
Of course, there is a difference between the quality of play and the entertainment value at any given time. We can agree that today's quality of play is at a very high level historically, but the entertainment value is much lower. And, for a fan, that seems to me far more important than some abstract quality of play. In fact, I wonder if there is some sort of inverse correlation between the quality of play and the entertainment value.
1:39 PM Jun 13th

FrankD
Interesting series. I to would like to know if there should be a correction for the increase of NFL and NBA players and status/salaries. Surely some of today's NBA and NFL players would have played or emphasized professional baseball. Also, and maybe this is too small of a factor, age demographics of a population may have had an effect. For example, did the baby boom have an effect on available talent for pro baseball?
12:22 PM Jun 12th

willibphx
Bill,

Can you provide the source for your 30% estimate. Most of the data I have seen show the African American population between 10% to 15% of the US throughout the last 150 years or so.

Second, how do you think the dramatic growth in the competition for talent from football and basketball has effected the talent pool for baseball.

Thanks as always for the analysis and insights.
9:43 AM Jun 12th

cderosa
Hey Bill,
This is an exciting series of articles; I'm really enjoying them.

Is there a case for marking down any of the later Korean War years to the level of maybe 1942 (90%)? It seems like there were a lot of important players in the service for a year or two during the 1951-1953 period, including Whitey Ford, Art Houtterman, Curt Simmons, Ted Williams. Willie Mays, Don Newcombe, Vern Law, Dick Groat.
It seems like a bigger deal to me than the losses in the rest of the Cold War draft.

Chris DeRosa
8:43 AM Jun 12th

DefenseHawk
hotstatrat,

You mentioned at looking at the 100 win seasons after expansion. I had made a similar point in a "Hey, Bill" comment re: Grich, Parker and the 1977 A.L. expansion. That point kind of got lost in the larger question that I was asking Bill about Win Shares.

But, absolutely, it's something to look at. (I would look at winning percentages instead, however, because the first expansions in 1961 and 1962 bumped the schedule up to 162 games.)

The average A.L. "established" club played .522 ball in 1977, gaining an average of 3.5 wins over 162 games compared to 1976. The big four (Yankees, Royals, Red Sox and Orioles) even more so, gaining over 9+ wins on average.

The already "established" teams benefit with expansion. And so do their players. They're playing against weaker competition.

Does Rod Carew hit .388 in 1977 if not for expansion? Unlikely. Sure, he hit only .328 against the Mariners (.432 vs. Toronto). But the pitching is going to be slightly better on the established clubs if there's no expansion. Fielding will also probably be slightly better.

Carew went only 1 for 8 against Seattle's Glenn Abbott. If not for expansion, the A's Rick Langford might have been spending another year in the minors, with Abbott retaining his place in Oakland's rotation instead of being picked by the Mariners in the expansion draft. Carew had gone 4 for 9 against Langford.

However, the decline of the quality of play in the years immediately after expansion is not all due to the lack of available baseball talent. Some of the decline is due to the lack of talent made available to them. Not all of the best AAA talent was available in the draft pool. Nor was even all the best available "bench" talent from the majors was made available. Just look at the draft rules for the expansions of 1961, 1962 and 1977 if you're not already familiar with them.

Other rules hurt the expansion clubs, too, such as being put at the end of the draft selections. Or Seattle and Toronto not being allowed to participate in baseball's first re-entry draft (the free agent class of 1976).

It took, on average, the first class of expansion teams 20.5 years before winning a division title. The Class of '69 took an average of 12.75 years.
8:43 PM Jun 11th

77royals
The quality of the leagues in the 1970's might not have been as high as previously, but man, it was a lot of fun to watch.
4:33 PM Jun 11th

hotstatrat
I am extremely grateful for this, thank you. I even wrote a long article about something that relied on my very rough estimates of exactly what you are measuring. That article was abandoned because my estimates were unsatifactorly rough. I'm sure your conclusions will inspire me to try again.

I was surprised to read that you believe it took longer to overcome the 1969 expansion than the ones early in that decade. One thing I hope you will attempt to measure is that baseball teams over time have become ever more sophisticated and effective at training players. For that reason, if the Majors recovered from the 1961-1962 expansion by 1966 or '67, I would guess they recovered from the 1969 expansion by 1973.

One indicator of league strength that is admitedly very wonky because it relies on small sample size, but is evidently real: 100 win seasons. There have been a vastly out of proportion of 100 win seasons in expansion years or shortly after expansion.

There has also been a large number of 100 win seasons in recent years. I'm wondering if that is a reflection of a feast or famine style of team building lately or other modern baseball economics or that leagues are getting weaker due to so many American athletes being more interested in football or other endeavors. You've mentioned how scary baseball demographics are right now.
1:38 PM Jun 11th

The Better League, 5

COMMENTS (11 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: