Remember me

Rezearch?

August 3, 2010

I.

 

I have a couple of research questions for you, anybody knows how to do these.

First, has anybody studied the willingness of managers to break the “rule” against intentionally putting the potential winning run on base?   It seems to me. . . .just my impression. . .it seems to me that managers now are much, more willing to issue an intentional walk putting the potential winning run on base than they have ever been before.   Since intentional walks are pretty easy to spot in the accounts of the game, it seems to me that this should be something we could study.

Among the things I would like to know are:

1)  How frequent is this occurrence now, as opposed to 40 years ago?  (I will guess that it is 200% more common now than it was in 1970.)

2)  Do we have records of how many times each manager has done this?

3)  How often does it backfire, “backfiring” defined as “runner who was put on base scores the go-ahead run.”

 

II.

The All-Important “Loss” Column

 

This is the time of year at which teams which are not in first place start hearing about the all-important loss column.    The Red Sox are 6½ games back, we hear, but 7 back in the all-important. 

Well. . .what makes the Loss Column more important than the Win Column?   Wouldn’t it make as much sense to talk about the all-important Win Column?

There is a difference, which is this.   Teams which are in a pennant race win more games than they lose.   Let us suppose that, with a week to go, the standings are these:

 

Team

W-L

Pct.

GB

Indianapolis

93-65

.589

Portland

90-67

.573

2 ½

New Orleans

91-68

.572

2 ½

 

OK, in that situation, I grant you that the team which is 90-67 is in much better shape than the team which is 91-68.   In this scenario, in fact, if we assume

a) that each team has a 60% chance to win each remaining game,

b) that they don’t play one another, and

c) that in the event of a tie, all the teams are equal

Then Indianapolis’ chance of winning this race is 92.4%, Portland’s is 5.5%, and New Orleans’ is 2.1%.  Portland has a very meaningful advantage from having two more games left on the schedule.

 

Suppose, however, that the standings on August 1 are these:

 

Team

W-L

Pct.

GB

Indianapolis

63-39

.618

Portland

60-41

.594

2 ½

New Orleans

62-43

.586

2 ½

 

New Orleans is four games behind in the loss column; Portland, only two.   But in that situation, wouldn’t you rather be New Orleans than Portland?    Portland has to play four more games, with the same number of days left on the schedule.   That means that New Orleans has four more days off in August and September.   Given that you’re in roughly the same position, wouldn’t you rather have the extra days off than the possibility of winning an extra game because you haven’t played it yet?

Is there any research on this?   Has anybody ever studied the effect of days off during a pennant race?   How exactly would you do that?  

 

Your research is appreciated.

 

 

Part III—Ozzie’s Latest Rant

 

Our beloved half-crazy manager, Ozzie Guillen, has burst forth with some strongly-worded thoughts about the disparate treatment of Asian and Latin baseball players, and, as I write this, MLB is busy defending itself from charges of unequal treatment of these groups.

            Ozzie’s only problem is that he tells the truth.   It’s a bad habit, get you in a lot of trouble, but. . .he can’t seem to break it.

Look, we all want to believe that we are doing our best to accommodate all of our players, and to deal fairly with all of them.   At the same time, just because we think we are trying to do that doesn’t mean that we are succeeding.

When you see something that you think is wrong, you think is an injustice, what should you do about that?   Should you keep your mouth shut, or should you speak out against it?

I’m not saying that Ozzie is right, and here’s one element of the problem that I haven’t heard comment on.    The difference between English and Spanish is not at all like the difference between English and Japanese.   A person who speaks Spanish as a native language can learn to speak English with a reasonable effort, because many of the words are nearly the same and the structure of the language is similar.   This is not true of English and Japanese.  

            But Ozzie saw something that he thought was wrong, and he spoke out against it.   Good for him.   We shouldn’t criticize him for that; we should listen to him, and hear what he has to say.

 

 

Part IV—Age Deviation as an Indicator of Quality

 

            The sum total of successes and failures, as measured in the statistics, is the same in all leagues.  99.9% of sabermetrics is based on circular measurements, assuming that whatever is a success for one team is a failure for the other.  Sabermetrics—except in the recent era, when we have inter-league play with which to calculate adjustments—sabermetrics implicitly assumes that all leagues are created equal, that a player from 1957 is facing the same level of competition as a player from 2007.    The only problem is, it isn’t true.   It’s what we might call a necessary assumption, and it may be a workable assumption the majority of the time, but we know that it’s untrue on some level—we know, indeed, that it is always untrue on some level, since the quality of competition in two different leagues can never be precisely the same.

            It bothers me to operate on what I know to be a false assumption, and so at some point I began worrying about this in the back of my head.   In 2002, while I was coaching third base for a team of eight-year-old boys with the one requisite eight-year-old girl who plays on all such teams, I had a conceptual breakthrough on this issue.   “It may be true that the sum total of successes and failures is the same in the statistics of every league,” I realized, “but that nonetheless the statistics of high-quality leagues could be systematically different than the statistics of lower-quality leagues.  Thus, it may well be true that the statistics of a league do indicate or even measure the quality of the competition within the league.”

            Once I had made that breakthrough, I began to ask the question “In what ways are the statistics of high-quality leagues different than the statistics of lower-quality leagues?”   In what ways, in other words, do the statistics of a league indicate the quality of the league?   And once I had begun to ask that question, I found more than 40 ways that the statistical summary of a good league is systematically different from the statistical summary of a weaker league.  

            It was a big project for me; I worked on it for several weeks on end.   Just as I was about ready to write up what I had found, however, the Boston Red Sox called.    I put what I had found on the back burner, and went to work on Red Sox stuff, and I never did get around to writing up what I had learned.

            OK, here’s one way in which the statistics of good leagues are systematically different from the statistics of weaker leagues.

Age.

The more players in the league who are ages 25 to 30, generally speaking, the higher the quality of the competition within the league.  The further away from age 27 the players in the league are, the lower the quality of competition.

Think about it.   During World War II, the quality of competition obviously declined.    What happened to the age spectrum?   Baseball in World War II was often described as Old Men, Teenagers, and the physically unfit.

If you compare college ball to high school baseball, which has players who are nearer in age to 27?  Which has better quality competition?

If you compare college baseball to Triple-A baseball, which has more players who are nearer in age to 27?

            If you compare major league baseball to minor league baseball, which has more?

            If you compare high school baseball to little league baseball, which has more?

            If you compare a first-place team to a last-place team, which is more likely to have a 19-year-old in center field and a 38-year-old at first base?

            In all of these cases, the team which competes at a higher level is likely to have more players who are nearer in age to 27.

 

            We can use this fact, then, as one element among many to make a measurement of the level of competition within a league.   The first thing we need to do is to establish a method.    Here’s the method I came up with.

            1)  If a player is less than 27 years old, subtract his age from 27.

            2)  Square that.

            3)  Multiply that by his plate appearances.   The result is called the “Age Discrepancy Contribution”, or the ADC for short.

            4)  If a player is more than 27 years old, subtract 27 from his age.

            5)  Divide by two.

            6)  Square that.

            7)  Multiply by his plate appearances to get the ADC.

            8)  Find the team or league total of the ADC.

            9)  Divide by the team or league total plate appearances.

            10)  Take the square root of that.

            11)  Repeat steps 1-10 for pitchers, but replacing “plate appearances” with “innings pitched”.

            12)  Multiply the “batters result” by two and the “pitchers result” by one, and divide the total by three.

 

            In other words, age 26 is equivalent to 29, 25 is equivalent to 31, 24 is equivalent to 33, 20 is equivalent to 41, 18 is equivalent to 45, and 15 would be equivalent to 51.   

The result is called the “Age Deviation Score”, and the Age Deviation Score is an inverse indicator of the quality of play within a league.

 

            Let’s walk through the data for the 1976 Cincinnati Reds, a rather good team, and the 2003 Detroit Tigers, who lost 119 games.   These are the Plate Appearances and Ages for the members of each team:

 

Player

PA

AGE

 

 

Player

PA

AGE

1976 Cincinnati

Reds

 

 

 

2003 Detroit Tigers

 

 

Pete Rose

759

35

 

 

Dmitri Young

635

29

Dave Concepcion

636

28

 

 

Bobby Higginson

538

32

Ken Griffey Sr.

628

26

 

 

Carlos Pena

516

25

George Foster

627

27

 

 

Ramon Santiago

507

23

Joe Morgan

599

32

 

 

Craig Monroe

458

26

Tony Perez

586

34

 

 

Alex Sanchez

423

26

Cesar Geronimo

555

28

 

 

Shane Halter

393

33

Johnny Bench

552

28

 

 

Warren Morris

377

29

Dan Driessen

268

24

 

 

Brandon Inge

366

26

Doug Flynn

235

25

 

 

Eric Munson

357

25

Bill Plummer

168

29

 

 

Kevin Witt

289

27

Mike Lum

164

30

 

 

Omar Infante

244

21

Bob Bailey

141

33

 

 

Andres Torres

185

25

Ed Armbrister

90

27

 

 

Matt Walbeck

144

33

Gary Nolan

88

28

 

 

Gene Kingsale

140

26

Pat Zachry

77

24

 

 

Ben Petrick

129

26

Jack Billingham

70

33

 

 

Dean Palmer

98

34

Joel Youngblood

60

24

 

 

A.J. Hinch

82

29

Fred Norman

59

33

 

 

Danny Klassen

78

27

Don Gullett

51

25

 

 

Craig Paquette

33

34

Santo Alcala

50

23

 

 

Cody Ross

22

22

Pedro Borbon

20

29

 

 

Hiram Bocachica

22

27

Rawly Eastwick

19

25

 

 

Ernie Young

15

33

Pat Darcy

14

26

 

 

Nate Cornejo

5

23

Will McEnaney

9

24

 

 

Mike Maroth

4

25

Manny Sarmiento

7

20

 

 

Adam Bernero

4

26

Don Werner

5

23

 

 

Matt Roney

2

23

Rich Hinton

1

29

 

 

Jeremy Bonderman

2

20

 

 

 

 

 

Steve Avery

1

33

 

 

 

 

 

Gary Knotts

1

26

 

 

            If you follow the process above, it works out to an Age Deviation Score of 2.26 for Cincinnati, 2.10 for Detroit.  

            Fooled you, didn’t I?   But this is the data for the pitchers on the two teams:

 

Pitcher

IP

AGE

 

 

 

Pitcher

IP

AGE

Fred Norman

180.3

33

 

 

 

Steve Sparks

89.7

37

Jack Billingham

177.0

33

 

 

 

Steve Avery

16.0

33

Pedro Borbon

121.0

29

 

 

 

Danny Patterson

17.7

32

Rich Hinton

18.0

29

 

 

 

Jamie Walker

65.0

31

Joe Henderson

11.0

29

 

 

 

Brian Schmack

13.0

29

Gary Nolan

239.0

28

 

 

 

Adam Bernero

100.7

26

Pat Darcy

39.0

26

 

 

 

Gary Knotts

95.3

26

Don Gullett

126.0

25

 

 

 

Chris Spurling

77.0

26

Rawly Eastwick

107.7

25

 

 

 

Fernando Rodney

29.7

26

Pat Zachry

204.0

24

 

 

 

Matt Anderson

23.3

26

Will McEnaney

72.0

24

 

 

 

Eric Eckenstahler

15.7

26

Santo Alcala

132.0

23

 

 

 

Mike Maroth

193.3

25

Manny Sarmiento

43.7

20

 

 

 

Nate Robertson

44.7

25

 

 

 

 

 

 

Chris Mears

41.3

25

 

 

 

 

 

 

Nate Cornejo

194.7

23

 

 

 

 

 

 

Matt Roney

100.7

23

 

 

 

 

 

 

Franklyn German

44.7

23

 

 

 

 

 

 

Shane Loux

30.3

23

 

 

 

 

 

 

Wil Ledezma

84.0

22

 

 

 

 

 

 

Jeremy Bonderman

162.0

20

 

           

            When you do the same for the pitchers, you get 2.75 for the Reds, and 3.75 for the Tigers.   Combining the two scores in a 2-to-1 ratio, then, we get an overall Age Deviation Score of 2.42 for the Big Red Machine, and 2.65 for the 2003 Tigers.

            The score for a good team will usually be lower than the score for a bad team; not always, usually.    Sometimes things which are usually true can be added together to reach conclusions which are clearly true, although sometimes they cannot.  Let’s look now at the scores over time.

 

            When major league baseball “began” in 1876 (it had begun earlier than that, and it did not truly become major league baseball until later than that, but we have to start somewhere). . .when major league baseball began in 1876, the Age Deviation Scores for MLB were in the range of 3.5 and higher:

 

 

1876

1877

1878

1879

3.57

3.42

4.31

3.85

 

 

            You may wonder at the sudden surge upward in 1878.   “Major League Baseball” in 1878 consisted of only six teams, therefore the data is volatile.   A handful of personnel decisions impact the data for the league.    Monte Ward pitched 334 innings that year; he was 18.   The Only Nolan pitched 342 innings; he was 20.   These things move the league average upward in that environment.  

            Anyway, over the next few years, as the league stabilized, we would expect the Age Deviation Score to drop, suggesting an improvement in the quality of play, and it did:

1880

4.12

1881

3.43

1882

3.31

1883

3.13

 

            It dropped sharply.   The Age Deviation Score dropped from 4.12 in 1880 to 3.13 just three years later.   The 4.31 figure for the National League in 1878 remains the highest ADS on record for a league represented in the history books as major league.

            In 1884, however, baseball expanded very, very rapidly, causing the Age Deviation Score to spike upward:

 

1880

4.12

1881

3.43

1882

3.31

1883

3.13

1884

3.79

 

            In 1884 there were three leagues.   The Age Deviation Score for the National League was 3.57, for the American Association 3.70, and for the Union Association 4.16—consistent with the general perception that the National League was the strongest of these leagues, and the Union Association the weakest.

            The UA folded after the 1884 season, and from 1885 to 1889 the Age Deviation Scores drifted lower, consistent with a fairly rapid improvement in the quality of play:

 

1884

3.79

1885

3.46

1886

3.39

1887

3.48

1888

3.45

1889

3.17

 

            Throughout all of this era, the Age Deviation Scores were lower in the National League than in the American Association, as we would expect that they would be.

            In 1890 there was a baseball war.   The players formed a union and started their own league, the Players’ League.   Most of the best players went into the Players’ League, but the other leagues signed replacements and carried on.   As we would expect, this sent the Age Deviation Score somewhat higher:

 

1884

3.79

1885

3.46

1886

3.39

1887

3.48

1888

3.45

1889

3.17

1890

3.43

 

            The scores for the three leagues were 3.62 for the American Association, 3.79 for the National League, and 3.10 for the Player’s League, suggesting that, for that one season, the National League may actually have been the weakest of the three leagues.   Again, this is not a surprise to people who are knowledgeable about that season.

            After the 1890 season the Player’s League folded, and after the 1891 season the other two leagues consolidated.   Thus, baseball dropped from 24 teams to 12 teams in two years.   As we would expect, this led to a sharp decrease in the Age Deviation Score:

 

1890

3.43

1891

3.27

1892

3.03

 

            The 3.03 figure for 1892 was a new low at that time, but the figure continued to drop lower throughout the decade, indicating that the game was continuing to mature, continuing to weed out the teen-agers and the old men who had been common in the game through the 1880s.

 

1890

3.43

1891

3.27

1892

3.03

1893

2.99

1894

2.84

1895

2.87

1896

2.87

1897

2.87

1898

2.76

1899

2.72

 

            In 1900 the National League eliminated four teams, dropping major league baseball to one highly competitive 8-team league.   The Age Deviation Score dropped to 2.62.

            In 1901, however, the American League started up, disrupting the established order again, and driving the average higher:

 

1899

2.72

1900

2.62

1901

2.79

 

            To this point the data has consistently behaved as we would have expected it to behave, with the score increasing whenever new leagues are added, and decreasing in almost all other years.

            After 1901, however, this is not always true.   Baseball in the first decade of the 20th century was much more successful financially than it had been in the 1890s.   The players made more money; the game continued to get organized and continued to thrive.   We might expect, in these circumstances, that the Age Deviation Score would continue to go down, but in fact it did not.   After dropping to a record-low 2.56 in 1903, the deviation score began then to go up:

 

1900

2.62

1901

2.79

1902

2.64

1903

2.56

1904

2.70

1905

2.69

1906

2.87

1907

2.94

1908

2.90

1909

3.05

1910

2.96

 

            The deviation score reached to around 3.00 in 1909, and flattened out at that level for several years after that.   This is one of three respects in which the score in that era does not behave as we might expect.   The other two are:

            1)  That although the American League totally dominated the World Series of the 1910-1918 era, and the American League clearly had more stars, the deviation scores in the American League were consistently higher than in the National League,

            2)  That one would expect the addition of the Federal League to drive the scores up, but this did not happen.   Instead, the scores remained level or dropped slightly, and the scores for the Federal League itself were actually lower than the scores for the established leagues:

 

1910

2.96

1911

2.89

1912

3.02

1913

3.03

1914

2.94

1915

2.87

1916

2.88

 

            And these are the scores of the three leagues in those years:

 

 

American

 

 

National

 

 

Federal

1910

3.08

 

1910

2.85

 

 

 

1911

3.09

 

1911

2.68

 

 

 

1912

3.24

 

1912

2.82

 

 

 

1913

3.29

 

1913

2.79

 

 

 

1914

3.23

 

1914

2.91

 

1914

2.68

1915

3.02

 

1915

2.88

 

1915

2.73

1916

2.94

 

1916

2.82

 

 

 

 

            It was not until 1918 that the American League scores caught up to the National League.  

            I am not arguing that the Federal League was equal in quality to the National League or the American League.   I think it’s pretty clear that it was not.   However, I will alert you that, if we get the chance to look at other internal indicators of league quality, we will see this again; the Federal League will again, at other times, appear to be on an equal footing with the other two leagues. 

            Anyway, after 1916—after the Federal League folded—the age deviation scores contracted rapidly:

1916

2.88

1917

2.62

1918

2.66

1919

2.47

1920

2.49

1921

2.51

 

            And then swung back upward in the Babe Ruth era:

 

1920

2.49

1921

2.51

1922

2.67

1923

2.70

1924

2.73

1925

2.74

1926

2.72

1927

2.89

1928

2.98

1929

2.89

1930

2.78

 

            This is not entirely un-expected.    Baseball salaries increased rapidly in the 1920s.   When salaries increase, one of the things that happens is that older players stay in the game longer.   A player has been making $3,000 a year to play baseball, and the average salary goes to $7,000, that player is strongly motivated to hang on for another year.   This causes the age deviation score to increase, which is generally—but not universally—indicative of a decline in the quality of play.

            In any case these numbers worked their way back downward through the 1930s:

 

1930

2.78

1931

2.78

1932

2.79

1933

2.69

1934

2.77

1935

2.71

1936

2.67

1937

2.66

1938

2.73

 

            Through this era, as you would probably expect, the numbers in the American League tended to be a little bit lower than the numbers in the National League.   The American League was probably the stronger league.    The numbers went up briefly over the next few years:

 

1939

2.86

1940

2.99

1941

2.93

1942

2.89

1943

2.71

 

            We come then to World War II.   Since we all know that the quality of play went backward during World War II, we would expect the Age Deviation Score to have increased substantially—and in fact it did:

 

1940

2.99

1941

2.93

1942

2.89

1943

2.71

1944

2.94

1945

3.15

 

            The 3.15 figure in 1945 was the highest of any season since 1891, indicating a substantial move backward.   However, it should also be noted that the 1943-1944 figures are not remarkable; it’s really only 1945 that has an out-of-line Age Deviation Score.   The 1943-1944 figures are consistent with the rest of baseball history in that era.    In other words, there really were no more teenagers and old men in baseball in 1943 than in 1933 or 1923.     There were more in 1945, yes, but only in 1944 and 1945.

            Anyway, when the “real” players returned in 1946 the Age Deviation Score dropped sharply:

 

1945

3.15

1946

2.65

1947

2.68

 

            But then went back up in 1948-1949, when the young players who entered baseball after the War began to reach the majors:

 

1945

3.15

1946

2.65

1947

2.68

1948

2.82

1949

2.80

1950

2.76

 

            The numbers were stable throughout the 1950s:

 

1950

2.76

1951

2.69

1952

2.85

1953

2.74

1954

2.71

1955

2.75

1956

2.81

1957

2.79

1958

2.84

1959

2.80

1960

2.84

 

            We come, then, to the expansions of the 1960s.   For any method designed to measure the quality of play in the major leagues, this is a critical juncture, as people disagree strongly about the effects of expansion.    Some people—like me—believe that the effects of expansion on the quality of play were transitory; other people believe they were more or less permanent.   This is not a system to evaluate the overall quality of play, but it could be an element of such a system.   We thus need to look carefully at what is happening.

            The American League expanded from 8 teams to 10 in 1961; the National League did the same in 1962.   As it happens, neither league saw an immediate movement in the Age Deviation Score.    The American League ADS went from 2.73 in 1960 to 2.80 in 1961; the NL went from 2.99 in 1961 to 2.93 in 1962.

            However, there was an expansion effect.   It was just delayed.   What happens in an expansion is this.   The immediate effect of the expansion is to let into the major leagues a number of career minor leaguers, players who have been waiting for a chance but who aren’t quite good enough to force their way in, and these players tend to be closely bunched around 27 years of age.    What happens a year or two later, though, is that these first expansion players fail, for the most part, and many of them are replaced by very young players or by older players who have been released by other teams.    The Houston Colt .45s in 1962, their first season, had no teenagers on their roster.   In 1963, however, they had eight teen-agers on their roster at one point or another—one of whom (Rusty Staub) led the team in games played.

            In 1965, then, the Age Deviation Score reached up to 2.98—the highest figure since 1945, and one of the highest figures of the 20th century.    This suggests a substantial “backstep” in the quality of play due to expansion:

 

1960

2.84

1961

2.87

1962

2.80

1963

2.94

1964

2.90

1965

2.98

 

            After 1965 these figures went down rapidly—until the next expansion in 1969.   After the second expansion, they again worked their way upward:

 

1965

2.98

1966

2.85

1967

2.70

1968

2.59

1969

2.70

1970

2.70

1971

2.80

 

            The 2.59 figure in 1968 was the lowest since 1921.  Another observation here.  It is generally believed that the National League in this era was stronger than the American League—but this indicator does not reflect that.   In fact, the American League numbers in his era were consistently lower than the National League numbers—lower every year from 1957 through 1965.    It wasn’t until 1966 that the National League scores caught up to the American League—and even then they were almost the same.

            In the 1970s the Age Deviation Scores declined slowly:

 

1971

2.80

1972

2.79

1973

2.77

1974

2.77

1975

2.84

1976

2.71

1977

2.75

1978

2.74

1979

2.69

1980

2.63

1981

2.65

 

            In the late 1970s, after the beginning of the free agent era, salaries escalated very rapidly.  This kept older players in the game, and this caused the Age Deviation Score to go up again.   By the mid-1990s, however, the indicator had worked its way down to the all-time low:

 

1980

2.63

1981

2.65

1982

2.77

1983

2.81

1984

2.84

1985

2.83

1986

2.92

1987

2.81

1988

2.63

1989

2.62

1990

2.60

1991

2.68

1992

2.61

1993

2.53

1994

2.47

 

            2.47 remains the lowest figure ever, in 1919 and in 1994.   I’ll discuss this a little later.

            After 1994, of course, baseball was in the heart of the steroid era.  Steroids are a youth drug; they enabled players to do things at ages 35 and above that had never been done before at any age.  This causes the Age Deviation Score to move consistently higher for more than ten years:

 

1994

2.47

1995

2.48

1996

2.54

1997

2.62

1998

2.73

1999

2.74

2000

2.68

2001

2.74

2002

2.71

2003

2.75

2004

2.76

2005

2.81

2006

2.91

 

            It moved higher, yes, but not all that much higher.   The ADS moved back to the levels of the 1960s.

            Since 2006, with the gradual elimination of steroids and their lingering after-effects, the number has reduced significantly:

 

2006

2.91

2007

2.88

2008

2.78

2009

2.67

 

            Providing further evidence, if more is needed, that we are moving beyond the steroid era.  

The Age Deviation for the National League has been higher than the score for the American League in every season since 1999, often much higher—consistent with many other types of evidence showing that the quality of play in the American League has moved ahead of the quality of play in the National.

 

            It is my belief that the Age Deviation Score is in general an indicator of the quality of play within the league, but we don’t want to be overly confident of this.    The rising scores in the steroid era clearly do not indicate a decline in the quality of play, however much we might prefer to believe that that was true.   The steroids enabled players to increase their athletic abilities at ages when these would normally have been in decline.   This may have been morally wrong, but it is not evidence of a decline in the quality of play in the game, and it should not be interpreted that way.

            It is my belief that the quality of play in the major leagues has improved steadily over time—but the Age Deviation Score does not exactly support this belief; other evidence does, but the Age Deviation Score does not.    The Age Deviation Score shows a very rapid improvement in the quality of play from 1876 to about 1900, but relatively little improvement since 1900.  The lowest score ever was posted in 1919, and matched in 1994.

            But that argument, too, should not be overstated.    The score for 1919 was the lowest ever, true, but the average score for the years 1910 to 1919 was 2.86.   The average score for the years 1990 to 1999 was 2.60.      The data for 1919-1921 is just kind of a fluke, an anomaly.   Not too much can be read into it.

 
 

COMMENTS (15 Comments, most recent shown first)

donmalcolm
On Age Deviation Scores--I see little in this data set that supports your assertion, Bill. One problem that leaps out immediately is how you've moved away from the "age center." There are far more players with 200+ PAs at age 20 than there are at 41, for example. I realize that the game is front-loaded age-wise, but it strikes me that you need something less crude as a way of modeling age distribution than what you've set up here. The biggest problem, however, is that you're writing an essay with references to data that you don't present (all the NL and AL breakouts, or the ADS values for hitters and pitchers). Also: why are batters double-weighted? At-bats and BFP are the same number; it seems that there should be equal weighting. Finally: is this about league quality, or is it getting conflated with notions of team quality? You suggest that winning teams will have less age deviation--do the team numbers support that?
8:18 PM Aug 29th
 
CharlesSaeger
Martin: No, I'm not making your point for you. A largely forgotten education in Spanish is, for conversation purposes, effectively no better than no education at all. At best, you can get to the bathroom doing that, and holding your crotch and dancing is almost as good as saying "¿Dónde está el baño?"

I'm not trying to pick a fight either, but saying your point is "common sense," without any backup is bandwagoning. And saying, "You're making my point for me," and then turning around the context of what I said isn't nice.
2:34 PM Aug 19th
 
Cooper
Total guess: many of the players from Japan are coming to the US after an american team has made a pretty big investment in that player...they are coming to the US to make it work NOW. Does the team have more pressure to make it work, thus the interpreter?
12:44 PM Aug 16th
 
wovenstrap
Well, you're making my point for me. You may not think highly of the Spanish fluency of Anglos who take Spanish classes in high school, but even a poor education half-forgotten is miles ahead of the vast number of people, nearly the entire population, who cannot speak even a single word of Japanese. And there will still be a small number of Anglos who speak decent Spanish. I don't really see any comparison here.

Anyway, I didn't want to fight -- it's common sense that Spanish is much more prevalent in our culture than Japanese, and any Spanish speaker lacking English wold notice that, just as a Japanese speaker lacking English would notice the utter impossibility of finding any trace of Japanese almost anywhere in the US.
9:22 PM Aug 5th
 
evanecurb
Quality of play over time is an interesting issue to a lot of basball fans and you are the first I have seen try to tackle it quantitatively. If I remember the commentary in your historical abstract correctly, some of the other variables to be studied as possibly correlating with quality of play included the number of double plays per DP opportunity, the number of wild pitches and passed balls, the clustering of teams in the standings and the clustering of hitters and pitchers in the individual standings. If I think about the differences between pro ball and amateur ball as an example, I think you see all of these factors improving in pro ball. I also believe that you see more walks and errors in amateur ball, but I am having trouble visualizing what some of the other variables might be. Is roster size a possible indicator?
11:32 AM Aug 5th
 
CharlesSaeger
Martin: No, I wasn't forgetting it at all; see my remark about the barrio. And as for high school Spanish, other tongues are easy to forget after years of disuse if you didn't learn it fluently. If all you can say is, "Yo quiero Dos Equis," which is how well many (likely most; I know that something like 95% of all high school French students never use French at all after that) high school Spanish students speak Spanish after high school, that is about as good as no one speaking Spanish.

Look, I do some work as a Spanish interpreter, albeit not a truly good one. I speak to folks who have been in America for years and still do not speak English well or even at all. While many don't spend much time outside of spots with many Spanish speakers, they all do spend some time, and English still baffles them. It isn't that easy for most folks. Those folks have lives, work and children about which to worry; those come first. For a baseball player, hitting a little white ball and running somewhere comes first. He might have a little more time to learn English, but again, it isn't that easy.
10:10 AM Aug 5th
 
elricsi
We should note that ADS dropped below 3 in 1893, the same year the current pitching distance was set, and a year after the final death of the AA. Which lends more credence to 1893 as the beginning of MLB.
6:17 PM Aug 4th
 
wovenstrap
Charles Saeger is forgetting that the number of Spanish-speakers in Memphis will vastly outnumber the number of Japanese-speakers. Yes, it was just a hypothetical to illustrate the .... equivalency of not speaking English, but it's sobering to realize that even if you remove the native Spanish-speakers from the pool, there are going to be literally thousands of people in Memphis who were exposed to Spanish in high school or college.

And plus Japanese is harder for an English-speaker to learn, but that was already stated. I'm not sure how this all affects Ozzie's case. On the one hand, if there are a lot of Spanish speakers in baseball, then that suggests baseball should account for this. But I keep thinking that the existential bafflement of being a Japanese player on an American team must be far greater than being a Spanish speaker.

Plus a lot of the bafflement occurs outside of the stadium -- maybe there the Spanish speakers and the Japanese speakers are on more equal footing. Hailing a taxi, dealing with hotel concierges, ordering a meal in a nice restaurant.... it's possible Spanish speakers might feel pretty at sea in our nitpicky, time-obsessed, etc. Anglo ways. But I don't know. Plus Los Angeles is a lot different from Denver in these matters.
5:46 PM Aug 4th
 
stevensoderbergh
Any interest/value in applying this to managers?
4:42 PM Aug 4th
 
enamee
Regarding the Age Deviation Scores: If you take the highest-quality NCAA conference and, say, a mediocre Division III conference, it seems likely that they would have similar scores. The Little League World Series field of teams will have the same ADS as a local YMCA league, since those leagues are segregated by age. The obvious fact about baseball in 1919 is that many, many excellent players were not in the AL or NL, but in various minor leagues. Today, if you're a really good player, you're almost certainly going to be in the major leagues.

Of course, I realize that ADS is just one component of a larger analysis of league quality. It seems to me that it should be given low weight relative to other indicators of quality, though.
4:22 PM Aug 4th
 
jwt0001
Re: Ozzie
My question would be: has he said this to anyone within the organization? This is not an original question, I think I heard either Dan Patrick or Tony Kornheiser ask it first. To me, it's a bigger issue if he's tried to improve this in his own organization and been rebuffed.
2:02 PM Aug 4th
 
Robinsong
Part III:On Ozzie - I agree that Asians are treated differently. A big factor is numbers. Every team has multiple players and often coaches who understand Spanish. Every major league city and most minor league towns have a significant Spanish speaking population. Many Americans study Spanish or grew up speaking Spanish. The degree of isolation for a Spanish speaking ballplayer is small. None of these statements is as true for Asians. In addition, special language accomodations for the tiny number of Asian players might be affordable; to do the same for the hundreds of Spanish-speaking players would be unaffordable. That said, baseball probably takes the cultural adjustment for Latin American players far too much for granted and would benefit from thinking and planning and identifying solutions for potential problems.
IV: I think that Age Deviation, while interesting, is an unreliable indicator of quality across generations in mature MLB. Salary increases, baseball schools in Latin countries, the fact that the best players arrive young and stay long, PEDs, strength training are all factors that muddy the analysis. Players can to some extent control the durability of their skills after 27. Because salaries generally rise with age, an old player can lose his job to an inferior, cheaper, younger player.
12:14 PM Aug 4th
 
MattGoodrich
I can see where salary escalation might be an incentive for an older player to stick around longer. But wouldn't it also be an incentive for an owner to get rid of older, more expensive players? Are they invariably replaced by someone an equal distance from age 27?
10:55 AM Aug 4th
 
CharlesSaeger
English and Spanish have a common origin (Proto-Indo-European), and share some vocabulary (mostly due to borrowings, older ones from Latin and Norman French into English, newer ones direct from English to Spanish), but they're not mutually intelligible. Let me put it this way: if you drop a Bolivian into Memphis, he'll be only marginally less lost than putting a Japanese into Memphis, not counting if the Bolivian makes it into the inevitable barrio. The Japanese might well have studied English at some point, as the Japanese school system is excellent and English is obviously a useful foreign language for him to study.
10:43 AM Aug 4th
 
Trailbzr
1. This looks like a straight tally from Retrosheet.
2. Since bad teams play AAA lineups in September, I'd want to pack in late season games.
3. Doesn't every team already have bilingual players to translate Spanish?
4. There might be a difference between low-side and high-side age variation. Baby boom population demographics could have caused the drop that began in the late 60s, while salary escalation might have caused the late 80s rise.
8:16 AM Aug 4th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy