The Revenge of Matty Alou
I have been working ahead of you all in the Runs Saved series, meaning that the stuff I am publishing is several days behind the stuff I am working on at the moment. I am working ahead of you, but publishing stuff faster than I can create it, so you are catching up with me—a situation familiar to any teacher who has ever been required to teach a subject that he or she doesn’t know well. At one point I was several weeks ahead of you; now, I’m maybe three days ahead.
Anyway, there was a question asked to me in "Hey, Bill" on Wednesday morning which I didn’t know the answer to but know how to get the answer to, so I thought I would take a day off from the Runs Saved series to address the issue. The question had to do with Matty Alou hitting .330 every year in a league in which the league batting averages were at a very low level. The essence of the question was, "Matty Alou hit .332 in a league in which Bob Gibson had a 1.12 ERA, so what would he hit in a normal league?"
The answer, by the way, is .356. But give us a minute.
Dallas Adams invented a process to answer that question in 1977 or 1978; actually Dallas took a formula that I had invented—the Log5 method—and figured out how to generalize this formula so that it applied to a wider range of issues, thus giving it much more value. Also, some guy on the SABR Statistical Analysis Boards says that this process is the same thing as the Bradley-Terry process which was first published some time in the 1950s; I can’t figure out whether he is right or wrong about that, but I will note it in any case.
THIS METHOD HAS TREMENDOUS UNUSED POTENTIAL IN ANALYZING SPORTS. I cannot stress this enough. A few examples:
If you are creating a simulation, such as a Table Game or other simulation, you need to know "If a .328 hitter is facing a pitcher who gives up a .206 batting average in a league in which the average is .262, what is the resulting batting average? The resulting On Base percentage? What are the resulting frequencies of walks, doubles, triples, homers, or strikeouts?"
This method is how you can calculate that, and get the answer right. It’s really the ONLY method by which you can calculate that, and get the answer right.
What if they are playing in a different league. . the batter hits .328 in a league in which the batting average is .270; the pitcher limits batters to a .249 average in a league in which the average is .256. What will be the outcomes of the confrontation?
This method tells you the correct answers.
Adjusting parks. Let us suppose that the batting average for the season is .252 in Dodger Stadium and .282 in Colorado, and let us suppose that a player hits .321 in Colorado. What will he hit in Dodger Stadium?
This is the method that you use to answer that question. . .all questions like that. But it goes well beyond that. Let’s suppose that you are dealing with two candidates to make the NCAA basketball tournament, one of which is 25-7, but the other is 19-13 but against a much tougher schedule. Which is actually the better won-lost record?
This is the method you would use to study that question.
Suppose that a player hits .285 in Double-A, but in conditions in which the environmental batting average is .261. Suppose that he moves to Triple-A, but in Triple-A he plays in conditions under which the environmental batting average is .284. What will he hit?
This is the method that you would use to study that question.
It is a method that SHOULD BE used regularly by sports analysts, but, because it is a little bit confusing, a little bit awkward, it isn’t. That’s why I am writing this article, in response to the Hey, Bill question about Matty Alou. It relates to an extremely significant issue.
OK, we start with the Log5 method. A Log5 is a number which represents the relative strength of any other number, when compared to .500. The Log5 of any number less than .500 is less than .500 and less than the number itself, and the Log5 of any number greater than .500 is greater than .500 and greater than the number itself. It is not actually a complicated system.
The issue here is that to solve many of these problems that I have talked about, you have to figure the Log5 of a Log5. That gets hairy. But I’ll walk you through it. Basically, we are only using two formulas here. We use them over and over, but we’re just using two formulas.
If our starting point number is x, then the first formula, for the Log5 of x, is
  Log5 of X = X / (2 * (1-X))
If X is .600, then the Log5 of X is .750. We call it 750, rather than .750, because it is easier and doesn’t make any difference. And the other formula is just A/ (A + B). Just two really simple formulas; that’s all that it is. It’s just using these two simple formulas again and again that makes it seem complicated.
This process takes 15 lines. For the purpose of simple illustration, I’ll start with a more simple question: If a player hits .280 in a league in which the batting average is .250, what would be the equivalent batting average if the league batting average was .260? Lines 1 and 2 are just the player’s batting average, and the league batting average:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Lines 3 and 4 are the Log5s of these two numbers:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
And Line 5 is a simple arithmetic comparison of Line 3 to Line 4:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Because .280 is higher than .250, this results in a figure larger than .500, but it can never result in a number lower than zero or higher than 1.000. On Line 6 we introduce the alternative batting average that we want to see—in this case, .260:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
And on Line 7 we figure the Log5 of that:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
Line 7
|
Log5 Alternate League B Average
|
.176
|
Then we compare Line 4 in this process to Line 7 by the other formula, a simple comparison:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
Line 7
|
Log5 Alternate League B Average
|
.176
|
Line 8
|
Comparison of Line 4 and Line 7
|
.487
|
Line 8 is less than .500 because .250 is less than .260. Because .250 is less than .260, the Log5 of .250 (which appears on Line 4) is less than the Log5 of .260 (which appears on Line 7); therefore, Line 8 is less than .500. It’s basically saying that .260 is a stronger number than .250. ".487" represents the relative strength of .250 to .260.
We now have a number which represents the relative strength of .280 to .250 (.538), and a number which represents the relative strength of .250 to .260 (.487). What we need to do now is triangulate these two numbers so that they give us the relative strength of .280 to .260. We do that on Lines 9, 10 and 11:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
Line 7
|
Log5 Alternate League B Average
|
.176
|
Line 8
|
Comparison of Line 4 and Line 7
|
.487
|
Line 9
|
Log5 of Line 5
|
.583
|
Line 10
|
Log5 of Line 8
|
.474
|
Line 11
|
Comparison of Line 9 to Line 10
|
.552
|
What that .552 really means is "When a .280 batting average is compared to a .250 batting average and a .260 league is compared to a .250 league, the .280 batting average is the stronger than the .260 league average, but not by as much as it is stronger than the .250 league average."
If the league batting average is .250, the hitters succeed (in that respect) 25% of the time. That means that the PITCHERS succeed 75% of the time. We put that number of Line 12, to represent the league’s typical pitcher:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
Line 7
|
Log5 Alternate League B Average
|
.176
|
Line 8
|
Comparison of Line 4 and Line 6
|
.487
|
Line 9
|
Log5 of Line 5
|
.583
|
Line 10
|
Log5 of Line 8
|
.474
|
Line 11
|
Comparison of Line 9 to Line 10
|
.552
|
Line 12
|
One minus League Batting Average
|
.750
|
Then, in lines 13 and 14, we compare the derivative figure from Line 9--.583—to the pitchers’ success percentage, in Line 12 (.750):
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
Line 7
|
Log5 Alternate League B Average
|
.176
|
Line 8
|
Comparison of Line 4 and Line 7
|
.487
|
Line 9
|
Log5 of Line 5
|
.583
|
Line 10
|
Log5 of Line 8
|
.474
|
Line 11
|
Comparison of Line 9 to Line 10
|
.552
|
Line 12
|
One minus League Batting Average
|
.750
|
Line 13
|
Log5 of Line 11
|
.615
|
Line 14
|
Log5 of Line 12
|
1.500
|
I’m not really sure why we do that; Dallas Adams figured all of this out, and I just stumble around in the dark until I can remember what he did. I don’t always remember why it is done that way. Anyway, we’re almost done. Now we make a simple comparison of Line 13 to Line 14:
Line 1
|
Batting Average
|
.280
|
Line 2
|
League Batting Average
|
.250
|
Line 3
|
Log5 Batting Average
|
.194
|
Line 4
|
Log5 League Batting Average
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.538
|
Line 6
|
Alternate League Batting Average
|
.260
|
Line 7
|
Log5 Alternate League B Average
|
.176
|
Line 8
|
Comparison of Line 4 and Line 7
|
.487
|
Line 9
|
Log5 of Line 5
|
.583
|
Line 10
|
Log5 of Line 8
|
.474
|
Line 11
|
Comparison of Line 8 to Line 9
|
.552
|
Line 12
|
One minus League Batting Average
|
.750
|
Line 13
|
Log5 of Line 11
|
.615
|
Line 14
|
Log5 of Line 12
|
1.500
|
Line 15
|
Comparison of Line 13 to Line 14
|
.291
|
And that’s our answer: If a player hits .280 in a .250 league, he would hit .291 in a .260 league of the same quality. Which makes sense, when you think about it; .280 is 12% higher than .250, and .291 is 12% higher than .260. What we’re really doing is just increasing the player’s batting average in proportion to the change in the league batting average, EXCEPT that if we did that without limitation then a .900 hitter would hit more than 1.000, which isn’t possible. The process is just bending the lines very gradually so that we keep the averages in the realm of the possible.
OK, let’s do Matty Alou, 1968. In 1968 the league batting average was .243, so these are the first five lines of the process:
Line 1
|
Batting Average
|
.332
|
Line 2
|
League Batting Average
|
.243
|
Line 3
|
Log5 Batting Average
|
.249
|
Line 4
|
Log5 League Batting Average
|
.161
|
Line 5
|
Comparison of Line 3 to Line 4
|
.608
|
.332 compared to .243 is much higher than .280 compared to .250:
Line 1
|
Batting Average
|
.332
|
.280
|
Line 2
|
League Batting Average
|
.243
|
.250
|
Line 3
|
Log5 Batting Average
|
.249
|
.194
|
Line 4
|
Log5 League Batting Average
|
.161
|
.167
|
Line 5
|
Comparison of Line 3 to Line 4
|
.608
|
.538
|
Now we need to decide what is a "typical" league batting average. Let us say it is .263. I’m not sure what the major league batting average since 1900 is, but it’s within a point or two of .263, so we enter that:
Line 1
|
Batting Average
|
.332
|
Line 2
|
League Batting Average
|
.243
|
Line 3
|
Log5 Batting Average
|
.249
|
Line 4
|
Log5 League Batting Average
|
.161
|
Line 5
|
Comparison of Line 3 to Line 4
|
.608
|
Line 6
|
Alternate League Batting Average
|
.263
|
And then we just let the process run:
Line 1
|
Batting Average
|
.332
|
Line 2
|
League Batting Average
|
.243
|
Line 3
|
Log5 Batting Average
|
.249
|
Line 4
|
Log5 League Batting Average
|
.161
|
Line 5
|
Comparison of Line 3 to Line 4
|
.608
|
Line 6
|
Alternate League Batting Average
|
.263
|
Line 7
|
Log5 Alternate League B Average
|
.178
|
Line 8
|
Comparison of Line 4 and Line 6
|
.474
|
Line 9
|
Log5 of Line 5
|
.774
|
Line 10
|
Log5 of Line 8
|
.450
|
Line 11
|
Comparison of Line 9 to Line 10
|
.633
|
Line 12
|
One minus League Batting Average
|
.757
|
Line 13
|
Log5 of Line 11
|
.861
|
Line 14
|
Log5 of Line 12
|
1.558
|
Line 15
|
Comparison of Line 13 to Line 14
|
.356
|
Matty Alou’s .332 batting average in 1968 is equivalent to a .356 batting average in a historically normal season. I started to figure the Normalized batting average for every batting champion in history:
Matty
|
Honus
|
Nap
|
Jesse
|
Ed
|
Ginger
|
Nap
|
Honus
|
Alou
|
Wagner
|
Lajoie
|
Burkett
|
Delahanty
|
Beaumont
|
Lajoie
|
Wagner
|
1968
|
1900 NL
|
1901 AL
|
1901 NL
|
1902 AL
|
1902 NL
|
1903 AL
|
1903 NL
|
.332
|
.381
|
.426
|
.376
|
.376
|
.357
|
.344
|
.355
|
.243
|
.279
|
.277
|
.267
|
.275
|
.259
|
.255
|
.269
|
.356
|
.362
|
.409
|
.371
|
.362
|
.362
|
.353
|
.348
|
But that was too much work, so I backed off to figuring the adjusted batting average for everybody who beat the league batting average by 100 points or more:
Cy
|
George
|
Ty
|
Honus
|
Honus
|
Ty
|
Ty
|
Nap
|
Seymour
|
Stone
|
Cobb
|
Wagner
|
Wagner
|
Cobb
|
Cobb
|
Lajoie
|
1905 NL
|
1906 AL
|
1907 AL
|
1907 NL
|
1908 NL
|
1909 AL
|
1910 AL
|
1910 AL
|
.377
|
.358
|
.350
|
.350
|
.354
|
.377
|
.383
|
.384
|
.255
|
.249
|
.247
|
.243
|
.230
|
.244
|
.243
|
.243
|
.387
|
.375
|
.369
|
.374
|
.396
|
.401
|
.408
|
.409
|
So we see here that, normalized to the league batting average, Nap Lajoie’s .384 average in 1910 is actually more impressive than his .426 average in 1901. Not that there is any consensus about what Lajoie’s batting average was in either 1901 or 1910. Ty Cobb’s batting average normalizes to .408 in 1910, 1911 and 1912:
Ty
|
Joe
|
Sam
|
Ty
|
Joe
|
Tris
|
Nap
|
Heine
|
Cobb
|
Jackson
|
Crawford
|
Cobb
|
Jackson
|
Speaker
|
Lajoie
|
Zimmerman
|
1911 AL
|
1911 AL
|
1911 AL
|
1912 AL
|
1912 AL
|
1912 AL
|
1912 AL
|
1913 NL
|
.420
|
.408
|
.378
|
.410
|
.395
|
.383
|
.368
|
.372
|
.273
|
.273
|
.273
|
.265
|
.265
|
.265
|
.265
|
.272
|
.408
|
.396
|
.366
|
.408
|
.393
|
.381
|
.366
|
.361
|
Ty
|
Joe
|
Tris
|
Ty
|
Benny
|
Ty
|
Tris
|
Ty
|
Cobb
|
Jackson
|
Speaker
|
Cobb
|
Kauff
|
Cobb
|
Speaker
|
Cobb
|
1913 AL
|
1913 AL
|
1913 AL
|
1914 AL
|
1914 FL
|
1915 AL
|
1916 AL
|
1916 AL
|
.390
|
.373
|
.363
|
.368
|
.370
|
.369
|
.386
|
.371
|
.256
|
.256
|
.256
|
.248
|
.263
|
.248
|
.248
|
.248
|
.399
|
.382
|
.371
|
.387
|
.370
|
.388
|
.405
|
.390
|
Ty
|
George
|
Tris
|
Ty
|
Ty
|
George
|
Tris
|
Rogers
|
Cobb
|
Sisler
|
Speaker
|
Cobb
|
Cobb
|
Sisler
|
Speaker
|
Hornsby
|
1917 AL
|
1917 AL
|
1917 AL
|
1918 AL
|
1919 AL
|
1920 AL
|
1920 AL
|
1920 NL
|
.383
|
.352
|
.332
|
.382
|
.384
|
.407
|
.332
|
.370
|
.248
|
.248
|
.248
|
.254
|
.268
|
.284
|
.284
|
.270
|
.402
|
.370
|
.350
|
.393
|
.378
|
.382
|
.309
|
.362
|
After Ty Cobb, .402 in 1917, the next normalized .400 hitter is Rogers Hornsby in 1924:
Harry
|
Rogers
|
George
|
Ty
|
Rogers
|
Harry
|
Babe
|
Rogers
|
Heilmann
|
Hornsby
|
Sisler
|
Cobb
|
Hornsby
|
Heilmann
|
Ruth
|
Hornsby
|
1921 AL
|
1921 NL
|
1922 AL
|
1922 AL
|
1922 NL
|
1923 AL
|
1923 AL
|
1924 NL
|
.394
|
.397
|
.420
|
.401
|
.401
|
.403
|
.393
|
.424
|
.292
|
.389
|
.285
|
.285
|
.292
|
.282
|
.282
|
.283
|
.360
|
.270
|
.393
|
.375
|
.367
|
.380
|
.370
|
.400
|
I’ll include Bill Terry in 1930 because he hit .400, even though he actually was not 100 points better-than-league:
Harry
|
Rogers
|
Harry
|
Al
|
Rogers
|
Lefty
|
Bill
|
Al
|
Heilmann
|
Hornsby
|
Heilmann
|
Simmons
|
Hornsby
|
O'Doul
|
Terry
|
Simmons
|
1925 AL
|
1925 NL
|
1927 AL
|
1927 AL
|
1928 NL
|
1929 NL
|
1930 NL
|
1931 AL
|
.393
|
.403
|
.398
|
.392
|
.387
|
.398
|
.401
|
.390
|
.292
|
.292
|
.285
|
.285
|
.281
|
.294
|
.303
|
.278
|
.359
|
.369
|
.372
|
.366
|
.366
|
.362
|
.355
|
.372
|
After Rogers Hornsby in 1924 the next normalized .400 hitter—and the last normalized .400 hitter—is Ted Williams in 1941:
Chuck
|
Arky
|
Joe
|
Joe
|
Ted
|
Stan
|
Ted
|
Stan
|
Klein
|
Vaughan
|
Medwick
|
DiMaggio
|
Williams
|
Musial
|
Williams
|
Musial
|
1933 NL
|
1935 NL
|
1937 NL
|
1939 AL
|
1941 AL
|
1946 NL
|
1948 AL
|
1948 NL
|
.368
|
.385
|
.374
|
.381
|
.406
|
.365
|
.369
|
.376
|
.266
|
.277
|
.272
|
.279
|
.266
|
.256
|
.266
|
.261
|
.364
|
.368
|
.363
|
.362
|
.402
|
.373
|
.365
|
.378
|
Although normalized to a consistent league batting average, Ted Williams in 1957 is only 4 points behind Ted Williams in 1941:
Ted
|
Mickey
|
Harvey
|
Norm
|
Roberto
|
Rico
|
Joe
|
Rod
|
Williams
|
Mantle
|
Kuenn
|
Cash
|
Clemente
|
Carty
|
Torre
|
Carew
|
1957 AL
|
1957 AL
|
1959 AL
|
1961 AL
|
1967 NL
|
1970 NL
|
1971 NL
|
1974 AL
|
.388
|
.365
|
.353
|
.365
|
.357
|
.366
|
.363
|
.364
|
.255
|
.255
|
.253
|
.256
|
.249
|
.258
|
.252
|
.258
|
.398
|
.375
|
.365
|
.373
|
.374
|
.372
|
.376
|
.370
|
Rod
|
Rod
|
George
|
Wade
|
Willie
|
Tony
|
Wade
|
Andres
|
Carew
|
Carew
|
Brett
|
Boggs
|
McGee
|
Gwynn
|
Boggs
|
Galarraga
|
1975 AL
|
1977 AL
|
1980 AL
|
1985 AL
|
1985 NL
|
1987 NL
|
1988 AL
|
1993 NL
|
.359
|
.388
|
.390
|
.368
|
.353
|
.370
|
.366
|
.370
|
.258
|
.266
|
.269
|
.261
|
.252
|
.261
|
.259
|
.264
|
.365
|
.384
|
.383
|
.370
|
.366
|
.372
|
.371
|
.369
|
Tony
|
Jeff
|
Tony
|
Tony
|
Larry
|
Larry
|
Todd
|
Barry
|
Gwynn
|
Bagwell
|
Gwynn
|
Gwynn
|
Walker
|
Walker
|
Helton
|
Bonds
|
1994 NL
|
1994 NL
|
1995 NL
|
1997 NL
|
1998 NL
|
1999 NL
|
2000 NL
|
2002 NL
|
.394
|
.368
|
.368
|
.372
|
.364
|
.379
|
.372
|
.370
|
.267
|
.267
|
.263
|
.263
|
.262
|
.268
|
.266
|
.259
|
.389
|
.363
|
.368
|
.372
|
.365
|
.373
|
.368
|
.375
|
Nobody has done this since Chipper Jones in 2008, but Mookie just missed it by three points, so I’ll include him:
Ichiro
|
Chipper
|
Mookie
|
Suzuki
|
Jones
|
Betts
|
2004 AL
|
2008 NL
|
2018
|
.372
|
.364
|
.346
|
.270
|
.260
|
.249
|
.364
|
.368
|
.363
|
The .356 adjusted batting average in 1968 was the highest of Matty Alou’s career. Matty never played in a league with a batting average as high as .263; .262, yes, but not .263. His batting average would adjust upward for every season and partial season of his career. His career batting average would adjust to .319:
Lg
|
YEAR
|
AB
|
H
|
Avg
|
Lg Average
|
Adjusted Average
|
Adjusted Hits
|
NL
|
1960
|
3
|
1
|
.333
|
.255
|
.342
|
1.0
|
NL
|
1961
|
200
|
62
|
.310
|
.262
|
.311
|
62.2
|
NL
|
1962
|
195
|
57
|
.292
|
.261
|
.294
|
57.3
|
NL
|
1963
|
76
|
11
|
.145
|
.245
|
.157
|
11.9
|
NL
|
1964
|
250
|
66
|
.264
|
.254
|
.273
|
68.3
|
NL
|
1965
|
324
|
75
|
.231
|
.249
|
.244
|
79.1
|
NL
|
1966
|
535
|
183
|
.342
|
.256
|
.350
|
187.3
|
NL
|
1967
|
550
|
186
|
.338
|
.249
|
.354
|
194.7
|
NL
|
1968
|
558
|
185
|
.332
|
.243
|
.356
|
198.6
|
NL
|
1969
|
698
|
231
|
.331
|
.250
|
.346
|
241.5
|
NL
|
1970
|
677
|
201
|
.297
|
.258
|
.302
|
204.5
|
NL
|
1971
|
609
|
192
|
.315
|
.252
|
.328
|
199.8
|
NL
|
1972
|
404
|
127
|
.314
|
.248
|
.331
|
133.7
|
AL
|
1972
|
121
|
34
|
.281
|
.239
|
.308
|
37.3
|
AL
|
1973
|
497
|
147
|
.296
|
.259
|
.300
|
149.1
|
NL
|
1973
|
11
|
3
|
.273
|
.254
|
.282
|
3.1
|
NL
|
1974
|
81
|
16
|
.198
|
.258
|
.202
|
16.4
|
|
|
5789
|
1777
|
.307
|
|
.319
|
1845.7
|
Thanks for reading.
Bill