Remember me

Dating Pitchers (No, Not that Kind of Dating)

May 1, 2007
In introducing a new method, it is my habit to commence by explaining why the problem addressed is of significance, and how satisifed I am to have overcome the barriers to the solution thereof. In this particular case, modesty requires me to confess frankly that that which we are studying here is of such stupefying obscurity, bulwarked by such vast potential uselessness, that I am at a loss to explain why I am interested in it myself or to suggest why anyone else would be. Also, my method for addressing this useless and obscure problem actually doesn’t work worth a crap. Nonetheless, I have spent a solid week in January, 2007, studying this problem, and I’m going to put what I found on record. You’ve been warned.

Suppose that you look at a pitcher’s record. . ..let us pick this record to start with:

G

W

L

IP

R

ER

SO

BB

WP

H

HR

GS

CG

ERA

64

45

19

573

229

78

103

38

34

470

3

64

63

1.23


Working 573 innings, the pitcher has gone 45-19 with a 1.23 ERA. Even assuming that you don’t happen to recognize this particular record, it is fairly obvious when the record must have been posted. The pitcher in question pitched 573 innings, a usage pattern that hasn’t been common since Billy Martin died. No one has pitched 500 innings in a season since 1892, so we know immediately that this record must have been posted before 1892. Major league baseball or sort-of major league baseball began in 1876, so we have a seventeen-year window for this record, including the end points.

However, several things about this record make it fairly obvious that this season must have occurred early in that 1876-1892 period. The pitcher allowed 229 runs, 151 of them—or 66%--un-earned. Those kind of un-earned run percentages occurred only in the early part of that 1876-1892 era. Also, the pitcher struck out only 103 batters in 573 innings, walking only 38. Those extraordinarily low strikeout and walk ratios, again, occurred only in the very early part of the 1876-1892 era when pitchers would pitch 500 innings in a season. We can easily see, then, that this season must have occurred between 1876 and 1879. It is George Bradley’s record in 1876.

Suppose, however, that we take a more difficult problem, like this one:

G

W

L

IP

R

ER

SO

BB

H

HR

GS

CG

ERA

40

19

10

282

107

97

165

76

247

29

38

17

3.10


The pitcher has gone 19-10 with a 3.10 ERA. This is obviously a more modern record. This pitcher made 38 starts, which puts him in a four-man rotation. The regular four-man rotations became common around World War II and remained the dominant pattern until the late 1970s, so we can date this season as probably occuring sometime between 1946 and 1978. It can’t be 1990s, because no pitcher has worked 280 innings in a season since 1987. The pitcher threw 17 complete games. No one has thrown that many complete games, again, since 1987.

On the other end, the pitcher has allowed 29 home runs. Larry Corcoran allowed 35 home runs in 1884, but that was a one-year fluke by a pitcher pitching an obscene number of innings; no pitcher allowed as many as 29 home runs in less than 500 innings until 1930, and only three pitchers had done so before 1948. A home run total that high suggests a post-1950 season, but does not prove that the season couldn’t have occurred in the 30s or 40s.

The pitcher won 19 games but struck out only 165 batters. This suggests—but again, does not prove—that the season occurred in the 1950s, rather than the 1960s. Dick Donovan won 20 games in 1962 with only 94 strikeouts, but that was before the pitchers took over in ’63. From ’63 to ’68 the great majority of the pitchers who won 18 games or more had more than 165 strikeouts and also better ERAs than 3.10, although there were exceptions to both rules. It could be early 70s.

I’ll give you one more fact: the pitcher had six intentional walks. That dates it as post-1955, of course, because IBB don’t begin until 1955.

I am hopelessly addicted to puzzles. This is a stupid puzzle, I will grant you, because we actually know what the answer is—this was Johnny Antonelli in 1959. The challenge is not in the answer to this particular question, it is in the method. To what extent is it possible to look a pitcher’s record and say "that is a season from 1952" or "that is a season from the 1920s"? Can this be done?

Markers, samples, residue.. .. .are there identifying markers in a pitcher’s record—any pitcher’s record—which will enable us to say with confidence when that season was posted? It is a kind of make-believe detective exercise, related to a legitimate interest. The legitimate question is, "Why and how has baseball changed over time?" We spend a lot of time with that one. A geologist can study a rock and say that the rock was formed 45 to 48 million years ago, that there was a vast lake here at the time and a dinosaur with feathers and a serious case of colitis. An environmentalist can drill ice core samples from Antarctica and tell us how much carbon dioxide was in the air in the time of Julius Caesar and what penguins eat for breakfast and how many people were driving Pontiacs in 1952. A forensic scientist can take a bloody footprint and tell you that the crime was committed by a right-handed Norwegian between 5’9" and 5’11" who keeps a gray cat named Emilio. We are not going to learn anything really cool and useful like that in this article, but I am just trying to mimic the process and see where it leads us. I didn’t find anything useful, but I will put my failures on record, and maybe the next guy will get something out of it.

I started by drawing up what I called "back border" and "front border" rules. . . I thought I had 159 of these rules, but I see now that I had two rules numbered 85, so I guess it’s actually 160. These are the 160 rules, numbered 1 through 159. . .if you value your time you might want to scan these, rather than actually reading them:
  1. IBB begin in 1955.
  2. A pitcher with more than 500 IP has to be before 1892 since all pitchers with 500 innings are between 1876 and 1892.
  3. A pitcher with more than 400 IP has to be before 1908 since all pitchers with 400 innings are before 1908.
  4. All pitchers with 390 IP are before 1915 (For purposes of this study, "before 1915" or "pre-1915" includes 1915, and "after 1890" or "post-1890" includes 1890.)
  5. All pitchers with 380 IP are before 1917.
  6. All pitchers with 360 IP are 1972 or before.
  7. All pitchers with 360 IP are either before 1920, 1946 or 1971-72.
  8. All pitchers with 350 IP are pre-1973.
  9. All pitchers with 310 IP are pre-1979.
  10. All pitchers with 300 IP are pre-1980.
  11. All pitchers with 290 IP are pre-1985.
  12. All pitchers with 280 IP are pre-1987.
  13. All pitchers with 272 IP are pre-1991.
  14. All pitchers with 70 Complete Games are 1879 to 1884.
  15. All pitchers with 60 CG are pre-1892.
  16. All pitchers with 50 CG are pre-1893.
  17. All pitchers with 45 CG are pre-1904.
  18. All pitchers with 40 CG are pre-1908.
  19. All pitchers with 35 CG are pre-1946.
  20. All pitchers with 30 CG are pre-1975.
  21. All pitchers with 25 CG are pre-1980.
  22. All pitchers with 20 CG are pre-1986.
  23. All pitchers with 15 CG are pre-1998.
  24. All pitchers with 35 Incomplete Games are post-1965. (Incomplete Games being Games Started, minus complete games)
  25. All pitchers with 30 IG are post-1956.
  26. All pitchers with 25 IG and a winning record are post-1949.
  27. All pitchers with 20 IG are post-1910.
  28. All pitchers with 15 IG and less than 440 IP are post-1906.
  29. All pitchers with 11 or more IG are post-1891.
  30. All pitchers with 50 or more saves are post-1990.
  31. All pitchers with 40 saves are post-1983.
  32. All pitchers with 30 saves are post-1965.
  33. All pitchers with 30 saves AND 100 IP are pre-1992.
  34. All pitchers with 23 or more saves are post-1949.
  35. All pitchers with 20 saves and 120 or more IP are pre-1986.
  36. All pitchers with 15 or more saves are post-1924.
  37. All pitchers with 10 or more saves are post-1911.
  38. All pitchers with 6 or more saves are post-1905.
  39. All pitchers with 47 or more HR allowed are post-1986.
  40. All pitchers with 40 HR allowed are post-1955.
  41. All pitchers with 40 HR allowed and less than 12 losses are post-1986.
  42. All pitchers with 35 HR allowed and less than 35 wins are post-1948.
  43. All pitchers with 35 HR allowed and 0 complete games are post-1997.
  44. All pitchers with 30 HR allowed and less than 35 wins are post-1934.
  45. All pitchers whose WINS times HR allowed are 1000 or more are 1884-1891.
  46. Multiply the pitchers HR allowed times Complete Games. If total is 1500 or more the year is 1884.
  47. If the pitcher has 50 or more Wild Pitches the year is 1885 or 1886.
  48. If the pitcher has 40 or more WP the year is 1878 to 1890.
  49. If the pitcher has 31 or more WP the year is pre-1892.
  50. If the pitcher has 28 or more WP the year is pre-1914.
  51. If the pitcher allowed more than 400 runs the year is 1879 to 1883.
  52. If the pitcher allowed 350 runs the year is pre-1891.
  53. If the pitcher allowed 275 runs the year is pre-1897.
  54. If the pitcher allowed 275 runs and had 175 strikeouts OR had more strikeouts than walks, the year is pre-1892.
  55. If the pitcher allowed 275 runs and had less than 90 walks the year is pre-1886.
  56. All pitchers with 250 runs allowed are pre-1899.
  57. All pitchers with 206 runs allowed are pre-1903.
  58. All pitchers with 167 or more runs allowed are pre-1938.
  59. All pitchers with 161 or more runs allowed are pre-1979.
  60. All pitcher with 21 Games Started or more and 100% Complete Games are pre-1918.
  61. All pitchers with 21 GS and 90% CG are pre-1932.
  62. All pitchers with 20 GS and 80% CG are pre-1980.
  63. All pitchers with 20 GS and 70% CG are pre-1981.
  64. All pitchers with 20 GS and 60% CG are pre-1985.
  65. All pitchers with 20 GS and 50% CG are pre-1988.
  66. All pitchers with 20 GS and 40% CG are pre-1998.
  67. All pitchers with 10 or more CG are pre-1999.
  68. All pitchers with 20 GS or more and no CG are post-1964.
  69. All pitchers with 20 GS and 1 CG are post-1951.
  70. All pitchers with 20 GS and 2 CG are post-1936.
  71. All pitchers with 20 GS and 3 CG are post-1931.
  72. All pitchers with 20 GS and 4 CG are post-1928.
  73. All pitchers with 20 GS and 5 or 6 CG are post-1911.
  74. All pitchers with 20 GS and 7 or 8 CG are post-1908.
  75. All pitchers with 20 GS and 9 to 11 CG are post-1905.
  76. All pitchers with 20 GS and 12 CG are post-1893.
  77. All pitchers with 171 or more UR are pre-1883 (UR being Un-earned Runs).
  78. All pitchers with 150 or more UR are pre-1890.
  79. All pitchers with 140 or more UR are pre-1891
  80. All pitchers with 100 or more UR are pre-1895
  81. All pitchers with 91 or more UR are pre-1897.
  82. All pitchers with 72 or more UR are pre-1901.
  83. All pitchers with 64 or more UR are pre-1904.
  84. All pitchers with 54 or more UR are pre-1910.
  85. All pitchers with 42 or more UR are pre-1921.
  86. All pitchers with 40 or more UR are pre-1932.
  87. If Wins Minus Incomplete Games >50 then season is 1884 or 1885.
  88. If W Minus IG >40 then season is pre-1891.
  89. If W Minus IG >30 then season is pre-1912
  90. If W Minus IG >25 then season is pre-1931
  91. If W Minus IG >20 then season is pre-1952.
  92. If W Minus IG >15 then season is pre-1977.
  93. If W Minus IG < Negative 25 then season is post-1969.
  94. If pitcher has both 18 CG and 18 GF then season is between 1911 and 1923. (GF is Games Finished.)
  95. If pitcher has both 15 CG and 15 GF then season is between 1908 and 1936.
  96. If pitcher has both 13 CG and 13 GF then season is pre-1951.
  97. If pitcher has both 11 CG and 11 GF then season is pre-1972.
  98. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 165 or more, the season is 1876.
  99. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 150 or more, the season is pre-1880.
  100. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 110 or more, the season is pre-1901.
  101. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 90 or more, the season is pre-1926.
  102. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 80 or more, the season is pre-1942.
  103. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 65 or more, the season is pre-1945.
  104. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is 52 or more, the season is pre-1983.
  105. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is less than -200 and the pitcher pitched less than 500 innings, the season is 1965 or later.
  106. Divide the pitcher's innings pitched by 2 and subtract strikeouts.
    If the total is less than -141 and the pitcher pitched less than 380 innings, the season is 1946 or later.
  107. If the pitcher worked 90 or more games in relief, the season is 1969 or later.
  108. If the pitcher worked 75 or more games in relief, the season is 1964 or later.
  109. If the pitcher worked 70 or more games in relief, the season is 1950 or later.
  110. If the pitcher worked 65 or more games in relief, the season is 1943 or later.
  111. If the pitcher worked 60 or more games in relief, the season is 1939 or later.
  112. If the pitcher worked 50 or more games in relief, the season is 1925 or later.
  113. If the pitcher worked 45 or more games in relief, the season is 1923 or later.
  114. If the pitcher worked 40 or more games in relief, the season is 1917 or later.
  115. If the pitcher worked 35 or more games in relief, the season is 1915 or later.
  116. If the pitcher worked 30 or more games in relief, the season is 1913 or later.
  117. If the pitcher worked 25 or more games in relief, the season is 1911 or later.
  118. If the pitcher worked 20 or more games in relief, the season is 1906 or later.
  119. If the pitcher worked 15 or more games in relief, the season is 1897 or later.
  120. If the pitcher has averaged 9.10 innings per game or more, the season is pre-1890.
  121. If the pitcher has averaged 9.00 innings per game or more, the season is pre-1904.
  122. If the pitcher has averaged 8.50 innings per game or more, the season is pre-1980.
  123. If the pitcher has averaged 8.10 innings per game or more, the season is pre-1981.
  124. If the pitcher has appeared in 30 or more games and averaged 7.80 innings per game or more, the season is pre-1972.
  125. If the pitcher has averaged 0.50 innings per game or less, the season is post-1992.
  126. If the pitcher has averaged 0.60 innings per game or less, the season is post-1991.
  127. If the pitcher has appeared in 36 or more games and averaged 0.70 innings per game or less, the season is post-1991.
  128. If the pitcher has averaged 0.80 innings per game or less, the season is post-1966.
  129. If the pitcher has averaged 0.90 innings per game or less, the season is post-1963.
  130. If the pitcher has averaged 1.00 innings per game or less, the season is post-1962.
  131. If the pitcher has averaged 1.10 innings per game or less, the season is post-1956.
  132. If the pitcher has averaged 1.25 innings per game or less, the season is post-1944.
  133. If the pitcher has averaged 1.50 innings per game or less, the season is post-1940.
  134. If the pitcher has averaged 1.65 innings per game or less, the season is post-1929.
  135. If the pitcher has averaged 1.95 innings per game or less, the season is post-1925.
  136. If the pitcher has averaged 2.05 innings per game or less, the season is post-1924.
  137. If the pitcher has averaged 2.10 innings per game or less, the season is post-1922.
  138. If the pitcher has averaged 2.50 innings per game or less, the season is post-1921.
  139. If the pitcher has averaged 2.60 innings per game or less, the season is post-1915.
  140. If the pitcher has averaged 3.85 innings per game or less, the season is post-1913.
  141. If the pitcher has averaged 4.00 innings per game or less, the season is post-1910.
  142. If the pitcher has appeared in 30 or more games and averaged 4.50 innings per game or less, the season is post-1909.
  143. If the pitcher has appeared in 30 or more games and averaged 5.00 innings per game or less, the season is post-1906.
  144. If the pitcher has appeared in 30 or more games and averaged 5.50 innings per game or less, the season is post-1897.
  145. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 1020 or higher, the season is post-1985. (KP9 being strikeouts per nine innings, of course.)
  146. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 700 or higher, the season is post-1956.
  147. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 600 or higher, the season is post-1932.
  148. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 500 or higher, the season is post-1928.
  149. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 300 or higher, the season is post-1910.
  150. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 200 or higher, the season is post-1890.
  151. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 100 or higher, the season is post-1883.
  152. Multiply the pitchers KP9 times his ERA times his Incomplete Games.
    If product is 50 or higher, the season is post-1879.
  153. If the pitcher has 5 or less starts and 15 or more wins, the season is pre-1978.
  154. If the pitcher has posted 6.15 saves or more per 9 innings pitched, the season is post-1993.
  155. If the pitcher has posted 5.6 saves or more per 9 innings pitched, the season is post-1990.
  156. If the pitcher has posted 5.4 saves or more per 9 innings pitched, the season is post-1988.
  157. If the pitcher has posted 4.55 saves or more per 9 innings pitched, the season is post-1986.
  158. If the pitcher has posted 3.9 saves or more per 9 innings pitched, the season is post-1983.
  159. If the pitcher has posted 2.85 saves or more per 9 innings pitched, the season is post-1968.
I took a spreadsheet which had the season’s records of every pitcher from 1876 through 2006 who appeared in 30 or more games or pitched 100 or more innings—a total of 18,297 pitchers. I wrote these 160 rules—or actually, 159 of them—into the spreadsheet, to apply all 159 rules to each of the 18,297 pitchers. This was a deadly dull endeavor, even by the standards of the things I do for a living, which rarely involve movie stars, brassieres or gun battles. Two of the most boring days of my career, but I waded through it.

A few of the rules above would not apply to pitchers not included in the study. ..for example, there may be a pitcher who averaged more than 9.1 innings per game post-1890, but pitched less than 100 innings. Most of these rules apply universally to all pitchers, but a few of them are limited to pitchers pitching 30 games or 100 innings.

To illustrate how these rules work at their best, let us take the season of Cliff Curtis, with the Boston Braves in 1910. Curtis allowed 154 runs that year, 55 of them un-earned. We know, by way of Rule 84, that no pitcher since 1910 has allowed that many un-earned runs in a season, although it was common before 1910, and thus that this season cannot be more recent than 1910. But Curtis also made 37 starts that year, and completed only 12 of them—25 incomplete games. We also know, by way of Rule 27, that no pitcher BEFORE 1910 had more than 20 incomplete starts. Curtis’ record cannot have occurred before 1910; it cannot have occurred since 1910. This record can be positively identified, by way of rules 84 and 27, as having occurred in 1910.

Almost every season can be dated to some extent. Only 102 of the 18,297 seasons in the study—one-half of one percent--are not subject to any of these 159 rules. But having applied 159 rules to 18,297 seasons, I am sorry to report that there are remarkably few cases in which the rules actually work the way they are supposed to. I wasn’t expecting to be able to nail down seasons precisely, but I thought I would get a lot of seasons narrowed down to a 3-to-10 year stretch of time. In fact, I wound up with only 77 seasons pinned down to a ten-year window. I regard this, frankly, as an abject failure. I would have guessed that the funny-looking numbers of the 1880s, by themselves, would clearly identify more than 77 seasons. When you add in the 70-game, 45-inning seasons which only occur beginning in the 1990s, the 60-inning, 30-save seasons of modern relievers. . ..I would have thought, honestly, that we would have been able to pretty well lock in on thousands of seasons.

I didn’t implement Rule 7, from the list above, which carves out multiple time frames for exceptional innings pitched totals. I couldn’t figure out how to implement that in a spreadsheet within a reasonable amount of space, and the 1946 exception only applies to Bob Feller, who pitched 371 innings that year. It’s not really kosher to make rules targeted at an individual player, or even a very small group of players. Obviously I could make a rule saying that if a pitcher has 348 strikeouts and 153 walks it has to be 1946, and in this way "identify" every season. There are a couple of the other rules that are iffy in this way as well.

It may be OK to make rules that apply to only one player, if it’s part of a legitimate pattern. Rule 116, for example (the rule saying that if a pitcher pitched 30 games in relief the season is post-1913) probably distinguishes only one or two pitchers from Rule 115, which is similar, but that’s OK. The relief pitcher was coming into existence, little by little. A series of rules that track that emergence of a new phenomenon are very different from a rule that is intended to uniquely identify a season.

Some of the rules, as you can see, are contrived. Again, I don’t think there’s a problem with that. There is no prohibition in science against measuring things by complicated formulas with somewhat arbitrary combinations of elements. Sometimes I attempted to combine the "sorting power" of two trends into one rule by manufacturing a statistic. That’s legitimate. The only problem is, I just didn’t make it work very well.

I formed "year estimates" for all seasons by taking the midpoint of the back border and the front border for each season. The average error of these estimates was an astonishing 14 years.

This effort attempted to place a season in time by absolute rules; this season, because of this stat or this combination of stats, HAS to be within a specific time frame. That failing, I turned my attention from "clear proof" to "indications". Let’s take this season:
G W L IP R ER SO BB IBB WP HBP BK H HR GS CG SHO ERA
36 14 17 240 112 112 142 90 8 4 8 5 255 14 36 8 4 4.19

There are intentional walks in the record, which means that it’s post-1955, but let’s ignore that for the moment, and pretend that there is no clear proof as to when the season occurred. There are, however, many indications. The near-.500 record with an ERA over four strongly indicates that it isn’t 1960s or dead ball era. Thirty-one decisions and four shutouts strongly suggest that it isn’t 1990s or post-2000. Thirty-six games, all of them starts but only eight complete games, strongly indicate that the record is post-1960. The 112 runs, all of them earned, would also strongly suggest that the season was modern (it was, in fact, the first season ever in which a pitcher allowed 100+ runs, all of them earned.) The season is Dick Ruthven, 1976, but how can we sum up the "indications" as to a season’s time frame to make an estimate of when the season occurred?

I made up a second system to place a record in time, and I will describe that system in general terms, but not in specifics, which I am allowed to skip because I confessed earlier that the method doesn’t work very well anyway. . ..any of you could take the same concept and make a better method if you took the time. I wrote up an explanation of my system, but it’s four pages of technical gibberish, and, since the method doesn’t work well, it’s not worth it. But in general terms, this is what I did.

There are many things in a record which can be taken to indicate "early" or "late". Complete games have gone down steadily over time, thus complete games suggest "early". Un-earned runs suggest "early". Incomplete games, relief appearances and home runs allowed suggest "late". Saves suggest "late".

The first thing I did to make a "second time spot estimate", then, was to simply add up the things in the pitcher’s record that suggest "early’, which I called Early, and the things that suggest "late", which I called Late.

One of the most reliable truths about baseball history is that innings pitched per game have gone down over time, both by starters and by relievers. Starters in the 1880s pitched 9 innings a game, now they pitch 5 or 6. Early relievers pitched 3 innings a game; now they pitch 1 or less.

You can estimate the innings a pitcher will pitch in a season by multiplying his starts by some number—let’s say 7—and his relief appearances by some lower number—let’s say 1.25. Early in baseball history virtually all pitchers exceeded this expected number of innings pitched; now, virtually all pitchers fall short of it. You can make this rule even more reliably true by modifying the multiplier for starts so that pitchers with low ERAs are expected to pitch more innings per game than pitchers with high ERAs.

We can make another estimate of where the player falls on the time line, then, by looking at whether he is over or under his expected innings pitched. If he pitches more innings than expected, that’s "Early"; if he pitches fewer, that’s "Late". To adjust for this, I increased the "Early" indicators by multiplying them by Innings Pitched, and dividing by Expected Innings Pitched. Then the "Early to Late" ratio suggests where the pitcher falls along the 130-year timeline from 1876 to 2006.

Not very reliably, it turns out. That system, like the other one, has an average error of 14 years. When I put the two systems together by adding them together and dividing by two, it reduced the average error to 11.83 years.

The best thing about this study is that I didn’t get any government funding to do it. If I had, I’m afraid I’d wind up in jail. If you take an interest in the subject, feel free to pick up what I’ve done and move the ball.

Bill James
Ft. Myers, Florida
March 25, 2007
 
 

COMMENTS (3 Comments, most recent shown first)

mvandermast
The previous post reminds me of the 1982 Baseball Abstract, and the problem of the white space...
4:18 PM Jan 4th
 
mvandermast
The joy of re-reading past articles...

Maybe you (or another interested party) could use Similarity Scores (or possibly another similarity metric) to get better estimates? For example, you could see how similar each season is to the typical pitcher-season in each decade. Or something along those lines.








4:16 PM Jan 4th
 
jriddlesperger
I would say that I think that your goal is probably a little too ambitious. What I think you probably want is to be able to get something like 99%--or 95%--of seasons to a window roughly 10% the size of baseball history. So: what percent of pitchers' seasons come back within a 13-season window? (Oh, and I realize that because of the number of teams, the window should actually be smaller, probably significantly smaller).
6:51 PM Jun 12th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy