How Reliable are WonLost Records?
Part III
In one start, it is nearly impossible for a pitcher’s True Winning Percentage to match his recordbook Winning Percentage. After one start, a starting pitcher’s winning percentage is either .000 or 1.000 (following the tradition of listing a pitcher who is 00 as having a .000 winning percentage.) A pitcher’s onegame winning percentage cannot exceed .950 since, no matter how well the starting pitcher pitches, the other team may also pitch a shutout until the game wanders beyond the control of the strongest starting pitcher. If a pitcher pitches well, then, there is at least a 50point (.050) margin between his winning percentage and his true winning percentage. It IS possible for a starting pitcher to pitch so badly that his team’s chance of winning is near zero; possible, but not common.
There are 4,877 pitchers in my data who made at least one major league start, of whom only 3 pitched so badly that their True Winning Percentage after one start was within 10 points (.010) of zero. That is one in 1600, which will show in the charts later as 0%.
After two starts, a pitcher’s recordbook winning percentage has three possible landing spots: 1.000, .500, and .000. After three starts, there are five possible landing spots (1.000, .667, .500, .333 and .000. .500 is possible, and common, because most pitchers do not have three decisions after three career starts.)
As the number of possible winning percentages increases, the number of pitchers who have about the same True Winning Percentage as recordbook Winning Percentage increases. After one start, the average gap between Winning Percentage and True Winning percentage is 369 points (.369). After two starts it drops to 327 points; after three starts, to 272 points. After four starts, the average gap between a pitchers winning percentage and his true winning percentage is down to .229; after five starts, down to .198.
After one start, zero percent of starting pitchers have winning percentages within 10 points of their true winning percentage. After two starts, it is 1%; after three starts, 2%, and four starts, 3%, and after five starts, a higher 3%.
Winning percentages within 100 points True Winning Percentage are, of course, more common. After one start, 5% of pitchers have winning percentages within 100 points of their True Winning Percentage; after two starts, 13% do, after three starts, 23%, after four starts, 29%, and after five starts, 33%. After 438 starts, 100% of pitchers have winning percentages within 100 points of their True Winning Percentage.
Does True Winning Percentage predict a pitcher’s future Winning Percentage better than Wins and Losses predict themselves? I would be shocked if the answer is "No", but I did not actually study that question. It would seem to me intuitively that the convergence of True Winning Percentage and record book winning percentage as starts increase is nearly proof that True Winning Percentage is the dominant hand, but it just didn’t occur to me to actually study that while I was doing the work.
Anyway, this chart states the facts I have already given you (and more). I’ll highlight the numbers I have already given you; I am just trying to teach you to read the chart:
Starts

Discrepancy Average

10

20

30

40

50

60

70

80

90

100

1

.369

0%

0%

0%

0%

1%

1%

2%

3%

4%

5%

2

.327

1%

3%

4%

5%

7%

8%

9%

10%

12%

13%

3

.272

2%

5%

7%

10%

12%

14%

16%

19%

21%

23%

4

.229

3%

6%

9%

12%

15%

17%

20%

23%

26%

29%

5

.198

3%

7%

11%

14%

17%

20%

23%

26%

29%

33%

6

.180

4%

8%

11%

15%

18%

22%

25%

29%

33%

35%

7

.162

4%

8%

12%

16%

20%

23%

28%

31%

35%

38%

8

.149

4%

8%

12%

17%

21%

25%

29%

33%

37%

41%

9

.141

5%

9%

14%

18%

22%

26%

30%

34%

39%

42%

10

.133

5%

9%

14%

20%

24%

29%

33%

37%

41%

44%

After ten starts the average margin between Winning Percentage and True Winning Percentage is 133 points, and 44% of pitchers are within 100 points of their True Winning Percentage.
This chart below extends the one above out to 450 starts. There are 57 pitchers in my data who made 450 or more major league starts. Above 450 starts, we don’t have enough data for the percentages to be meaningful. We start to lose accuracy at about 300 starts. There were 237 pitchers in the data who made 300 or more starts, and the data holds together fairly well up to 300 starts, but we start to lose consistency in the data above 300 starts, and the sample size of pitchers gets smaller.
Starts

Discrepancy Average

10

20

30

40

50

60

70

80

90

100

1

.369

0%

0%

0%

0%

1%

1%

2%

3%

4%

5%

10

.133

5%

9%

14%

20%

24%

29%

33%

37%

41%

44%

20

.095

7%

15%

21%

28%

34%

39%

45%

49%

55%

60%

30

.080

9%

17%

24%

31%

39%

45%

52%

57%

62%

68%

40

.073

9%

17%

25%

34%

41%

49%

56%

62%

68%

73%

50

.064

11%

22%

30%

39%

47%

55%

62%

68%

74%

79%

75

.056

13%

23%

33%

43%

52%

61%

68%

75%

81%

86%

100

.048

14%

27%

41%

51%

60%

68%

75%

82%

87%

90%

150

.041

18%

32%

46%

58%

68%

77%

82%

87%

92%

94%

200

.037

17%

37%

50%

64%

73%

80%

86%

89%

93%

97%

250

.035

19%

37%

52%

67%

74%

83%

88%

92%

95%

98%

300

.032

22%

38%

57%

70%

80%

87%

92%

96%

97%

99%

350

.032

21%

42%

55%

68%

79%

86%

93%

97%

99%

99%

400

.031

24%

38%

56%

74%

82%

89%

90%

95%

98%

99%

450

.034

18%

30%

54%

63%

82%

86%

89%

96%

100%

100%

And this chart extends the chart above sideways to 250 points (meaning a discrepancy of .250 or less between Winning Percentage and True Winning Percentage.)
Starts

110

120

130

140

150

200

250

1

6%

8%

9%

10%

12%

21%

28%

10

48%

52%

55%

59%

63%

77%

85%

20

64%

69%

72%

76%

79%

91%

95%

30

72%

76%

81%

84%

87%

96%

98%

40

77%

81%

85%

88%

91%

98%

99%

50

83%

86%

89%

91%

94%

99%

100%

75

89%

92%

94%

96%

97%

100%

100%

100

93%

95%

97%

97%

99%

100%

100%

150

96%

97%

99%

99%

100%

100%

100%

200

98%

99%

99%

99%

100%

100%

100%

250

99%

99%

99%

100%

100%

100%

100%

300

99%

99%

99%

100%

100%

100%

100%

350

99%

99%

99%

99%

100%

100%

100%

400

99%

99%

99%

99%

100%

100%

100%

450

100%

100%

100%

100%

100%

100%

100%

By 50 starts, all pitchers have winning percentages within 250 points of their True Winning Percentage, but that is a very wide net. That just means that if a pitcher has a Winning Percentage of .600 after 50 starts, that his True Winning Percentage had been somewhere between .350 and .850.
So how do we generalize about this, in the English Language? It’s arbitrary, but let me suggest this. If the average gap between Winning Percentage and True Winning Percentage was .000, then WonLost records could be said to be 100% reliable. For each .001 of separation between them, we could say that that is a 1% drop in reliability.
If you agree to that definition, then wonlost records are completely uninformative (100% unreliable) through 17 starts.
Through 35 starts, which we could say is one season, wonlost records are 24% reliable, 76% distorted by events beyond the pitcher’s control.
Through 100 career starts, we could say that wonlost records are 52% reliable, 48% distorted. The point at which the 50% mark is crossed is 93 starts.
Through 200 career starts, wonlost records could be said to be 63% reliable.
Through 300 career starts, which we could say is a full career, wonlost records could be said to be 68% reliable.
Although the reliability would continue to increase above 300 starts, the data samples are too small to draw any conclusions. It is apparent, however, that for wonlost records to become 90% reliable as a reflection of how well the pitcher has pitched would take much, much, much longer than any pitcher’s career. 80%, you can argue
In the fourth article of this series, posted tomorrow, we’ll talk about individual pitchers—the pitchers with the highest True Winning Percentages, the pitchers whose recordbook winning percentages are most inflated, the pitchers who were better or worse than their wonlost record shows, etc.