Recent Game Vs. Full Season Performance For Starting Pitchers
Hey, Bill: If you were choosing a pitcher to start a critical game, say a one-game playoff or game 7 of a series, would you be more likely to choose the pitcher on your team who has the highest pitcher score or the pitcher who has pitched best over the past four or five starts? Or would you use other criteria or a combination of criteria? --Flying Fish
The general principle is that more information is usually better than less, but I decided to study the specific question. I took my data base of games, 1950 to 2014, and looked at the question of how well one game by a starting pitching predicts the next, how well the previous TWO games predict the next, three games, four games, etc., up to 30 games. To begin with, I took all game lines in the data—a little more than 240,000 lines—and marked each by the career start number. Then I eliminated the first 30 starts from each pitcher’s career, so that pitchers who hadn’t been around for a full season wouldn’t be included in the study, since they aren’t relevant to the Fish’s question. This left 179,844 game lines in the data.
Sort those, first, by the pitcher’s performance (Game Score) in his previous start. In the top 10% there are 17,984 games. The pitchers who had pitched best in their previous starts went 7,486-6,467 with a 3.55 ERA, while those who had pitched worst in the previous start went 6,328-6,680 with a 4.23 ERA:
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
A1
|
17984
|
7486
|
6467
|
.537
|
54.41
|
121970.2
|
115123
|
37427
|
83635
|
3.55
|
B1
|
17984
|
7156
|
6369
|
.529
|
52.71
|
117362.2
|
114433
|
36767
|
77687
|
3.77
|
C1
|
17984
|
6977
|
6350
|
.524
|
52.30
|
116155.2
|
113245
|
36995
|
75307
|
3.84
|
D1
|
17985
|
6949
|
6449
|
.519
|
51.70
|
114741.0
|
113241
|
37350
|
73653
|
3.92
|
E1
|
17985
|
6743
|
6512
|
.509
|
51.11
|
113219.2
|
112942
|
37565
|
70972
|
3.99
|
F1
|
17985
|
6587
|
6591
|
.500
|
50.60
|
112008.1
|
112979
|
38008
|
70029
|
4.08
|
G1
|
17985
|
6540
|
6620
|
.497
|
50.71
|
111756.1
|
112287
|
37516
|
69249
|
4.05
|
H1
|
17984
|
6389
|
6646
|
.490
|
50.17
|
110327.0
|
111832
|
37987
|
67241
|
4.13
|
I1
|
17984
|
6305
|
6743
|
.483
|
49.91
|
109769.1
|
111560
|
38259
|
65320
|
4.17
|
J1
|
17984
|
6328
|
6680
|
.486
|
49.62
|
109961.0
|
112767
|
38240
|
66426
|
4.23
|
Group A1 is the pitchers who had pitched best (Group A) in their previous ONE (1) start; A2 is those who had pitched best in their previous two starts, etc. I have more data than this, but the additional data doesn’t fit. The "A1" group pitchers pitched 4,406 complete games including 1,049 shutouts, and the won-lost record of their TEAMS, as opposed to the pitchers themselves, was 9,589-8,376 (19 ties).
Anyway, we can see that the previous start predicts the next one to a limited extent, and in the next chart we can see that the extent to which previous performance predicts next-start performance increases when we use the previous two starts, rather than one:
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
A2
|
17984
|
7781
|
6243
|
.555
|
55.30
|
123183.2
|
114253
|
36920
|
87266
|
3.44
|
B2
|
17984
|
7287
|
6395
|
.533
|
53.14
|
118501.2
|
114329
|
37111
|
78824
|
3.71
|
C2
|
17984
|
7164
|
6272
|
.533
|
52.67
|
116776.1
|
112992
|
37085
|
76150
|
3.77
|
D2
|
17985
|
6816
|
6436
|
.514
|
51.62
|
114422.1
|
113256
|
37241
|
72794
|
3.92
|
E2
|
17985
|
6608
|
6712
|
.496
|
50.92
|
113539.2
|
114229
|
37603
|
70928
|
4.03
|
F2
|
17985
|
6686
|
6531
|
.506
|
51.01
|
112891.1
|
112852
|
37627
|
69890
|
3.99
|
G2
|
17985
|
6500
|
6627
|
.495
|
50.14
|
111236.1
|
113010
|
38075
|
68153
|
4.16
|
H2
|
17984
|
6401
|
6785
|
.485
|
50.08
|
110421.1
|
111862
|
38094
|
66765
|
4.15
|
I2
|
17984
|
6188
|
6705
|
.480
|
49.52
|
108860.2
|
111736
|
38115
|
65035
|
4.24
|
J2
|
17984
|
6029
|
6721
|
.473
|
48.84
|
107438.1
|
111890
|
38243
|
63714
|
4.36
|
A GS is "Average Game Score". The predictive power of the previous games increases again when we go from two starts to three:
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
A3
|
17984
|
7869
|
6204
|
.559
|
55.86
|
124107.1
|
113545
|
37284
|
89691
|
3.38
|
B3
|
17984
|
7390
|
6255
|
.542
|
53.61
|
119265.1
|
113821
|
37350
|
80069
|
3.64
|
C3
|
17984
|
6958
|
6495
|
.517
|
52.40
|
116823.0
|
114391
|
37156
|
76130
|
3.81
|
D3
|
17985
|
7045
|
6397
|
.524
|
51.99
|
115486.1
|
113406
|
37210
|
73051
|
3.85
|
E3
|
17985
|
6777
|
6512
|
.510
|
51.21
|
113605.0
|
113380
|
37277
|
71642
|
3.98
|
F3
|
17985
|
6612
|
6571
|
.502
|
50.75
|
112537.1
|
113134
|
37376
|
69182
|
4.05
|
G3
|
17985
|
6386
|
6677
|
.489
|
50.28
|
111315.0
|
112666
|
37867
|
67623
|
4.12
|
H3
|
17984
|
6369
|
6648
|
.489
|
49.88
|
109982.1
|
112125
|
38208
|
66094
|
4.17
|
I3
|
17984
|
6229
|
6771
|
.479
|
49.11
|
108340.2
|
112354
|
38153
|
63935
|
4.30
|
J3
|
17984
|
5825
|
6897
|
.458
|
48.14
|
105809.1
|
111587
|
38233
|
62102
|
4.50
|
Whereas the 10% of starting pitchers who had pitched well in their previous ONE start had a .537 winning percentage and a 3.55 ERA, the 10% of pitchers who had pitched well in their previous THREE starts had a .559 winning percentage and a 3.38 ERA, and whereas the 10% of pitchers who had pitched most badly in their previous one start had a .486 winning percentage and a 4.23 ERA, those who had pitched most poorly in their previous three starts had a .458 winning percentage and a 4.50 ERA. More information is better; the predictive power increases.
This continues to be true as we add more starts—up to at least 15 starts. Since really only the "A" group and the "J" group are interesting, I will trim the charts to just those:
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
A1
|
17984
|
7486
|
6467
|
.537
|
54.41
|
121970.2
|
115123
|
37427
|
83635
|
3.55
|
A2
|
17984
|
7781
|
6243
|
.555
|
55.30
|
123183.2
|
114253
|
36920
|
87266
|
3.44
|
A3
|
17984
|
7869
|
6204
|
.559
|
55.86
|
124107.1
|
113545
|
37284
|
89691
|
3.38
|
A4
|
17984
|
8014
|
6035
|
.570
|
56.41
|
125072.0
|
113309
|
37386
|
91922
|
3.31
|
A5
|
17984
|
8061
|
6022
|
.572
|
56.77
|
125538.0
|
112768
|
37321
|
93237
|
3.27
|
A6
|
17984
|
8114
|
6000
|
.575
|
57.01
|
126016.1
|
112622
|
37200
|
94439
|
3.25
|
A7
|
17984
|
8170
|
5929
|
.579
|
57.09
|
126194.1
|
112588
|
37416
|
95276
|
3.25
|
A8
|
17984
|
8124
|
5941
|
.578
|
57.20
|
126371.0
|
112532
|
37477
|
95882
|
3.23
|
A9
|
17984
|
8186
|
5934
|
.580
|
57.37
|
126682.0
|
112420
|
37635
|
96889
|
3.22
|
A 10
|
17984
|
8254
|
5938
|
.582
|
57.49
|
126822.0
|
112245
|
37546
|
97149
|
3.20
|
A 11
|
17984
|
8264
|
5908
|
.583
|
57.73
|
127260.2
|
112098
|
37690
|
97819
|
3.17
|
A 12
|
17984
|
8295
|
5923
|
.583
|
57.80
|
127486.2
|
112287
|
37584
|
98498
|
3.16
|
A 13
|
17984
|
8295
|
5914
|
.584
|
57.84
|
127482.0
|
112197
|
37622
|
98673
|
3.16
|
A 14
|
17984
|
8357
|
5902
|
.586
|
58.00
|
127778.0
|
112032
|
37652
|
99013
|
3.14
|
A 15
|
17984
|
8382
|
5887
|
.587
|
58.05
|
127938.1
|
111912
|
37835
|
99256
|
3.14
|
In this chart, then, we can see that the predictive power of the previous starts increases when more starts are considered; that is, the winning percentage of the "best" pitchers improves, and the ERA declines, as more starts are considered, up to 15. Also, when we look at the worst pitchers, they continue to get worse:
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
J1
|
17984
|
6328
|
6680
|
.486
|
49.62
|
109961.0
|
112767
|
38240
|
66426
|
4.23
|
J2
|
17984
|
6029
|
6721
|
.473
|
48.84
|
107438.1
|
111890
|
38243
|
63714
|
4.36
|
J3
|
17984
|
5825
|
6897
|
.458
|
48.14
|
105809.1
|
111587
|
38233
|
62102
|
4.50
|
J4
|
17984
|
5803
|
6943
|
.455
|
47.86
|
105401.1
|
112056
|
38155
|
60987
|
4.54
|
J5
|
17984
|
5778
|
6896
|
.456
|
47.56
|
104591.2
|
111897
|
38231
|
60107
|
4.60
|
J6
|
17984
|
5730
|
6924
|
.453
|
47.35
|
104018.0
|
111640
|
38230
|
59474
|
4.64
|
J7
|
17984
|
5712
|
6970
|
.450
|
47.16
|
103810.1
|
112012
|
38418
|
58909
|
4.67
|
J8
|
17984
|
5640
|
7022
|
.445
|
46.98
|
103305.2
|
111984
|
38149
|
58341
|
4.71
|
J9
|
17984
|
5552
|
7024
|
.441
|
46.78
|
103156.0
|
112183
|
38484
|
58200
|
4.76
|
J 10
|
17984
|
5536
|
7112
|
.438
|
46.73
|
103181.2
|
112333
|
38342
|
57968
|
4.76
|
J 11
|
17984
|
5559
|
7064
|
.440
|
46.69
|
103097.0
|
112483
|
38366
|
57902
|
4.76
|
J 12
|
17984
|
5534
|
7125
|
.437
|
46.56
|
102845.2
|
112468
|
38373
|
57762
|
4.80
|
J 13
|
17984
|
5503
|
7123
|
.436
|
46.51
|
102689.2
|
112374
|
38380
|
57411
|
4.81
|
J 14
|
17984
|
5528
|
7088
|
.438
|
46.42
|
102539.0
|
112545
|
38428
|
57449
|
4.83
|
J 15
|
17984
|
5556
|
7122
|
.438
|
46.31
|
102351.1
|
112764
|
38354
|
57250
|
4.85
|
The predictive power of the last six starts is twice the predictive power of the last one start, but the predictive power of the last 15 starts is only 20% greater than the predictive power of the last six starts. The predictive value of each additional start is less than the predictive value of the previous one.
After 15 to 20 starts, the charts flatten out to such an extent that it is difficult to say with confidence that any additional gains are being made. There are two other things that are happening, beyond the natural law of diminishing returns. Since a pitcher only makes about 30 starts in a season, after 15 starts about half of the "old" starts we are adding into the data are from the previous season. It is likely that last year’s data has less predictive value than this year’s data, even if this year’s data was three months ago.
Also, there is a technical issue with using the Game Score, unadjusted, as the indicator of how well the pitcher has pitched, since the Average Game Score by a pitcher is higher in 1968 than in the steroid era. This is a small effect, and it isn’t an actual problem in the part of the study where we are measuring noticeable effects, but as the effects being measured grow smaller, the technical issue becomes more significant relevant to the effects being measured. As a consequence of these things, after 15 starts some measures go one way and some go another, and it is unclear whether we’re actually gaining any more useful information or not. This chart compares the data for the last five starts compared to the last ten, 15, 20, 25 or 30:
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
A5
|
17984
|
8061
|
6022
|
.572
|
56.77
|
125538.0
|
112768
|
37321
|
93237
|
3.27
|
A 10
|
17984
|
8254
|
5938
|
.582
|
57.49
|
126822.0
|
112245
|
37546
|
97149
|
3.20
|
A 15
|
17984
|
8382
|
5887
|
.587
|
58.05
|
127938.1
|
111912
|
37835
|
99256
|
3.14
|
A 20
|
17984
|
8338
|
5858
|
.587
|
58.15
|
127927.1
|
111756
|
37984
|
100165
|
3.12
|
A 25
|
17984
|
8329
|
5937
|
.584
|
58.19
|
128074.2
|
111867
|
37721
|
100592
|
3.13
|
A 30
|
17984
|
8393
|
5917
|
.587
|
58.31
|
128204.1
|
111756
|
37738
|
100877
|
3.11
|
|
|
|
|
|
|
|
|
|
|
|
Group
|
Games
|
Won
|
Lost
|
WPct
|
A GS
|
IP
|
H
|
BB
|
K
|
ERA
|
J5
|
17984
|
5778
|
6896
|
.456
|
47.56
|
104591.2
|
111897
|
38231
|
60107
|
4.60
|
J 10
|
17984
|
5536
|
7112
|
.438
|
46.73
|
103181.2
|
112333
|
38342
|
57968
|
4.76
|
J 15
|
17984
|
5556
|
7122
|
.438
|
46.31
|
102351.1
|
112764
|
38354
|
57250
|
4.85
|
J 20
|
17984
|
5552
|
7111
|
.438
|
46.43
|
102754.2
|
112716
|
38087
|
57482
|
4.83
|
J 25
|
17984
|
5522
|
7034
|
.440
|
46.27
|
102351.0
|
112901
|
38091
|
57087
|
4.86
|
J 30
|
17984
|
5501
|
7060
|
.438
|
46.19
|
102273.1
|
112943
|
38122
|
56993
|
4.87
|
Probably the data continues to gain predictive significance after 15+ starts have passed, but the gains are very small and we cannot be certain that they are real.
So the answer to your question, at this point, is that the last year’s data is a better predictor of pitcher performance than the last five games, but that one should not ignore the last five games, either. Suppose that a pitcher has pitched moderately well over the last 30 games, but extremely poorly over the last five? Then the short-term effects might be more important than the longer-term effects, and perhaps the answer is that one should go with the hot hand.
Or not.
I did one more study. Suppose that we compare a pitcher who has been pitching well lately but has not pitched well over a longer term with a pitcher who has pitched well over his last 30 starts, but has pitched poorly over his last five starts?
I figured for each pitcher his G5 – G30; that is, his average Game Score over his last 5 starts minus his average Game Score over his last 30 starts, but with the additional qualification that to be in the top group the pitcher must have genuinely pitched poorly over his last 30 starts, and to be in the bottom group the pitcher must have genuinely pitched poorly over his last 5 starts.
Conclusion? You want the pitcher who has pitched well over his last 30 starts—absolutely and without question. I looked at the 1000 most extreme examples on each end.
In the top group were, for example, Marty Pattin in his start of May 8, 1973. In his previous five starts (April 16 to May 4, 1973) he had lost all five, had pitched less than four innings per start, and had an ERA of 11.17. But in his last 30 starts (June 20, 1972 to May 4, 1973) he had pitched 220 innings and had gone 16-11 with a 3.07 ERA. Pattin pitched very well in that game, although he lost the game 1-0.
Second example: Gaylord Perry in his start of August 15, 1974. In his previous five starts he had pretty much been pounded every time, giving up a total of 44 hits and 31 runs in 38 and a third innings, and losing all five starts. But in his previous 30 starts he was 20-7 with a 2.27 ERA, meaning that in the 25 starts BEFORE the last five he was 20-2 with an ERA under 1.50. Gaylord pitched well in his start of August 15, and won the game.
On the other end we have, for example, Brian Bohannon in his start of September 15, 1999. In his previous five starts he had pitched 8 innings, 9 innings, 7 innings, 8 innings and 8 innings, and his ERA for those five starts was 2.25, with 32 strikeouts and 12 walks in 40 innings. But over his last 30 starts, although he was 12-11, he had pitched 182 innings, struck out 110, walked 82, and had a 5.79 ERA.
Bohannon was hit hard in his start of September 15—and the next one, and the next one, and the next one, and the next one. He was Brian Bohannon; he was what he was. It’s baseball. The cream doesn’t ALWAYS rise to the top, but the sand always falls to the bottom.
Comparing 1,000 pitchers in each group, the pitchers who had pitched poorly in their last 5 starts but had pitched well over their last 30 starts had a Winning Percentage in their next start of .538, and an ERA of 3.40. The pitchers in the opposite group had a Winning Percentage of .438, and an ERA of 4.70.