Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Random and Responsive Performance by Starting Pitchers Day III

By Bill James

July 10, 2010

VII. Returning to Effective Runs Allowed

We are dealing here with the issue of whether some pitchers may have an ability to pitch especially well when they have an opportunity to win the game. We were working before with John Burkett, Mike Morgan and Whitey Ford, and let’s throw Jim Clancy and Tom Seaver into the mix as well—Seaver, for obvious reasons, and Clancy, because he is the third man in the Burkett/Morgan comparison:

						Total Runs
Pitcher	Innings	W	L	W Pct	ERA	Per 9 Innings
John Burkett	2648.1	166	136	.550	4.31	4.67
Jim Clancy	2517.1	140	167	.456	4.23	4.66
Mike Morgan	2772.1	141	186	.431	4.23	4.65

Working with one run of offensive support, Burkett’s teams were 3-30, Clancy’s were 3-37, Morgan’s were 3-39, Ford’s were 10-28, and Seaver’s were 12-56:

						Effective
					Effective	Runs
					Runs	Allowed
First	Last	Games	Won	Lost	Allowed	Rate
Whitey	Ford	39	10	28	64	1.67
Tom	Seaver	68	12	56	147	2.16
John	Burkett	33	3	30	104	3.16
Jim	Clancy	40	3	37	140	3.51
Mike	Morgan	42	3	39	151	3.61

Except that that’s not exactly the method that I used in some cases. That’s exactly the method I used for Clancy, Seaver and Morgan above, but not for Ford and Burkett.

We get into problems with small groups of data. Suppose that a pitcher went 0-12 when working with one run, or 0-1, or 0-39. What is his runs allowed rate? By the method above, you can’t get an answer. You wind up searching for the square root of infinity, and, of course, only the Dalai Lama and Al Gore know what is the square root of infinity.

To prevent this from happening, I added “placekeeping data” to all groups of games involving less than 40 decisions. At one run, since the winning percentage of teams with one run was .101, I added one (placekeeping) win and nine losses—treating John Burkett as if he was 4-39, rather than 3-30, and Whitey Ford as if he was 11-37, rather than 10-28. These additions change the chart above to the following:

					Effective
				Effective	Runs
				Runs	Allowed
First	Last	Won	Lost	Allowed	Rate
Whitey	Ford	10	28	70	1.83
Tom	Seaver	12	56	147	2.16
John	Burkett	3	30	103	3.12
Jim	Clancy	3	37	140	3.51
Mike	Morgan	3	39	151	3.61

I added these “placekeeping numbers” at all levels, to all pitchers who had less than 40 decisions at the level, always adding one “placekeeping win”, but varying the number of losses as follows:

	Team
	Winning	Wins/Losses
	Percentage	Added
1 Run	.101	1 and 9
2 Runs	.247	1 and 3
3 Runs	.389	1 and 1.6
4 Runs	.528	1 and .9
5 Runs	.641	1 and .56
6 Runs	.727	1 and .375
7 Runs	.802	1 and .25
8 Runs or More	.907	1 and .1

These are the values for the five pitchers at the levels 2 runs and 3 runs:

					Effective
TWO	RUNS			Effective	Effective
				Runs	Runs
First	Last	Won	Lost	Allowed	Rate
Tom	Seaver	43	48	192	2.11
Whitey	Ford	19	22	88	2.15
John	Burkett	15	46	214	3.50
Jim	Clancy	8	36	187	4.24
Mike	Morgan	9	56	324	4.99

THREE	RUNS				Effective
				Effective	Runs
				Runs	Allowed
First	Last	Won	Lost	Allowed	Rate
Whitey	Ford	45	30	184	2.45
Tom	Seaver	57	56	336	2.97
Jim	Clancy	29	40	243	3.52
Mike	Morgan	27	39	238	3.61
John	Burkett	24	37	227	3.72

Whitey Ford is doing really well here. Some people tend to think of Whitey Ford as a pitcher who won a lot of games because the Bronx Bombers scored seven runs a game for him. As the charts above show, this is anything but true. Ford won a lot of games 2-1, 3-1 and 3-2. This chart summarizes the three charts above:

Total, One to Three Runs
					Effective
				Effective	Runs
				Runs	Allowed
First	Last	Won	Lost	Allowed	Rate
Whitey	Ford	74	80	342	2.22
Tom	Seaver	112	160	675	2.48
John	Burkett	42	113	544	3.51
Jim	Clancy	40	113	570	3.73
Mike	Morgan	39	134	714	4.13

In the other method, Seaver was number one because I was measuring the total distance away from an average pitcher, which gave Seaver an advantage over Ford because he had pitched 75 to 80% more games within our data. But on a per-game basis, Ford is more than holding his own. (The “Effective Runs Allowed” above are not actually integers, and may not add up exactly as you might expect for that reason.)

At the four-run level Seaver bests Ford and the three musketeers go Burkett-Morgan-Clancy, but the cumulative order stays the same:

		Totals at Four Runs				Total, One to Four Runs
				Effective					Effective
				Runs					Runs
First	Last	Won	Lost	Allowed	ERAR		Won	Lost	Allowed	ERAR
Whitey	Ford	35	25	203	3.38		109	105	544	2.54
Tom	Seaver	49	30	247	3.13		161	190	922	2.63
John	Burkett	25	25	200	4.00		67	138	744	3.63
Jim	Clancy	20	25	201	4.47		60	138	772	3.90
Mike	Morgan	24	25	200	4.08		63	159	914	4.12

Ford was the only one of these pitchers—and one of few pitchers in the study—who was able to deliver a winning record for his team in games with four runs or less. At the five-run level, Whitey Ford’s teams were a fairly astonishing 36-3. Everybody was able to win over half the time with five runs, but Morgan’s teams were only 29-23:

		Totals at Five Runs				Totals, One to Five Runs
				Effective				Effective
				Runs				Runs
First	Last	Won	Lost	Allowed	ERAR	Won	Lost	Allowed	ERAR
Whitey	Ford	36	3	60	1.55	145	108	605	2.39
Tom	Seaver	62	16	198	2.54	223	206	1121	2.61
John	Burkett	28	14	148	3.54	95	152	892	3.61
Jim	Clancy	30	19	195	3.97	90	157	966	3.91
Mike	Morgan	29	23	232	4.45	92	182	1145	4.18

In the first of this series of articles I mentioned that Burkett was 18-4 when working with six runs, whereas Morgan was 16-4, which I represented as an advantage for Burkett. But actually, when we look at the team record, rather than the individual pitcher’s won-lost record, Morgan beats Burkett at six runs. And Whitey Ford once again beats everybody:

		Totals at Six Runs				Totals, One to Six Runs
				Effective				Effective
				Runs				Runs
First	Last	Won	Lost	Allowed	ERAR	Won	Lost	Allowed	ERAR
Whitey	Ford	39	5	95	2.15	184	113	699	2.36
Tom	Seaver	40	7	118	2.51	263	213	1239	2.60
John	Burkett	27	12	156	3.99	122	164	1048	3.66
Jim	Clancy	32	10	141	3.35	122	167	1107	3.83
Mike	Morgan	25	7	102	3.20	117	189	1248	4.08

At the seven-run level Jim Clancy’s teams were 20-2, which is even better than Seaver’s teams. But Ford’s Yankees went 27-2:

		Totals at Seven Runs				Totals, One to Seven Runs
				Effective				Effective
				Runs				Runs
First	Last	Won	Lost	Allowed	ERAR	Won	Lost	Allowed	ERAR
Whitey	Ford	27	2	58	1.98	211	115	757	2.32
Tom	Seaver	47	6	133	2.50	310	219	1371	2.59
John	Burkett	25	9	142	4.18	147	173	1190	3.72
Jim	Clancy	20	2	50	2.29	142	169	1157	3.72
Mike	Morgan	16	6	93	4.24	133	195	1341	4.09

We come, finally, to our last category, which is “eight runs or more”. There’s not much point in tracking it beyond eight runs because the winning percentages are getting so close to 1.000, but there is a difference between the “eight-run” category and the others, which is that up to now we have known exactly how many runs the team was working with on offense. We will assume that the offense, in those games in which it scores eight runs or more, averages nine runs.

Jim Clancy with eight runs or more was 40-1:

		Totals, Eight Runs or More				TOTALS, One or More Runs
				Effective				Effective
				Runs				Runs
First	Last	Won	Lost	Allowed	ERAR	Won	Lost	Allowed	ERAR
Whitey	Ford	73	3	139	1.82	284	118	896	2.23
Tom	Seaver	59	4	148	2.34	369	223	1519	2.57
John	Burkett	77	6	209	2.51	224	179	1398	3.47
Jim	Clancy	40	1	58	1.42	182	170	1216	3.45
Mike	Morgan	48	7	189	3.44	181	202	1530	3.99

OK, I’ve got some housekeeping details to take care of here, but first let me say: I think that we have effectively demonstrated at this point—unless I am missing something--that Mike Morgan was not as effective a pitcher, in terms of delivering a win, as were Clancy and Burkett. In terms of ERA and runs allowed per nine innings, he was just the same. In terms of his effectiveness at delivering a win, given a certain number of runs to work with, he was not the same. We have done that, and we have created a framework for measuring the difference—measuring the costs to Mike Morgan’s teams of his failure to respond to the situations. We’ll pick up on those things in a moment.

The two housekeeping issues that we need to deal with are

1) What do we do with shutouts? and

2) Why are the “Effective Runs Allowed Rates” so low?

To this point we have not dealt with games in which the pitcher’s team was shut out, because, if your team doesn’t score any runs, it doesn’t make any difference how well you pitch, you can’t win. What can we do about these games?

There seem to be four options, which are:

1) To assume that every pitcher had the same level of effectiveness in these games—let’s say 4.50 runs per game or something like that,

2) To assume that every pitcher had the same level of effectiveness in these games that he had overall,

3) To ignore them entirely, and

4) To ignore them entirely, except that we display them as losses.

I think the best option is (4)—to ignore them entirely, except that we display them as losses in the totals, so that we are displaying accurate totals of wins and losses for the starter’s games.

The other question is, why are the Effective Runs Allowed Rates so low? Mike Morgan, after all, allowed 4.65 runs per nine innings in real life. We are saying here

1) That he was ineffective at pitching well when he had a chance to win, and yet

2) His “effective” runs allowed rate was 3.99.

What up with that?

Everybody is low. The reason for this is that the distribution curve of runs scored in a game (or runs allowed in a game) is asymmetrical. If teams average 4.00 runs in a game, they will never score less than zero, but they will sometimes score more than eight. That means that there have be more games with less than four than games with more than four, in order to re-balance the system at four.

The Pythagorean formula assumes that, when a team averages four runs a game, this is an average. In our study it is not an average; it is a constant. The “four” runs average for the four-run group is 4, 4, 4, 4, 4. But there will be more “opposing” games under 4.00 than over four, which means that there will be more wins than losses. If you take two teams which both average four runs a game head to head, but one team always scores exactly four but the other scores a varied number averaging four, the team that always scores four will win over half the time. The same is true at every level of offense, including one run. If a team always scored one run in every game and allowed an average of one run per game but in a varied pattern, they would win more than half their games.

This skews our calculations toward a lower-than-real-life effective runs allowed rate, and we’ll need to adjust for that. It’s actually kind of interesting how it happens. You remember this chart, which I presented in the second of this series of articles?

		Team
		Winning
		Percentage
	0 Runs	.000
	1 Run	.101
	2 Runs	.247
	3 Runs	.389
	4 Runs	.528
	5 Runs	.641
	6 Runs	.727
	7 Runs	.802
	8 Runs or More	.907

Based on that chart, we can figure what the “effective runs allowed rate” being calculated at each level of offensive effectiveness is:

	Team	Effective
	Winning	Runs
	Percentage	Allowed
		Rate
0 Runs	.000
1 Run	.101	2.98
2 Runs	.247	3.49
3 Runs	.389	3.76
4 Runs	.528	3.78
5 Runs	.641	3.74
6 Runs	.727	3.68
7 Runs	.802	3.48
8 Runs or More	.907	2.88

Again, when teams score 8 runs or more, we assume that they have scored an average of nine.

Anyway, the average runs allowed rate in this study was about 3.50 runs per game, whereas it should have been about 4.40. It’s 21.4% low. We can correct for this, then, by multiplying the Effective Runs Allowed Rates calculated before by 14, and dividing by 11.

OK, picking up the “summary” chart before, but incorporating those two changes, this would be the data for the five gentlemen that we have been following here:

FINAL DATA					Effective
					Runs
					Allowed
First	Last	Games	Team Wins	Team Losses	Rate
Whitey	Ford	423	284	139	2.84
Tom	Seaver	647	369	278	3.27
Jim	Clancy	380	182	198	4.40
John	Burkett	423	224	199	4.42
Mike	Morgan	411	181	230	5.08

Notes:

1) Tom Seaver’s Effective Runs Allowed Rate here is actually higher than his real-life runs allowed rate (3.15 runs per nine innings). Seaver ranked first in our earlier competition because Seaver pitched more games with a low ERA than any other modern-era pitcher—thus, he is going to do very well in any kind of a runs-allowed rate competition among modern pitchers.

2) Whitey Ford’s Effective Runs Allowed Rate is the lowest of any pitcher in our study with a reasonable number of starts. In the last of this series of articles, posted tomorrow, I will show the data for all 663 pitchers with 100 or more starts in the Retrosheet data. Ford is number one on the list; Bryan Rekar is number 663.

3) Jim Clancy surged at the last minute in our data, pushing ahead of John Burkett. People assume that Mike Morgan pitched for terrible teams, although he actually didn’t, on balance. Jim Clancy pitched for worse teams than Morgan did, but Clancy had a better won-lost record. If Clancy had matched the won-lost percentage of his team in every season, his career record would have been 154-161 (.489).

I’m not entirely happy with Clancy’s evaluation. Clancy’s teams went 40-1 when they scored eight runs or more, which creates a very low Effective Runs Allowed Rate for him in those games, but if his team’s had gone, say, 38-3, then we would have a very different calculation for him in those games, driving his overall final figure up by 15 points to 4.55. The system allows a disproportionate impact of a very few games in that case, which it should not do; I should have devised some way to prevent that from happening. Charts sometimes act irrationally near their boundaries; you probably all know this.

VIII. Final Thoughts

In saying that Mike Morgan did not pitch well when he had chances to win ball games, in saying that Whitey Ford and Bob Gibson did, I am not offering a moral judgment. We have not established, and I am not saying, that these performance deviations are beyond what could have occurred by chance.

But I am saying that these performance anomalies are “real” in this sense: that wins and losses resulted from them, and that therefore we can and should appropriately consider this in evaluating these players.

Mike Morgan, I think, is shown as being 36 runs below average as a pitcher for his career. The real number, I would argue, is closer to minus 150. Morgan allowed 4.65 runs per nine innings—but he won ballgames with the consistency of a pitcher allowing 5.08 runs. That’s a difference, over the course of his career, of 132 runs.

That completes the written part of this analysis. In tomorrow’s installment we’ll list the data for all pitchers with 100 or more career starts in the Retrosheet data.

COMMENTS (8 Comments, most recent shown first)

CharlesSaeger
A methodical question: why not add placekeeping data to all records? That gives us a means regression, and keeps things consistent for all pitchers.
3:38 PM Jul 14th

MattGoodrich
It seems the moral judgement is hard to avoid - people always want to know who's 'clutch'.
If a pitcher gives up 3 runs and loses 3-2, which is better: if he gives up 3 runs in the first inning, then shuts them down the rest of the way, or if he goes into the 7th inning with a 2-0 lead, then gives up 3 runs? Either way the end result was the same, but I bet more people would chastise the pitcher for 'blowing' the 2-run lead.
2:46 AM Jul 13th

champ
Thank you for this work, Bill. The closer we get to removing a pitcher's wins and losses from any sort of moral judgement, the closer we get to fairer all-star selections, award winners, and contracts.
5:25 PM Jul 12th

CharlesSaeger
Oops, reread part 8. We shan't know until someone combs the data for this.
4:57 PM Jul 12th

Bucky
One of the main differences between Bill James and many other people who talk about and write about baseball is that they assume they are right even if the evidence is against them. Bill James keeps trying to find out if he is wrong, even if his evidence looked pretty good at the time. This series is a great take on the idea of "pitching to the score" and will give us all much to ponder.
Now I'm curious as to how Jack Morris did!
2:02 PM Jul 12th

CharlesSaeger
How much of these differences are due to random chance?
11:05 AM Jul 12th

jdw
Really interesting study, Bill.
2:34 AM Jul 12th

chuck
Thank you for this, Bill. It's a fascinating read; looking forward to the list tomorrow.
Just wondering- when you say the pitcher had x number of runs to work with- baseball-reference shows the run support both for the whole game a pitcher started and during the time the pitcher was in the game. Were you using the latter for this study?
12:54 AM Jul 12th

Random and Responsive Performance by Starting Pitchers Day III

COMMENTS (8 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: