Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

The Practical Effects of Run Volatility

By Bill James

April 9, 2022

Does any team gain a practical advantage in winning games as a result of the distribution by games of their runs scored? There are obvious theoretical advantages to different patterns of runs scored; that is not the question here. The question here is whether the actual differences between teams in terms of run scoring patterns are large enough to be considered a meaningful advantage.

This question arose as a part of a "Hey, Bill", discussion, as it has arisen in the past. This time, however, I happened to look in the right direction, and see how this issue could be resolved, so, while watching the Red Sox opening day loss to the Yankees, I started the research.

Method. First, I downloaded the team batting stats for every team since 2010. I eliminated from the study all of the teams from 2020, when baseball teams played only 60 games each, and figured the runs scored per game for each team. There were 61 teams in the 11 seasons within the study which scored between 4.400 and 4.600 runs per game. . .that is, 4.50 runs per game, but with a tolerance up or down of .10 runs.

I downloaded from Retrosheet the Game Logs for each of those 61 teams, giving us the runs scored and runs allowed in each game. Altogether, these teams played 9,883 games:

Games Played by the 61 teams	9883
Runs Scored	44427
Runs Allowed	44373
Wins	4914
Losses	4968
Ties	1

They scored and allowed 4.50 runs per game and played .500 ball; actually .497, but that isn’t a statistically meaningful discrepancy from .500.

I wanted to study a large number of games from teams with very comparable offenses. The next thing I had to do was to figure the teams’ winning percentage with each number of runs scored in a game. That is summarized in the chart below:

Runs Scored in Game	Wins	Losses	Pct
24	1	0	1.000
21	2	0	1.000
20	2	0	1.000
19	7	0	1.000
18	5	0	1.000
17	9	0	1.000
16	14	0	1.000
15	30	0	1.000
14	42	1	.977
13	71	1	.986
12	111	4	.965
11	173	6	.966
10	208	14	.937
9	315	36	.897
8	422	85	.832
7	597	126	.826
6	651	250	.723
5	678	402	.628
4	675	645	.511
3	524	843	.383
2	300	1039	.224
1	77	945	.075
0	0	571	.000

Probably this chart isn’t very different for teams that score 4.50 runs per game than for any other team; it might be a little different because of park effects or something, but you’d get essentially the same chart if you studied teams than scored 650 runs a season (four a game) or 800 runs a season (five a game.)

I take the relevance of this to be self-apparent, but out of an excess of caution, or perhaps it is flatulence, I will explain a little, anyway. Take two teams that each play 2 games, and each score 14 runs in those two games. The one team, however, scores 14 in one game and zero in the other, whereas the other team scores 7 runs in each game. The team that scores 14 runs all in one game has an expectation very close to 1.000 wins, since they are almost certain to win the game in which they score 14 runs, and cannot win the game in which they score none. The team that scores 7 runs in each game, however, has an expectation of winning 1.652 games, out of the two, since the winning percentage of teams that score 7 runs in a game is .826. It is much better to score 7 runs in each of two games than to score 14 in one game and none in another. One pattern is more efficient than the other.

Some teams that score 4.500 runs per game might be expected to win more games than other teams that score the same number, then, based on the efficiency of their runs per game pattern. We can measure, then, each teams expected wins based on their runs scored in each game.

For the Washington Nationals in 2021, their runs scored in their first 20 games were 6, 6, 0, 0, 5, 0, 5, 3, 6, 6, 1, 6, 2, 5, 3, 1, 0, 7, 0 and 5. Based on that, they could have been expected to win 8 of their first 20 games. When they scored 6 runs in the first game, they had an Expected Win Contribution (Ex W C) for that game of .723. The total expected win contribution for the 20 games, based only on their offense and ignoring everything else, is 8.09 wins. They did, in fact, go 8-12 in those 20 games.

Team	Month	Day	G		Opp		R	OR	Ex W C	Ex Wins	Ex Losses
Was	April	6		Vs	ATL	W	6	5	.723	0.72	0.28
Was	April	7	1	Vs	ATL	L	6	7	.723	1.45	0.55
Was	April	7	2	Vs	ATL	L	0	2	.000	1.45	1.55
Was	April	9		At	LA	L	0	1	.000	1.45	2.55
Was	April	10		At	LA	L	5	9	.628	2.07	2.93
Was	April	11		At	LA	L	0	3	.000	2.07	3.93
Was	April	12		At	STL	W	5	2	.628	2.70	4.30
Was	April	13		At	STL	L	3	14	.383	3.08	4.92
Was	April	14		At	STL	W	6	0	.723	3.81	5.19
Was	April	15		Vs	ARI	L	6	11	.723	4.53	5.47
Was	April	16		Vs	ARI	W	1	0	.075	4.60	6.40
Was	April	17		Vs	ARI	W	6	2	.723	5.33	6.67
Was	April	18		Vs	ARI	L	2	5	.224	5.55	7.45
Was	April	19		Vs	STL	L	5	12	.628	6.18	7.82
Was	April	20		Vs	STL	W	3	2	.383	6.56	8.44
Was	April	21		Vs	STL	W	1	0	.075	6.64	9.36
Was	April	23		At	NY	L	0	6	.000	6.64	10.36
Was	April	24		At	NY	W	7	1	.826	7.46	10.54
Was	April	25		At	NY	L	0	4	.000	7.46	11.54
Was	April	27		At	TOR	L	5	9	.628	8.09	11.91

Through 40 games, they had 17.09 expected wins, and they were in fact 17-23. After that, though, because their pitching was poopy, their actual win total fell steadily behind the wins expected based on runs scored. They wound up the year with only 65 wins against an expectation of 80.

ALL of these 61 teams had an expectation of about 81 wins; that’s the implication of the selection criteria. Their offenses are all about the same, ignoring outside effects and focusing just on the runs scored.

The question at the center of this study, however, is to what extent are some of these offenses actually better than others, based on the runs by games distribution? Is that a meaningful factor? DO sometimes have a meaningful edge or other teams, based on scoring runs in more efficient game patterns? That’s what we’re trying to get to here.

There does not appear to me to be any meaningful advantage to be gained by scoring runs in an efficient pattern. The 61 teams had an average of 80.56 expected wins and 81.46 expected losses (one of the teams, Tampa Bay in 2013, played a 163^rd game, making the average 162.02.) The standard deviation of expected wins was 1.64; of expected losses, 1.65. That’s 1% of the schedule of 162 games. It is the sort of statistical remnant that has to occur in a study in which there is no underlying effect, simply because everything isn’t going to even out.

The BEST team in the group, in terms of scoring runs in efficient patterns, was the 2013 Baltimore Orioles. The Orioles are the outlier of the study. They scored runs in very efficient patterns, giving them 85.46 expected wins, or 4.90 more than an average team. No other team within the study had a deviation from the average of more than 3.05. No one else is two standard deviations away from the norm; the 2013 Orioles are essentially three standard deviations off.

If you look at a chart given earlier, you can see that scoring 2 to 7 runs per game is "efficient", while scoring more than 7, and especially more than 10, is inefficient. Scoring more than 10 runs in a game is wasting runs.

Whereas an average team in the study scored more than 10 runs in a game 8 times, the 2013 Orioles did so only 3 times.

Whereas an average team in the study scored more than 7 runs in a game 26 times, the Orioles did so only 22 times.

On the other end, whereas an average team in the study was shut out 9.4 times, the 2013 Orioles were shut out only 6 times.

And whereas an average team in the study scored 1 or zero runs 26 times, the Orioles did so only 15 times.

So, if you want to, you can make an argument that this is a significant factor in a team’s success, at least sometimes. If a team has a gain in expected wins of five games, that certainly would be hugely significant in a pennant race. If you want to see it as significant, go ahead.

BUT.

It’s only significant IN A PENNANT RACE; otherwise it’s just 10% of the scale, the scale being 56 to 106 wins, let us say.

It’s only really significant if it has an actual cause, as opposed to being something that just happened, and

It’s not really 5 games, quite. The 2013 Orioles scored 4.599 runs per game, putting them at the very top of the range of teams within the study. About 1.6 games of the 4.90-game advantage is actually accounted for by the fact that they scored more runs than the average team in the study.

A point in favor of the "it is TOO significant" argument is that the Orioles did have a home-run offense. Chris Davis hit 53 home runs that year, and the Orioles led the majors with 212 home runs. It is likely and reasonable that home run hitting produces runs in more efficient game patterns than sequential offense. SLIGHTLY more efficient; not notably more efficient. And the Orioles did not have an efficient offense in the surrounding seasons.

So I’ll leave that up to you. The most inefficient offense in the study was the 2016 Dodgers, who lost about 2.97 games due to scoring runs in inefficient patterns. The 2016 Dodgers won their opening day game by a 15-0 score. Whereas the 2013 Orioles had only one game in which they wastefully scored more than 11 runs, the 2016 Dodgers won games 15-0, 12-6, 13-7, 14-3, 15-5, 18-9 and 14-1. But the Dodgers were shut out 12 times, and held to one run another 15 times—a total of 27 games with 1 run or less, whereas the Orioles had only 15 of those. But the Dodgers still won 91 games, and their division.

Bottom line: I don’t see evidence that this is a meaningful advantage for any team. However, a few teams DO increase or decrease their expected wins by 3 or 4 games by scoring in efficient or inefficient patterns.

In general, the advantage goes to CONSISTENCY, not to inconsistency. If there is any advantage in inconsistent offense, that would only be for very, very weak offenses.

COMMENTS (10 Comments, most recent shown first)

evanecurb
I wonder if an offense can be constructed to maximize efficiency while controlling for other factors. I’m assuming it would be difficult to foresee.
4:34 PM Apr 13th

djmedinah
About this sentence: “It is likely and reasonable that home run hitting produces runs in more efficient game patterns than sequential offense.” Could that be what is driving the home run rate—that is, that teams are (undoubtedly unconsciously, but you know what I mean) selecting for home run ability in order to “smooth out” offensive production? Another approach to this problem is to see if run production has become "less lumpy" over time.
3:31 AM Apr 12th

FrankD
Interesting article. Two things: Just thinking about it a priori consistancy is the key. Given two teams who avg 10 runs a game but one either scores 10 or zero while the other team always scores 5. The first team is, at best, a .500 team. The latter should do much better. Of course, it you are a very weak team and avg 2 runs a game, then you'd probably do better scoring 0,0,0,8 for every four games then 2 every game.

The other thing is that how in the heck would you create a team that scores inconsistantly: given the same players could you create a lineup that scores more inconsistantly? Or, could you build such a team?
11:30 PM Apr 11th

waisanhart
shthar, Bill pointed out in the 1988 Abstract that while the 1987 Tigers got off to a poor start (11-19), the 11 wins included a number of blow-out wins, so it wasn't very surprising that they came back to post the best record in MLB for the season.
5:42 PM Apr 10th

jgf704
Volatility in runs allowed would be a factor too, wouldn't it -- though in the other direction. That is, a team that allows an *inconsistent* number of runs will win more games than a team that allows a consistent number of runs.

FWIW, I actually did a little study of this myself many years ago (I published it in a post on the old Sons of Sam Horn message board, but I'm no longer a member, so I have no idea if I could retrieve it). I had game-by-game run distribution data for approximately 10 years (1990 to 2000), and I computed the mean and the standard deviation of both runs scored and runs allowed per game. Then I correlated these with winning percentage and came up with a modified Pythagorean relationship.

That is, in just looking at the mean, it is true:

W/L = (RunsScored/RunsAllowed)^1.83

But when I included the standard deviations, I found something like:

W/L = (RunsScored/RunsAllowed)^1.4 * (stdevAllowed/stdevScored)^0.5

I might not be remembering the exponent values exactly right.
11:07 PM Apr 9th

shthar
How does this track with a much earlier observation of Bill's that, 'good teams don't win one run games, they win blowouts'?

He was talking about the 84 tigers, if I recall. And that winning one run games wasn't really an indicator of anything besides luck. So a team's record in those games could be good or bad and did not track the teams overall record.

Now that I think about it, I don't know if he showed that teams that win more games, win more blowouts, but he must have, right?

Now that I think about it, it might have been the 87 tigers.

2:41 PM Apr 9th

raincheck
This makes a ton of sense. Your odds of winning go up as you score more runs. But at a certain number of runs, the curve starts to flatten (maybe at seven runs), and the. it actually becomes flat (at 15 runs).

So teams that have a lot of very high run games are scoring inefficiently. They are scoring a lot of runs with very little or no value.

I assume there is a huge random element here. I mean, you can’t save up some of those 16 runs you got today like the announcers will invariably suggest. You might get shut out tomorrow, but laying down you bats once you have 9 runs won’t prevent that.
1:14 PM Apr 9th

StatsGuru
The 1988-1989 won't be in your study, but I thought they were an interesting example of consistency vs. inconsistency. Runs allowed by the two teams were nearly identical. In 1988, Dodgers scored 3.88 runs per game and were +3 in wins over the Pythagorean projection. Their runs were distributed in a somewhat normal curve.

In 1989, their offense dropped to 3.46 runs per game, but were -5 in wins. The distribution of their runs scored was concave; most of the weight was in the 0-2 run range or the 6+ range.

I'm glad you did that study, since those two seasons always resonated with me about efficient offenses.
10:30 AM Apr 9th

3for3
I think the win chart shows that there is an advantage to being efficient. If we had a choice of averaging 4.5 runs, or always getting 4, getting 4 does slightly better.

The problem is that is quite impractical to arrange your offense to be efficient. There just isn’t a way to do it, and even if you could, there’d still be opposing pitchers and defenses that will take you away from efficiency. Facing Kershaw one day, and a back of the rotation pitcher on a bad team the next? Well, you see the idea.
7:38 AM Apr 9th

The Practical Effects of Run Volatility

COMMENTS (10 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: