September 4, 2016

I’ve been looking into monthly breakdowns lately, as a part of my larger looking-into issues of randomness, and non-random patterns, in players’ careers. There are astonishingly huge swings in some players’ monthly performances over the course of their entire careers that don’t really make a lot of sense to me, except as samples of randomness in the data. When we see a player’s career OPS in July dive way below his normal OPS-range in the other five months, what are we to make of it? That the player broke down physically July after July, following above-average Junes and preceding above-average Augusts? That he had some sort of psychological problem once the calendar turned to July which ended happily and promptly when the page turned to August?

I attribute these surprisingly common anomalies to randomness, although I admit I know less about statistics than I do about the estrous cycle of chimpanzees, to quote a renowned Elizabethan poet. I simply can’t find a more plausible explanation, if only by process of elimination. I reject the physical and psychological explanations because over the course of 16- or 22-year careers, I can’t imagine how something like that would crop up year after year. The psychological explanations are particularly troubling, implying as they do issues of character: how could someone be so fragile psychologically as to perform poorly (or well) for mental reasons every time July (or September) comes around, and still be of sound enough mind and character to perform at Hall-of-Fame levels otherwise.

It just makes the most sense to me that, even at the numbers generated over the course of long and successful careers, the sample sizes would still be small enough to result in significant variations on a monthly basis. A HoFer might accrue as many as 600-odd games in a given month over his career, which seems large but which isn’t quite large enough to smooth out all the statistical bumps in the data. A player could conceivably under- (or over-) perform for four straight seasons totaling 600 games, at least to a small degree, although we’d probably search for a more convincing explanation than random data, and maybe we would find it: did he play for a certain manager most of those four years? Bat in a particular spot in the order? Play a certain position? Play in a particular ballpark or league? We could probably find some explanation that would seem, at least, more satisfying than "Hey, that’s just how the numbers shook out."

In the middle of trying to sort this stuff out, I read an article in SI, a pretty unusual look at the relationship between two of my early favorite players, Sandy Koufax and Maury Wills, and it contained an assertion I decided to fact-check. On the night of Koufax’s fourth no-hitter, against the Cubs:

It was a bit cool in Chavez Ravine (which often sees temperatures dip at night in September), while Koufax preferred warmth.

I’d never realized that Koufax preferred warmth, despite being a fan of his for over fifty years, so I decided to check it out. I did know that generally hitters do best in warm months and pitchers in cool weather, but maybe Koufax was an exception or maybe he enjoyed pitching in warm weather despite getting hit harder. I didn’t know, so I checked out Koufax’s monthly performances, which I’ll summarize briefly: nothing stood out about the warm months or the cool ones.

But the in-between months were quite spectacular. June, in particular.

He went 42-12 in June, for a .778 winning percentage. Crudely, multiplying by six, if Koufax had performed all year as he did in June, he would have had 252 wins and only 72 losses by the age of thirty, and we wouldn’t be talking about him being in the Hall despite his low number of career wins, would we?

Of course, there are some very clear reasons for his anomalous Junes that belie the crude numbers: for one thing, he pitched more in June—a LOT more—than in any other month. Of his lifetime 2,324 IP, 21.3% of them took place in June. If you multiply 21.3% by 6, you get 128%, so the 252 wins is a little high. (It works out to more like 197 wins.)   Still, there’s no arguing with that .778 winning percentage, by far the highest of Koufax’s monthly winning percentages. His next highest is .694 in May, and then .650 in July.

His ERAs in June were also easily the best of his career: 2.43, contrasting sharply with the 3.33 ERA he put up in nearly 400 career IP in August. (So if he enjoyed pitching in warm weather, he must have been a bit of a masochist.)   Just to dispose of the monthly variations in career stats, which I believe at this point to be random fluctuations in the numbers (until shown otherwise), my experience (probably borne out by a tedious search of weather records going back a hundred years) tells me that July and August are the hottest months, April and September (or April/March and September/October, as has it) are the coolest months, and May and June are somewhere in between.  This general statement tells me that if I notice a phenomenon happening in August and it’s due to the weather, well, that phenomenon or something like it ought to show up in July as well. If August is paired with April, however, that’s pretty well telling me that the weather is probably the least to do with the phenomenon.

But Koufax’s two highest monthly ERAs, the only ones he posted above 3.00, are in fact April and August. So I’m assuming that whatever the cause of these monthly splits is, weather is not the place to look first. I can make sense of some of these monthly splits, at least, with simple logic. You’d think, given Koufax’s great performances in the World Series that he was a pretty good cold-weather pitcher, right?  But in fact he threw as many shutouts in the last three games of the 1965 World Series (2) as he threw in his entire career in the month of April, in just about a full season of games: 36 starts, 193 IP. Of course the explanation for this, aside from his "clutchness" in the Series, is the likely fact (given that low IP/starts ratio) that he was pulled from some potential shutouts in his first few starts of any season, which only makes sense.

But given the fact that Koufax had a reputation for clutch pitching, and given the fact that the Dodgers were in many tight pennant races down the stretch, I was a little surprised to find that his record in September was pretty mediocre: 24-19, his lowest winning percentage of any month, and a slightly above-average (for Koufax) ERA of 2.84.  Of course, 2.84 is nothing to sniff at, and W/L records are notoriously liable to influences outside of pitching quality, but still….that’s where I would have looked for Koufax’s game to improve, if only by a small margin. And I would have been wrong to look for that.

Since nothing has jumped out at me so far, in a few months of perusing monthly breakdowns, I must admit that I’m inclined to think now of monthly splits as mostly the random result of relatively small sample sizes and the necessity that some months, at these sample sizes, will be better or worse than other months.

That last observation, however banal or obvious it may seem, took me a little time to absorb: looked at from the opposite perspective, though, is it at all likely that even the most consistent player would put up the same exact figures in every month of his career? Of course not. The most consistent player (whom I’m assuming was Hank Aaron, or someone very much like Hank) would have to have slight statistical variations in his synoptic stats, such as OPS, and other, lesser players would have had both smaller sample sizes and less general consistency than Hank, so there must be a certain range in monthly performances that is perfectly normal, something on the order of an .050-.100 monthly OPS variation to which no significance can be attached.

I’ll publish my findings in an upcoming article on the various breakdowns of’s splits, which continue to fascinate me when they makes sense and when they don’t, but I wanted to lay out some of these premises first: monthly variations, and some others, for which I can find no general reason or explanations within that small range, are a model for random fluctuations. At the level of several hundred games, which is about the maximum for typical career monthly figures, variations of .050-.100 OPS points are perfectly normal. But some other stats may have more logical bases behind their variations, which I’ll try to explore.

For example, I was surprised to find that Koufax was a much better pitcher, superficially, in the first halves of seasons over his career than his second halves.  Of his career 165 wins, exactly 100 came before the All-Star game, and only 65 after, while he lost eleven fewer games in the first half of his seasons (37) than he did in the second half (48).  Supporting that .719/.575 W/L split is a half-run difference in his ERAs (2.53/3.04).

Now let’s debunk that illusion a little. First place (and this is a quarrel I have with the first half of the season, according to them, doesn’t actually occur between the 81st and 82nd games of a 162-game season, but rather at the All-Star break, which usually takes place a few days after that point. (As one example, Pete Rose, an everyday player if there ever was one and the All-Time "Games" leader, played 55% of his career games in the "first half" of his seasons, according to, rendering the comparison of "halves" of every season statistically suspect. I suppose there are problems with counting the first 81 games as the first half, too, such as the occasional game suspended in the first half but concluded in the second, but even so that would be more useful than simply counting each first half as being up to ten games longer than the second half of each season.)  So counting stats, like Wins, are going to be slightly biased towards the first halves.

But non-counting stats, like W/L percentage and ERA, should still be fine, right? Not quite.  Koufax’s stats are going to be skewed by certain factors that apply unusually to his case: while I’m touting him as a first-ballot Hall of Famer, his stats are among the least Hall-of-Famey in that his overall career was unusually short for a HoFer, giving him relatively small sample sizes to work from, and it was unusually concentrated into a few seasons, making injuries bear an unusually heavy weight on the random distribution of his stats. For two seasons (1962 and 1964) during Koufax’s brief peak, he physically broke down in August, reducing both his counting stats and his quality for a third of his entire peak period, 1961-1966. (When he tried coming back in late 1962, his ERA was a rather unKoufaxian 10.38. In 1964, he broke down and stayed broken all year.) And finally, his being a pitcher in itself reduces the sample size in certain regards: an everyday player gets up to 30 games a month to put up stats, but a pitcher at best gets only 6 or 7.  So Koufax, of all Hall-of-Famers, is going to be unusually susceptible to variations in his numbers, despite his unusual consistency and excellence for his last six seasons. To the extent that monthly breakdowns of Hall-of-Famers are generally going to even out of the course of their careers, Koufax’s monthly records are among the most volatile for reasons we can easily grasp.

Generally, though, Hall-of-Fame players’ stats are going to be relatively stable from month to month, as I hope to show in future essays here, while other breakdowns will show both larger and smaller variations, which I’m hoping to discuss and to explain.


COMMENTS (7 Comments, most recent shown first)

Hah! So do I. Just don't spell it 'Stephen'.
4:31 AM Sep 8th
Steven Goldleaf
No problem, George, I answer to "Steven," "Steve," "Stevie," and "Steverino." Thanks for the encouragement--you can expect a further dive into Bbref's splits in another day or so.
8:18 AM Sep 7th
Sorry *Steven*. I don't why I called you Steve in my post. My apologies.
5:17 PM Sep 6th
Great article, Steve! I, too, have been fascinated with Baseball-Reference's splits since I discovered them several years ago (they're not that easy to find if you don't know how to look). My current obsession is the home-away split for both hitters and pitchers.

I encourage you to keep analyzing and publishing on months and other split-season stats.

Thanks again,
5:16 PM Sep 6th
Good article, Steven. I saw nothing that showed a lack of understanding of probability (which is not the same as statistics, although you can't do statistics if you don't understand probability). I think that you've suggested weather isn't a factor, so this doesn't matter, but September is, on average, slightly warmer than July in Los Angeles and San Diego and is the warmest month in San Francisco. That's not true of any place I know about east of the Rockies or even many other places in or west of the Rockies; it's not true of Seattle, for example.

I agree that monthly splits aren't worth worrying about unless you have a player with a fairly long career who always does poorly or well in a particular month, and maybe not even then. But over a career I'd look harder for causative or even correlative factors.

It is important, in my opinion, for people to understand just how large a sample needs to be for such variable performance measures as batting and pitching to reliably detect differences. So good for you for helping to make this clear.

I am fairly confident that you know as much about chimpanzee reproductive cycles as I do.
10:02 AM Sep 5th
Steven, it's possible that I have a better understanding of statistics than you do (I understand standard deviation! I've even written code to compute it), but probably not by all that much. FWIW, I think you're right that there is a core of randomness in these breakdowns--though it might be more accurate to say that there are too many factors affecting performance, some knowable, some not, to take into account. And I definitely agree about the dubiousness of where lines are drawn in the calendar. There is even a certain arbitrariness about the division into months: if you're looking at weather, there is usually more difference between 15 Aug and 31 Aug than between 31 Aug and 1 Sep.

I yield to you entirely on the matter of chimps in heat.
7:15 AM Sep 5th
Obviously you need a sample size greater than 1, so looking at some other players (which I assume you are doing for the article, and you're just giving us a teaser here) makes some sense. (I suppose it might make sense to aggregate a bunch of pitchers, but maybe not.) Generally, though, if July/August are supposed to be good months for hitters, that should show up in the data.

Another split--day/night...I have often read that day games are better for hitters because, even with good lighting, visibility is better in day games.
12:06 AM Sep 5th
©2021 Be Jolly, Inc. All Rights Reserved.|Web site design and development by|Terms & Conditions|Privacy Policy