Remember me

Hot Hand Question

July 11, 2013

From "Hey, Bill". 

Since hitting a baseball requires such coordination of many moving body parts, and it is easy for a piece (and therefore the whole mechanism) to get out of whack, do you think the clusters of hits we take to be hot streaks are just the player executing at a maximum level?



                Yeah, that’s actually a good argument in the way you stated it, and there is a kernel of truth in it.     I relate this to a sport which is more at my athletic level, which is pool.   I have a pool table in my basement, and I run the table several times a day, and count how many shots it takes me to knock down 15 balls.   Sometimes, once in a while, it is 15 shots—I have done it in as few as 11—and sometimes it takes 40 shots.  Or even more.

                I am 100% certain that not ALL of this variance is random.   I am 100% certain that I fall into unproductive habits, as I am playing, that make me less effective—three of them specifically.  I start "sawing" the stroke, pushing the back end of the cue down instead of sending it straight forward into the cue ball.    I start "jerking" the shots, rather than shooting smoothly.   There are some shots for which it is the best approach to flick the cue quickly forward and then jerk it quickly back, but that’s about 15% of the shots, and I find myself doing it repeatedly when it is not appropriate.   Third, I start rushing from shot to shot without taking the time to think through, before each shot, how I want to approach the shot, how I want to spin the ball, where I want to leave the cue ball to set up the following shot, etc.    I know that I do these three things (and have some lesser included offenses); I have known it for years, but I still fall into these bad habits, and find myself missing shots.

                Shouldn’t I believe, then, that baseball players fall into similar bad habits from swing to swing, and that this makes them less effective sometimes than at other times?

                Yes, I should believe that, and I do believe it.   I also believe, for what it is worth, that a batter’s level of effectiveness can vary, to an extent, because of "petty confidence", petty confidence being the transient, unreliable type of confidence that may be here today and totally gone by Sunday.    I don’t question but that these are real variables, and also, players are often playing with manageable injuries that don’t prevent them from competing, but may prevent them from performing at the level at which they might otherwise reach.  I don’t doubt that there are additional performance variables that I have not mentioned, but which some master of the obvious will find it necessary to point out to us.

                At the same time, there is also the tendency of hits (and all other performance elements), to form random clusters, so then the performance variation has both real and artificial elements.   The question is, to what extent, in watching the games, are we seeing what is real, and to what extent are we seeing an illusion created by random clusters?

                Suppose that you take a player’s statistics, and re-organize his at bats at random.  . . .or suppose that you take his Start-a-Madic, ASPC or Ethan Allen’s magic spinning wheel game card and recreate a thousand at bats for him.  You will find, absolutely and without question, that there is just as much up and down variation in the random performance as in the player’s actual record.    This has been studied hundreds of times.    There is a very good web site devoted to the issue, the Hot Hand Web Site (, maintained by Texas Tech professor Alan Reifman.)   I don’t want to overstate the study; the Hot Hand phenomenon has been studied hundreds of times by dozens of different researchers, and occasionally, one of us thinks we possibly have found some tiny and elusive difference between the "actual" and simulated data.    But for the most part, those studies always show that the variance in the real-life performance is identical to the variance that would be expected if nothing was operating except the normal randomization. 

                This is not an absolutely convincing argument that nothing is going on here except random variation.    It is certainly possible for other effects to mimic the patterns of random variation closely enough that it might be difficult to distinguish between the two, and it is certainly more than possible that might be difficult to distinguish between pure random variation and some mix of random and causal variation.    But the question becomes, then, how much causal variation is it reasonable to think might be completely hidden in a mix of causal and random variation?

                Well, if it was 50-50, it would be extremely easy for us to distinguish between the patterns in the two sets of data.    We know how much random variation can be found in any data set; that’s a pretty basic and easy thing to calculate or at least estimate.    If you add a second level of variation equal to the first, it will create obviously non-random patterns.

                If it 70-30—in other words, if the causal variation was roughly half the size of the random variation—that, again, would be easy to distinguish from pure random variation.    Even if it was 90-10, we should be able to distinguish between that and pure random variation.   If it was 99-1, maybe we would have a hard time telling one from the other.

                So when you see the variations in performance, what is it that you are seeing?   You’re seeing randomness. . ..not pure and absolute randomness, perhaps, but largely randomness.    95% or 99% of what you are seeing is just random variations in performance.

                There is another way to approach the issue.      How much variation in performance is it reasonable to think might occur due to things like petty confidence and a hitter falling into bad habits?

                Well, I don’t know, exactly, but .270 hitters have months in which they hit .400.    Marquis Grissom in June of 1994 hit .385 (47 for 122).    Johnny Damon in 2000 hit almost .500 for a solid month.     It is not reasonable to think that players actually reach that level of performance ability.   If a player could become a .400 hitter for a month, somebody would hit .400 every month.     Somebody would hit .400 in his career.

                Johnny Damon was essentially the same hitter in 2011 that he was in 1996, and every year in between.    It is not reasonable to think to that he suddenly, for a few weeks in his career, became something radically and totally different, and then somehow, time after time, returned to exactly what he always had been.  He is not a shape shifter.   It is much more reasonable to believe that he was the same player or essentially the same player throughout, but that he simply had a cluster of superlative games that made him look different for a short period of time.

                We know that the standard deviation of batting average in a season, for players who bat 500 or more times, is 27 points.   It is not reasonable to suppose that hitters go from being 3 standard deviations above the norm in one month to 3 standard deviations below the norm the next month.   It is even less reasonable to suppose that a hitter flips from standard deviations above to standard deviations below from week to week (in terms of his true underlying ability), but somehow finds himself more or less in the same position time after time, given 600 plate appearances.

                Let’s go back to the issue of the variation that occurs in my pool playing from game to game, which might be analogous to the variation that occurs in your golf game if you are one of them golfer types.   How long does it take you to fix a glitch in your putting style, given that you have a certain level of ability as a putterer?   I do fall into bad habits, as a pool player, and this does cost me a few shots now and then, but I also spot these problems and fix them on a pretty short schedule.   30, 40 shots; I’m going to figure out what I am doing wrong, and get it right.

                If we assume that players spot the flaws in their swings and fix them on a similar schedule in terms of the number of swings, what is that going to take?   A couple of games, maybe?

                Look, Yasiel Puig

                a)  is not a .400 hitter, and

                b)  is not temporarily a .400 hitter, either.

                He’s a .280 hitter, probably, maybe a little less.   He doesn’t temporarily become something other than what he is, just because a string of ground balls scoot through the infield and a bunch of Dodger fans are not strong enough to blow away the smoke.    I don’t actually become a good enough pool player to run the table in 15 shots; it’s just something that happens once in a while.  Hot streaks and slumps are smoke and mirrors.    It’s not that there is nothing there; it is more like random variance is New York City and the actual up and down changes in ability are Immokalee, Florida.  When you talk about the people who live in New York City and Immokalee, you’re mostly talking about the Big Apple.


COMMENTS (36 Comments, most recent shown first)

I think hitters fall into bad habits, lose confidence, and go through periods where they swing at more bad pitches than they normally would. I believe there is a causal effect for bad streaks.

I think hot streaks are random events. Players who are executing their fundamentals properly are going to go through weeks where they have more production than in other weeks.

In other words, sometimes players execute poorly and this results in a sustained period of poor performance. Other times, they execute properly, which, in the case of major leaguers, is a very high level of execution. The highs and lows they experience while executing proper fundamentals (which is most of the time, or they wouldn't be in the big leagues for long) are random events.

I have absolutely no evidence to support this.
7:51 PM Jul 27th
I'll just repeat what I said:

"Yes, for the most part, streaks and the like exist, but the EXTENT to which they exist is the real question. And, it doesn't exist to any extent to which a manager is going to make a real decision, other than as a tie-breaker type of scenario. "
6:46 PM Jul 22nd
My last comment and I'll be quiet. I just read the section on pitchers, and noted with interest that you DID identify a clear and significant skill-based variation in starting pitchers.

Here's my suggestion as to why the measureable difference suddenly appeared...

With batters, your sample size was four games -- perhaps 18 plate appearances to determine wOBA. With pitchers, your sample size was the same four games -- but a starter's wOBA is measured over perhaps 80-100 plate appearances -- a four or five times larger sample size.

I think if you ran the batters numbers over the same 80-100 plate appearances, you would filter out much of the random noise, and end up, as with the pitchers, with a measurable impact from skill.
4:01 PM Jul 22nd
Gregg: and I would bet that we would NOT find any of that.

We don't need guesses in this discussion at this point, but evidence.

I've done my part, and if someone else wants to go for it, feel free. I will say that if they try to do these studies, and spend dozens of hours on it, don't feel bad that you wasted all that time without finding anything. It's an almost guarantee that the destination will be fruitless. Your best hope is to enjoy the journey and maybe you'll find something of value.
2:26 PM Jul 22nd
Thanks Tom. I read the section on Batting Streaks (thanks Google Books). Interesting stuff.

But I'm not surprised by the result, because it is based on wOBA, which is a great statistic for large sample sizes, but I think less instructive when looking at very small sample sizes, such as you used in your example. I wouldn't doubt that a high share, perhaps more than half, of the qualifiers in your study were pure beneficiaries of good luck -- balls finding gaps, strings of weak pitchers, etc. As I understand wOBA, it doesn't filter out lucky hits.

I think if you studied slumps and hot streaks for longer periods -- 20 games, or 30 games -- using wOBA -- you would filter out a high percentage of the luck-based streaks. Or, if you used statistics such as line-drive percentage, or swing-and-miss rate or other statistics that you surely understand better than I, you would come much closer to identifying batters whose streaks have a foundation in skill variation rather than luck.

And I'm guessing you would find that skill-based hot streaks and slumps, while not as common as the baseball announcers imply, are far more of a real factor in everyday baseball than you and Bill seem to believe.
1:35 PM Jul 22nd
That is the kind of streakiness I am talking about, among other kinds of streakiness. Whether it is in-game, or game-to-game, or week-to-week, or batter-v-pitcher, or batter-v-pitchHand or anything you can think of.

And we have a whole chapter devoted to it in The Book, and it's probably available for free via Amazon's Look Inside, or Google Books. And in other parts of The Book, we devote a few studies to the idea of streakiness as well.

Even things like reverse-platoon advantage has alot of illusion to it (on the hitter side anyway... on the pitcher side, it's more easily explained based on the kinds of pitchers they throw). Bill James had a tremendous research piece on it in one of his later Abstracts.
12:55 PM Jul 22nd
Tom, perhaps we have been using the term "streak" differently. Your examples use it literally, e.g., three straight extra-base hits and then measuring impact on the fourth at bat. I've been thinking of it in a broader context, with "hot steak" being the opposite of a slump. No single at bat is significant -- but I'm thinking of the the player's performance over 20, 30 or 50 at bats -- sometimes even longer.

When I said it is observable, what I mean that in watching, for example, Ellsbury's batting (discounting the random factors of which balls find holes and which are caught), you can clearly see that the guy, currently, is able to see and hit the ball with more authority than he could two months ago.

And if you want to look at statistics, his drive percentage was 13% back in May -- an extended slump. In July, he is clearly "locked in," with a line drive percentage of 28%. (i chose line drive percentage to avoid most of the randomness issues of balls-in-play). I'm sure if I cherry-picked sets of, say, 50 at bats, instead of taking arbitrary full months, the percentages would skew even higher. Is it your position that his line drive percentage more than doubled through random chance of the bat hitting the ball on-center more often? I don't see how you could discount a skill factor being present.

Bill's comment was that on his best day, he didn't actually become a good enough pool player to clear the table in 15 shots. I won't argue with that. But if, for one month, he cleared the table in an average of 20 shots, and the next month cleared it in an average of 40 shots, then I WOULD say he was a better pool player the first month.

Perhaps this isn't the kind of streakiness that you are arguing against -- or perhaps it is. If it is, is there research you can refer us to which indicate something to the contrary?

12:05 PM Jul 22nd
If you add a bit of standard deviation to a lot of standard deviation, it's hard to notice the difference. If the SD due to "talent" streakiness is 1/10 the SD due to luck, the overall SD is only one-half of 1% higher that it was before.

(It's a pythagorean relationship: 1 squared plus 1/10 squared equals 1.01. The square root of 1.01 equals about 1.005.)

If the SD of "talent" streakiness is up to 2/10, then you're still only up about 2% in overall SD.

At 30%, the overall SD is up around 4.5%. At 40%, the overall SD is still only around 8% higher.

So it's really, really hard to spot streakiness. The same thing happens when you look for clutch hitting ... the distribution of clutch difference looks almost exactly like random, but that might be just because clutch talent is, say, only 2/10 as spread out as luck, so it only shows up in a 2% difference that's difficult to spot.

Hope I got this math right ...
11:51 AM Jul 22nd
This is a related thread, for those interested:
9:52 AM Jul 22nd
Gregg: thanks for your comment:

110Phil: excellent post. As we have shown in The Book, we ARE able to identify the existence of things like streaks, being hot/cold, etc. But, it's at a very extreme level, very hard to detect, and virtually impossible to use as an actionable item.

So, when Gregg says "But the existence of a non-negligible skill factor in streaks seems entirely observable ", I don't see the evidence for that statement. If it's "entirely" observable, then it should be fairly easy to show evidence for it. And the fact is that no one has shown evidence for it being "entirely observable" or being beyond "non-negligible".

Do players who get on base three times in their first three times on base manage to get on base at a better than normal level their fourth time? No. Do players who get three extra base hits in their first three times manage to increase production their fourth time? No. Do players who do that while hitting at least one HR increase production? No.

Do pitchers who retire 18 straight batters to start the game do better the third time through than they otherwise would? Yes! But, is it noticeably better? Not really.

This is what we are up against. Yes, for the most part, streaks and the like exist, but the EXTENT to which they exist is the real question. And, it doesn't exist to any extent to which a manager is going to make a real decision, other than as a tie-breaker type of scenario.

9:18 AM Jul 22nd
I think one of the things clouding the conversation, and that was plaguing me as I read the article... is that most of us have experience with baseball (or sport in general) at the amateur performance level. We play in Little League and/or High School and/or College and/or weekend warrior leagues, and as such we experience the variation of skill-- and this feels magnified to us because our games are more infrequent.

When I play once a week I have fewer opportunities to make the adjustments that Zeke referenced earlier. When I get stuck in a bad habit, I can't fix it the next day and so it can compound's itself in my petty confidence-- which is already prone to fluctuation due to the fact that I do not perform at a professional level of skill.

And it's this experience that clouds our willingness to see that the professional game's day-in-day-out frequency of opportunity matched with the incredibly high skill level that evens out that skill & petty confidence variation to negligible levels.

2:07 PM Jul 19th
I am around 5-foot-7. I have not been measured with great accuracy. It's possible that I am actually 5-foot-7.1, or something, or maybe even 5-foot-7.5. But it's NOT possible that I'm 6-foot-3, or even 5-foot-9.

Similarly, the fact that our measurements aren't accurate enough to say that there are NO talent-related streaks/slumps does not change the fact that we have very, very strong confidence that such streaks/slumps are very minor. If 5-foot-7 is normal performance, we haven't ruled out that players have regular hot streaks where they're 5-foot-7.1. But we HAVE ruled out that streaky players are regularly 6-foot-3 -- or even 5-foot-11, or 5-foot-8.

I would certainly agree that there is SOME inherent talent streakiness. As Tango often says on his blog, where you have human beings, you have variation. The question is, HOW MUCH is there? And the evidence says, not enough to matter, and not enough to be able to see with the naked eye.
10:52 AM Jul 19th
Mr. Tango, you are right, apologies for overstating things. I respect you and your work, as well as Mr. James.

But the existence of a non-negligible skill factor in streaks seems entirely observable (especially in slumps), entirely consistent with what we personally experience as athletes, and entirely consistent with the human condition. I don't know any person, in any field, who doesn't experience slumps and "hot streaks" in terms of their own performance level, whether they are writers, programmers, singers, business managers or researchers.

It seems to me that in order to conclude, to the contrary, that a batter's skill level somehow persists at an almost constant level, without peaks and valleys that meaningfully impact results, requires incontrovertible statistical modeling, with no obvious flaws. At least in this article and the discussion, I haven't seen that.

6:39 AM Jul 19th
I don't have any "predetermined" conclusions. My conclusions are based on research I've done. The evidence directs my conclusions, not the other way around. It's absurd for you to suggest otherwise.​
8:37 PM Jul 18th
Mr. Tango, I appreciate your acknowledgement that the randomness has not been measured in an accurate way for batters.

But your further comments appear to me to be reasoning backwards to justify your predetermined conclusion. The fact is, we have absolutely no idea (from any previous analysis) of how skill-based variation has entered into the statistics. But the idea that it could be "even less" a factor than before, is completely absurd. If it shows up that way, all that proves is how inadequately we've grasped what is going on.

8:27 PM Jul 18th
Gregg is right. Except the items he points out will INCREASE the variation we expect from non-batter skill (random variation + variation in pitcher talent + variation in parks), and thereby reduce the gap between this non-batter variation and the total variation we actually see.

If someone wants to bet on the hot-hand, there's plenty of bookies willing to take their money to that effect.​
10:16 AM Jul 18th
Bill, your argument seems based on your confidence that one can determine the random variation that would typically occur for a given batter -- providing you with a firm baseline to measure variation against. But I don’t share that confidence.

It seems to me that that the method used to create a baseline IS only valid in something like coin flips, because the individual probability on every flip is 50%, which matches exactly the coin’s overall “batting average” of .500.

But a player who averages .270 clearly does NOT have a 27% probability in every at bat (even if we assume absolutely no skill-related variation). If he’s a lefty, facing Koufax in his prime, clearly he does NOT have a 27% chance of getting a hit – perhaps more like 16%. Other factors besides the opposing pitcher come into play as well in determining the individual probability of each at bat.

What a .270 batter in fact experiences over a season is a series of probabilities which AVERAGE 27%, but individually vary between, perhaps, 16% and 38%. Furthermore, these probabilities are not randomly scattered, but are at least to some degree clustered (the batter may face Koufax four consecutive times, for instance).

I don’t know what the difference would be in peaks and valleys between a string of varying probabilities that average 27%, and a string of probabilities that are a constant 27%. But certainly we could not expect identical results.

To my thinking, unless someone has actually constructed a credible model of varying probabilities which mimic a baseball season, we don’t have a valid baseline to measure against, and therefore cannot reach the conclusion that the impact of variations in skill level are negligible.

9:19 AM Jul 18th
Under the "petty confidence" idea, keep in mind that the opposing pitcher's perception of a hitter's "hot streak" will also play a role. Even if it is based on random chance, if the pitcher believes it to be otherwise it can affect the outcome.

Basically, "pitching carefully" to a hitter you believe to be especially dangerous in the current circumstance could and probably does affect the outcome of the plate appearance, even if it simply increases their walk rate.
2:09 PM Jul 16th
I think part of the problem is people think of "luck" as only things that *look* lucky, like "hitting the ball hard but right at someone."

But even "real" hits are lucky, in the sense that -- among other things -- you swing at where you think the ball will be and sometimes you're right and sometimes you're not. And you have only so much control of your body: sometimes your bat goes exactly where you want, and sometimes you're off by half a millimetre and the ball goes foul.

The word "luck" causes a lot of problems because, intuitively, it has to SEEM like luck for us to think of it that way.

It's like ... suppose you think people can control dice rolls, make a die land six. You roll a die and it lands six. People think, "skill!" You roll a die, and it hits a bump on the table and takes a weird bounce and stops on six. People think, "luck!"

But they're both luck.
9:50 AM Jul 16th
Quoting from Greggborgeson:

Some hot streaks are caused by luck (hits falling in) and some are caused by a player being locked in. Some slumps are caused by luck (balls hit hard but right at someone), and some by temporary loss of skills (Papi in the spring a few years ago). Then there is the middle ground where a batter plays more or less at his skill level.

If can isolate skill-based slumps and skill-based hot streaks from their random cousins, then we have something to study. It seems to me that this could be done by more skilled people than me, with statistical tools available.

Yes, that is exactly my point: That COULD be done--if it was real. The problem is, it can't be done. Hundreds of efforts have been made to do exactly that, and they have essentially all failed. We thus reach the conclusion that there's not very much there.

12:57 PM Jul 15th
Like hitters, pitchers too show variances. It's not that rare for a pitcher to be dominating through three innings, lose some kind of edge and give up four runs in the fourth, and return to dominance. The change is very small. What amazes me is pitchers who can be dominant game after game, inning after inning (Halliday a couple of years ago comes to mind). When the game is based on a round ball being struck by a round object, the tiniest difference in contact can produce a significantly different result.
11:00 AM Jul 15th
I'm still waiting to hear that Pulg is actually 27 years old.
5:26 AM Jul 15th
Bill, I'm not a mathematician, but I'll try to answer your question in layman's terms.

Some hot streaks are caused by luck (hits falling in) and some are caused by a player being locked in. Some slumps are caused by luck (balls hit hard but right at someone), and some by temporary loss of skills (Papi in the spring a few years ago). Then there is the middle ground where a batter plays more or less at his skill level.

If can isolate skill-based slumps and skill-based hot streaks from their random cousins, then we have something to study. It seems to me that this could be done by more skilled people than me, with statistical tools available

If it turns out that skill-based slumps and hot steaks roughly balance, then you have the answer to your question (how can you add variation in skill to random variation, and not impact the end result).
8:04 AM Jul 14th
In a sequence of Bernoulli trials (events with outcomes Yes or No, such as coin flips or batting average) having different expected outcomes for subgroups of trials REDUCES variance over the complete data set. The formula for Bernoulli variance is Npq, where N is trials, p is probability in one trial, and q is (1-p). A .300 hitter in 600 AB has variance 600x.21 = 126 for hits in a season; if he's actually a .200 hitter half the time and a .400 hitter half the time, his variance is 300x.16 + 300 x.24 = 48 + 72 = 120.

But that's looking at the whole season as one observation. To detect the existence of subsets, you'd need to test the variance among candidate subsets. For example, if you think a hitter is different from week to week, then you need to look at the variance among weeks. If that isn't any different than chance (and quality of opposition and park effects) would predict, then you can conclude that weeks are not a streak event.
5:57 AM Jul 14th
If you add a bit of standard deviation to a lot of standard deviation, it's hard to notice the difference. If the SD due to "talent" streakiness is 1/10 the SD due to luck, the overall SD is only one-half of 1% higher that it was before.

(It's a pythagorean relationship: 1 squared plus 1/10 squared equals 1.01. The square root of 1.01 equals about 1.005.)

If the SD of "talent" streakiness is up to 2/10, then you're still only up about 2% in overall SD.

At 30%, the overall SD is up around 4.5%. At 40%, the overall SD is still only around 8% higher.

So it's really, really hard to spot streakiness. The same thing happens when you look for clutch hitting ... the distribution of clutch difference looks almost exactly like random, but that might be just because clutch talent is, say, only 2/10 as spread out as luck, so it only shows up in a 2% difference that's difficult to spot.

Hope I got this math right ...
11:46 PM Jul 13th
The question to which no one has suggested an answer is this: How can you add variation to a sequence of events without increasing the variation? If there is water in a bathtub and you add more water, the water level rises. If there is iron on a scale and you add more iron, it weighs more. How is it, then, that there is variation on a scale due to randomness, and you add more variation due to other causes, and yet there is no more variation than there was before? How does that happen?
7:28 PM Jul 13th
It's great to see this article. In the context of shooting percentage in basketball, I've heard that the good and bad streaks can be attrributed completely to randomness, but that never seemed satisfying to me. Players deal with mental and physical short-term drawbacks that the player has to break out of to get out of a bad streak. Or mental and physical peak performance, which he tends to to break out of unwillingly when he starts to think too much (or
not enough) , or when he gets an injury.

From what others have said, we can think of performance as being on a graph, with the x-axis being low to high mental/physical factors, and the y-axis being low to high luck (randomness). For example, if a player is at a peak mentally and physically and is therefore hitting line drives, but at the same time he has bad luck, then he'd be way to the right of the x-axis, but way down on the y-axis. So in that case there are with extreme causes that can average out to average
results, which is indistinguishable in outcome to average luck and mental/physical causes. Does that make sense? When everything is working together, he is way up in the right corner of the graph.

Anyway, when the causes work in synchronicity (high luck, high mental/physical peak or vice versa) I would think that a statistician could detect that the result is not due to randomness, but it seems I never hear that being acknowledged by those who analyze date of streakiness. Of course, as has been mentioned, these mental/physical peaks and valleys are generally of short duration, but still relevant in showing a streak that is not due to

12:51 PM Jul 13th
Great article, and good input from your readers. But I really can't buy into Bill's conclusion that hot streaks and slumps are smoke and mirrors. That may be true of many or most of what we see as hot streaks. But I don't believe it is true of all of them.

Picking up on Raincheck's comments, your eye can tell you which hot streaks are luck and which are because a batter is locked in. The stats may look the same, but the way the ball flies off the bat is entirely different.

These are two very distinct phenomena, and I would think that there might be objective ways they could be distinguished and thus isolated for further study. Perhaps comparing, against a batter's norm, their strike-out rate, swing-and-miss rate or batting average on balls-in-play would enable one to tease out the luck factor and more objectively identify a batter who is truly locked in mechanically for an extended period of time.

7:27 AM Jul 13th
(1) Okay, this is true, that a Strat-O-Matic card will show as much variance as a real player would. A real player's variance is not wider than the variance of normal randomization.

(2) But does the conclusion follow?! I don't believe that it does. Just because Effect A is not larger than Effect B, can we conclude that Effects A and B have the same CAUSES? A person's skin can turn red from a Niacin flush, and can turn red from being embarrassed. Supposing the skin turns red to the same degree in both cases. Do we conclude inevitably that the red skin is being caused by the same thing?

An imaginary S-O-M player hits in 9 games in a row due to throws of the dice. A real MLB player hits in 9 games in a row due to his having his timing in rhythm, his wife being nice to him, and his eating chicken instead of Oreo Double Stufs. Why would the 9 games in a row factor --- > necessarily imply the same cause, just because they happen to occur to the same extent?​
5:14 PM Jul 12th
Is everyone's peak the same? Is there an ultimate ceiling for all players, or does each player's vary?

This article made me think of Bull Durham, where Nuke throws what Crash calls and it's perfect, with him saying something like, "that's beautiful, what I do."
I'm reminded of when Greg Maddux would (I believe) throw a complete game shutout on fewer than 70 pitches. Ho-hum, just cruising.

Isn't this hot streak (peak) what scouts look for in players? Their ceiling? Shouldn't they be looking at their average? Guys who drive fans crazy because of their talent level and potential not living up to it or not being able to maintain it over the course of a whole season are all over the place.

Personally, I'd like to have players on my team that never make it on the "hottest" lists. I'd rather a guy hit between .270-.290 every month instead of up and down. Just as there is variance in results, is there not variance in controlling these ebbs and flows? Is it not a skill to remain relatively consistent?
4:38 PM Jul 12th
There are sometimes observable differences in hot streaks. There are the "every ground ball is finding a hole" streak and then there are the "this dude is hitting a lot of balls on the nose" streaks. I follow the Dodgers, and Nick Punto had one of the former earlier this year. A very visible case of fool's gold. Hanley Ramirez has been on one of the latter lately, where he is not just hot, his outs look good and he is hitting majestic foul balls.

Punto has no skills, any hit he ever gets is random. Certainly some of Ramirez' success is random. But there are times when a player has all of his mechanics in place, and his mental state is right and he performs at a peak. We can't stay in that ideal place forever, but any former athlete has felt it, for 15 minutes or a month.

8:37 AM Jul 12th
A Master of the Obvious is chipping in here as requested. Thanks for a thought-provoking article. Following up on some of the comments, it seems obvious to me me that streaks fall into three categories: good luck, bad luck, and an actual drop in skill. I think we should very rarely see a temporary hot streak driven by anything but luck. Most professional athletes are focused and well-trained--they are playing near the peak of their ability most of the time. If the hits are falling in for a week or two at an unusual pace, that's luck. But when the hits stop coming for a week or two, that could be luck, or it could be a mechanical flaw, or an illness in the family, or a sore ankle, or a million other things. Nobody argues that these kinds of things hinder performance. The nearly impossible thing to tease out from data alone is whether you have bad luck or a real problem. There might be statistical ways to determine if a particular player has more slumps than hot streaks, suggesting that he has tangible which might be fixed. I'm not sure even a few seasons of data would be enough to tease that out statistically, though. That is why scouts and managers with a strong understanding of human nature are still necessary for developing talent--they can distinguish between luck and skill.
7:23 AM Jul 12th
Under the thread lidsky started, I paraphrased what Bill once said that the fact that most hitters' batting averages are so consistent from season to season, makes it extremely unlikely that they have very different skill levels from week to week. But...

The complexity of mechanics argument does make some sense, in that a player could lose focus or fall out of practice and need some coaching help just to get back to where he was. But what would cause a HOT streak, which is what a lot of the debate about streaks focuses on? Is there a reason (other than to reward him) to give the substitute a start if he got a pinch hit yesterday?

Earlier this year, Orioles closer Jim Johnson ended his streak of 35 consecutive saves by blowing three in a row. Then, after converting one and a non-save appearance, blew another. Watching those games, it was clear his mechanics were off. So that explains a COLD streak, but where does HOT come from, and why can't it be made permanent?​
6:03 AM Jul 12th
One of the things that I think tends to get forgotten in the "true talent + randomness" vs. "it's a game of adjustments" debate is that while I'm totally willing to assume some hitters notice and fix their flaws faster than others, that's already included in the "true talent" portion of the equation. Seems logical to me that at least part of the reason somebody might be a .280 hitter instead of .260 (or vice versa) would be how fast he can work out the kinks.
5:56 AM Jul 12th

Wow – this is identical to what I was trying to address in the Reader Posts section over the last couple of days. Of course, not as eloquently. (Makes sense – you create prose for a living – I create chips). I did utilize the same pool and golf analogies as I tried to explain my thoughts.
What I have been trying to posit, and having a hard time putting into words, is that there are two causes of the “streakiness”. One is literally the luck – the ground balls find the holes. And the other is the fact that for the reasons you describe in your playing pool, our skill varies. That skill varies in a random way. That skill variation – though we can feel it – also is random in nature and has a random distribution. The randomness of luck floats on top of this randomness in skill level. If the skill variation dominated we would still have a random distribution of streaks, but the variation would be much larger


Going back to the Puig example. Maybe on average for a longer period of time he has the skill of say .280 hitter, but his month he is “feeling it” and is playing at the skill level of a .320 and getting the lucky breaks on top of that.


I think you’re right in the luck variation dominating the skill variation, and it islikely because the larger peaks and valleys of the skill variation are relatively short lived. Just as I'm sure you feel it in pool, and I feel it in golf, once we think we really have it figured out – it quickly vanishes.

Great post - thanks.

1:06 AM Jul 12th
A really stimulating investigation of the subject ... I especially like the question, How many standard deviations can skill deflect? In tournament chess, 200 rating points equate to an SD, and it's well accepted that players can (with difficulty) sustain a "hot streak" of just about 1 SD "over their heads" for a few months at a time. They can (very easily) sustain play 1-2 SD's below their ability ... :- )


I think there is an esoteric point that we overlook here, in our zeal (as sabermetricians) to conclude that Random Is Random and we can capture human imperfections with stats.


The story goes about a stats professor who challenged each of his students to write out 100-trial coin flips by hand, "inventing" the sequences, and then to actually flip coins and record those. The students shuffled them. Then the professor would casually sort through the pile, tossing the real trials into one pile and the "invented" coin flip records into another pile -- and being right almost every time.

This was because REAL COIN FLIP SEQUENCES ARE MUCH MORE RANDOM THAN WE'D EXPECT. The real records would routinely have 7-heads sequences in them, but the students would always limit their streaks to two and three...


It's only an analogy, but the human mind (and its expectations; the "petty confidence") affects human performance.

It's very likely that Strat-O-Matic cards "should" have hot streaks of (say) magnitude 7 .... and that real humans "should" have hot streaks of magnitude 3 ... and that when you witness humans having hot streaks of magnitude 6-7, this actually IS a reflection of the fact that they are having remarkable hot streaks and cold streaks.

Or not :- )
11:03 PM Jul 11th
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy