Oh, Bee Pee? (SECOND TRY)

April 16, 2021

Couple of days ago, I asked the Oracle (i.e., "Hey Bill") an idle question, which is to say a question I had no idea the answer to, and could have cared less what the answer was. Typical lazy American thinking, I know, which makes it, I guess, an "American Idle" question. Might as well copy out the Q. and A., since they were both short and to the point, and since Bill didn’t have much more to his answer than my question warranted, this can illustrate the futility of both:

 

If you had only two data points, 1) a player's exact age, and 2) his OBP on the day he played his 400th MLB game, how accurately do you think you could predict his career OBP? Crapshoot?  As well as you can predict the weather six weeks from today? Within .010-020 OBP points?

Asked by: Steven Goldleaf


Answered: 4/10/2021

 I would guess I could get within 10-20 points 80% of the time or more.  

 

Thinking back, I was trying to ask HOW Bill would figure this one out, rather than the question I did ask, which was IF he thought he could figure it out. In the larger sense, which was so large that I couldn’t possibly fit it into the narrow frame of a "Hey Bill" question, I was wondering if there is an early point to most players’ careers where their stats were essentially set in stone, and the final decade or so of their careers is just a matter of playing out the hand they’re dealt. Take a youngish player like Pete Alonso—can you look at his stats, particularly his rate stats (because the one big variable in amassing counting stats is career length, which basically rests on health) and predict with any accuracy where they’ll end up?

Now, of course you can do this relatively late in a player’s career. It doesn’t take much smarts to look at a 30-year-old with 1000 games under his belt and a .350 OBP and predict that he’ll wind up with a career OBP in the .340-.360 range, and if pressed further you might go towards the low end, since you know that his last year or two or three, he’ll probably lower that nice cushy OBP a bit and then lose his job.

But if we can confidently predict the future of a 30-year-old with 1000 games played, can we do the same with a 29-year-old with 850 games played? As Don Vito Corleone says in his first spoken line of English in the Godfather saga, "Yeah, sure, why not?"

So how low can we go? How early in a player’s career can we inspect his OBP through that season, and derive a reasonably accurate guess as to his final MLB performance in OBP?  Obviously we can’t do it based on his first few plate appearances (though I didn’t specifically check that out—it would be a goofy thing to do, which might put it right up my alley).  But could you do it after the 400 games I arbitrarily threw out to the Oracle?

I picked that number because it seemed low enough to have predictive value, and because 400 games seemed to represent about three seasons of play, about the point that I think players have shown us who they are.

I also specified ‘a player’s exact age,’ mainly because I suspect Bill has a data base that can actually rank players by their dates of birth, and because I know that Bill places a lot of importance on players’ dates of birth—if a 21-year-old has a much better chance of a strong career than a 22-year-old, given the same exact stats (a thesis of Bill’s that continues to amaze me), then presumably even smaller differences will have some sort of effect, a half-year jump for sure, but maybe even a month or a week could prove advantageous.

Problem here is I have no such database. I wouldn’t even have a clue how to design a study that relied on players’ exact dates of birth (and why not hours of birth while I’m at it?), other than the brute force method, and I’m not a very forceful guy. So ‘exact date of birth’ is a non-starter, and ‘exactly 400 games’ is something I could find, again, only by brute force, the tedious if not impossible task of going through each player’s career to find the season in which he played his 400th game and then to go through his daily log to learn which game that was within that season, and then computing his OBP through that exact game.   I could do it, I suppose, but I could also push a peanut up a mountain with my nose, another task that’s not getting crossed off my "To-Do" list any time soon.

But I could approximate both categories. I decided that "400 games" would yield results not too far from "350-449 games" and "exact dates of birth" could be approximated with the "age as of midnight, June 30" that BB-ref.com supplies. (Very wacky specificity—they chose the single minute of the day that is ambiguous. Literally any other minute would tell you clearly which day they mean, whereas "midnight" makes you wonder if they mean just as June 30 begins or as it ends.) If I could get results using this approximate method, then I might presume that a more exact method would yield more exact results.

I sorted a decade’s worth of MLB players at age 24 (quite tedious enough for me, thanks) and eliminated those players who had played more than 449 games and fewer than 350 games. This in itself was sort of educational. I had figured that most 24-year-olds would have played a few partial seasons (24 is a little old to be playing your first MLB game) and good players would have gotten their first big-league at-bats by 21 or 22—I wanted good players because they would be the ones to have substantial big league careers. No sense tracking what zhlubs and scrubs and bubs would do, right? We know where they’ll be at age 30, long retired.

But to my surprise, most 24-year-olds did NOT have between 350 and 449 games played. Some were playing their rookie seasons at age 24, and some were playing their 600th game by that point. I’d guess that of the 50 or 60 candidates I scrutinized (24-year-old players who seemed to be in their 3rd or 4th MLB season), fewer than 20 fell into the 350-449 games range.

For convenience’s sake, and for nostalgia’s sake, I decided to limit this little study to 24-year-olds in the 1960s, mainly because I am so deeply familiar with those players that I could make intelligent guesses as to where each 24-year-old was in his career, helping me eliminate players to look up who I knew were playing their rookie seasons at age 24 (I didn’t for example have to look up Jim Lefebvre in 1966 because I knew that was his rookie year) and players who had long since played their 500th MLB game by age 24 (I didn’t have to look up Orlando Cepeda in 1962 because I knew he’d been a full-time superstar for at least five years at that point). There were about equal numbers of players who fell below 350 games as rose above 449, so I was pleased that I’d chosen a reasonable age for someone to have played between 350 and 449 games.

My first question is: what would you suspect the relationship to be between a player’s OBP through age 24 and his lifetime OBP? A fair guess would be "his OBP rises as he ages" since a 24-year-old still has a few seasons to go before reaching his statistically most likely peak age of 27. But how much does his OBP rise? A lot or a little? And is it predictable, or random?  That is, if it goes up by an average of .020 points after age 24, do some players go up by .080 points and others fall by .040 points or something? That wouldn’t seem like useful information to me, unless there were a pattern of which types of players rise and which types fall.

So go ahead, guess the relationship before you look at the table below. Random? Consistently in the + .010-.020 range? Consistently in the .050-.070 range?

While you’re contemplating, I’ll wax nostalgic. (I already waxed Roth in my last article.)  It was fun for me to sort these players by age: I made a table for each season of all MLB players and then I clicked on "age" to group all the players who were 24 that year, but in doing so, I created a big table starting with the 17-olds who had gotten at least one plate appearance that year, and the 18-year-olds, and so on. Before I scrolled down to the 24-year-olds, I read those names, and it was amazing how many of those teenagers were total washouts.  I mean, you might suppose that if a big-league club decided that a teenager was skilled enough to be on a big-league roster, he must be pretty precociously talented, but no, not really.

Oh, sure, you get an occasional "Rusty Staub" or "Johnny Bench" showing up on these lists of teenaged debuts, but mainly it’s a bunch of Jay Dahls and Jerry Hinsleys and Rick Jameses and Jimmy McMaths and Mike McQueens, an assembly of big-league washouts who nobody ever heard of. Also every single year the player heading up the list of alphabetized players before I sorted them by age was, of course, Henry Aaron, which I find strangely comforting—same guy, same alphabetical placement, year after year after year.

Now to the list itself, of all 18 players from 1960 through 1969 who had had between 350 and 449 games played at the age of 24. I assembled this listing by eye, going through the 24-year-olds, many of whom had only a handful of games, being pitchers, or two handsful of games, being bench players, September callups, etc.  The qualification for my checking out his exact career games played through age 24 was if he had at least 100 games played that year: if someone was a part-timer at age 24, he was very unlikely have played 350 career games, though I might have missed one or two if they played in 60 games that year, say, due to injury but had played fulltime for a couple of seasons before age 24.  Also if someone was a full-timer, with over 150 games his age 24 season, and I thought he’d certainly played two previous 150+ game seasons (Ron Santo, Pete Rose, Cepeda, and a few others) I eliminated them—I might have made a mistake or two here, too, but basically this is the list:

24 year olds with about 400 games played

24 year old

Year turned 24

Games

OBP that year

Career OBP

Killebrew

1960

390

.349

.376

Taylor

1960

436

.322

.321

C. Boyer

1961

443

.288

.299

Geiger

1961

428

.335

.337

McCovey

1962

350

.369

.374

Cardenas

1963

433

.309

.311

Hershberger

1964

439

.324

.316

Nicholson

1964

374

.316

.318

Jesus A. Lou

1966

384

.302

.305

Campy

1966

353

.313

.311

K. Harrelson

1966

412

.313

.325

McMullen

1966

406

.302

.316

Wynn

1966

399

.338

.366

Agee

1967

359

.309

.320

Kessinger

1967

405

.281

.314

Petrocelli

1967

385

.311

.332

Schaal

1967

409

.319

.341

Scott

1968

445

.322

.333

 

 

 

 

 

 

"OBP that year" means "OBP through age 24," not just in his age 24 season. As you might have astutely observed, the majority of these eighteen had virtually the same lifetime OBP that they had at age 24: Taylor, Geiger, Cardenas, Nicholson, Alou and Campy were all within .003, basically a rounding error, while Boyer, McCovey, Hershberger, Agee, Scott, Harrelson and McMullen were within .014, an improvement that aspires vainly for the status of "barely noticeable."  (Most of these minute differences were improvements, albeit teeny-tiny ones—though Taylor went down by .001, Campy by .002. Damned Cubans. Hershberger LOST .008 after age 24)

The only differences I would call significant were Killebrew, Kessinger, and Wynn, who basically improved by about .030. Otherwise there were virtually no changes that jump out at you. All told, the average difference between age-24 OBPs and career OBPs is under .015, more like .013, I figure.

It’s hard to believe, btw, that some of these guys held onto starting jobs with OBPs like .288, .302, 281 after playing 400-odd games in MLB, but I guess teams weren’t paying that much attention to OBP in those days. But essentially the point that got driven home to me was that there was virtually no improvement for most of these guys—what you saw at age 24 was what you got.

One thing I love about running these 1960s studies is that they’re fun:  I was following baseball so closely back then (other areas of my life, such as school or personal hygiene, not so much) that I could write a few paragraphs about each one of these guys with zero prep, and it wouldn’t be a strain. I’ve always been puzzled by Dave Nicholson NOT being nicknamed "Swish" –maybe "Swish" was a homophobic slur by the early 1960s? There was another Nicholson in MLB (1936-1953) , a good player who was nicknamed "Swish," but that’s a tradition, sticking the nickname of an earlier player onto a later player with the same last name, Dutch Leonard, Dusty Rhodes, Boog Powell, and all those guys—the 1960s Nicholson, though, was a big old swish, in that he set the all-time record for strikeouts in 1963, 175 K, smashing the old record by 33 big whiffs. Someone can probably explain to me how he avoided being labelled "Swish." BB-ref.com lists no nickname for him.

I also love catching BB-ref.com in a mistake. This is actually an error in prosody, but it’s a pretty bad one.  They give pronunciations of players’ names, telling you which syllables are stressed by CAPitalIZing them. So:

Full Name: Tommie Lee Agee

Pronunciation: \ay-GEE\

 

On what planet is the second syllable in "Agee" the stressed syllable? Totally nuts.

I used to teach prosody, or I tried to, anyway.  (Can you say you taught something if no one ever learned anything?) It’s pretty hard, even for students who want to write verse, to understand which syllables in a line get stressed and which get un-stressed.  And I could sympathize: when I took a 9-week seminar in grad school specifically in prosody (the official name for the course was "Style and Prosody," which some wit turned into "Smilin’ Sodomy," because the professor was a sadistic failed poet who delighted in showing all the Ph.D. candidates how little they knew about their chosen field), I did poorly on the first few exercises, marking unstressed syllables as stressed and vice-versa, and feeling like the idiot the professor assured me I was, but then I suddenly caught on, and for the rest of the term turned in perfectly scanned versions of the fiendishly obscure poems he gave us as scansion exercises.  It was actually fairly scientific, and once I grasped that fact, it became downright easy—there were principles to apply, rather than doing everything by feel, and all I had to do was understand what those principles of scansion were. I could explain those principles to you in about fifteen minutes, but if experience means anything, you’re not going to grasp them in fifteen years.

Anyway, there’s no way on earth, or any other sublunar object, that the second syllable in Agee’s name gets the stress. I’ve heard his name pronounced by a few hundred Mets fans and a few dozen announcers and by Agee’s lifelong pal Cleon Jones, and every one of us hit the first syllable hard.

About Tony Taylor, the other day, someone brought up a positioning peculiarity that I had never considered before, that makes so much sense to me: when the Phillies brought up Dick Allen in 1964 they played him at third base, which he had never played before and which, it turned out, he really couldn’t play. At the same time, they had a gigantic hole at first base, which Allen could play passably. Instead they played a guy named John Herrnstein at first base that year, who couldn’t get on base but compensated for that by also not being able to hit for power. Actually, he couldn’t hit for shit. Played 125 games for the ’64 Phils.

The weird thing is that the same Phillies team had more middle infielders than the entire rest of the league combined: they had two regulars for each middle infield position, Cookie Rojas and Taylor at second base, and Bobby Wine and Ruben Amaro at shortstop. Taylor was also an experienced and decent third baseman, raising the question of why they didn’t make him the regular third-baseman, make Allen the regular first baseman, juggle the other three guys in the middle of the infield, send Herrnstein back home to Chillicothe Ohio, and win two or three pennants in the mid-to-late 1960s. For some reason, Herrnstein’s name never comes up in snide remarks about Gene Mauch’s managing, but maybe it should have been the title of his autobiography: "I THOUGHT JOHN HERRNSTEIN WAS A MAJOR LEAGUE HITTER: The Confessions of Gene Mauch."

Okay, that’s hindsight and totally unfair. But you gotta wonder what would have happened if Mauch had decided to make Tony Taylor his regular thirdbaseman. Maybe some Phillies fan can tell me why this blindingly obvious solution to his crowded middle infield problem didn’t cross Mauch’s capacious mind.

Don’t mean to get sidetracked here in 1960s trivia. Back to OBP: of course this is a very limited study and maybe the ability to grow your OBP past the age of 24 is larger than this little study shows. But if it is true that most players’ OBPs are going to stay right at the level they’re at at age 24, that gives new meaning to the axiom "It is what it is." If your new young stud seems to be getting on base at a .310 clip, guess what? He’s a .310 OBP guy, and never is going to improve very much.

Further study is needed, of course. I’m not arguing that the 1960s is definitive—maybe other decades’ 24-year-olds make Killebrew, Kessinger, and Wynn the rule rather than the exception. That’s certainly what I might have guessed would happen with the 1960s, that most players would show something like .030 improvement after age 24, and that a few would even exhibit .050 or .070 improvements.

Another thing to check for, as a kind of control here, would be to see if there was similar growth, or lack thereof, in batting average, slugging percentage, and OPS. I picked OBP because I thought "learning the strike zone" was literally a learning-type skill that improved a lot as a batter got to know the pitchers and the umpires better, but maybe that was where this study went off the rails.

 

 

P.S. Ron Swoboda, who had played 513 games through age 24, and so was one of my rejectees for this study, was born on June 30. I don’t know the time of day he was born, though it is scary that I knew his birthdate without looking it up. If his mother could have extruded him a few hours later, he would have qualified for inclusion in this study. His OBP difference was .009 points, .315/.324.

 
 

COMMENTS (11 Comments, most recent shown first)

shinsplint
Thanks, Steven. 1968 was the quintessential year of the pitcher, of course. The Orioles may not have been too concerned about Belanger's hitting, since Paul Blair hit .211, and Curt Blefary sunk to .200 with 15 homers after his first promising 3 years, among others.
8:21 AM Apr 21st
 
Steven Goldleaf
Very nice post, shinsplint. I see Belanger had just over 750 PA, but under 350 Games Played. It's hard to see how held his job after being that bad for his first 750 PA.
5:56 AM Apr 21st
 
shinsplint
This reminded me of a post I did back in 2011 where I evaluated the OPS of players through age-24 and how they did afterwards. I thought it was some interesting stuff, but it got no response for some reason. I showed the highest gainers and losers between the up-to-age-24 and afterwards as well as how different levels of up-to-age-24 OPS progress. Spoiler alert--there's a regression toward the middle that occurs.

boards.billjamesonline.com/showthread.php?2939-hitting-progression-from-age-24-and-onward
7:33 PM Apr 20th
 
bearbyz
Nice article, way better then your last one which I didn't really understand.

Now I'm wondering, why those three players improved. I would have guessed staying closer to your current on base average would be the norm, maybe a few points lower depending on the length of a players career.

Wynn and Killebrew walked a lot. Killebrew led the league in on base percentage when the strike zone was made smaller. Wynn later moved out of Houston, but LA was no bed of roses. I can't start to explain Kessinger at all.
2:57 PM Apr 17th
 
Steven Goldleaf
Just curious about the point raised in my post-script: how many players' dates of birth (hometowns, etc.) have you got memorized? I never consciously tried to remember this stuff, but just reading their bios obsessively in yearbooks, programs, promo materials and so on for years and years as a young boy, I absorbed so much trivia it scares me. (Mostly scary is how much I could have learned about important things instead of having it branded in my memory that Mets 3bman Charlie Smith was born on September 15th.) Among Mets, I know Smith's birthday, Alvin Jackson's (Xmas day, also true of Rickey Henderson), Swoboda's, Bud Harrelson's (D-Day), Keith Hernandez's (because it's also Mickey Mantle's), Lou Brock's (because it's also mine) Pete Rose's (because it's the start of MLB season), Sandy Koufax's and possibly a few others I'm not remembering right now. Mind you, I can't remember my girlfriends' birthdays very well, which may tell you something about my misplaced priorities. I mean, women get pissy about that stuff, MLB players not so much.
5:42 AM Apr 17th
 
villageelliott



Re: "As with Sabrina and Psycho, the original was better,"---mpiafsky...I agree.​
5:05 PM Apr 16th
 
mpiafsky
As with Sabrina and Psycho, the original was better
2:06 PM Apr 16th
 
Steven Goldleaf
I thought about that, but these actually were the guys who would BENEFIT from the low-offense period, in that their offensive numbers (including OBP) should go UP in the second half of their careers. The fact that after 1968 or so, they DON'T show much improvement actually makes the thesis stronger, no?
10:15 AM Apr 16th
 
3for3
Nostalgia aside, I am surprised you used a time period with very low offense, followed by a rule change. That aside, cool concept, deserving of more study (SLG, too).
9:54 AM Apr 16th
 
malbuff
A fine article and interesting conclusion. I would have expected the OBPs to go up as these guys learned the 'zone better.

I'd also say that if we were to do this study for current-era players (say, post-2000), we'd have to advance the age to 26 or so. Except for an exceptional few-- the Trouts, the Harpers, the Guerrero Juniors-- players tend to hit the MLB at an older age now, since most are playing at least 2 years in college before turning pro.
9:03 AM Apr 16th
 
Steven Goldleaf
OK, it took this time. Rylan must be on vacation or something--usually he's pretty good about getting back to me about stuff I screw up technically. What happened, if you care, is that when I went to submit this piece last time, a prompt came up on my screen asking me to supply a "custom link" which it never did before. So I found the "custom link" box and typed in a link name and all hell broke loose. This time, as usually happens, a custom link was automatically generated and so no such prompt appeared.

The previous piece's comments were pretty funny, though, so I hope that one will stay up there as a monument to technology's fuckups.
4:52 AM Apr 16th
 
 
©2021 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy