Remember me

Triangle

June 25, 2010

            This started about four years ago, when I was appearing on a panel with the great Peter Gammons.   There was a local player who was a Hall of Fame candidate, and the audience asked me whether I thought he should be in the Hall of Fame.  I thought, frankly, that he should be allowed to visit the Hall of Fame like anybody else, but as the room was clearly sympathetic to the candidate, this didn’t seem like the right thing to say.   I would have been happy to lie about it and say that I thought he should be elected without delay, but unfortunately I was on record as saying that he should not, and as explaining at great length why he should not, so I would have looked stupid trying to weasel out of it.   I explained as briefly as I could that I wished him the best, but I didn’t think he was a Hall of Fame player, and talked briefly about secondary averages, home park effects, runs created relative to context, and some other nonsense.   The crowd was getting surly.

            At this point somebody asked Peter Gammons the same question, and Gammons replied “Less than .300 and less than 400 homers is tough.”   It was a perfect answer.   Ten words, and he had closed off the debate.

            There is a simple idea at the core of this answer, which is that a player’s Hall of Fame position can be triangulated with respect to a very few stats.   Since then I have wondered, How accurately can one predict a player’s Hall of Fame status with respect to just three stats?  Hits, home runs, batting average.    How well can you do?

            The first thing we have to describe is what we mean by “accurate”.   This is the scoring method that I used:

            1)  If the system predicts that an eligible player should be in the Hall of Fame and he is in the Hall of Fame, that’s a “+”, or a “Win” for the system.

            2)  If the system predicts that a player should be in the Hall of Fame but he is not, that’s an type-one loss for the system.

            3)  If the system fails to predict that a player should be in the Hall of Fame but he is, that’s a type-two loss for the system.

            4)  If the system predicts that a player should not be in the Hall of Fame and he is not, that’s a non-event.   It doesn’t count for or against the system.

 

            We kind of have to do it that way, because if you count it as a “success” when the system says a player should not be in the Hall of Fame and he is not, then your system will be right 99% of the time whether it is worth a crap or not.   My system says that Rafael Belliard should not be in the Hall of Fame—and he isn’t!  You can rack up a lot of points that way.   We don’t count those points.

            Let’s start by simply “predicting” that every player with 3000 career hits will be in the Hall of Fame, and every player with less than 3000 will not.  How accurate is that?

            It’s 16% accurate.   It yields 22 right answers, and zero type-one errors.   No one predicted by this system to be in the Hall of Fame is not in (not counting Pete Rose, because Rose is not eligible.)   But it gives us 119 type-two errors, since there are 119 hitters in the Hall of Fame who did not have 3000 career hits.

            Definitional quibbles. .. .not counting players who also managed and are in the Hall of Fame as managers, not counting Negro League stars, pitchers, etc.    You know that stuff.

            So we’re 22 for 141, which is 16%.

            If we go down to 2900 hits, our system improves to 22% accuracy.   At 2800 hits, we’re up to 27%; at 2700 hits, up to 30%, etc.

            But by 2700 hits, we have errors of both kinds.  Harold Baines, Al Oliver, Rusty Staub, Roberto Alomar, Vada Pinson, Bill Buckner, Dave Parker and Doc Cramer had 2700 hits, but are not in the Hall of Fame.  If we predict that all players with 2700 hits should be in the Hall of Fame, we’ll be wrong on those 8 players.  At 2700 hits, we have 45 correct predictions, 8 type one errors, and 96 type two errors, as there are 96 hitters who are in the Hall of Fame with less than 2700 hits.

            This “system”—this one-element measure—continues to improve in accuracy until we reach the level of 2,210 hits.   At 2,210 hits, the system is 48% accurate—91 correct predictions, 47 type one errors, 50 type two errors.   If you go either up or down from 2,210, the system becomes less accurate.

 

            Getting ahead of ourselves, we will be able to improve on that level of accuracy—but not by a whole lot.     That’s almost as good as we’re going to do, and I’m telling you that now because I don’t want to set up expectations and then fail to deliver.   But getting back to the line of march, next let’s check Home Runs, as a one-dimensional measure.

            As a single-element predictor of Hall of Fame status, home runs are much less accurate than hits.    The accuracy of home runs as a Hall of Fame predictor peaks at 294 home runs, but that accuracy is just 22%.   At 294 home runs we have 41 accurate predictions, 46 type one errors, and 100 type two errors (meaning that there are 100 Hall of Fame hitters who did not hit 294 homers.)    That’s the best we can do with home runs.

            Marginally better with batting average.  Batting average, as a predictor, peaks at .311, and the accuracy at that level is 35% (ignoring players with less than 500 career hits.)  

 

            OK; now comes the hard part.    The hard part is figuring out how to combine these elements to get maximum accuracy.  Since there are billions of possible ways to combine them, we can’t try them all.    Let’s start with “Hits + Home Runs”.

            Adding together hits and home runs, the accuracy of the prediction system, at its peak, reaches .492.  The peak is at 2,386.   91 players have a total of 2,386 Hits and Homers, and are in the Hall of Fame.   But 44 players have 2,386 Hits + Homers and are not in the Hall of Fame, and 50 players are in the Hall of Fame but don’t have 2,386 Hits + Homers—a total of 94 errors.  91 and 94; 49%. 

            I tried Hits plus 2 * Homers, but this makes the optimal prediction less accurate.   I tried Hits plus 3 * Homers, but that was even less accurate.    Adding additional weight to a home run did not bring us closer to the target.   Working with 1.50*homers or 1.25*homers might have given us some small benefits, I don’t know.   I didn’t try those things. 

            OK; let’s work with Hits + Homers, and try to bring in Batting Average.  Again, there are a million ways to do this:

 

            (Hits + Homers) * Batting Average

            (Hits * Batting Average) + Homers

            2 (Hits * Batting Average) + Homers

            (Hits + Homers) * (Batting Average - .100)

            (Hits + Homers) * (Batting Average + .100)

            (Hits + Homers + 250) * Batting Average

 

            Etc., etc., without limit.    For each formula that you try, you have to find the “cutoff level” at which that formula has maximum value.

            I experimented with formulas of this nature for several hours.   The best formula that I was able to find. . .well, actually there were two that tied:

 

            A)   (Hits + Homers) * (Batting Average - .050)

            B)   (Hits + Homers +1500) * (Batting Average + .100)

 

            (A) is simpler than (B) and just as accurate, so we’ll use (A).   This formula reaches its peak accuracy at 584.   If a player’s Hits plus Homers, times Batting average minus .050. . .if that number is greater than 584, the player will be a Hall of Famer.

            This formula makes 96 correct predictions.    It also predicts Hall of Fame status for 35 players who have not made the Hall Of Fame, and fails to predict Hall of Fame status for 45 players who have been selected.    That’s 96-80, or 55% accurate.

            Conservatively stated; we could argue that this prediction is 73% accurate.   Of 131 players who have a score of 584 or more, 96 are in the Hall of Fame.   That’s 73% (96-35).   But I think it’s better to say 55%.  It’s not a fantastic level of accuracy, but remember, we’re only using a tiny bit of the information available to us.   We’ve got reams of stats about every player; we’re only using three of them.   We don’t know whether the player is a catcher or a first baseman.   We don’t know whether he stole a thousand bases or a dozen.   We don’t know whether the player was a World Series hero or Bill Buckner.    We’re at 55% without considering any of that.   It’s not too bad.

 

 

Why are We Doing This?

 

            We are doing it because it is useful, as a part of a package, to know where a player stands with respect to the Hall of Fame.   What is a Hall of Fame combination of these three critical stats?   Nomar Garciaparra retired this spring with 1,747 hits, 229 homers and a .313 batting average.   Is that a Hall of Fame triangle, or is it not?

            It is not.   It’s 519; the cutoff is 584. 

            Johnny Damon, on the other hand, does have a Hall of Fame combination.   Damon entered the year with 2,425 hits, 207 homers, and a .288 average.    That’s comparable to Ryne Sandberg (2,386 hits, 282 homers, .285), Enos Slaughter (2,383 hits, 169 homers, .300) or Jim Bottomley (2,313 hits, 219 homers, .310).   It scores at 628.  Most players in with combinations similar to that have historically made the Hall of Fame.

            Am I saying that Johnny Damon has the Hall of Fame made in the shade?   No, of course not.    There are always other factors.   Sandberg was a Gold Glove second baseman.   Slaughter missed three years with World War II.   Bottomley had several buddies on the Veteran’s Committee.   There are always other factors—plus, our system is only 55% accurate.  45% of the time, players with Hall of Fame combinations will not make the Hall.   It’s not a mandate; it’s just a yardstick.   It is useful to have yardsticks.

 

 

 

 

But Wait a Minute

 

            But there is another problem with suggesting that Damon’s a Hall of Famer, which is:  that it seems obvious that the standards which have prevailed in the past cannot prevail in the future.

            Expansion.    There are 30 teams now, not 16, and careers are longer than they used to be.   In order for the standards which have applied in the past to apply into the future, there would have to be an increase of 100% or more in the number of players who are selected.    That’s not going to happen, because nobody wants it to happen.   People complain about the declining standards for the Hall of Fame, but. . .it’s not the real world.   In the real world, the standards for Hall of Fame selection started to creep up years ago, and the rate of increase is going to accelerate.

            So the standard in the future isn’t likely to be 584, but something more like 700.

 

            A few lists for you.   These are the top ten scores of all time:

 

First

Last

H

HR

Avg

Score

Ty

Cobb

4189

117

.366

1362

Hank

Aaron

3771

755

.305

1154

Stan

Musial

3630

475

.331

1153

Pete

Rose

4256

160

.303

1117

Tris

Speaker

3514

117

.345

1070

Babe

Ruth

2873

714

.342

1048

Rogers

Hornsby

2930

301

.358

997

Willie

Mays

3283

660

.302

993

Honus

Wagner

3415

101

.327

975

Nap

Lajoie

3242

82

.338

958

 

            All Hall of Famers except the guy who isn’t eligible.   These are the top ten scores for players not in the Hall of Fame:

 

 

First

Last

H

HR

Avg

Score

Pete

Rose

4256

160

.303

1117

Barry

Bonds

2935

762

.298

917

Rafael

Palmeiro

3020

569

.288

856

Manny

Ramirez

2494

546

.313

800

Ken Jr.

Griffey

2763

630

.285

797

Derek

Jeter

2747

224

.317

794

Alex

Rodriguez

2531

583

.305

793

Harold

Baines

2866

384

.289

778

Craig

Biggio

3060

291

.281

775

Gary

Sheffield

2689

509

.292

773

 

            And these are the top ten scores for players who are eligible for the Hall of Fame, but not in:

 

First

Last

H

HR

Avg

Score

Harold

Baines

2866

384

.289

778

Al

Oliver

2743

219

.303

750

Roberto

Alomar

2724

210

.300

734

Dave

Parker

2712

339

.290

732

Vada

Pinson

2757

256

.286

711

Steve

Garvey

2599

272

.294

701

Fred

McGriff

2490

493

.284

699

George

Van Haltren

2532

69

.316

691

Bill

Buckner

2715

174

.289

690

Rusty

Staub

2716

292

.279

690

 

           

            I went through a lot of gyrations trying to find some formula which didn’t show Harold Baines to be well above the normal standard of a Hall of Famer.   I couldn’t find any such formula.   Every combination that I tried, without exception, showed Harold Baines to be the highest-scoring player who is eligible for the Hall of Fame but has not been selected. 

            These are the lowest-scoring players who are in the Hall of Fame:

 

 

First

Last

H

HR

Avg

Score

Ray

Schalk

1345

11

.253

276

Roger

Bresnahan

1252

26

.279

293

Roy

Campanella

1161

242

.276

317

Frank

Chance

1273

20

.296

318

Phil

Rizzuto

1588

38

.273

363

Joe

Tinker

1687

31

.262

365

Johnny

Evers

1659

12

.270

368

Tommy

McCarthy

1496

44

.292

372

Rick

Ferrell

1692

28

.281

397

Hughie

Jennings

1527

18

.311

404

 

 

            And these are the 20 highest-scoring active players as of the close of the 2009 season:

 

 

First

Last

YEAR

H

HR

Avg

Score

Manny

Ramirez

2009

2494

546

.313

800

Ken Jr.

Griffey

2009

2763

630

.285

797

Derek

Jeter

2009

2747

224

.317

794

Alex

Rodriguez

2009

2531

583

.305

793

Gary

Sheffield

2009

2689

509

.292

773

Ivan

Rodriguez

2009

2711

305

.299

751

Chipper

Jones

2009

2406

426

.307

729

Vladimir

Guerrero

2009

2249

407

.321

721

Todd

Helton

2009

2134

325

.328

684

Garret

Anderson

2009

2501

285

.295

682

Johnny

Damon

2009

2425

207

.288

628

Omar

Vizquel

2009

2704

78

.273

619

Jim

Thome

2009

2138

564

.277

615

Ichiro

Suzuki

2009

2030

84

.333

598

Albert

Pujols

2009

1717

366

.334

591

Bobby

Abreu

2009

2111

256

.299

590

Magglio

Ordonez

2009

1974

277

.312

590

Carlos

Delgado

2009

2038

473

.280

577

Miguel

Tejada

2009

2114

285

.289

573

Edgar

Renteria

2009

2185

132

.288

550

 
 

COMMENTS (13 Comments, most recent shown first)

BuddyPC
"There was a local player who was a Hall of Fame candidate, and the audience asked me whether I thought he should be in the Hall of Fame. I thought, frankly, that he should be allowed to visit the Hall of Fame like anybody else, but as the room was clearly sympathetic to the candidate........... and Gammons replied 'Less than .300 and less than 400 homers is tough.' ”

Who is...Jim Rice?

New England Red Sox fans have been as ridiculously zealous in his enshrinement merits in recent years while ignoring his shortcomings, as they were hostile to him for same in the second half of his career. Sort of penitent, really.
8:02 AM Jul 3rd
 
glkanter
Besides, all you other readers keep going on about OPS. Which has a huge BB component:

Player BB OBP SLG OPS
Thome 1,646 .404 .556 .960
Manny 1,312 .411 .589 1.000
1:28 PM Jul 2nd
 
glkanter
It's funny how different Thome's and Manny's scores are...
1:02 PM Jul 2nd
 
hankgillette
Did you try compressing your Hall of Fame Monitor down to just the three statistics?

At any rate, I'll bet you could get a lot more accuracy by using hits, OBP, and SLG. Hits would give you a rough estimate of longevity, and the other two would tell you how good he was.
2:15 AM Jul 1st
 
markj111
There are a lot of 1B/DH types on the highest scores not in the HOF.
1:38 PM Jun 30th
 
3for3
Obviously adding a 4th element, position would make this far more accurate. Even just separating C/2B/SS from the others would make a more accurate square.
11:37 AM Jun 28th
 
tbell
I've got to say, as an avid Bill James reader and student for more than 25 years, this sort of twaddling around, at this late date, is most uninspiring.

Many years ago, Bill James showed me that batting average, hits and home runs need to be understood in context - context that involves a universe of nuance.

Many years ago, Bill James also showed me that the writers with Hall of Fame votes are almost unanimously a horde of nitwits without the slightest grasp that that universe of nuance exists.

So why, in 2010, is Bill James wasting his time with junk toys that serve only to show that Hall of Fame voters are a horde of nitwits whose votes are predictable by the most primitive possible means? We already knew that. Which is why I, and many like me, no longer give a damn about the Hall of Fame - much less quantifications of how primitively HOF voters think.

And why is Bill James wasting his time massaging value out of three stats the value of which is already thoroughly understood by anyone who cares to understand them? An understanding that most of us have achieved primarily through Mr James's writings?

Futile exercises like this, on the arid topic of how Hall of Fame voters think, are a waste of my time. And more importantly, of Bill James's time.
12:51 AM Jun 27th
 
kseesar1
A fun article, for the simple reason, that, as you mentioned, while combining three basic stats is far from perfect, it is a useful way to measure who may or may not be eligible for the Hall of Fame. What is particularly interesting is the cut-off of 700 for the active players list. I know, many of them are about done (or in Junior's case, clearly done) or on the downside of their careers, but still, the 700 cut-off point is about right on an intuitive level. I have never thought of Todd Helton as a future Hall of Famer, but Vlad I think will make it eventually, along with Chipper, Pudge, et. al. Again, it's a simple measurement, but a fun one.
2:57 PM Jun 26th
 
CharlesSaeger
Shortened article: you really can't predict who will go to the Hall based on three numbers.

Idea: there ARE points of automatic induction, like 3,000 hits (actually 2,873 by George Ruth -- Harold Baines is the first guy eligible and not in). There are points below which no one is selected. If we leave aside George Wright (866 hits, including the National Association, but NOT including the era before the National Association, which is the real reason why he was picked), the lowest hit total for a Hall of Famer is Campanella, 1161, using only those hitters who played in the white majors for ten years and excluding managers and pitchers. Would the formulas be any more accurate if you were to say, "If you get 2900 hits or 600 home runs, you're in, and if you're not, here's a quick way to figure the chances"?
5:16 PM Jun 25th
 
rgregory1956
Years ago, you made an off-the-cuff remark that if you only had one statistic to judge a player's "greatness", that stat would be Games Played. Just wondering: how accurate is Games Played in predicting HOF staus?
1:18 PM Jun 25th
 
tangotiger
Jim Rice is at 703, so this doesn't really help the argument. Tim Raines at 677. I think rather than THOSE three stats, it should be the THREE best stats that make the case. So, for Raines, it would be SB, R, and H, and you'd have to mix them up in a certain way so that it doesn't make him overly qualified, etc, etc. For Rice, it would be BA, SLG, RBI.
12:17 PM Jun 25th
 
Kev
Just as ERA and Wins do not measure a pitcher as well as WHIP, and BB/SO, membership in the Hall of Fame doesn't
automatically establish the player's credentials for greatness.

I would think that to measure a player's value, a system of significant depth such as, but not WS. I don't know that one exists or could be established, but the HOF is really no more than a medal of recognition, a society to which entrance is gained by a far from perfect sysstem.

For me, the HOF is overrated, but fun.
11:51 AM Jun 25th
 
wovenstrap
Edgar Renteria?
10:51 AM Jun 25th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy