Remember me

U Scores

March 11, 2008

          In the last couple of months there has been some talk about “suspicious numbers” for aging baseball players.  

            I should say, in my opening words, that I want nothing to do with the concept of forensic sabermetrics.   The problem with indicting baseball players based on shady statistics is that we lack the ability to reach that which is the goal of all forensics:  certainty.  Those are his fingerprints or they or not; we don’t need to know that they could be.   That is her blood or it is not; those are his tire tracks or they are not; that is her handwriting or it is not.   We need clear answers, not expert speculation.  Saying that a player’s numbers are “suspicious” is dirtying a player’s reputation based on guesswork and inference.     

            I’m not talking about that; I am not suggesting that any player’s statistics indicate on any level a likelihood of performance enhancing drug use.  We do, however, have other levels of deviance from the norm.   There is a “hierarchy of departure from the norm” which would go something like this:

A)    Criminal,

B)     Suspicious,

C)    Unusual,

D)    Normal.

We certainly cannot conclude, based on statistical analysis, that any player has engaged in criminal conduct, and I would regard it as reckless to say, based on the statistics, that anyone’s statistics are suspicious. 

We could, however, observe that certain statistical events are unusual.   Davey Johnson hit 43 homers in 1973 after having played in the majors for several years, hitting no more than 18.   That’s unusual.    Zack Wheat had a career-high 221 hits in 1925 at the age of 37.  That’s unusual.   It doesn’t mean that either of them was using steroids, but it’s unusual.  

Well, how unusual is it?   That’s what I’m trying to get here.   I’m working on a system to “score” how unusual the things are that any player does, with the goal in mind of being able to say, with a measure of objectivity, that “this player’s career is highly unusual, or moderately unusual,” or that it is “not unusual at all.”

The $64,000 question is, “What is unusual?”  Many things are unusual.   It’s unusual to hit more triples in a season than doubles.   It is unusual to have a higher on-base percentage than slugging percentage.   It is unusual to have more walks than hits.  It is unusual to drive in twice as many runs as you score.   None of these things is suspicious, but they are unusual.

In the process of this research I am going to isolate and study as many “unusual” types of accomplishments as I can.   However, first let me note two things that I am NOT going to regard as unusual:

1)  Excellence itself is not unusual.  Willie Mays is not “unusual” because he is great.

2)  Unusual consistency is not unusual (meaning, of course, that unusual consistency is not what we are trying to identify with this research.) 

What is unusual would be a long list, and we probably won’t get to everything.  However, let’s start with some of the things we enumerated before:

U1)  It is unusual to have more triples in a season than doubles.    I gave five points for each triple that any player hit in a season above his doubles total.   Harry Davis in 1897 hit 10 doubles, 28 triples, the most unusual relationship between those two categories in baseball history.  I gave him 90 points for that.

U2)  It is unusual to have extra base hits on more than one-half or less than one-ninth of your hits.    I gave a player five points for each extra-base hit that his extra-base hit total was

a)      below one-ninth of his total hits (minimum 18 hits), or

b)      above one-half of his hits (minimum 10 hits). 

Juan Pierre in 2000 had 62 hits, of which only 2 were doubles, none triples and none home runs.   That’s unusual.   Pierre gets 24.4 points for that—62 divided by 9 is 6.889, minus 2 is 4.889, times five is 24.4.

Unusual combinations on that end used to be more common.    Willie Keeler in 1898 had 216 hits, of which 206 were singles, and Roy Thomas in 1900 had 168 hits, of which 161 were singles.   Altogether there are 1,051 players in my data base who qualify for points on U2-a, including one—Willie Bloomquist—in 2007.

In modern baseball it is more common (though still quite uncommon) to see a player with more extra-base hits than singles.    There are 348 players in my database who have more extra base hits than singles, led by Barry Bonds, 2001 (107 extra base hits, 49 singles) and Babe Ruth, 1921 (119 extra base hits, 85 singles).    Bonds is 29 hits above one-half of his hits, thus gets 145 “odd performance points” on this account.   There have been players getting points on this account since 1883.

U3)  It is unusual to have more walks in a season than hits.    There are 42,000 players in my data, of whom 41,000 have at least as many hits as walks, and about 1,000 have more walks than hits.   Jack Crooks in 1892 had 95 hits, 136 walks, which stood as the record for its type until Barry Bonds came along.  I gave each player one point for each walk that they had in excess of their hit total.

U4)  It is unusual to have twice as many RBI in a season as home runs.   I awarded two points under U4 for the player’s RBI, times .6, minus runs scored. 

Vic Wertz in 1960 had 45 runs scored, 103 RBI—the most “unusual” ratio of all time.   That’s worth 33.6 points (103, times .6, minus 45, times 2.)  Bengie Molina in 2007 was 81-38—the 17th most “unusual” ratio of all time, 21.2 points. 

            U5)  It is unusual for a player to increase his career high in home runs after the age of 30.   86% of players have their career high in home runs by age 30.   For those who didn’t I gave them:

            One point for each Home Run by which they increased their career high at age 31.

            Two points for each additional Home Run at age 32.

            Three points at age 33, etc.

            Hank Sauer at age 30 had a career high in home runs of 5.   At age 31 he hit 35 home runs, increasing his career high by 30, and earning 30 points under U5.   Four years later he increased his career high to 37, earning him an additional 10 points (5 * 2 or, if you prefer,  (35 – 30) * (37 – 35)).   Two years after that he increased his career high in home runs by an additional four, earning him another 28 points (7 * 4 or, if you prefer, (37-30) * (41 – 37)).

            U6)  It is unusual for a player to significantly increase or significantly decrease his home run frequency after establishing a major league baseline.      To factor that in, I figured each player’s career home runs per plate appearance at the end of the season in which the player played his 500th career game (S-500).  I then figured an “expected career home runs” for each player based on his career plate appearances and his career home run rate through S-500.   I then gave the player one point for each two home runs greater than ten that his career home run rate was different from his expected.

            Al Dark, for example, played his 500th game in 1951.  At the end of that season he had hit 36 career home runs in 2,507 plate appearances.   He finished his career with 7,829 career plate appearances.   We thus could have expected him to hit 112 home runs—actually, 112.42.  

            We thus treat any number between 102.42 and 122.42 as “normal”, and Dark would get no points under U6 if he were between those points.  But he hit 126 career home runs—3.58 more than we would regard as normal deviation.   For this, he gets 1.79 points.

            Some people will object that U5 and U6 are redundant measures of the same trait.  Yes, that’s true, they are—as is U9 and, to an extent, U8.   We have imperfect measures of these unusual career paths.   My belief is that by measuring this “late in life Home Run growth” in different ways, we get a better approximation of the underlying events than if we simply make one kind of flawed and arbitrary measurement. 

            U7)  It is unusual for players to have seasons in which their OPS is more than 150 points (.150) distant from their career norm.    Norm Cash in 1961 had an OPS of 1.148.  His career OPS was .862—a discrepancy of 286 points (.286). 

            In this category I ignored discrepancy occurring in 100 plate appearances or less, and credited points for discrepancies larger than .150 in more than 100 plate appearances.   These were credited by the formula

            Career OPS

            Minus Season OPS

            Minus .150 (set to zero if total is less than zero)

            Times Plate Appearances minus 100

            Divided by 1.5.

           

            For Norm Cash, 1961, this is 1.148, minus .862, minus .150, equaling .136.

            Cash had 672 at bats, so we multiply this by 572, making 77.8.

            Divided by 1.5, making 51.8.

            Cash’s 51.8 “outlying season score” in 1961 is the 7th highest of all time. 

            You are no doubt asking how I arrive at these contorted and arbitrary formulas.   The answer is:  I have too much time on my hand.   No, seriously.  ..I’m trying to give essentially equal weight to each type of “oddness”.   I am trying to get essentially the same number of seasons labeled as “unusual” and the same number of points awarded for unusualness in each category.   I fool around with the parameters until I get results in which

a)      about the same number of points are awarded, and

b)      the results seem reasonable.

A system in which Norm Cash, Luis Gonzalez, Brady Anderson and Jim Hickman are listed among the players having the most unusual seasons in history seems like a reasonable system.  

U8)  It is unusual for players to have prime seasons at ages before 24 and after 30.   Not terribly unusual, but 72% of prime seasons are between the ages of 24 and 30. 

How do we measure this?  I’m only dealing with position players here, not pitchers.  I figured the “season score” for each season, and identified all seasons which exceeded .800 of the player’s career high season score as a prime season.    I then subtracted .800 times the player’s career high season score from each season which exceeded .800 of the career high.

This then was multiplied by

7 if the player was 18 years old

6 if he was 19

5 if he was 20

4 if he was 21

3 if he was 22

2 if he was 23

But zero if he was 24.   This was done, of course, because it is more unusual to have a prime season at a very early age.

On the other end, this was multiplied by

1 if he was 31.

1.5 if he was 32.

2 if he was 33.

2.5 if he was 34.

3 if he was 35,

Etc.

And the product of that was divided by 12.

OK, Bob Horner, 1980.   Season score, 225.   His career high was 260.   .800 times 260 is 208.   225 minus 208 is 17.  Horner’s 1980 is a prime season because it exceeds 80% of his career norm, and it exceeds 80% of his career norm by 17 points.

Horner was 22 years old, so we multiply the 17 points by 3, and divide by 12.   Horner winds up with 4 “odd season points” for having a prime season at age 22. 

U9) It is unusual for a player to increase his career high in Home Runs after playing more than 500 games.  This, of course, is an amalgam of points 5 and 6, which attempted awkwardly to measure the same thing.    In this category, we simply award one point for each Home Run that the player increases his career best in homers, after the season in which he plays his 500th game.  Davey Johnson, for example, gets 33 points because his career high in home runs at the end of the season in which he hit his 500th home run was 10, and he subsequently improved that to 43.    Hank Sauer, on the other hand, gets only 6 points in this category because, while his home runs were hit late by age, he did hit 35 home runs in his first season as a regular.

U10) It is unusual for a player to have an on-base percentage which is 15% higher or 33% lower than his slugging percentage.   In this category I ignored players who had less than 100 plate appearances.  For players whose on base percentage was 15% higher than their slugging percentage, I subtracted 1.15 times the slugging percentage from the on base percentage, and multiplied the difference by Plate Appearances minus 100.  

Bill North in 1980 had an on base percentage of .373, a slugging percentage of just .292.    His on base percentage was 37 points higher than 15% more than his slugging percentage.   He had 500 plate appearances.   Multiplying .037 times 400, then, North receives 15 “odd points” for having an unusual ratio between his on base percentage and his slugging percentage.

On the other end, players are credited with U-points if their on-base percentage is less than two-thirds of their slugging percentage.   Same process, reversed; two-thirds of slugging percentage, minus on base percentage, times 100 less than plate appearances. 

Victor Diaz in 2007 had a .259 on base percentage, a .538 slugging percentage, which is the most extreme ratio ever between those two stats (in 100 or more plate appearances, although there are a few players not included in my data.)   Two-thirds of his slugging percentage is .359, minus his on base percentage leaves .100.   He had only 108 plate appearances, however, so he is credited with only 8/10 of one “U point”.

 

 

OK, that’s all I have so far.   There are many, many other kinds of “unusual” accomplishments for hitters, and, if I get time, I’ll add points for other unusual career progressions or unusual combinations of events.   I hope you will tell me, in the space below, what things you would regard as unusual occurrences for hitters and (later) for pitchers, and I hope that I’ll get time to incorporate some of those things into the system. 

But right now, let me summarize the results of points U1 to U10.  

 

U1 is points awarded for having more triples in a season than doubles.   The top ten seasons in this area are:

 

Rank

Player

YEAR

U1

1

Harry Davis

1897

90

2

Chief Wilson

1912

85

3

Duff Cooley

1895

55

4

Bill Kuehne

1885

50

5

Heinie Reitz

1894

45

6

Hughie Jennings

1899

45

7

Deion Sanders

1992

40

8

Edd Roush

1916

40

9

Tommy Leach

1902

40

10

Eleven Tied with

 

35

 

A total of 9,465 points are awarded (through 2007) under rule U1.

 

U2 is points awarded for having extra base hits on more than one-half or less than one-ninth of the player’s total hits.   The top ten seasons in this area are:

 

Rank

Player

YEAR

U2

High/Low

1

Barry Bonds

2001

145

High

2

Babe Ruth

1921

85

High

3

Albert Belle

1995

82.5

High

4

Mark McGwire

1998

75

High

5

Mark McGwire

1999

72.5

High

6

Willie Keeler

1898

69.88

Low

7

Babe Ruth

1920

65

High

8

Willie Stargell

1973

60

High

9

Roy Thomas

1900

58.24

Low

10

Jim Edmonds

2003

57.5

High

 

A total of 10,289 points are awarded under rule U2.

 

 

U3 is points awarded to players with more walks in a season than hits.   The top ten players are:

 

 

Rank

Player

YEAR

U3

1

Barry Bonds

2004

291

2

Barry Bonds

2002

147

3

Jack Crooks

1892

123

4

Barry Bonds

2007

114

5

Jimmy Wynn

1976

102

6

Roy Cullenbine

1947

99

7

Eddie Yost

1956

96

8

Yank Robinson

1890

93

9

Ferris Fain

1955

81

10

Wes Westrum

1951

75

 

And a total of 10,050 points are awarded under U3.   

 

U4 is points awarded to hitters whose Runs Scored are less than 60% of their RBI.  The top ten hitters:

 

Rank

Player

YEAR

U4

1

Vic Wertz

1960

33.6

2

Earl Sheely

1931

32.4

3

Larry McLean

1910

31.2

4t

Bob Oliver

1974

24.8

4t

John Bateman

1963

24.8

4t

Smoky Burgess

1965

24.8

7

Chief Meyers

1910

24.4

8t

Doc Miller

1914

23.6

8t

Rusty Staub

1983

23.6

8t

Smead Jolley

1931

23.6

8t

Terry Kennedy

1983

23.6

 

A total of 9,794 points are awarded under rule U4.

 

U5 is points awarded to hitters who establish new career highs in home runs at age 31 or later.   The top seasons in this regard are:

 

Rank

Player

YEAR

U5

1

Barry Bonds

2001

162

2

George Crowe

1957

96

3

Luke Easter

1950

92

4t

Andres Galarraga

1996

80

4t

Bob Thurman

1957

80

6

Luis Gonzalez

2001

78

7

Carlton Fisk

1985

77

8

Terry Steinbach

1996

76

9

Cy Williams

1923

75

10

John Vander Wal

2000

72

 

Bonds in 2001 established a new career high in home runs by 27 (73 vs. 46), at the age of 36.   That’s 162 points—easily the highest total in baseball history.  A total of 8,598 points are awarded under rule U5.   George Crowe and Bob Thurman, 2nd and 4th on the list, were teammates on the 1957 Cincinnati Reds, both of them veterans of the Negro Leagues. 

 

U6 is points awarded to hitters who increase or decrease their home run rates after the season in which they played their 500th career game.   All of the players in the top ten increased their home run rate, although many players—Eddie Mathews, for example—do lose home runs as they age:

 

 

 

 

Rank

Player

U6

1

Barry Bonds

154

2

Rafael Palmeiro

153

3

Sammy Sosa

131

4

Ken Griffey Jr.

109

5

Stan Musial

108

6

Steve Finley

102

7

Rogers Hornsby

88

8

Lou Whitaker

88

9

Gary Sheffield

80

10

Robin Yount

76

 

 

A total of 10,460 points are awarded under rule U6.  These points, of course, are awarded only once in a career, as opposed to being potentially awarded in multiple seasons. 

 

U7 is points awarded to hitters who have an OPS 150 points higher or lower than their career norm.    The top ten seasons of all time under rule U7 are:

 

Rank

Player

YEAR

U7

1

Barry Bonds

2002

94

2

Barry Bonds

2004

76

3

Hugh Duffy

1894

73

4

Barry Bonds

2001

67

5

Sammy Sosa

2001

60

6

Tip O'Neill

1887

57

7

Norm Cash

1961

52

8

Luis Gonzalez

2001

49

9

Barry Bonds

1989

48

10

Fred Dunlap

1884

44

 

 

 

In 1929 Mel Ott hit .328 with 42 home runs, 151 RBI.   He was 20 years old, and these were easily the best raw numbers that he was ever to have.   This is the most remarkable peak season at an off-prime age of all time:

 

 

Rank

Player

YEAR

U8

1

Mel Ott

1929

47

2

Barry Bonds

2001

42

3

Barry Bonds

2002

40

4

Barry Bonds

2004

40

5

Joe Jackson

1911

35

6

Al Kaline

1955

33

7

Jim O'Rourke

1890

32

8

Alex Rodriguez

1996

32

9

Eddie Mathews

1953

31

10

Joe Kelley

1894

31

 

A total of 10,371 points are awarded under rule U8. 

 

Points under U9 are awarded to hitters who increase their career-best home run total after the season in which they play their 500th game.   The top ten increases of all time are:

 

Rank

Player

U9

1

Barry Bonds

48

2

Luis Gonzalez

42

3

Rogers Hornsby

34

4t

Rafael Palmeiro

33

4t

Sammy Sosa

33

4t

Davey Johnson

33

7

Tilly Walker

31

8t

Ken Griffey Jr.

29

8t

Brady Anderson

29

10t

Cy Williams

28

10t

Steve Finley

28

 

 

U10 is points awarded for an unusual relationship between on base percentage and slugging percentage.   The top ten players in this area are:

 

Rank

Player

YEAR

OBA

SPct

U10

1

Roy Thomas

1900

.451

.335

38

2

Yank Robinson

1890

.434

.281

35

3

Sammy Sosa

1999

.367

.635

35

4

Goat Anderson

1907

.343

.225

34

5

Sammy Sosa

1998

.377

.647

34

6

Matt Williams

1994

.319

.607

33

7

Sammy Sosa

2001

.437

.737

33

8

Dave Kingman

1979

.343

.613

32

9

Jack Crooks

1890

.357

.254

32

10

Javier Lopez

2003

.378

.687

32

 

 

A total of 10,693 points are awarded under Rule 10, which are about evenly split between players with high on base percentages and players with high slugging percentages. 

 

Summarizing these ten categories of performance, you might guess that Barry Bonds would score as having the most unusual career of all time.   In fact, you might suspect that I have set up the system so that Bonds always comes to the fore.   I certainly did not.   I doubt that it would be possible to measure these type of unusual accomplishments in such a way that any player other than Bonds would come to the front.   You could measure them to get different totals, different rankings, but these are the most unusual players of all time by my system so far:

 

Rank

Player

Pl Tot

1

Barry Bonds

1974

2

Mark McGwire

803

3

Sammy Sosa

521

4

Babe Ruth

511

5

Roy Thomas

393

6

Yank Robinson

323

7

Max Bishop

303

8

Andres Galarraga

293

9

Ken Griffey Jr.

293

10

Luis Gonzalez

288

11

Rogers Hornsby

288

12

Cy Williams

283

13

Jack Crooks

278

14

Rafael Palmeiro

268

15

Brady Anderson

262

16

Hank Aaron

246

17

Eddie Yost

244

18

Willie Keeler

240

19

Gene Tenace

237

20

Ken Caminiti

237

21

Albert Belle

237

22

Jack Clark

228

23

Edgar Martinez

227

24

Willie McCovey

222

25

Juan Gonzalez

212

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

           These are all players who had longer careers.   At some point, when I have done more research, I would need to compare the player’s “U Score”—his total of odd accomplishments—against his games played.   I will look at more things, or anyway I plan to.   I will look at pitchers.  Perhaps some of the 1990s/turn of the century players will slide down the list a little.   It is hard to imagine Barry Bonds sliding out of the position of having had the most unusual career of all time.

 
 

COMMENTS (10 Comments, most recent shown first)

benhurwitz
On the U4 all-time list (points awarded to hitters whose Runs Scored are less than 60% of their RBI), at least 2 of these guys (Burgess, Staub) were pinch-hitting specialists. Burgess in 1965 played 80 games, had 77 at-bats. Staub in '83: 104 games, 115 at-bats. Obviously they had some walks too. When reaching base, they must have been pinch-run for frequently, to account for part of their "U". I think the point of this "U" should have been to deal with starting players, not pinch hitters. There could be some interesting "U" categories just for pinch hitting, of course.
9:57 AM Apr 9th
 
Gno
I assume if James looks at pitchers, Clemens may crack the top five, and Randy Johnson will be listed as well.
10:39 PM Mar 20th
 
mskarpelos
Bill,

Have you read _Game of Shadows_, and do you object to the authors' (Lance Williams and Mark Fainaru-Wada) use of statistics in one of the book's appendices as evidence that Bonds took steroids? From this article it seems that you object to using statistics alone as evidence of steroid use (or any other improper behavior), but when used in conjunction with other evidence (as Williams and Fainar-Wada do in their book), using statistics this way may have some merit. Your thoughts?

For what it's worth, I think Bonds took steroids and probably HGH as well, but if I were on the jury of his perjury trial, I would vote to acquit because I don't think the evidence reaches the reasonable doubt standard.
8:15 PM Mar 20th
 
jdurkee
Bill, why don't you make data sets of performance -- by position, by era, by team, by league, or by whatever population you want to study?

That gives you a distribution.

Then use the "Student" t-test to query your distribution about what is outside of "normal" in that distribution -- to the extent of confidence limits you want it to be outside normal.

That's what the t-test was designed to do -- compare populations to see the chance they are from the same distribution.

Course I could be wrong..... :)

John Durkee
4:41 PM Mar 16th
 
bjames
There are 104 players in history who have 100 more strikeouts than walks--86 through 2000, when Geoff Jenkins did it, and 19 more since 2000. The first to do that were Dave Nicholson and Dick Stuart in '63, then Billy Cowan and Nelson Matthews in '64. So that is an "unusual" feat and could be charted as such, as many other events also could.
1:55 PM Mar 16th
 
tommyr
Hi Bill. After the 2000 season, I noticed that then Milwaukee Brewers slugger Geoff Jenkins ammassed 135 strikeouts and only 33 walks. This seemed unusual to me, a player who has 100 more K's than BBs. At the time, I did a search using Lee Sinins Sabermetric Encyclopedia, and I believe that only one or two players in history had more than 100 more strikeouts than walks, previously. I do not have that software on this computer, so maybe someone can confirm this. If so, it would seem to be highly unusual.

regards


2:48 PM Mar 14th
 
cunegonde
Given the large number of current and recent players in the mix, I would speculate that the nature of the "unusual" is based on standards that hold true over time broadly speaking, but don't take sufficiently into account recent changes in the game that are changing the definition of "normality." If that is true, there will be a lot more "unusual" performances in the past 20 years or so than previously. Given that a number of other "unusual" performances are from the first 40 or so years of professional ball, I would speculate that the most "normal" period in bseball history is roughly from 1930 (after Ruthball was an established fact) until about 1990, when Ruthball Chapter 2 settled in for good.
8:04 PM Mar 13th
 
those
I think runs for U4 is correct, because it says 60 percent of RBIs. If it were home runs, Hack Wilson would be listed as unusual for having less than 114 home runs in 1930.
12:35 PM Mar 13th
 
jimbo
Excellent article, detailed and thought-provoking.
7:44 AM Mar 13th
 
rpriske
I assume under U4 it should say 'runs' and not 'home runs' (as that woudl not be unusual at all, I would think.
12:20 PM Mar 12th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy