BILL JAMES ONLINE

The Right Weight

June 19, 2017
 2017-30

The Proper Weight for the Last Start

 

              OK, first of all, I want you yahoos to know that you kept me up all night, and I need the sleep, frankly.    I do have other things I should be working on.  

              Anyway, I saw how we could "test" whether the proper weighting for the most recent start was 1%, 1.5%, 2%. . . .whatever. . .so I had to do it.   Lot of work.  

         &​nbsp;    Here’s what I did.   I made a simplified version of the ranking system, meaning that it was basically the same thing but missing some of the bells and whistles.    In the "real" system, I adjust the Game Score for every game by considering the park in which the game is played and also the quality of the opposing offense.   In this system I didn’t do that.   It makes a very minor difference. . .really no practical difference at all in most cases.  

              In the "real" system I create rankings for every day of the calendar year by modifying the pitcher’s last score by the length of time since he has pitched.   That’s a LOT of work, a lot of work, and I didn’t do that.    I only modified the scores when the pitcher actually pitched, and I only made leader boards for each five-day period of time.   If a pitcher started twice in a five-day period he would be listed twice, and if he didn’t start in those five days he wouldn’t be listed at all.   That makes more difference than the other thing does, but I don’t see why it should queer the results of the study, which is all that I am interested in.

              In the "real" system I include post-season starts.   In this version I didn’t include them, because they’re not in the data.   That makes some difference in who would rate where, but again, I don’t see ANY reason why it would queer the study, although if Guy123 was still with us I am sure that he would come up with several reasons.    In the "real" system we move pitchers backward from season to season based on how many days that are inactive; in this version I just moved pitchers backward 100 points at the start of every new season.  

              Anyway, it’s a simplified, stripped-down version of the ranking system, and here’s what I did with it.   Given a simplified, stripped-down version of the ranking system, I varied the "weighting percentage" for the most recent start from 1 percent to 6.5%, doing ratings (and thus rankings) for all half-steps in between.. .1%,1.5%,2%,2.5%,3%,3.5%,4%,4.5%,5%,5.5%,6%, 6.5%. . …

 

              Having done that, I figured each pitcher’s record during his next ten starts—his average game score, wins, losses, innings pitched, earned runs allowed, and ERA.    The theory is. . .the theory will eventually be shown true. . .the theory is that (a) pitchers who rate higher should perform better over their next ten starts, and (b) if the system gets to be more effective, more predictive, then the difference between the high-ranking pitchers and the low-ranking pitchers should increase.    Right?

         &nbs​p;    Joedimino suggested that we should "see how they do in their next start or 5 starts, whatever, but a short timeframe, so ability is basically unchanged."   That’s not QUITE right, I don’t think.   One thing that makes #1s #1s and #57s #57s is not just that MOMENT, but that there is STABILITY in their performance.   I used 10 starts rather than 1 or 5, because if that highly-rated pitcher loses his level of effectiveness almost immediately, that’s relevant, rather than irrelevant.   It MATTERS whether he can hold on to that level of effectiveness.    But basically that’s what I did, only I used 10 starts rather than 1 or 5.

 

              Also, there is a very, very, very important difference between EXACTLY what I am trying to do and exactly what Tango is doing, I think, which has potentially major impact on the issue of whether what I have found is relevant to HIS method.   

              Tango says he uses about 1% "decay rate" per start, whereas I am using 3%.    But on the issue of which value creates more accurate ratings, there is this very crucial difference:  that I start every pitcher out at the bottom of the scale and make them fight their way up the chart, whereas I would not assume that Tango has done this.   I am assuming that what Tango is saying is simply that, in valuing the pitcher’s statistics, he de-emphasizes what was done a year ago by 30%, two years ago by 51%, three years ago by 65.7%, etc.    That is very different, in that it assumes that what happened in the more distant past was a vacuum, having no impact on the rating.   My method essentially assumes that there’s a "dead weight" back there at the start of the process, and the pitcher has to prove that he has shaken off that dead weight in order to move up the scale.   Some of you aren’t going to understand what in the hell I am talking about, but Tom will understand it.  Tom is essentially starting everyone off in the middle of the scale. . .that’s not EXACTLY true, but that’s as close as I can come to explaining it.   I’m starting everyone off at the BOTTOM of the scale.   It makes a big difference.  

              The first place it makes a difference is in the "backward movement" between seasons.   In the real system I let a pitcher’s "value number" decay slowly when he is not pitching, such as between seasons.   It causes a pitcher to go backward by about 100 points over the winter, more or less; Kershaw ends one season at 620 and starts the next one at 520.   This makes sense, to me, because pitchers DO very routinely lose all effectiveness over the course of a winter, so each pitcher has to "prove himself" again every year.   

              But in this test, when you start everybody out at 300.000 and apply the one-percent per-start adjustment that Tom suggested, nobody gets to 400 points in the first year, so everybody goes back to 300 at the start of every year. . .not absolutely, but generally; it takes a really strong season to get from 300 to 400 in one season, if we’re only making 1% adjustments per game.

              SO I had to fix the system for that problem.   I did that by changing the "backward shift" between seasons to 33 points, rather than 100, when the weighting for each start was only 1%.    When the weighting was changed to 1.5%, then the backward drift was increased to 50 points; when the weighting was increased to 2%, then the backward drift was increased to 66 points.   When the weighting was changed to 2.5%, then the backward drift was 83 points, and when the weighting was 3% or more than 3%, then the backward drift was 100 points between seasons.  

              I don’t know if that makes sense to everybody.   When you give more weight to each start, then the pitchers move more rapidly away from the 300.000 where everybody starts, so that in one season they are able to get to 450 or 500.    Then, when you move everybody backward by 100 points, that doesn’t reset the system so that all pitchers start the season even;  it just re-sets the point totals so that the good pitchers are still ahead, but the numbers are lower. 

              Anyway. . . what weighting you use sometimes makes a tremendous difference in where pitchers are rated.     Wayne Twitchell in 1973 pitched a series of brilliant games.   In a stretch of 11 starts he pitched four shutouts and several other excellent games, including a four-hitter in which the only run was un-earned.   If you weight each of those starts at 6.5%, then Twitchell at the end of that run ranks as the #11 starting pitcher in baseball.   If you weight each of them at 1%, he ranks 97th.     Big difference.   Or. . .perhaps a more relatable example.  . remember that tremendous run that Kris Medlen had in 2012?    If you weight each start at 6.5%, Medlen climbs to 26th in the ratings.   If you weight each start at 1%, he is still 98th.  

              Or, on the other end, Pedro Martinez in 2008, when he made 20 starts with a 5.61 ERA.   If you weight each start at 1%, Pedro still ranks as the #4 starting pitcher in baseball.   If you weight each start at 6.5%, he ranks 105th.    Significant difference there.    Those are the extreme examples, but every great pitcher has a phase like that at the end of his career, when he can be ranked anywhere from 10th to 95th based on what weight you give to his recent outings.   Roy Halladay, Steve Carlton, Randy Johnson. . .they all take a quick tumble like that late in their careers.  

 

              OK, so having explained that, here’s what happens.   I rated all pitchers 1952 to 2013, and I rated them in each 5-day window in that period.   Then I eliminated all the data prior to 1960, because there are gaps in the data early and also you need time after you start the rating system to allow pitchers to find their level, and then I eliminated all the data from five-day windows in which there were less than 105 pitchers who made a start.

              We have lots and lots of data; we don’t need to mess with the "weak" data.   We can work only with the "strong" data.   Sometimes you’ll have a five-day window at the start of a season or at the end of a season or over the All-Star break in which there are only a few games played, so the pitcher who rates #1 in that time period might be not a worthy representative of a #1 pitcher.   We don’t need that data. 

              We are left with 1,412 rating periods, and thus with 14,120 pitchers who rank 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th,9th or 10th during one of those rating periods.     If we look at the next ten starts for those 14,120 pitchers, then, we would have 141,200 games—understanding that many of these are redundant counts, because the pitchers who rank highly in one five-day rating period almost always rank highly in the NEXT five-day rating period as well.    We will call these the "follow up" games. 

              We don’t quite have 141,200 follow up games, because not everybody makes 10 more starts before their career ends or we hit the end of the study period or something.   We have 139,888 follow up games (for pitchers rated 1 through 10).  

              The Average Game score, for those 139,888 games by the pitchers rated 1 through 10 (with a 1% weighting for each start) was 56.49.  They had 63,020 wins, 44,428 losses for a .587 winning percentage, and they had a 3.36 ERA. 

              Next, let’s look at the pitchers who ranked 50 to 59th on the list.   There are also 14,120 of these pitchers.     These 14,120 pitchers had 136,885 follow up games.   THEIR average game score was 50.32, their winning percentage was .502 (49,832-49,457) and their ERA was 4.12. 

              Finally, we look at the pitchers who ranked in spots 96 to 105 in each of the 1,412 study periods—another 14,120 pitchers.    These pitchers had 125,599 follow-up games, with a .450 winning percentage (39,365-48,163), and a 4.51 ERA.  

              So we have what a businessman would call proof of concept.    The data shakes out the way it SHOULD shake out.   The highly-rated pitchers do in fact pitch much better, over their next ten starts, than the lower-rating pitchers.    In the study, six pitchers had 10-start stretches in which they went 10-0:  Roger Clemens, Bob Gibson, John Smoltz, Gaylord Perry, Justin Verlander and Brandon Webb.    No one went 0-10; Hideo Nomo went 0-9 with an 8.60 ERA. 

              Now, having established that the process works, we can then compare whether it works better or worse if we increase the weight given to each start.  

              First, I compared the two ends of the chart—1% per start, and 6.5% per start.  

              It turns out, given this structure, that a 6.5% weighting works better than a 1% weighting.   Again, cautionary note:  this may not conflict directly with Tom Tango’s position, given his method.   The heavier weighting works better, in my system, because it moves deserving pitchers up through the rankings more rapidly than a 1% weighting.   But if you don’t start everybody out at the bottom of the scale, then you may not have that effect, so you might do better with a 1% weighting, I don’t know.   (I’ll amend this comment later.)

              But in THIS system. . .as I said, pitchers ranked 1-10 with a 1% weighting have an average Game Score for their next ten starts of 56.49, a winning percentage of .587, and an ERA of 3.36.   But when we increase the weighting to 6.5%, then the average Game Score increases to 57.38, the winning percentage increases to .592, and the ERA drops to 3.28.  

              On the other end of the scale. . .not the EXTREME other end of the scale, because the extreme other end of the scale might be 132 pitchers or 162 or some other weird number. . .but the pitchers ranked 96 to 105, with the 1% weighting, have an average Game Score of 48.11, a winning percentage of .450, and an ERA of 4.51.    But with the 6.5 weighting, the average Game Score drops to 47.28, the winning percentage drops to .443, and the ERA increases to 4.64.   The 6.5% weighting does a better job of predicting future performance than does the 1% weighting.

              1.5% is more effective at predicting future performance than 1%.  1.5% (as opposed to 1%) increases the average Game Score from 56.49 to 57.11, and the winning percentage from .587 to .592.    The ERA drops from 3.36 to 3.30.    Parallel changes on the other end of the scale. . .maybe I shouldn’t write out all of those.  

              2% is as effective as 6.5% at sorting out the top pitchers.   With a 2% weight for each game, the pitchers ranked 1-10 have an average Game Score during their next ten starts of 57.35, a winning percentage of .593, and an ERA of 3.27—basically the same data as the 6.5% weighting.       

              2.5% is. . .well, the data is a little bit mixed, but it appears to be a tiny bit more effective than 2%.    I’ll chart the data for you in a moment.   The average Game Score in the next ten starts increases from 57.34 to 57.44, and the ERA drops from 3.27 to 3.26, although the Winning Percentage also drops.

              3% is a tiny bit more effective than 2.5%. 

              3.5% is almost indistinguishable from 3%. . ..just a tiny, tiny, tiny bit worse.

              4% is indistinguishable from 3.5%.  

              4.5% looks the same as 4%.

 

              OK, It’s 7:00 in the morning, I’ve been up all night, I’m tired and I need to get to bed, so I’m not going to run the data for 5%, 5.5% or 6%.   I am almost embarrassed to admit this, but it appears that my instinct—3%--may have been as good a place to put the correct weight as any other—not necessarily BETTER than other options, certainly not markedly better, but as good as any.  

              Here’s a chart of the data from the study:

 

   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

1%

1 to 10

14120

139888

7902691

56.49

63020

44428

.587

1%

50 to 59

14120

136885

6888067

50.32

49832

49457

.502

1%

96 to 105

14120

125599

6042409

48.11

39365

48163

.450

                 
                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

1.50%

1 to 10

14120

140300

8011761

57.10

64074

44175

.592

1.50%

50 to 59

14120

137777

6892017

50.02

49559

50349

.496

1.50%

96 to 105

14120

124878

5994503

48.00

38993

47805

.449

                 
                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

2%

1 to 10

14120

140428

8053345

57.35

64489

44288

.593

2%

50 to 59

14120

137841

6876234

49.89

49392

50862

.493

2%

96 to 105

14120

124188

5954408

47.95

38746

47432

.450

                 
                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

2.50%

1 to 10

14120

140500

8070611

57.44

64507

44580

.591

2.50%

50 to 59

14120

138405

6912131

49.94

50010

50975

.495

2.50%

96 to 105

14120

123391

5901065

47.82

38558

47101

.450

                 
                 
                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

3%

1 to 10

14120

140525

8078235

57.49

64674

44537

.592

3%

50 to 59

14120

138626

6926620

49.97

50009

50916

.496

3%

96 to 105

14120

123398

5895252

47.77

38616

47226

.450

                 
                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

3.50%

1 to 10

14120

140526

8077707

57.48

64696

44556

.592

3.50%

50 to 59

14120

138522

6904902

49.85

49558

51364

.491

3.50%

96 to 105

14120

123107

5877328

47.74

38406

47025

.450

                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

4%

1 to 10

14120

140491

8075853

57.48

64716

44580

.592

4%

50 to 59

14120

138447

6910182

49.91

49921

51130

.494

4%

96 to 105

14120

123082

5875938

47.74

38339

46935

.450

                 
                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

4.50%

1 to 10

14120

140469

8076205

57.49

64691

44581

.592

4.50%

50 to 59

14120

138442

6913713

49.94

49886

50956

.495

4.50%

96 to 105

14120

123300

5890179

47.77

38445

46932

.450

                 
   

 

 

Total

Average

 

 

 

   

 

Followup

Game

Game

 

 

Winning

   

Pitchers

Games

Score

Score

Wins

Losses

Percentage

6.50%

1 to 10

14120

140380

8055387

57.38

64556

44449

.592

6.50%

50 to 59

14120

138493

6939193

50.11

50193

50824

.497

6.50%

96 to 105

14120

119568

5653119

47.28

36743

46183

.443

 

 

   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

1%

1 to 10

2884593

359034

3.36

1%

50 to 59

2547200

388793

4.12

1%

96 to 105

2176415

363927

4.51

         
         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

1.50%

1 to 10

2920220

356199

3.29

1.50%

50 to 59

2552887

393337

4.16

1.50%

96 to 105

2157562

361694

4.53

         
         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

2%

1 to 10

2933198

355277

3.27

2%

50 to 59

2548823

395337

4.19

2%

96 to 105

2141380

359162

4.53

         
         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

2.50%

1 to 10

2939608

355180

3.26

2.50%

50 to 59

2565213

397254

4.18

2.50%

96 to 105

2123889

357815

4.55

         
         
         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

3%

1 to 10

2942246

355158

3.26

3%

50 to 59

2568881

397638

4.18

3%

96 to 105

2123755

358822

4.56

         
         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

3.50%

1 to 10

2942569

355362

3.26

3.50%

50 to 59

2562431

399032

4.20

3.50%

96 to 105

2116249

357931

4.57

         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

4%

1 to 10

2943089

355572

3.26

4%

50 to 59

2564417

397879

4.19

4%

96 to 105

2116549

358093

4.57

         
         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

4.50%

1 to 10

2943181

355529

3.26

4.50%

50 to 59

2564429

397245

4.18

4.50%

96 to 105

2122500

358410

4.56

         
   

 

 

 

   

 

Earned

 

   

Outs

Runs

ERA

6.50%

1 to 10

2935716

356274

3.28

6.50%

50 to 59

2569612

394443

4.14

6.50%

96 to 105

2048380

352094

4.64

 

 

 

              The one conclusion that I can firmly reach after doing this is that 1% or 1.5% is clearly not the right answer—and I SUSPECT, although I can’t prove this, that it isn’t the right answer in Tom’s method, either.    The reason that 1% is not the right weight is that it fails to pick up on pitcher’s loss in effectiveness near the end of their careers.  

              "Near the end of their careers" is a fraught term.   It sounds like we are talking about Roger Clemens in 2004, Chuck Finley in 2002, Kevin Appier in 2002, Jack Morris in 1993, etc., and yes, we are talking about them.   Almost every outstanding pitcher has a moment at the end of his career in which (a) he has not been pitching well, and (b) he is going to get really hammered during his next ten starts, but (c) he would still rank as one of the ten top pitchers in baseball at that moment, based on a 1% weighting scale, because the weight isn’t enough to pick up what is happening.  

              But it is not JUST that moment.   There is a Sonny Gray/Jose Quintana moment that happens much earlier to many pitchers.   Jose Quintana is not old and has not yet had a "great" career, but he is also not pitching well at the moment, either.     A lot of guys hit the end of the road when they are 26, 27, 28, 29 years old.    A 1% weighting system isn’t going to see that when it is happening.

 

              Other than that. . .you can use 2%, 3%, anything up to 6%, and one predicts the future about as well as another.  

 

 
 

COMMENTS (21 Comments, most recent shown first)

JohnPontoon
Although wry, I try not to lie. Please take my statement as read. Wait, are most people PRO-unfairness? That'd be sad.
2:22 PM Jun 26th
 
MarisFan61
John: Shall I say 'thanks'? :-)
The reason I put it as a question is, if you're serious and not ironic or sarcastic or some such, it might be the first time in history anyone has expressed such a thing.
12:24 AM Jun 24th
 
JohnPontoon
Maris, you've obvously grasped the semantic pitfall which I had dug, and restated your position in a way that even deliberately obtuse and pedantic guys like myself cannot fairly find quarrel. I'm anti-unfairness, so my work here is done. (Tips rhetorical cowboy hat.)
10:27 PM Jun 23rd
 
MarisFan61
P.S. I see that actually Bill did say it -- in what he called "unnecessarily dramatic language":
"IT'S HAPPENED! Max Scherzer has passed Kershaw to become the #1 starting pitcher in baseball."

I don't think Bill believed this was definitively true, or not close, and in fact I'm pretty sure he wasn't able to write it or say it without a smile; he knew he was being provocative, and he knew it wasn't a real-world "fact." The 'fact' was just that Scherzer had moved ahead in the ranking system. The real-world type of wording was a catch line.

(Bill, consider this an invite to say if any of that's wrong.) :-)
8:35 PM Jun 23rd
 
MarisFan61
John, I wasn't talking about whether it was factual, just whether it's meaningful. It's very possible for a thing to be factual but not meaningful.

If you have two pitchers who were so equal a week earlier (and in fact Scherzer was behind), and then the one who was behind pulls ahead on the basis of 1 or 2 games, there is no way that in terms of anything meaningful the guy is the #1 pitcher in baseball.

Sure, I know what you said, and I know what Bill had said. He didn't say Scherzer IS the #1 pitcher in baseball; he said he's the #1 starting pitcher in the Starting Pitcher Rankings.

I was talking about whether this means it's meaningful to say that Scherzer IS the #1 pitcher -- and I'm saying that ipso facto, it isn't, and cannot be.

The whole thing is based on this evaluation system. (Right?)
And, a week earlier, the system had the other guy ahead. If you wanted to say, yeah, the system is valid for saying who's the best pitcher in baseball, a week earlier it was the other guy.
Now, a week later, evidently on the basis of 1 or 2 games, Scherzer pulls ahead. It simply cannot be meaningful to say that on such a basis, the new guy is clearly the best pitcher in baseball and that it's not close.

It's a trap to be overly focused on the 'fact' of what number an evaluation system puts out. The main reason it's of interest is the extent to which it's meaningful in terms of the actual world. The main reason Bill's finding triggered interest 'out there,' and ultimately the reason any of us are interested in it, is because people assume it is intended to try to represent the real world.

Sure, Scherzer became clearly the #1 pitcher according to the system. But in an instance like this, it can't mean that the guy is the #1 pitcher. No matter how unequal they are on the bottom line numbers, such a difference cannot be sufficient to justify the real-world statement. (Which, BTW, Bill didn't make, not that I noticed anyway.)

7:46 PM Jun 23rd
 
JohnPontoon
Mr. Fan61, I invite you to parse the second sentence of my comment, particularly the bit between the commas. That part establishes my statement as factual.
9:27 AM Jun 23rd
 
Guy123
I don’t see ANY reason why it would queer the study, although if Guy123 was still with us I am sure that he would come up with several reasons.
Oh, I'm still here. I just figured out what the (implied) ground rules are here, and am trying to respect them.
9:33 AM Jun 22nd
 
MarisFan61
John: The sureness with which you say that is not well taken, because it depends on a certain way of seeing it -- and according to how I see such things, which is that 1 or 2 games can't definitely determine such a thing, your way is wrong. Which of course depends on my way of seeing it. :-)

Look: Bill said that according to his system (which is what you're basing it on too), Scherzer had moved ahead of Kershaw last week. That means (I figure) that it probably resulted from 1 or 2 games.

I think that the way you expressed it reflects (borrowing from your phrasing) a lack of ability to discern what is significant and what isn't.
11:47 PM Jun 21st
 
JohnPontoon
As I see it, Scherzer has pulled ahead of Kershaw as the #1 pitcher in baseball beyond even the merest possibility of doubt, as measured by the system Bill created. To imply that the two pitchers are even, as measured by Bill's system, is to lack the ability to discern what number is the larger of any given two numbers. This is not my opinion.
5:35 PM Jun 20th
 
bearbyz
Maris, I figure this is just one part of the argument. Scherzer passing Kershaw has gotten people talking about it.
10:56 AM Jun 20th
 
Fireball Wenz
I'm concerned about you pulling all-nighters to do this work. We're going to put you on a paragraph count, Bill. You can write 25 paragraphs every five days, and you can doodle on the side for 20 minutes between chapters.
8:36 AM Jun 20th
 
danfeinstein
Quick question that I couldn't figure out from the article introducing the leaderboards: is this using the original version of Game Scores or the newer version Bill introduced here a few years back which changed the weightings slightly?


6:16 AM Jun 20th
 
tangotiger
Bill: I feel like an idiot, but in the offseason, I did the WARcels:

tangotiger.com/index.php/site/comments/war-marcels-...-warcels

And that uses this scheme as its basis:
60%: year T
30%: year T-1
10%: year T-2

The decay rate I was using the 1%, makes sense for rate stats. But, when you include playing time, we have to be far more aggressive.

And since your scheme gives 63% of the weight to the current year, I think your method is wholly sensible.



11:06 PM Jun 19th
 
tangotiger
One thing that I remembered is "aging". There are three components to forecasting, the "mininum base" for any forecasting system:
1. weighting recent performance more strongly
2. applying an aging curve
3. "ballast" (regression toward the mean)

So, Bill's point at the end regarding the end of a pitcher's career: well, it's very possible that Bill was able to get an inferred aging by weighting as he did. As well as combined with his ballast of not using league average (which is the traditional way to do regression) but of using, essentially, replacement level.

Basically, Bill may have devised an elegant system that works better than the "standard" and being able to do so by putting a great deal of weight on the most recent season.

Gotta really think more about this.

7:46 PM Jun 19th
 
tangotiger
Perfect, just perfect.

***

It's very possible that the "ballast" (regression toward the mean basically) is what requires the 3%. It's also possible that the 1% or 1.5% I use is wrong.

It's a bit like Game Score, where the "Classic" version starts everyone at 50, whereas my revised one (and the similarly unoffficial revised one from Bill) starts much lower. I start it at 40 and Bill closer to 35 I think.

Regardless, it's an important distinction to make. And it's just as easy for me to test it based on having the ballast component being replacement level rather than average. I'll get to that later in the week and see if that affects my decay rate.
6:40 PM Jun 19th
 
MarisFan61
Bear: I know!
But I guess there can be differing degrees of comfort about not finding an answer.

Like, I never minded, back in kidhood, that there was (IMO) no clear answer to "Mickey or Willie," and I thought the people who had an answer were either fooling themselves, or biased (after all, just about everyone was), or both. I don't need answer to whether Scherzer or Kershaw is 'better' right now. I'm happy enough with saying they're both ahead of everyone else and that it's essentially arbitrary to try to separate them. I like reality and truth. I prefer it to unreality and untruth:-)
2:30 PM Jun 19th
 
bearbyz
Maris, it is fun to have a race and even though Scherzer passed Kershaw by just a little bit and no system is that accurate, he passed him by the rules of the race. It is more fun to do that than to say now they are so close it should be a tie.
1:11 PM Jun 19th
 
OldBackstop
Awesome. The kid still has it.
1:11 PM Jun 19th
 
MarisFan61
(To be clear: Didn't mean Kershaw should be #1 over Scherzer either.)
12:47 PM Jun 19th
 
MarisFan61
I know that the main thing isn't the results at any given moment, but the method. That's the main thing these articles are about.

But, the controversy (such as there is) was because of saying one guy is #1. Ignoring the moron reactions like that Scherzer isn't even in the top 10, the main issue is saying that Scherzer has moved ahead of Kershaw -- and, it says here (I mean just right here) :-) that's a little ridiculous. It's like how it would be to start including the decimals in Win Share numbers, which you assiduously avoid because it wouldn't be meaningful and it might seem to convey false accuracy.

Both of which are the case when we splits hairs like declaring either of these two guys #1 right now. They're too close. And maybe even whatever method you come up with at 7 A.M. today, you might get a different result with what you might do at 7 A.M. tomorrow.
12:10 PM Jun 19th
 
JohnPontoon
I like the dedication it takes to do this work until 7 AM. BUT, Bill, I'd rather you forced healthier habits upon yourself so as to extend your career value.
11:49 AM Jun 19th
 
 
© 2011 Be Jolly, Inc. All Rights Reserved.