Remember me

Ranking Baseball Teams

January 5, 2009

            Since I developed the point-comparison ranking system that we have used to rank NFL teams, NBA teams and (in 2007) NCAA football teams, people have been asking me if it could be used to compare baseball teams.  I have finally approached this problem, and I’ll have a report for you shortly, but first, I need to make sure everybody is up to speed on the method.   Even though I have explained this several times by now.

            Suppose that Albany plays Binghamton, and beats them 70-52.    The only thing we can conclude, based on this one game, is that Albany is 18 points better than Binghamton.    If we assume that an average team is “100”, then, we have to conclude, based on this one game, that Albany is at 109, Binghamton is at 91.  

            Suppose that Cheektowaga plays Deer Park, and defeats them 122 to 41.   The only thing we can conclude, based on the data we have, is that Cheektowaga is 81 points better than Deer Park.   Since this is the only game info we have about either team, we place Cheektowaga at 140.5, and Deer Park at 59.5.

            Suppose, however, that Albany plays Deer Park and beats the 82-68.   Albany, 107; Deer Park 93, and that Binghamton plays Cheektowaga, and defeats them 92-89.  Now we have two estimates for each team:

 

            Albany             109      107                  Average  108.00

            Binghamton        91      101.5               Average    96.25

            Cheektowaga 140.5      98.5                Average   119.50

            Deer Park         59.5      93.5               Average     76.25

 

            So the teams appear to rank in this order:

 

            1.  Cheektowaga          119.50

            2.  Albany                    108.00

            3.  Binghamton   96.25

            4.  Deer Park                 76.25

 

            The data—like all such data—is internally inconsistent.    If Cheektowaga is 81 points better than Deer Park, Binghamton is 3 points better than Cheektowaga, and Albany is 18 points better than Binghamton, than Albany must be 102 points better than Deer Park.   But Albany played Deer Park, and defeated them by only 14.   Our estimates, then, must not be exactly right.

            Suppose that we re-run all of the games, but using these averages above (the end point data) as the starting point.   In the first round of games, since all teams are initially assumed to be average, Albany’s “game output score” for the Binghamton game was 109, based on

            (100 + 100 + 70 – 52)   =  109.00

            But in the second round, we assume that Albany is at 108 and Binghamton at 96.25, and score the game for Albany as:

            (108 + 96.25 + 70 – 52) =  111.125

            The results of the second-round calculations for all games are:

 

            Albany             111.125             99.125           Average     105.125

            Binghamton        93.125           109.375           Average     101.25

            Cheektowaga   138.375           106.375           Average     122.375

            Deer Park          57.375             85.125           Average       71.25

 

            The rank is now

 

            1.  Cheektowaga          122.4

            2.  Albany                    105.1

            3.  Binghamton 101.1

            4.  Deer Park                 71.3

 

            We then use these second-round outputs as the third-round starting figures, and re-calculate again.  

            After the third round, Binghamton has pulled ahead of Albany:

 

1.  Cheektowaga          123.8

2.  Binghamton           103.8

            3.  Albany                   102.7

            4.  Deer Park                 68.8

 

            And, as we continue to re-calculate, they will pull further ahead.  Finally, after a few dozen rounds of re-calculation, we will reach these values:

 

1.  Cheektowaga (1-1)             125.25

2.  Binghamton (1-1)                106.25

            3.  Albany (2-0)                       102.25

            4.  Deer Park (0-2)                    66.25

 

            And when we reach those numbers, they stop moving.   If you re-calculate 10,000 more times, the numbers will stay right there.     Deer Park is supposed to be 81 points worse than Cheektowaga and 14 points worse than Albany; we can’t make that happen, so we compromise on 59 points worse than Cheektowaga (81 – 22) and 36 points worse than Albany (14 + 22).    Albany is supposed to be 18 points better than Binghamton but only 14 points better than Deer Park.   We can’t make that happen, so we compromise on 36 points better than Deer Park (14 + 22) but 4 points worse than Binghamton (18 – 22).   Cheektowaga is supposed to be 81 points better than Deer Park but 3 points worse than Binghamton.  We can’t make that happen, so we compromise on 59 points better than Deer Park (81 – 22) but 19 points better than Binghamton (-3 + 22). 

            Let me emphasize this:  that it makes no difference whatever what values you initially assume each team has.    If you initially assume that Deer Park has a value of 400 and all of the other teams are at zero, you get first-round outputs like this:

 

            Albany             9          207      Average             108.0

            Binghamton      -9         1.5      Average              -3.75

            Cheektowaga  240.5    -1.5      Average             119.5

            Deer Park        159.5   193      Average             176.25

 

            But after the second-round calculations, you get these averages:

 

            1.  Cheektowaga          122.4

            2.  Deer Park               121.2

            3.  Albany                    105.1

            4.  Binghamton   51.2

           

            And, after a large number of re-calculations, you wind up with exactly the same numbers we had before.   The initial assumption entirely washes out as the “comparison” data is re-introduced again and again and again. 

            In this example there are only four games, but in real leagues there are hundreds or thousands of games, each of them trying to push the ranking for a team up or down.   It takes many rounds of re-calculation for the system to resolve all the tensions as well as it can, but the process eventually reaches a stopping point, which is the point at which the output values are the same as the input values.  

            In basketball we assume that an average team has a value of 200, in football, an average of 100, and in baseball, an average of 10.   Those values are arbitrary, and it doesn’t matter; it’s just a way of establishing a center.   You could make 27.418 the center, and the relative values of the teams would be just the same.

 

            Anyway, I have done this for various sports, and people keep asking me, “Could you do this for baseball?”    I didn’t think about doing it for baseball, honestly, because

            1)  One doesn’t tend to assume that the outcome of a single baseball game is representative of the ability of the team, and

            2)  The universe of games is immense.

            Once Retrosheet published the data for 2008, however, I took a look at the issue of whether the game logs for each team could be moved into a spreadsheet like the ones we use.   I concluded that it could.   I’m not a programmer; it took me 25, 35 hours of actual work to do this, but I was able to get the games into a spreadsheet hooked up with all the necessary formulas.  

            OK, this is the ranking, by this method, for the 30 Major League teams in 2008, not including post-season play:

 

 

Boston

A

11.35

 

Toronto

A

11.11

 

Tampa Bay

A

11.10

 

New York

A

10.84

 

Chicago

A

10.81

 

Minnesota

A

10.81

 

Chicago

N

10.76

 

Angels

A

10.71

 

Cleveland

A

10.56

 

Philadelphia

N

10.37

 

Detroit

A

10.14

 

NY Mets

N

10.12

 

Oakland

A

10.07

 

Milwaukee

N

10.06

 

Texas

A

10.02

 

St. Louis

N

10.00

 

Baltimore

A

10.00

 

Kansas City

A

9.86

 

Los Angeles

N

9.85

 

Florida

N

9.69

 

Arizona

N

9.62

 

Houston

N

9.56

 

Atlanta

N

9.52

 

Seattle

A

9.49

 

Cincinnati

N

9.21

 

Colorado

N

9.12

 

Pittsburgh

N

8.90

 

San Francisco

N

8.88

 

San Diego

N

8.84

 

Washington

N

8.63

 

            And this is the ranking for all 30 teams when post-season play is included:

 

 

Boston

A

11.26

 

Tampa Bay

A

11.13

 

Toronto

A

11.08

 

New York

A

10.82

 

Minnesota

A

10.78

 

Chicago

A

10.74

 

Chicago

N

10.68

 

Angels

A

10.67

 

Cleveland

A

10.53

 

Philadelphia

N

10.50

 

NY Mets

N

10.15

 

Detroit

A

10.11

 

Milwaukee

N

10.04

 

Oakland

A

10.04

 

St. Louis

N

10.02

 

Texas

A

10.00

 

Baltimore

A

9.97

 

Los Angeles

N

9.96

 

Kansas City

A

9.84

 

Florida

N

9.72

 

Arizona

N

9.65

 

Houston

N

9.57

 

Atlanta

N

9.55

 

Seattle

A

9.46

 

Cincinnati

N

9.22

 

Colorado

N

9.15

 

Pittsburgh

N

8.91

 

San Francisco

N

8.90

 

San Diego

N

8.86

 

Washington

N

8.66

  

            Boston comes out first, but. . ..I hope nobody thinks this is what this is about.   If you work with baseball statistics, you probably knew who would come out first before I told you.   I could care less who ranks first; this isn’t how the championship is determined.  

            The Red Sox, by this method, were about 219 runs better than an average major league team, during the regular season.    They outscored their opponents by 151 runs (845 to 694), and they played a schedule that was 68 runs better than a major league average schedule.   (Actually, 66.  The other two runs are created by the interaction of the two forces.)     Washington is 222 runs worse than an average team—184 from being outscored, and 38 more from playing a weaker-than-average schedule.

 

            Having done this, I see more merit in the method than I would have supposed was there going in.   I was reluctant to do this; I thought it would be a lot of work to show us things that

            a)  we already knew, and

            b)  didn’t really matter.  

            Baseball has a very good system to resolve its championship; it doesn’t need rankings. 

            But I see more merit there than I suspected.  First, the issue of predictability. . .do early-season rankings predict the finish of the season?

            Yes, but not really.    Since there are very few inter-league games early in the season, if we rank the teams based on games played through May 31, we get rankings that are sensible within the league, but with little information on how one league compares to the other:

 

 

 

 

 

 

 

 

AMERICAN

 

 

NATIONAL

 

 

Team

Rank

 

Team

Rank

 

Boston

10.70

 

Chicago

11.39

 

Toronto

10.64

 

Philadelphia

10.93

 

Oakland

10.64

 

Atlanta

10.86

 

Tampa Bay

10.59

 

Arizona

10.56

 

Chicago

10.58

 

NY Mets

10.24

 

Cleveland

10.28

 

Los Angeles

10.10

 

Angels

10.04

 

St. Louis

10.02

 

New York

10.00

 

Houston

9.97

 

Baltimore

9.79

 

Florida

9.96

 

Texas

9.75

 

Pittsburgh

9.93

 

Minnesota

9.70

 

Cincinnati

9.80

 

Detroit

9.68

 

Milwaukee

9.78

 

Kansas City

9.03

 

Washington

9.36

 

Seattle

8.96

 

San Francisco

9.01

 

 

 

 

San Diego

8.87

 

 

 

 

Colorado

8.86

 

 

 

 

 

 

 

 

 

 

 

 

 

            Those rankings are reasonably predictive of the final finishes, but are they better than just looking at the standings?

            Well, yes, probably.    Florida was 31-23 on May 31, in first place in the NL East—but we see them here as a .500 team, and the fourth-best team in the division, Philadelphia being the best.  They were one game under .500 for the rest of the season, and finished third in the division.    Cleveland was 25-30 on May 31, but we see them here as an above-average team, and they did go 56-51 the rest of the way, although they never did get back in the race.    On the other hand, Oakland and Atlanta, which appeared early in the season to be strong teams, ultimately proved not to be.   My conclusion:  Yes, there is probably some predictive significance to the method, but you wouldn’t want to rely on it.   And you’d have to study many more seasons than one to know how reliable the early-season evaluations really were.

            That’s a minor issue, to me, although an obvious one.  The real virtues that I see in this system are:

 

1)  It creates a meaningful and fairly reliable evaluation of how one league compares to the other.  

            Suppose that there was only one game played between the leagues, and that in that one game, San Diego beat Texas 13-2.   If that was the case, in this system that would cause the National League teams, on average, to rank ten to twelve points ahead of the American League teams.   This would happen because there would be nothing in the system resisting the input of that one game.  San Diego would, of necessity, rank 11.0 runs ahead of Texas, and all of the other teams would re-orient themselves within the league based on the interlocking schedule within the league.

            That’s a cautionary note; limited games, not interlocking with the rest of the schedule, can have a disproportionate impact on the rankings in this system.     

            But the AL/NL comparisons are not based on one game, of course; they are based on 250+ games, in which the American League, which has dominated those games for several years, went 149-103 (150-107 if we count the World Series.)   Our estimate is that the average American League team is 0.86 runs per game better than the average NL team.   I am sure we could derive this estimate by some other and simpler method—but I doubt that we could derive any more accurate estimate.

            On the larger point, I think that the greatest potential of this method is in the comparison of leagues, and in particular in the comparison of leagues for college baseball.   Our experiment here suggests that this method works quite well for baseball.   There are 900+ college baseball teams, which play a completely interlocking schedule.   UC-Riverside plays somebody who plays somebody who plays somebody who plays Middlebury College in Vermont and the University of Puget Sound in Tacoma and Bowdoin College in Maine.  

            Major league teams genuinely need to know how one college league compares to another.   This method could definitively resolve that issue.   The people who should do this are us—Bill James Online.   So far, because of financial and programming issues, we haven’t been able to get things like that done, but if we don’t, somebody will.   There is no reason we cannot clearly and definitively rank Bowling Green and Montevallo against USC and Texas.  

 

2)  It creates a sophisticated and accurate estimate of each team’s strength of schedule.

            Because we have accurate rankings for each team on a run scale, we can easily figure the average strength of the opposition for each team.   These are those figures for each of the 30 major league teams:

 

 

Team

Lg

Strength of Schedule

Plus Runs

 

Baltimore

A

10.51

83

 

Toronto

A

10.44

72

 

New York

A

10.44

71

 

Tampa Bay

A

10.43

70

 

Boston

A

10.41

66

 

Texas

A

10.41

66

 

Kansas City

A

10.39

64

 

Detroit

A

10.34

55

 

Seattle

A

10.33

53

 

Oakland

A

10.32

52

 

Chicago

A

10.28

46

 

Angels

A

10.28

45

 

Cleveland

A

10.26

43

 

Minnesota

A

10.26

42

 

Pittsburgh

N

9.83

-27

 

Cincinnati

N

9.81

-30

 

Washington

N

9.80

-32

 

Houston

N

9.76

-38

 

Florida

N

9.71

-47

 

Atlanta

N

9.71

-47

 

Milwaukee

N

9.69

-50

 

St. Louis

N

9.68

-51

 

Philadelphia

N

9.65

-56

 

San Diego

N

9.65

-57

 

Chicago

N

9.64

-58

 

San Francisco

N

9.64

-59

 

NY Mets

N

9.64

-59

 

Colorado

N

9.61

-62

 

Arizona

N

9.57

-70

 

Los Angeles

N

9.55

-73

 

 

            The teams in the AL East play the strongest schedules, because there are four strong teams in that division.   Baltimore plays the toughest schedule because they are the only team that has to play all four of them.  

            There is a lot of talk about strength of schedule. . .some whining about the imbalancing effects of the inter-league matchups, some discussion about playing so many games inside the division.    This method gives us a solid, credible information with which to approach that discussion.   I think that’s worthwhile. 

Baltimore’s schedule is 156 runs tougher than Los Angeles’ schedule—one run a game, basically.   Baltimore starts the season 156 runs behind the Dodgers.   What do we think about that?   Should we just live with it, or should try to do something about it?  

 

 

3) It is a step toward the possible evolution of methods of adjusting statistical performance for strength of schedule.  

            I don’t talk about what happens in the Red Sox front office, but I think I can tell you this:  We worry a lot about “Can this guy come into our division and compete?”  OK, here’s a pitcher who has a good record pitching in some other division, but.  .that ain’t the AL East.   What’s going to happen to him against this level of competition?

            This information is a step on the road toward a method that can adjust statistical performance for the level of competition.  

 

Alternative Approach

 

            What if we approached this problem not through run differential but through winning percentage?

            In order to do that, we need to be able to state the outcome of each game as a winning percentage.   I outlined a method to do that in another article (Winning Percentage from a Game). . ..a 1-0 win creates a winning percentage of .541, a 12-6 victory is .626, a 4-5 loss is .422.  

            If you post a .626 winning percentage against a team with a winning percentage of .482, what is your winning percentage?  In other words, the .626 assumes a .500 opponent.   It’s not a .500 opponent; it’s a .482 opponent.  What’s the equivalent winning percentage?

            It’s .609.   There’s an old established method for dealing with that. . ..Dallas Adams and I invented it in the 1970s.   I don’t want to get into that now, but:

 

            .626 against a .400 team is equivalent to .527 against a .500 team.

            .626 against a .450 team is equivalent to .578 against a .500 team.

            .626 against a .500 team is .626.

            .626 against a .550 team is equivalent to .672 against a .550 team.

            .626 against a .600 team is equivalent to .715 against a .600 team.

 

            If you combine the new method (Winning Percentage from a Game) with this old method, you can calculate a winning percentage for each game, adjusting for the quality of the competition.  

            We evaluate each game of the major league season in this way.   Milwaukee played at Cincinnati on April 18, April 19 and April 20, Milwaukee winning 5-2 and 5-3 and losing the third, 3-4.  

            A 5-2 win is a winning percentage of .719.   A 5-3 win is a winning percentage of .635, and a 3-4 loss is a winning percentage of .444, so Milwaukee’s winning percentages for the three games are .719, .635 and .444, and Cincinnati’s are .281, .365 and .556—without adjusting for the quality of competition.

            To adjust for the quality of competition, we go through the process outlined above.   On the first round of calculations, we assume that Cincinnati is a .500 opponent.    Cincinnati’s winning percentage after one round of calculations, however, is .471, so in the second round of calculations we assume that their winning percentage is .471, and we re-calculate again.  

            After many rounds of calculations, Cincinnati’s winning percentage locks in at .470, and then it won’t move anymore; this is the end point data.   We recalculate these games based on that conclusion:

 

            .719 against .470 is .694 (meaning that it is equivalent to .694 against a .500 team.)

            .635 against .470 is .607.

            .444 against .470 is .414.

 

            So Milwaukee’s winning percentage contributions for those games are .694, .607 and .414.  

            By calculating every game in this fashion and running it through many cycles, we get output winning percentages for every team as follows, including the playoff and World Series Games:

 

 

 

 

 

 

Team

Lg

Winning

 

 

 

Percentage

 

Boston

A

.558

 

Tampa Bay

A

.542

 

Toronto

A

.539

 

Chicago

N

.535

 

Angels

A

.533

 

Philadelphia

N

.529

 

Minnesota

A

.528

 

New York

A

.528

 

Chicago

A

.521

 

NY Mets

N

.518

 

Cleveland

A

.517

 

Milwaukee

N

.516

 

Los Angeles

N

.509

 

St. Louis

N

.508

 

Houston

N

.500

 

Florida

N

.499

 

Texas

A

.495

 

Oakland

A

.495

 

Arizona

N

.492

 

Kansas City

A

.488

 

Detroit

A

.488

 

Baltimore

A

.487

 

Atlanta

N

.484

 

Colorado

N

.475

 

Cincinnati

N

.470

 

Seattle

A

.468

 

San Francisco

N

.464

 

Pittsburgh

N

.459

 

San Diego

N

.452

 

Washington

N

.438

 

            This is essentially the same as the rankings we got by the other method—a little different, but mostly the same.  

            A strength of this method is that it is more focused on wins and losses, and pays little attention to the difference between a 7-1 win and a 15-1 win.    There are eight runs there that should be depreciated—and are depreciated by this method, not by the other one. 

            A weakness of this method, which was discussed in the companion article (Winning Percentage from a Game) is that the average winning percentage from all games does not track with the team’s actual winning percentage, but with a figure halfway between that number and .500.  . ..600 becomes .550, .580 becomes .540, etc.  

            I was trying to figure out a way to work around this problem, but the best I could come up with was simply to go through the entire process, and then double the spreads (double the distance from .500) at the end of the process:

 

 

 

 

Centralized

 

De-Centralized

 

Team

Lg

Winning

 

Winning

 

 

 

 

Percentage

 

Percentage

 

 

Boston

A

.558

becomes

.615

 

 

Tampa Bay

A

.542

becomes

.585

 

 

Toronto

A

.539

becomes

.577

 

 

Chicago

N

.535

becomes

.571

 

 

Angels

A

.533

becomes

.565

 

 

Philadelphia

N

.529

becomes

.558

 

 

Minnesota

A

.528

becomes

.556

 

 

New York

A

.528

becomes

.556

 

 

Chicago

A

.521

becomes

.542

 

 

NY Mets

N

.518

becomes

.536

 

 

Cleveland

A

.517

becomes

.533

 

 

Milwaukee

N

.516

becomes

.531

 

 

Los Angeles

N

.509

becomes

.517

 

 

St. Louis

N

.508

becomes

.516

 

 

Houston

N

.500

becomes

.500

 

 

Florida

N

.499

becomes

.498

 

 

Texas

A

.495

becomes

.491

 

 

Oakland

A

.495

becomes

.490

 

 

Arizona

N

.492

becomes

.484

 

 

Kansas City

A

.488

becomes

.476

 

 

Detroit

A

.488

becomes

.475

 

 

Baltimore

A

.487

becomes

.474

 

 

Atlanta

N

.484

becomes

.467

 

 

Colorado

N

.475

becomes

.451

 

 

Cincinnati

N

.470

becomes

.441

 

 

Seattle

A

.468

becomes

.435

 

 

San Francisco

N

.464

becomes

.428

 

 

Pittsburgh

N

.459

becomes

.417

 

 

San Diego

N

.452

becomes

.404

 

 

Washington

N

.438

becomes

.375

 

            

            That’s not a very good way to make that adjustment, and I’m sure somebody will suggest a better way of de-centralizing the data. 

            I experimented with de-centralizing the data during the calculation process—that is, de-centralizing the numbers after each round of calculations, before the next round of calculations.  I thought that what might happen when we did that might be

            1)  That after being de-centralized in the opening rounds of the calculations, the data might stabilize at the de-centralized numbers, or

            2)  That the system might veer out of control, and start giving us irrational calculations.  

            But actually neither of those happens.  What happens—it is in a sense re-assuring—is that the system persistently attempts to stabilize at the “centralized" numbers, and defies the efforts to de-centralize it.   In other words, Boston is headed for .558 and Washington is headed for .438, no matter what you do.  If you double the difference from .500 in the early rounds, the data will hone in on the “centralized” numbers as soon as you stop forcing it away from .500.    If you double the difference from .500 after every round, the system hones in on the de-centralized numbers.   Doing the de-centralization during the process is the same as doing it after the process.  

            It works OK; I like the other method a little better, but I can see an argument for this one, too.  No matter what we do, we are going to reach the conclusion that the Red Sox were the best team in baseball in 2008, but I’ve checked my finger a number of times, and I’m really certain that there ain’t no ring there.    I’m not pursuing that claim; we’re simply trying to understand the data a little bit better.  By learning to make inferences from the data, we might eventually learn to rank restaurants, high schools, political candidates or movie stars.   We’re starting with baseball teams. 

 
 

COMMENTS (14 Comments, most recent shown first)

Trailbzr
BillJ's strength of schedule for Texas looks suspicious. Every other division's schedule strength is ranked in reverse order by run differential (e.g. BAL TOR NYY TBA BOS). In light of this, the AL West does not seem plausible:
Texas 10.41
Seattle 10.33
Oakland 10.32
Angels 10.28
The 10.41 for Texas could be a simple typo on the webpage, but if it's really the number in Texas' ranking, that would explain a .09 difference, since their schedule strength value should be about the same as Oakland's.
11:21 PM Jan 10th
 
jrickert
The problem of drift from 10.00 is probably caused by the rounding error for the calculations propagating until the become large enough to notice. This problem can be eliminated by setting up the equations that represent the limiting values and solving the system rather than iterating.
With present day machines solving a 30-by-30 (or 916-by-916) system is fairly straightforward.
When I ran the numbers for 2008, my numbers seem to be closer to clarkshu's. I got Texas at 9.93 and the White Sox at 10.866.
11:58 PM Jan 9th
 
clarkshu
Bill-I write out information about each game, and I don't see any weird baseball scores or game scores in my data. My game scores range from 1.19 to 19.19 (PHI beat STL 20-2 on 6/13). Anyway, my power ratings are 0.09 low for Texas, and 0.05 high for the White Sox, so if there's a discrepancy in our data, it's probably there. Here are the game records I write out for their games:

CHAAL 200807110 TEXAL A 2 7 3.28 4.62 7.9
CHAAL 200807120 TEXAL A 9 7 6.78 4.62 11.4
CHAAL 200807130 TEXAL A 11 12 7.78 2.12 9.9
CHAAL 200807210 TEXAL H 1 6 2.78 5.12 7.9
CHAAL 200807220 TEXAL H 10 2 7.28 7.12 14.4
CHAAL 200807230 TEXAL H 10 8 7.28 4.12 11.4
TEXAL 200807110 CHAAL H 7 2 6.97 5.93 12.9
TEXAL 200807120 CHAAL H 7 9 6.97 2.43 9.4
TEXAL 200807130 CHAAL H 12 11 9.47 1.43 10.9
TEXAL 200807210 CHAAL A 6 1 6.47 6.43 12.9
TEXAL 200807220 CHAAL A 2 10 4.47 1.93 6.4
TEXAL 200807230 CHAAL A 8 10 7.47 1.93 9.4

The first two decimal numbers in each row are offensive and defensive game scores. The third number is the total game score, which is just the sum of the first two. All of these look reasonable and the runs scored are correct.
9:45 PM Jan 9th
 
bjames
CLARKHSU--The Texas bug; my guess would be that one game is reading 1-90 instead of 1-9 or something like that. There's a discrepancy of about 80 runs. I'm sure you can find it if you run the data for each month. . .one month will have a rating of 3.1 or something.
12:17 AM Jan 9th
 
clarkshu
On the rankings numbers moving off-center, you can wait until the rankings stabilize before you recalibrate. I didn't understand what was happening until I computed rankings for the 2007 NFL season and then added in the postseason. When I ran the regular season numbers, the average game score and power score were both 100.0, but when I added the postseason games the average power score fell to 99.7, while the average game score stayed the same.
11:00 PM Jan 7th
 
clarkshu
Bill, I set my program to handle up to 1500 teams and 50000 games. I just ran a test with 47000+ games (albeit only 33 teams) and it ran in about 15 seconds.

When I run the 2008 regular season through my program, I get ratings that are usually .01 or .02 different from yours. The one exception is Texas, where I get a rating of 9.93. Since I didn't do any manual data entry, I'm trying to find a reason for this. I don't make any adjustments for home field, but that can't matter that much.
3:20 PM Jan 6th
 
bjames
Responding to Clarkshu. . .I believe there were 916 college baseball teams last year, or some number like that. We figured it out one time, went through every league we could find and counted the teams. Let's assume teams play 70 games a year; that's 32,000 games, more or less. (916 * 70 / 2).

On the issue of the average drifting from 10.000 to 9.999 when the games are uneven, in my system I actually re-center the numbers at 10.000 once every 15 cycles or something. The off-course drift becomes an actual problem if you re-cycle the data a very large number of times, as you may need to do to calculate values in a large and complex system.
12:34 PM Jan 6th
 
rpriske
J.P.'s hands are tied. He isn't blowing it up, he just can't do very much.

People should not underestimate the devestating effect on the Blue Jays that the death of Ted Rogers was.
11:09 AM Jan 6th
 
mketchen
Bill,

Great stuff. My question is this, was Toronto A) really that unlucky (they seem to rank high in every metric except actual W-L record) and B) If they were this close and they knew it, why is J.P. Blowing it up? Did they not know this? Is it strictly finicial? Are they going to deman a move to the AL West ; ). Keep up the great work.
9:54 AM Jan 6th
 
ventboys
I wonder if this can also work on the individual level, adding schedule strength to park effects? I imagine that it wouldn't be simply quality of competion, but also a texture of competition, of sorts. Colorado players possibly play a disproportunate number of road games in low run environments, or baltimore players face more quality pitchers, as potential examples. In the NFL, I can see where looking forward to the schedule can make some difference.
10:53 PM Jan 5th
 
clarkshu
Bill, did you notice that the average rating falls from 10 to 9.99 when you include the postseason games? I noticed this on the NFL rankings, and it's a much bigger deal there. With this method, the average game score will always be 10.00, but the average power rating will only be 10.00 if every team has played the same number of games. Adding postseason games increases the quality of the "average" game, since the extra games are played by good teams, so the power scores go down to compensate.
10:40 PM Jan 5th
 
clarkshu
I could adapt the C program I wrote for NFL rankings for this rather easily. The only information I would need would be an idea of how many teams and games are involved, so I could make the internal tables large enough.
10:26 PM Jan 5th
 
enamee
A system like this for NCAA teams is like the holy grail of college baseball statistics. I've got to think somebody with some programming skills could pull it off.
6:07 PM Jan 5th
 
elricsi
This guy has been ranking all manner of teams by computer for a few years:

http://www.masseyratings.com/rate.php?lg=mlb&yr=2008&sub=MLB&mid=6

You can check those and see how his list compares (he is using wins and losses only). On another part of his site, he compares all the college basketball and football ranking systems.
4:46 PM Jan 5th
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy