Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

Ranking Baseball Teams

By Bill James

January 5, 2009

Since I developed the point-comparison ranking system that we have used to rank NFL teams, NBA teams and (in 2007) NCAA football teams, people have been asking me if it could be used to compare baseball teams. I have finally approached this problem, and I’ll have a report for you shortly, but first, I need to make sure everybody is up to speed on the method. Even though I have explained this several times by now.

Suppose that Albany plays Binghamton, and beats them 70-52. The only thing we can conclude, based on this one game, is that Albany is 18 points better than Binghamton. If we assume that an average team is “100”, then, we have to conclude, based on this one game, that Albany is at 109, Binghamton is at 91.

Suppose that Cheektowaga plays Deer Park, and defeats them 122 to 41. The only thing we can conclude, based on the data we have, is that Cheektowaga is 81 points better than Deer Park. Since this is the only game info we have about either team, we place Cheektowaga at 140.5, and Deer Park at 59.5.

Suppose, however, that Albany plays Deer Park and beats the 82-68. Albany, 107; Deer Park 93, and that Binghamton plays Cheektowaga, and defeats them 92-89. Now we have two estimates for each team:

Albany 109 107 Average 108.00

Binghamton 91 101.5 Average 96.25

Cheektowaga 140.5 98.5 Average 119.50

Deer Park 59.5 93.5 Average 76.25

So the teams appear to rank in this order:

1. Cheektowaga 119.50

2. Albany 108.00

3. Binghamton 96.25

4. Deer Park 76.25

The data—like all such data—is internally inconsistent. If Cheektowaga is 81 points better than Deer Park, Binghamton is 3 points better than Cheektowaga, and Albany is 18 points better than Binghamton, than Albany must be 102 points better than Deer Park. But Albany played Deer Park, and defeated them by only 14. Our estimates, then, must not be exactly right.

Suppose that we re-run all of the games, but using these averages above (the end point data) as the starting point. In the first round of games, since all teams are initially assumed to be average, Albany’s “game output score” for the Binghamton game was 109, based on

(100 + 100 + 70 – 52) = 109.00

But in the second round, we assume that Albany is at 108 and Binghamton at 96.25, and score the game for Albany as:

(108 + 96.25 + 70 – 52) = 111.125

The results of the second-round calculations for all games are:

Albany 111.125 99.125 Average 105.125

Binghamton 93.125 109.375 Average 101.25

Cheektowaga 138.375 106.375 Average 122.375

Deer Park 57.375 85.125 Average 71.25

The rank is now

1. Cheektowaga 122.4

2. Albany 105.1

3. Binghamton 101.1

4. Deer Park 71.3

We then use these second-round outputs as the third-round starting figures, and re-calculate again.

After the third round, Binghamton has pulled ahead of Albany:

1. Cheektowaga 123.8

2. Binghamton 103.8

3. Albany 102.7

4. Deer Park 68.8

And, as we continue to re-calculate, they will pull further ahead. Finally, after a few dozen rounds of re-calculation, we will reach these values:

1. Cheektowaga (1-1) 125.25

2. Binghamton (1-1) 106.25

3. Albany (2-0) 102.25

4. Deer Park (0-2) 66.25

And when we reach those numbers, they stop moving. If you re-calculate 10,000 more times, the numbers will stay right there. Deer Park is supposed to be 81 points worse than Cheektowaga and 14 points worse than Albany; we can’t make that happen, so we compromise on 59 points worse than Cheektowaga (81 – 22) and 36 points worse than Albany (14 + 22). Albany is supposed to be 18 points better than Binghamton but only 14 points better than Deer Park. We can’t make that happen, so we compromise on 36 points better than Deer Park (14 + 22) but 4 points worse than Binghamton (18 – 22). Cheektowaga is supposed to be 81 points better than Deer Park but 3 points worse than Binghamton. We can’t make that happen, so we compromise on 59 points better than Deer Park (81 – 22) but 19 points better than Binghamton (-3 + 22).

Let me emphasize this: that it makes no difference whatever what values you initially assume each team has. If you initially assume that Deer Park has a value of 400 and all of the other teams are at zero, you get first-round outputs like this:

Albany 9 207 Average 108.0

Binghamton -9 1.5 Average -3.75

Cheektowaga 240.5 -1.5 Average 119.5

Deer Park 159.5 193 Average 176.25

But after the second-round calculations, you get these averages:

1. Cheektowaga 122.4

2. Deer Park 121.2

3. Albany 105.1

4. Binghamton 51.2

And, after a large number of re-calculations, you wind up with exactly the same numbers we had before. The initial assumption entirely washes out as the “comparison” data is re-introduced again and again and again.

In this example there are only four games, but in real leagues there are hundreds or thousands of games, each of them trying to push the ranking for a team up or down. It takes many rounds of re-calculation for the system to resolve all the tensions as well as it can, but the process eventually reaches a stopping point, which is the point at which the output values are the same as the input values.

In basketball we assume that an average team has a value of 200, in football, an average of 100, and in baseball, an average of 10. Those values are arbitrary, and it doesn’t matter; it’s just a way of establishing a center. You could make 27.418 the center, and the relative values of the teams would be just the same.

Anyway, I have done this for various sports, and people keep asking me, “Could you do this for baseball?” I didn’t think about doing it for baseball, honestly, because

1) One doesn’t tend to assume that the outcome of a single baseball game is representative of the ability of the team, and

2) The universe of games is immense.

Once Retrosheet published the data for 2008, however, I took a look at the issue of whether the game logs for each team could be moved into a spreadsheet like the ones we use. I concluded that it could. I’m not a programmer; it took me 25, 35 hours of actual work to do this, but I was able to get the games into a spreadsheet hooked up with all the necessary formulas.

OK, this is the ranking, by this method, for the 30 Major League teams in 2008, not including post-season play:

Boston	A	11.35
Toronto	A	11.11
Tampa Bay	A	11.10
New York	A	10.84
Chicago	A	10.81
Minnesota	A	10.81
Chicago	N	10.76
Angels	A	10.71
Cleveland	A	10.56
Philadelphia	N	10.37
Detroit	A	10.14
NY Mets	N	10.12
Oakland	A	10.07
Milwaukee	N	10.06
Texas	A	10.02
St. Louis	N	10.00
Baltimore	A	10.00
Kansas City	A	9.86
Los Angeles	N	9.85
Florida	N	9.69
Arizona	N	9.62
Houston	N	9.56
Atlanta	N	9.52
Seattle	A	9.49
Cincinnati	N	9.21
Colorado	N	9.12
Pittsburgh	N	8.90
San Francisco	N	8.88
San Diego	N	8.84
Washington	N	8.63

And this is the ranking for all 30 teams when post-season play is included:

Boston	A	11.26
Tampa Bay	A	11.13
Toronto	A	11.08
New York	A	10.82
Minnesota	A	10.78
Chicago	A	10.74
Chicago	N	10.68
Angels	A	10.67
Cleveland	A	10.53
Philadelphia	N	10.50
NY Mets	N	10.15
Detroit	A	10.11
Milwaukee	N	10.04
Oakland	A	10.04
St. Louis	N	10.02
Texas	A	10.00
Baltimore	A	9.97
Los Angeles	N	9.96
Kansas City	A	9.84
Florida	N	9.72
Arizona	N	9.65
Houston	N	9.57
Atlanta	N	9.55
Seattle	A	9.46
Cincinnati	N	9.22
Colorado	N	9.15
Pittsburgh	N	8.91
San Francisco	N	8.90
San Diego	N	8.86
Washington	N	8.66

Boston comes out first, but. . ..I hope nobody thinks this is what this is about. If you work with baseball statistics, you probably knew who would come out first before I told you. I could care less who ranks first; this isn’t how the championship is determined.

The Red Sox, by this method, were about 219 runs better than an average major league team, during the regular season. They outscored their opponents by 151 runs (845 to 694), and they played a schedule that was 68 runs better than a major league average schedule. (Actually, 66. The other two runs are created by the interaction of the two forces.) Washington is 222 runs worse than an average team—184 from being outscored, and 38 more from playing a weaker-than-average schedule.

Having done this, I see more merit in the method than I would have supposed was there going in. I was reluctant to do this; I thought it would be a lot of work to show us things that

a) we already knew, and

b) didn’t really matter.

Baseball has a very good system to resolve its championship; it doesn’t need rankings.

But I see more merit there than I suspected. First, the issue of predictability. . .do early-season rankings predict the finish of the season?

Yes, but not really. Since there are very few inter-league games early in the season, if we rank the teams based on games played through May 31, we get rankings that are sensible within the league, but with little information on how one league compares to the other:


AMERICAN		NATIONAL
Team	Rank	Team	Rank
Boston	10.70	Chicago	11.39
Toronto	10.64	Philadelphia	10.93
Oakland	10.64	Atlanta	10.86
Tampa Bay	10.59	Arizona	10.56
Chicago	10.58	NY Mets	10.24
Cleveland	10.28	Los Angeles	10.10
Angels	10.04	St. Louis	10.02
New York	10.00	Houston	9.97
Baltimore	9.79	Florida	9.96
Texas	9.75	Pittsburgh	9.93
Minnesota	9.70	Cincinnati	9.80
Detroit	9.68	Milwaukee	9.78
Kansas City	9.03	Washington	9.36
Seattle	8.96	San Francisco	9.01
		San Diego	8.87
		Colorado	8.86

Those rankings are reasonably predictive of the final finishes, but are they better than just looking at the standings?

Well, yes, probably. Florida was 31-23 on May 31, in first place in the NL East—but we see them here as a .500 team, and the fourth-best team in the division, Philadelphia being the best. They were one game under .500 for the rest of the season, and finished third in the division. Cleveland was 25-30 on May 31, but we see them here as an above-average team, and they did go 56-51 the rest of the way, although they never did get back in the race. On the other hand, Oakland and Atlanta, which appeared early in the season to be strong teams, ultimately proved not to be. My conclusion: Yes, there is probably some predictive significance to the method, but you wouldn’t want to rely on it. And you’d have to study many more seasons than one to know how reliable the early-season evaluations really were.

That’s a minor issue, to me, although an obvious one. The real virtues that I see in this system are:

1) It creates a meaningful and fairly reliable evaluation of how one league compares to the other.

Suppose that there was only one game played between the leagues, and that in that one game, San Diego beat Texas 13-2. If that was the case, in this system that would cause the National League teams, on average, to rank ten to twelve points ahead of the American League teams. This would happen because there would be nothing in the system resisting the input of that one game. San Diego would, of necessity, rank 11.0 runs ahead of Texas, and all of the other teams would re-orient themselves within the league based on the interlocking schedule within the league.

That’s a cautionary note; limited games, not interlocking with the rest of the schedule, can have a disproportionate impact on the rankings in this system.

But the AL/NL comparisons are not based on one game, of course; they are based on 250+ games, in which the American League, which has dominated those games for several years, went 149-103 (150-107 if we count the World Series.) Our estimate is that the average American League team is 0.86 runs per game better than the average NL team. I am sure we could derive this estimate by some other and simpler method—but I doubt that we could derive any more accurate estimate.

On the larger point, I think that the greatest potential of this method is in the comparison of leagues, and in particular in the comparison of leagues for college baseball. Our experiment here suggests that this method works quite well for baseball. There are 900+ college baseball teams, which play a completely interlocking schedule. UC-Riverside plays somebody who plays somebody who plays somebody who plays Middlebury College in Vermont and the University of Puget Sound in Tacoma and Bowdoin College in Maine.

Major league teams genuinely need to know how one college league compares to another. This method could definitively resolve that issue. The people who should do this are us—Bill James Online. So far, because of financial and programming issues, we haven’t been able to get things like that done, but if we don’t, somebody will. There is no reason we cannot clearly and definitively rank Bowling Green and Montevallo against USC and Texas.

2) It creates a sophisticated and accurate estimate of each team’s strength of schedule.

Because we have accurate rankings for each team on a run scale, we can easily figure the average strength of the opposition for each team. These are those figures for each of the 30 major league teams:

Team	Lg	Strength of Schedule	Plus Runs
Baltimore	A	10.51	83
Toronto	A	10.44	72
New York	A	10.44	71
Tampa Bay	A	10.43	70
Boston	A	10.41	66
Texas	A	10.41	66
Kansas City	A	10.39	64
Detroit	A	10.34	55
Seattle	A	10.33	53
Oakland	A	10.32	52
Chicago	A	10.28	46
Angels	A	10.28	45
Cleveland	A	10.26	43
Minnesota	A	10.26	42
Pittsburgh	N	9.83	-27
Cincinnati	N	9.81	-30
Washington	N	9.80	-32
Houston	N	9.76	-38
Florida	N	9.71	-47
Atlanta	N	9.71	-47
Milwaukee	N	9.69	-50
St. Louis	N	9.68	-51
Philadelphia	N	9.65	-56
San Diego	N	9.65	-57
Chicago	N	9.64	-58
San Francisco	N	9.64	-59
NY Mets	N	9.64	-59
Colorado	N	9.61	-62
Arizona	N	9.57	-70
Los Angeles	N	9.55	-73

The teams in the AL East play the strongest schedules, because there are four strong teams in that division. Baltimore plays the toughest schedule because they are the only team that has to play all four of them.

There is a lot of talk about strength of schedule. . .some whining about the imbalancing effects of the inter-league matchups, some discussion about playing so many games inside the division. This method gives us a solid, credible information with which to approach that discussion. I think that’s worthwhile.

Baltimore’s schedule is 156 runs tougher than Los Angeles’ schedule—one run a game, basically. Baltimore starts the season 156 runs behind the Dodgers. What do we think about that? Should we just live with it, or should try to do something about it?

3) It is a step toward the possible evolution of methods of adjusting statistical performance for strength of schedule.

I don’t talk about what happens in the Red Sox front office, but I think I can tell you this: We worry a lot about “Can this guy come into our division and compete?” OK, here’s a pitcher who has a good record pitching in some other division, but. .that ain’t the AL East. What’s going to happen to him against this level of competition?

This information is a step on the road toward a method that can adjust statistical performance for the level of competition.

Alternative Approach

What if we approached this problem not through run differential but through winning percentage?

In order to do that, we need to be able to state the outcome of each game as a winning percentage. I outlined a method to do that in another article (Winning Percentage from a Game). . ..a 1-0 win creates a winning percentage of .541, a 12-6 victory is .626, a 4-5 loss is .422.

If you post a .626 winning percentage against a team with a winning percentage of .482, what is your winning percentage? In other words, the .626 assumes a .500 opponent. It’s not a .500 opponent; it’s a .482 opponent. What’s the equivalent winning percentage?

It’s .609. There’s an old established method for dealing with that. . ..Dallas Adams and I invented it in the 1970s. I don’t want to get into that now, but:

.626 against a .400 team is equivalent to .527 against a .500 team.

.626 against a .450 team is equivalent to .578 against a .500 team.

.626 against a .500 team is .626.

.626 against a .550 team is equivalent to .672 against a .550 team.

.626 against a .600 team is equivalent to .715 against a .600 team.

If you combine the new method (Winning Percentage from a Game) with this old method, you can calculate a winning percentage for each game, adjusting for the quality of the competition.

We evaluate each game of the major league season in this way. Milwaukee played at Cincinnati on April 18, April 19 and April 20, Milwaukee winning 5-2 and 5-3 and losing the third, 3-4.

A 5-2 win is a winning percentage of .719. A 5-3 win is a winning percentage of .635, and a 3-4 loss is a winning percentage of .444, so Milwaukee’s winning percentages for the three games are .719, .635 and .444, and Cincinnati’s are .281, .365 and .556—without adjusting for the quality of competition.

To adjust for the quality of competition, we go through the process outlined above. On the first round of calculations, we assume that Cincinnati is a .500 opponent. Cincinnati’s winning percentage after one round of calculations, however, is .471, so in the second round of calculations we assume that their winning percentage is .471, and we re-calculate again.

After many rounds of calculations, Cincinnati’s winning percentage locks in at .470, and then it won’t move anymore; this is the end point data. We recalculate these games based on that conclusion:

.719 against .470 is .694 (meaning that it is equivalent to .694 against a .500 team.)

.635 against .470 is .607.

.444 against .470 is .414.

So Milwaukee’s winning percentage contributions for those games are .694, .607 and .414.

By calculating every game in this fashion and running it through many cycles, we get output winning percentages for every team as follows, including the playoff and World Series Games:


Team	Lg	Winning
		Percentage
Boston	A	.558
Tampa Bay	A	.542
Toronto	A	.539
Chicago	N	.535
Angels	A	.533
Philadelphia	N	.529
Minnesota	A	.528
New York	A	.528
Chicago	A	.521
NY Mets	N	.518
Cleveland	A	.517
Milwaukee	N	.516
Los Angeles	N	.509
St. Louis	N	.508
Houston	N	.500
Florida	N	.499
Texas	A	.495
Oakland	A	.495
Arizona	N	.492
Kansas City	A	.488
Detroit	A	.488
Baltimore	A	.487
Atlanta	N	.484
Colorado	N	.475
Cincinnati	N	.470
Seattle	A	.468
San Francisco	N	.464
Pittsburgh	N	.459
San Diego	N	.452
Washington	N	.438

This is essentially the same as the rankings we got by the other method—a little different, but mostly the same.

A strength of this method is that it is more focused on wins and losses, and pays little attention to the difference between a 7-1 win and a 15-1 win. There are eight runs there that should be depreciated—and are depreciated by this method, not by the other one.

A weakness of this method, which was discussed in the companion article (Winning Percentage from a Game) is that the average winning percentage from all games does not track with the team’s actual winning percentage, but with a figure halfway between that number and .500. . ..600 becomes .550, .580 becomes .540, etc.

I was trying to figure out a way to work around this problem, but the best I could come up with was simply to go through the entire process, and then double the spreads (double the distance from .500) at the end of the process:

		Centralized		De-Centralized
Team	Lg	Winning		Winning
		Percentage		Percentage
Boston	A	.558	becomes	.615
Tampa Bay	A	.542	becomes	.585
Toronto	A	.539	becomes	.577
Chicago	N	.535	becomes	.571
Angels	A	.533	becomes	.565
Philadelphia	N	.529	becomes	.558
Minnesota	A	.528	becomes	.556
New York	A	.528	becomes	.556
Chicago	A	.521	becomes	.542
NY Mets	N	.518	becomes	.536
Cleveland	A	.517	becomes	.533
Milwaukee	N	.516	becomes	.531
Los Angeles	N	.509	becomes	.517
St. Louis	N	.508	becomes	.516
Houston	N	.500	becomes	.500
Florida	N	.499	becomes	.498
Texas	A	.495	becomes	.491
Oakland	A	.495	becomes	.490
Arizona	N	.492	becomes	.484
Kansas City	A	.488	becomes	.476
Detroit	A	.488	becomes	.475
Baltimore	A	.487	becomes	.474
Atlanta	N	.484	becomes	.467
Colorado	N	.475	becomes	.451
Cincinnati	N	.470	becomes	.441
Seattle	A	.468	becomes	.435
San Francisco	N	.464	becomes	.428
Pittsburgh	N	.459	becomes	.417
San Diego	N	.452	becomes	.404
Washington	N	.438	becomes	.375

That’s not a very good way to make that adjustment, and I’m sure somebody will suggest a better way of de-centralizing the data.

I experimented with de-centralizing the data during the calculation process—that is, de-centralizing the numbers after each round of calculations, before the next round of calculations. I thought that what might happen when we did that might be

1) That after being de-centralized in the opening rounds of the calculations, the data might stabilize at the de-centralized numbers, or

2) That the system might veer out of control, and start giving us irrational calculations.

But actually neither of those happens. What happens—it is in a sense re-assuring—is that the system persistently attempts to stabilize at the “centralized" numbers, and defies the efforts to de-centralize it. In other words, Boston is headed for .558 and Washington is headed for .438, no matter what you do. If you double the difference from .500 in the early rounds, the data will hone in on the “centralized” numbers as soon as you stop forcing it away from .500. If you double the difference from .500 after every round, the system hones in on the de-centralized numbers. Doing the de-centralization during the process is the same as doing it after the process.

It works OK; I like the other method a little better, but I can see an argument for this one, too. No matter what we do, we are going to reach the conclusion that the Red Sox were the best team in baseball in 2008, but I’ve checked my finger a number of times, and I’m really certain that there ain’t no ring there. I’m not pursuing that claim; we’re simply trying to understand the data a little bit better. By learning to make inferences from the data, we might eventually learn to rank restaurants, high schools, political candidates or movie stars. We’re starting with baseball teams.

COMMENTS (14 Comments, most recent shown first)

Trailbzr
BillJ's strength of schedule for Texas looks suspicious. Every other division's schedule strength is ranked in reverse order by run differential (e.g. BAL TOR NYY TBA BOS). In light of this, the AL West does not seem plausible:
Texas 10.41
Seattle 10.33
Oakland 10.32
Angels 10.28
The 10.41 for Texas could be a simple typo on the webpage, but if it's really the number in Texas' ranking, that would explain a .09 difference, since their schedule strength value should be about the same as Oakland's.
11:21 PM Jan 10th

jrickert
The problem of drift from 10.00 is probably caused by the rounding error for the calculations propagating until the become large enough to notice. This problem can be eliminated by setting up the equations that represent the limiting values and solving the system rather than iterating.
With present day machines solving a 30-by-30 (or 916-by-916) system is fairly straightforward.
When I ran the numbers for 2008, my numbers seem to be closer to clarkshu's. I got Texas at 9.93 and the White Sox at 10.866.
11:58 PM Jan 9th

clarkshu
Bill-I write out information about each game, and I don't see any weird baseball scores or game scores in my data. My game scores range from 1.19 to 19.19 (PHI beat STL 20-2 on 6/13). Anyway, my power ratings are 0.09 low for Texas, and 0.05 high for the White Sox, so if there's a discrepancy in our data, it's probably there. Here are the game records I write out for their games:

CHAAL 200807110 TEXAL A 2 7 3.28 4.62 7.9
CHAAL 200807120 TEXAL A 9 7 6.78 4.62 11.4
CHAAL 200807130 TEXAL A 11 12 7.78 2.12 9.9
CHAAL 200807210 TEXAL H 1 6 2.78 5.12 7.9
CHAAL 200807220 TEXAL H 10 2 7.28 7.12 14.4
CHAAL 200807230 TEXAL H 10 8 7.28 4.12 11.4
TEXAL 200807110 CHAAL H 7 2 6.97 5.93 12.9
TEXAL 200807120 CHAAL H 7 9 6.97 2.43 9.4
TEXAL 200807130 CHAAL H 12 11 9.47 1.43 10.9
TEXAL 200807210 CHAAL A 6 1 6.47 6.43 12.9
TEXAL 200807220 CHAAL A 2 10 4.47 1.93 6.4
TEXAL 200807230 CHAAL A 8 10 7.47 1.93 9.4

The first two decimal numbers in each row are offensive and defensive game scores. The third number is the total game score, which is just the sum of the first two. All of these look reasonable and the runs scored are correct.
9:45 PM Jan 9th

bjames
CLARKHSU--The Texas bug; my guess would be that one game is reading 1-90 instead of 1-9 or something like that. There's a discrepancy of about 80 runs. I'm sure you can find it if you run the data for each month. . .one month will have a rating of 3.1 or something.
12:17 AM Jan 9th

clarkshu
On the rankings numbers moving off-center, you can wait until the rankings stabilize before you recalibrate. I didn't understand what was happening until I computed rankings for the 2007 NFL season and then added in the postseason. When I ran the regular season numbers, the average game score and power score were both 100.0, but when I added the postseason games the average power score fell to 99.7, while the average game score stayed the same.
11:00 PM Jan 7th

clarkshu
Bill, I set my program to handle up to 1500 teams and 50000 games. I just ran a test with 47000+ games (albeit only 33 teams) and it ran in about 15 seconds.

When I run the 2008 regular season through my program, I get ratings that are usually .01 or .02 different from yours. The one exception is Texas, where I get a rating of 9.93. Since I didn't do any manual data entry, I'm trying to find a reason for this. I don't make any adjustments for home field, but that can't matter that much.
3:20 PM Jan 6th

bjames
Responding to Clarkshu. . .I believe there were 916 college baseball teams last year, or some number like that. We figured it out one time, went through every league we could find and counted the teams. Let's assume teams play 70 games a year; that's 32,000 games, more or less. (916 * 70 / 2).

On the issue of the average drifting from 10.000 to 9.999 when the games are uneven, in my system I actually re-center the numbers at 10.000 once every 15 cycles or something. The off-course drift becomes an actual problem if you re-cycle the data a very large number of times, as you may need to do to calculate values in a large and complex system.
12:34 PM Jan 6th

rpriske
J.P.'s hands are tied. He isn't blowing it up, he just can't do very much.

People should not underestimate the devestating effect on the Blue Jays that the death of Ted Rogers was.
11:09 AM Jan 6th

mketchen
Bill,

Great stuff. My question is this, was Toronto A) really that unlucky (they seem to rank high in every metric except actual W-L record) and B) If they were this close and they knew it, why is J.P. Blowing it up? Did they not know this? Is it strictly finicial? Are they going to deman a move to the AL West ; ). Keep up the great work.
9:54 AM Jan 6th

ventboys
I wonder if this can also work on the individual level, adding schedule strength to park effects? I imagine that it wouldn't be simply quality of competion, but also a texture of competition, of sorts. Colorado players possibly play a disproportunate number of road games in low run environments, or baltimore players face more quality pitchers, as potential examples. In the NFL, I can see where looking forward to the schedule can make some difference.
10:53 PM Jan 5th

clarkshu
Bill, did you notice that the average rating falls from 10 to 9.99 when you include the postseason games? I noticed this on the NFL rankings, and it's a much bigger deal there. With this method, the average game score will always be 10.00, but the average power rating will only be 10.00 if every team has played the same number of games. Adding postseason games increases the quality of the "average" game, since the extra games are played by good teams, so the power scores go down to compensate.
10:40 PM Jan 5th

clarkshu
I could adapt the C program I wrote for NFL rankings for this rather easily. The only information I would need would be an idea of how many teams and games are involved, so I could make the internal tables large enough.
10:26 PM Jan 5th

enamee
A system like this for NCAA teams is like the holy grail of college baseball statistics. I've got to think somebody with some programming skills could pull it off.
6:07 PM Jan 5th

elricsi
This guy has been ranking all manner of teams by computer for a few years:

http://www.masseyratings.com/rate.php?lg=mlb&yr=2008&sub=MLB&mid=6

You can check those and see how his list compares (he is using wins and losses only). On another part of his site, he compares all the college basketball and football ranking systems.
4:46 PM Jan 5th

Ranking Baseball Teams

COMMENTS (14 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: