Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

The Log5 Method, Etc, Etc, Etc.

By Bill James

December 11, 2015

The Log5 Method, Etc., Etc., Etc.

Hi Bill In baseball, there's a nice symmetry when it comes to expected runs scored in a game....Offense X Defense divided by Lg average. Offense and defense are equally responsible for runs scored. What if the offense is more responsible than the defense? Say, for example, 3 pt % in college basketball. The offense is maybe 80% responsible, with the defense only 20%. For example: KU is playing Oregon St on Saturday. KU shoots 43.9% from 3pt land. OSU holds its opponents to 28.4%. D-1 Average is 34.1% Using the baseball formula gives me this: 43.9% * 28.4% / 34.1% = 36.5% KU should shoot 36.5% from 3 pt land on Saturday using the standard formula. But if we think the offense is responsible for 80% of 3pt% (just making up that #)...how would we adjust that formula above? My eyeball test says it should be around 40-41%; I just can't figure out how to tweak the formula correctly.

--Yakustk

You assumed that if KU has a 3-point (offensive) percentage of .439 and Oregon has a defensive percentage of .284 in a context in which the norm is .341, then KU against Oregon should result in a .365 outcome. In reality, the Log5 method tells us that it would result in an expected outcome of a shooting percentage of .375. I can’t really explain why in this context, but the expected result is .375, not .365. Your question (Yakustk) was, wouldn’t this be higher (than .365) because offense is more of a controlling factor in this set of events than defense is? But if offense is more of a controlling factor than is defense, I believe that the effect that would create is not to cause the Log5 system not to work, but rather, merely to cause the standard deviation of 3-point percentage for offensive teams to be much larger than the standard deviation for defensive teams. It may be that it IS larger; it may be that it is not; I don’t know. I wouldn’t intuitively agree with your assertion that the offensive elements of this are more dominant than the defensive components. KU bombs threes successfully if the other team can’t defend the perimeter. The other team won’t be able to defend the perimeter if (1) the perimeter players have to help out inside, or (2) KU moves the ball too quickly for them to adjust. But we (KU) certainly have games when we can’t get up good three-point shots because the other team is able to defend the perimeter.

--Bill James (Me)

More palatably: an (observed) .350 OBP hitter facing an (observed) .280 OBP pitcher should have an identical expectation as a .350 OBP pitcher facing a .280 OBP hitter. But what about an observed 35% 3-point offense facing an observed 28% 3-point defense: would that necessarily mean that a 35% 3-point defense facing a 28% 3-point offense result in the same outcome? If the answer can be shown to be "no", and agreeing that the solution is to apply Log5, then this must mean we need to adjust the numbers first before doing so.—Tango Tiger

I agree that the entirety of the explanation rests on the spread in talent for offense and defense in shooting percentage. And if the spread is equal, then we Log5(*) our way to the solution. (* More commonly known as the Odds Ratio Method.) However, if the spread in offense 3 pointers is larger than defense 3 pointers (which seems reasonable enough at least to have a discussion), we'd see say the spread in offense to be larger than in defense. Could we then just use Log5 on the shooting percentages? I say no, not without an adjustment (which is really your reader's point). For example, if something was 100% offense and 0% defense, random variation would still give us a non-zero standard deviation for defense. That's because we care about talent, but we only know observations. So, we actually want to remove the random variation portion first, before applying Log5. (We don't have this issue in the 50/50 cases because the random variation would, essentially, cancel out.)

--Tango Tiger

Several initial points—

a) In order for this to make ANY sense to the reader, I’m going to have to back WAY up and explain 15 bits of pre-requisite knowledge.

b) The terminology is tremendously confusing, so it’s always going to be difficult to keep track of what it is that we are actually discussing.

c) I am 99% certain that you are wrong, but

d) I am 98% certain that it would be impossible to determine which theory was correct with data from a baseball context. The data would just never support a real-life experiment. You MIGHT be able to prove which theory was correct with a theoretical experiment or with data from some other area, but it would be difficult.

OK, let’s back up and try to take the reader with us. The discussion starts with this question: Suppose that a .600 team is playing a .450 team? How often will the .600 team win, and how often will the .450 team win?

The .600 team will win 64.7% of the time; the .450 team will win 35.3% of the time.

How do we know this to be true?

We know it in this way. Suppose that each team has an "underlying skill level", and that the underlying skill levels interact to create the output, or the "observed skill levels". What, then, is the "underlying skill level" for a .600 team?

We assume as a starting point that

a) the underlying skill level for a .500 team is .500, and

b) the underlying skill level for a .000 team is .000, and

c) the underlying skill level for a 1.000 team is infinite.

Suppose that a .500 team is playing another .500 team. If the underlying skill of each team is .500, then Team A will win 50% of the time:

Team A WPct = (.500 / (.500 + .500)) = .500

If Team B is a .000 team, then Team A will always win:

Team A WPct = (.500 / (.500 + .000)) = 1.000

If Team is a 1.000 team, them Team A will always lose:

Team A Wpct = (.500 / (.500 + inf)) = .000

All of that makes sense, right? If a .500 plays another .500 team, they should split. If a .500 team plays a team that never wins, the .500 team will always win, and if a .500 team plays a team that never loses, the .500 team will always lose.

Suppose that a .600 team plays a .500 team? How often will they win then?

There are four points on the graph at which we know how often the .600 team should win. We know that if they play a .000 team, they should always win, just as a .500 team would. We know that if a .600 team plays a .500 team, they should win 60% of the time. We know that if they play another .600 team, they should each win 50% of the time, and we know that if they play a 1.000 team, then they should always lose.

Suppose a .600 team is playing opponents of all different qualities—a .100 team, a .200 team, a .250 team, a .300 team, a .900 team, etc. When we graph their expected winning percentage against quality of opposition, we know intuitively (and logically) that the graph has to rise at some consistent rate, and we also know where that graph should be at these four points--.000, .500, .600 and 1.000. What we do not know at this point is where the graph should be at any other point.

The number that works for .600 is .750:

(.750 / (.750 + .000) ) = 1.000

(.750 / (.750 + .500) ) = .600

(.750 / (.750 + .750) ) = .500

(.750 / (.750 + inf) ) = .000

So we know now the "underlying skill levels" for four teams—

.000 is .000,

.500 is .500,

.600 is .750,

1.000 is Infinite.

What is the underlying skill level for a .400 team?

It’s .333. Same thing:

(.333 / (.333 + .000) ) = 1.000

(.333 / (.333 + .333) ) = .500

(.333 / (.333 + .500) ) = .400

(.333 / (.333 + inf ) ) = .000

So then, what if a .600 team plays a .400 team? If the underlying skill level of the .600 team is .750, and the underlying skill level of the .400 team is .333, then the .600 team should win 69.2% of the time:

(.750 / (.750 + .333) ) = .692

At this point, we don’t know whether this is a correct answer, but we can conclude that it is a reasonable answer. We know that a .600 team has to beat a .400 team more than 60% of the time. But we also know, when we think about it, that the .600 team has to beat a .400 team somewhat less than 70% of the time, because of the curvature of the lines. But that’s getting too technical, and I’d better get away from that point before I lose the audience.

The .750 can be derived from .600 in this way, using the terms "USL" and "OSL" for "Underlying Skill Level" and "Observed Skill Level"

USL = OSL / (2 * (1 – OSL) )

By that method, the Underlying Skill Level for .600 is .750, and the Underlying Skill Level for an Observed Skill Level of .400 is .333:

.750 = .600 / ( 2 * (1 – .600) )

.333 = .400 / (2 * (1 - .400) )

So, by this method, we can derive the Underlying Skill Levels for ANY Observed Skill Level. Suppose that we do this. Suppose that we derive the Underlying Skill Levels for a 1.000 team, a .900 team, an .800 team, a .700 team, etc.:

OSL	USL
1.000	Infinite
.900	4.500
.800	2.000
.700	1.167
.600	.750
.500	.500
.400	.333
.300	.214
.200	.125
.100	.056
.000	.000

Suppose that we match up the .800 team (and every other team) against a .950 team, a .925 team, a .900 team, an .875 team. . .a .025 team, etc. What would you find?

What you would find is that the Expected Winning Percentage given by this method is reasonable and appears to be correct in every case. In every case in which it should be over .500, it is over .500. In every case in which it should be under .500, it is under .500. In every case in which it should be over .600 but under .700, it is over .600 but under .700. If it SHOULD be closer to .700 than to .600, it IS closer to .700 than to .600. If it should be over .650 but less than .651, it will be over .650 but less than .651.

I am not a theoretical mathematician; I am an empiricist. If it works, that’s good enough for me. This is a system that works. That’s what I care about.

I developed this method probably in 1977; I don’t know, could have been ’75 or ’76. Sometime forty years ago. At that time I was corresponding regularly with Dallas Adams. This method assumes a .500 league; in other words, it works only if the league average is .500. Dallas Adams, a better mathematician than I am, developed a way to generalize this process so that it works if the league average is not .500; in other words, if it is a batting average (like .265) or an on base percentage (like .335). How does Dallas Adams’ adaptation of the method work? Let me see if I can figure this out.

. All of this can be expressed in a spreadsheet in a simple form:

Team A Vs. League	.600	.750
Team B Vs. League	.400	.333
Team A Vs. Team B	.750	.333	.692

Which can be condensed to one line:

Team A	Team B	Tm A USL	Tm B USL	A Versus B
.600	.400	.750	.333	.692

At this point I should stop and acknowledge for the benefit of the Anal Retentive Assholes in the audience that of course there are many technical problems with applying this process in practice. A .600 team is not a .600 team in every situation; they may be .650 team in their home park and a .550 team on the road. If you calculate what their winning percentage should be at home and what it should be on the road and then put the two together 50-50, it won’t come out to precisely .692; it will come out with something very close to that, but not EXACTLY that. Also, we’re assuming that a .600 team will play .600 ball against an average team, but in the real world, when you take the games played by the .600 team out of the league or set them apart in any manner, then the league is no longer a .500 league, so a .500 team is no longer precisely average, so a .600 team may not play .600 ball against a .500 team. Of course the .692 number (against a .400 team) varies with the quality of the starting pitchers, and the sum for a series of starting pitchers may not be exactly .692, and there are no doubt many other internal variables like this that will complicate the calculation in practice. We are estimating the Underlying Skill Level based on the observed winning percentage, but the previously observed winning percentage may not be an accurate measure of the team’s ability, and consequently may not have predictive significance.

In spite of these things, this process DOES work, and it is enormously useful in studying many different problems. Suppose that a .325 hitter is facing a pitcher against whom the league batting average is .280 in a league with a .265 batting average. What will be the expected batting average?

The expected batting average will be .342, but it is a little bit tough to explain how we get there. Intuitively, you can see that the hitter and the pitcher are both pushing the batting average UP, both pushing it HIGHER—the batter, because he hits for an average greater than the league average of .265, and the pitcher, because he allows hitters to hit for an average higher than .265. Both are pushing up, away from .265, so the result is higher than either the average of the hitter or the average of the pitcher. But how do we calculate ‘zactly what it is?

Take the five cells above:

Cell 1 (A1) is the first outcome,

Cell 2 (A2) is the second outcome,

Cell 3 (A3) is the USL of the first outcome, which is (A1/(2*(1-A1)).

Cell 4 (A4) is the USL of the second outcome, which is (A2/(2*(1-A2)).

Cell 5 is the outcome of 1 against 2, which is (A3 / (A3 + A4)).

But in order to make this work, we have to go through the same process five times. We start by comparing the hitter to the league average, in one line:

Htr A	Htr Lg	Htr A USL	Htr Lg USL	A Versus Lg USL
.325	.265	.241	.180	.572

These are the same FORMULAS as before.

The hitter against the League has an underlying strength level of .572. Because the hitter is stronger than the league average, he is above .500. For the pitcher, we do the same thing, except we base it not on the pitcher’s "failure" rate--.280—but on his "success" rate, which is .720:

.720

.735

1.286

1.387

.481

Because the pitcher is below average, this figure is lower than .500. Then we compare the hitter (.572) to the pitcher (.481):

.572

.481

.668

.464

.590

What we have done here is to combine the strength of the hitter and the strength of the pitcher into one number. We’re repeating the same process, over and over—and we’re not done yet. On the next line we compare the .500 norm to the .265 norm:

.500

.265

.500

.180

.735

Then we repeat the process one more time to compare the USL of the hitter vs. the pitcher (.590) to the strength level of .500 vs. the league norm (.735):

.590

.735

.720

1.387

.342

All put together, it looks like this:

Hitter Vs. League:	.325	.265	.241	.180	.572
Pitcher Vs. League:	.720	.735	1.286	1.387	.481
Hitter Vs. Pitcher at .500:	.572	.481	.668	.464	.590
.500 Vs. League Norm:	.500	.265	.500	.180	.735
Hitter Vs. Pitcher at League Norm:	.590	.735	.720	1.387	.342

A hitter who hits .325 facing a pitcher who allows an average of .280 in a league in which the overall average is .265 can be expected to hit .342. Start by figuring the strength of the hitter compared to .500, by the same formula as before:

Step 1 USL = .325 / (2 * .675)

That works out to .240 741.

Then figure an average hitter in this league (.265) against the league:

Step 1 USL = .265 / (2 * .735)

Which is .180 272.

Then compare the specific hitter (.240 741) against the league average (.180 272) by the additive method:

USL (Hitter vs. League) = (.240 741) / (.240 741 + .180 272)

Which is .571 813 . This figure is over .500 because the hitter is better than the league average.

Next we do a similar thing with the pitcher. The pitcher allows a .280 batting average, which means that he gets out .720 of hitters. His Step 1 USL is 1.285 714.

Step 1 USL (Pitcher) = .720 / (2 * .280)

Which is 1.285 714.

Do the same for the league average from the pitcher standpoint:

Step 1 USL (Average Pitcher) = .735 / (2 * .265)

Which is 1.386 792. Then we compare the pitcher to the league average pitcher:

Pitcher USL (1.285 714 )/ (1.285 714 + 1.386 792).

Which is .481. This figure is less than .500 because the pitcher is not as good as the league average at preventing hits.

That’s Line One of the box and Line Two. Line Three puts the hitter’s strength and the pitcher’s strength into one number (.590). Line Four compares the .500 norm to the .265 norm, and Line Five is a mathematical adjustment which moves the combined Hitter and Pitcher USL (.590) from a .500 context to a .265 context.

I wish I had Dallas Adams’ address so that I could ask him whether I have all of this right. I never actually understand the method except when I am using it, so every time I use it I have to re-construct it based on my understanding of it, and I’m always afraid that I may be completely wrong, but I think I am OK. Let’s do some real-life 2015 examples. In 2015 the American League Batting League, Miguel Cabrera, hit .338, while the pitcher who held hitters to the lowest average, Marco Estrada of Toronto, limited hitters to a .203 average, in a league in which the overall average was .255. Plug those numbers into the spreadsheet:

Cabrera Vs. League:	.338	.255	.255	.172	.598
Estrada Vs. League:	.797	.745	1.963	1.457	.574
Hitter Vs. Pitcher at .500:	.598	.574	.744	.673	.525

Cabrera is doing more to push the batting average UP from .255 than Estrada is doing to push it DOWN from .255, so the end-result batting average will be MORE THAN .255:

Cabrera Vs. League:	.338	.255	.255	.172	.598
Estrada Vs. League:	.797	.745	1.963	1.457	.574
Cabrera Vs. Estrada at .500:	.598	.574	.744	.673	.525
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Cabrera Vs. Estrada at Lg Batting Average:	.525	.745	.553	1.457	.275

Cabrera against Estrada, Cabrera should hit .275, ignoring platoon differentials and park effects and those kind of things. Let’s stay in the American League, but go for a lesser hitter—let’s say Lorenzo Cain, who hit a mere .307:

Cain Vs. League:	.307	.255	.221	.172	.563
Estrada Vs. League:	.797	.745	1.963	1.457	.574
Cain Vs. Estrada at .500:	.563	.574	.645	.673	.489
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Cain Vs. Estrada at .255:	.489	.745	.479	1.457	.247

Cain pushes the league batting average up by 52 points; Estrada pushes it down by 52 points. However, since .255 is closer to .000 than it is to 1.000, it is more difficult to push the batting average down by 52 points than it is to push it up by 52 points, so Estrada appears to be slightly stronger in pushing the average downward than Cain is at pushing it upward, so the resulting expected average is .247. Suppose that, instead of Cain, we use Manny Machado, who hits .286:

Machado Vs. League:	.286	.255	.200	.172	.539
Estrada Vs. League:	.797	.745	1.963	1.457	.574
Machado Vs. Estrada at .500:	.539	.574	.584	.673	.464
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Machado Vs. Estrada at .255:	.464	.745	.433	1.457	.229

Machado should hit about .229 against Estrada. Suppose that, instead of Machado, we used Steve Pearce, who hits .218:

Pearce Vs. League:	.218	.255	.139	.172	.448
Estrada Vs. League:	.797	.745	1.963	1.457	.574
Pearce Vs. Estrada at .500:	.448	.574	.406	.673	.376
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Pearce Vs. Estrada at .255:	.376	.745	.301	1.457	.171

Pearce, who hit .218 overall, should hit about .171 vs. Estrada (understanding, of course, that Pearce in reality is probably a better hitter than that. But he probably hit about .171 against Estrada last season.) But suppose that, rather than Estrada, Pearce was facing (let’s say) Hisashi Iwakuma, against whom the league average was .240:

Pearce Vs. League:	.218	.255	.139	.172	.448
Iwakuma Vs. League:	.760	.745	1.581	1.457	.520
Pearce Vs. Iwakuma at .500:	.448	.520	.406	.542	.428
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Pearce Vs. Iwakuma at .255:	.428	.745	.374	1.457	.204

Pearce Vs. Iwakuma should hit about .204. Suppose that Pearce was not facing Hisashi Iwakuma, but, let’s say, Kyle Lobstein, against whom the league batting average was .305:

Pearce Vs. League:	.218	.255	.139	.172	.448
Lobstein Vs. League:	.695	.745	1.141	1.457	.439
Pearce Vs. Lobstein at .500:	.448	.439	.406	.391	.509
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Pearce Vs. Lobstein at .255:	.509	.745	.518	1.457	.262

Pearce should hit .262 against Lobstein. Lobstein is kind of an unfortunate name for a pitcher, isn’t it? And finally, let us suppose that Miguel Cabrera was facing Kyle Lobstein:

Cabrera Vs. League:	.338	.255	.255	.172	.598
Lobstein Vs. League:	.695	.745	1.141	1.457	.439
Cabrera Vs. Lobstein at .500:	.598	.439	.744	.391	.655
.500 Vs. League Norm:	.500	.255	.500	.172	.745
Cabrera Vs. Lobstein at .255:	.655	.745	.950	1.457	.395

This is impossible, as Cabrera and Lobstein were teammates, but then, it’s all theoretical, anyway. Miguel Cabrera against Kyle Lobstein would probably hit about .395. Lorenzo Cain against Kyle Lobstein would probably hit about .361, and Manny Machado against Kyle Lobstein would probably hit about .338.

With this method, you can figure what ANY hitter should hit against any pitcher. You can go further than that; you can figure what Miguel Cabrera should hit against Aaron Brooks if the league average was .255, or if it was .265, or if it was .275. You can go further than that; if Miguel Cabrera played in one league and Aaron Brooks Brothers played in a DIFFERENT league, you can still put them together (in theory) in a league which hits .255, or .260, or whatever you want the league to hit.

You can go further than that; you can generalize the process to a wide array of questions. Suppose that

a) The Unemployment Rate among Black People is 14%,

b) The Unemployment Rate among Teenagers is 17%, and

c) The Unemployment Rate for the entire population is 6%.

What will be the unemployment rate among Black Teenagers? By this method, you can find the answer to that question—and your answer will in fact be correct if you have done the math correctly. It is a method which applies to a wide array of real-life questions.

OK, the preliminaries are out of the way. Now let’s turn our attention to the specific question that the poster, who as I recall was a refugee from a RISK game, posted. Kansas University’s basketball team, which shoots three-point shots at a 43.9% clip (actually they’re at 47.2% after Tuesday’s game, but let’s not worry about that.) On Saturday at 8:00 the Jayhawks will play Oregon State, a team which defends the three-point shot very well. Yakutsk says their opponents shoot only 28.4% from three-point range, and I’ll take his word for it. He also says that the overall NCAA percentage in this regard is 34.1%, and, again, I’m going to take his word for that.

So with this information, we would expect Kansas University’s basketball team to hit about 37.5%, based on their ability to shoot threes, contrasted with Oregon State’s ability to prevent opponents from hitting their threes. We would figure this as follows:

KU Offense vs. Avg Team:	.439	.341	.391	.259	.602
O State Defense vs. Avg:	.716	.659	1.261	.966	.566
KU vs. O State at .500	.602	.566	.756	.652	.537
.500 vs. NCAA Norm	.500	.341	.500	.259	.659
KU vs. O State at NCAA norm:	.537	.659	.580	.966	.375

Yakutsk argues, however, that this 37.5% is not likely to be predictive in real life, because offense is more dominant than defense in shooting threes. He argues, either because he believes this or merely to create a thesis, that offense is FOUR TIMES AS IMPORTANT as defense in this competition. The method implicitly assumes that they are equally important. The method, then, presumably reaches the wrong answer.

Well, OK, let’s assume that offense IS three times as important as defense in this part of the contest. If that was true, that would like have an impact on THIS calculation, the one which is highlighted below:

KU Offense vs. Avg Team:	.439		.341		.391		.259		.602
O State Defense vs. Avg:	.716		.659		1.261		.966		.566
KU vs. O State at .500	.602		.566		.756		.652		.537
.500 vs. NCAA Norm	.500		.341		.500		.259		.659
KU vs. O State at NCAA norm:	.537	.659		.580		.966		.375

It is that Cell of the calculation—Cell C5—which implicitly assumes that offense and defense are equal. The formula for Cell C5 is

C5 = (C3 / ( C3 + C4))

If C3 is four times more important in this outcome than C4, then that formula would (presumably) change to:

C5 = ( (4 * C3 ) / (3 * C3 + C4 + 1.5))

Why does the denominator change from (C3 + C4) to (4 * C3 + C4 +1.5), rather than just to (4 * C3 + C4)? You will ask. Enquiring minds will want to know.

It has to. If you just divide (4 * C3) by (4 * C3 + C4), you get a series of absurd answers, leading ultimately to an obviously absurd conclusion. In this structure, when you divide (4 * C3) by (4 * C3 + C4), you are in effect assuming that Oregon State’s defense against the three has an underlying strength of C4 divided by four, which would be .163. You are assuming that three possessions in four, their defense against the four is .000. What you SHOULD assume is that the "missing data" is inert—in other words, that three possessions in four, their defense against the three-point shot is just average.

So let’s go with that. That changes the calculation block to this:

KU Offense vs. Avg Team:	.439	.341	.391	.259	.602
O State Defense vs. Avg:	.716	.659	1.261	.966	.566
KU vs. O State at .500	.602	.566	.756	.652	.584
.500 vs. NCAA Norm	.500	.341	.500	.259	.659
KU vs. O State at NCAA norm:	.584	.659	.703	.966	.421

Which means that KU SHOULD shoot about 42.1% (on threes) against Oregon State’s defense, if we assume that Yakutsk is not too terribly Yakutski.

EXCEPT.

Except that I am arguing that what Yakutsk says is impossible. Tango says that I am wrong (I think); he says that it IS possible. It’s not.

Why is it not possible?

Because it confuses the OBSERVED with the UNDERLYING skill level. This is what Yakutsk wrote:

What if the offense is more responsible than the defense? Say, for example, 3 pt % in college basketball. The offense is maybe 80% responsible, with the defense only 20%. For example: KU is playing Oregon St on Saturday. KU shoots 43.9% from 3pt land. OSU holds its opponents to 28.4%.

--Yakutstk

The FIRST sentence, which is highlighted in green, implicitly refers to an UNDERLYING skill level. It makes no sense if it is interpreted as referring to OBSERVED skill level; it only makes sense if it refers to an UNDERLYING skill level. But the second sentence refers explicitly to an OBSERVED skill level. Oregon State holds opponents to a 28.4% shooting percentage from three point range. This is an observed skill level. Without intending to do so, Yakutsk has pulled a bait-and-switch on it, referring to OSU’s OBSERVED skill level as if it was an UNDERLYING skill level.

But if Yakutsk is correct in theorizing that defense is only 20% responsible for the effectiveness of the three-point shooting by the offense, then the OBSERVED skill level will bear little resemblance to the actual or underlying skill level.

If, in fact, OSU has an OBSERVED skill level (in this respect) of .566 (holding opponents to a 3-point percentage of 28.4%), and if it is true that the defense controls only 20% of the output, then it must be true that OSU’s TRUE skill level is dramatically better than it has been measured as being. The only thing that is relevant here is not their UNDERLYING skill level; it is only their OBSERVED skill level.

So the formula would have to work, as it is and without adjustment, because the only thing that is relevant is the OBSERVED skill level.

This can be demonstrated with an experiment, only it is 3:00 in the morning and all of this crap is so complicated that I wasn’t able to make my experiment work the way I wanted it to. My experiment was supposed to work this way. I was going to set up an 8-team league, in which teams had the same UNDERLYING offensive and defensive skills, but the outcomes were determined 80% by the offensive component. There would be 56 "output numbers"—Team A against Team H and Team H against Team A, etc. I expected that:

(a) the average three-point shooting percentage could be established at .341,

(b) the OBSERVED three-point percentages would have a much larger standard deviation on offense than on defense, even though we knew that the underlying percentages were identical.

(B) did in fact happen as I had expected that it would; the standard deviation of three-point percentages on offense was more than three times higher than it was on defense. But I couldn’t make (a) work; I couldn’t make the observed three-point shooting percentage .341. I was doing something wrong; don’t know what, so I won’t print the chart of observed percentages.

Thank you for reading. I think I’ll make this a public article, just as a little joke, like the public is actually going to take the time to figure out what we are arguing about here. They’ll be a test on December 26, so I want you all to spend Christmas Day working on this.

COMMENTS (24 Comments, most recent shown first)

JimPertierra
Bill,
Let me look for Dallas' address. We got in touch again in about 2007 and I know he was in Northern California.

Will let you know.
Best/Jim
9:34 AM Dec 23rd

tangotiger
Yak: in order to figure out how much ballast you need for each (if it's even going to be different, which is your theory), you simply need to run a test to see where you get the least amount of error in predicting future outcomes.

If you have game by game data, then you can do it.
7:45 AM Dec 12th

Yakustk
Here is what I did after reading Tango's and Bill's earlier comments (not this article...I'll have to read that closely that later tonight).

3 pt %---
Offense: .439
Defense: .281
Average: .341

1) Took the difference between offense and lg average, and multiplied by .2 (defense responsibility).

(.341-.439)*.2= -.0197

Added actual offense % with the adjustment:

.439 + (-.0197)= .419

Regressed offense 3 pt% = 41.9%

2) Took difference between defense and lg average, and multiplied it by .8 (offense responsibility).

(.341-.281)*.8 = .048

Added defense % with the adjustment:

.281 + .048 = .329

Regressed defense 3 pt %= .329

I then used those numbers for the standard formula:

.419 * .329 / .341

expected 3 pt % = 40.4%

My adjustment seems reasonable but I don't know enough about regression/ Gravity Principle to know if it comes into play here.

Other examples:

if Oregon St has lg average 3 pt defense, then KU's expected % will be 41.9% as opposed to their current 43.9%.

.419 * .341 / .341

Or if OSU allowed 50% shooting on 3pt attempts:

.419 * .373 / .341

expected shooting %= 45.8%

If OSU allowed 10% shooting:

.419 * .293 / .341 =

ex shooting %= 36%

9:11 PM Dec 11th

tangotiger
Re: StratOMatic

Log5 as I've shown is Odds Ratio Method, which is:
oddsHitter * oddsPitcher / oddsLeague
So, a .400 OBP v .300 OBP in a .320 league yields .378

A simple approximation is:
rateHitter * ratePitcher / rateLeague
So, a .400 OBP v .300 OBP in a .320 league yields .375

A similar approximation is:
rateHitter + ratePitcher - rateLeague
So, a .400 OBP v .300 OBP in a .320 league yields .380

Strat-Matic uses a form of the third version. What they would do is give the .400 OBP hitter a card of .480. Half the time, this hitter would have a .480 and the other half, he'd average out to .320, so that he comes out to .400. When he faced a .300 pitcher (who himself would have a .280 card), he'd be .480/2 + .280/2 = .380

3:20 PM Dec 11th

bjames
"You probably could have saved a lot of time just by asking Strat-O-Matic how they figure up the cards." This would be assuming that Strat-O-Matic knows more about this issue than I do. I will assure you that they do not. ?
2:48 PM Dec 11th

tangotiger
To show how the odds ratio method works, and how it gives the identical results to log5.

You have a .600 team (.6 wins per .4 losses or 1.5 wins ratio per loss) facing a .400 team (.4 wins per .6 losses or 0.67 wins ratio per loss).

When a .600 v .400, you would do:
1.5 / 0.67 = 2.25

That's 2.25 wins per 1 loss. To convert a ratio to a percentage:
2.25 / (2.25 + 1) = .692

Just like log5.

Now, the power of the Odds Ratio form is that you can extend it to include other variables. Say you want to include the Home field advantage. That's a .540 record for the average team, or .54 wins per .46 losses, or 1.17 wins per loss.

A .6 v .4 team at home:
1.5 x 1.17 / .67 = 2.63 wins per 1 loss, or .725 win%

A .6 v .4 team on road:
1.5 / (.67 x 1.17) = 1.92 wins per 1 loss, or .658 win%

***

You can also use it for things other than a .500 baseline, and include even more things like batter v pitcher to include home field and platoon advantage, etc. It's a bit more complex to account for the non-.500 baseline, but it flows right in once you see it.

2:16 PM Dec 11th

ghealey
Interesting discussion. I recently wrote a paper showing that log5 is a special case of a binary logit model which allows constrained logistic regression to be used to determine the appropriate coefficients for the offense, defense, and league. The approach also allows additional variables to be incorporated into the model. Many thanks to Bill and Dallas Adams for the log5 model that inspired this work. The paper is publicly available at

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7069266

1:33 PM Dec 11th

jrickert
Many years ago I did a study looking at Milwaukee Brewers home games and needed their expected won-lost. After getting expected values based on Brewers home W-L vs. League Home W-L and Opponents Road W-L vs. League Road W-L I checked and found that against the teams the Brewers were expected to play .600 ball against they played roughly .600 ball. Against the .550 cohort they played roughly .550, and the good predictions of Log% held through the 20 or so expected W-L cases that had enough games to make it worthwhile to check.
1:09 PM Dec 11th

jrickert
There was an Article in the February 2015 "Mathematics Magazine" called "The James Function" (written by Christopher N. B. Hammond, Warren P. Johnson, Steven J. Miller) which showed that this Log5 is the only function satisfying some reasonable criteria such as a .600 team beats a .500 team 60% of the time, etc.
1:05 PM Dec 11th

rgregory1956

I'm a lazy sabermetrician. I don't care about exactness as much as I care about "in the ballpark". That .600 teams play at a .594 clip when playing .500 teams is close enough verification of the theoretical for me to know that the premise is generally accurate. And frankly, I didn't want o deal with inter-league games (and the relative strengths of each league) and trying to figure which exactly is a .600 team or which exactly is a .500 team. What's the phrase: close enough for government work?

12:58 PM Dec 11th

MarisFan61
...I thought part of the idea was to see how many people wouldn't be left behind....

The thing for for me about the "Underlying Skill Level" thing, Bill, is that it seems to me like literally the first time ever that I've come across a phrase you've used prominently that doesn't convey some intuitive linguistic meaning, and which you didn't explain.
12:56 PM Dec 11th

tangotiger
In this particular case (AL 2015), you needed a ballast of 22 or 23 W and L for each team to their "other" records, before applying the log5.

Not quite the 35-35 I expected, but still quite substantial and enough to make my point.
11:51 AM Dec 11th

tangotiger
As a for instance, and I just did this right now, I took all the head-to-head matchups of the 2015 AL teams.

For each head-to-head, I removed their actual head-to-head results, and looked at each team's OTHER W/L records (AL v AL only). For example, Angels v A's had an 11-8 record. I removed that from each of their records so that I have 66-57 (.537) for Angels and 46-77 for A's (.374). That difference is .163. I took all head to heads where the gap was at least .060 (and so a lopsided matchup). There were 47 such matchups.

The average dominant team had a win% of .540 while the average opponent team had a win% of .441. Log5 would predict .598.

The actual head-to-head results was 293-218, or a .573 win%. Which is very much in line with what I had expected the result to be.

Someone else can pick it up from here, and look at more years. I'm quite confident we'll see the pattern repeating itself.

Data here:
www.baseball-reference.com/leagues/AL/2015-standings.shtml
11:46 AM Dec 11th

shthar
You probably could have saved a lot of time just by asking Strat-O-Matic how they figure up the cards.
11:25 AM Dec 11th

tangotiger
Correction to this:
" a .600 team should win about 60% of the time against a .500 team"

It should say:
" an observed .600 team should have won about 60% of the time against an observed .500 team"

This nuance is actually quite important, as it pertains to Yak's original question.

You are taking ALL the data, INCLUDING the head-to-head games in question that you are trying to "predict". At the very least, you should remove the head-to-head matchups first, then look for .600 teams and .500 teams. You won't get quite the same results. It'll be closer to .570.

(You can't start with the .600 teams, then remove the head-to-head matchups. In that case, you'll still end up with .600 as the outcome.)
10:59 AM Dec 11th

rgregory1956

In theory, a .600 team should win about 60% of the time against a .500 team. How does it look in actual practice?

Starting with 1961, I looked for teams that played at a .600 clip, meaning either 97 or 98 wins in a 162-game schedule (adjusting for not playing 162, such as strike seasons). I matched them up with .500 teams from the same league-season (but I included 80-81 and 81-80 teams). I then looked at how the .600 teams did against their .500 counterparts. I found 26 such matched-sets. The .600 teams actually played at a .6018 rate (there were more 98 win teams than 97 win teams) and the .500 teams played at a .4993 clip (there were 5 more teams that went 80-81 than 81-80). Here's the results of the head-tohead matchups: the .600 teams won 59.4%.

10:49 AM Dec 11th

tangotiger
On a side note: In terms of win%, the amount of ballast is 35W and 35L for MLB teams (at any point in the season), 7-7 for NBA teams and 6-6 for NFL teams (this is true as of 2005, though things may have changed slightly since).

So, if you were to limit yourself to only a team's past observed W-L record, then you'd predict the future outcome between two teams by first applying the above ballast, and then using Log5.

9:11 AM Dec 11th

tangotiger
Bill:

Actually, in light of your post regarding "ballast", this is exactly what "regression toward the mean" is.

And for batter BABIP, you would have less "ballast" than for pitcher BABIP. (Based on DIPS theory.)

Similarly, and this is yak's theory, offensive 3-pointers scored requires less ballast than defensive 3-pointers allowed. (A theory still to be proven.)

And therefore, you apply the ballast first, before you use Log5 in terms of predicting outcomes.
9:01 AM Dec 11th

bjames
So, what we would first do is regress their observed BABIP. The .320 observed hitter would be .310, while the observed .320 pitcher would be .305. The observed .280 hitter would be .290, while the observed .280 pitcher would be .295.

Log5 (or Odds Ratio) would then clearly show that the .320 observed hitter v .280 observed pitcher would PREDICT a higher outcome than .280 hitter v .320 pitcher.

At that point you would discover that you had spent hundreds of hours dealing with an entirely irrelevant issue.
8:50 AM Dec 11th

bjames

If we observe a player with a .400 OBP in 600 PA, in a league of .300 OBP, we have to presume that the TRUE talent level for that player should actually have resulted in a .380 OBP or so. That is, at the group level, when we observe a player above average, we have to assume (at the group level) that the player benefited from more good luck than bad luck. The point therefore is to first "regress toward the mean" the observed .400 down to .380. THEN, we can apply the Log5 (Odds Ratio Method).

No; this is NOT what should be done. You're introducing here an irrelevant issue--randomness--which has nothing whatsoever to do with Yakutsk's question or with my discussion of it. This is an irrelevant issue. I agree that this issue WOULD arise in a real-world effort to apply the issue, but it would NEVER arise relevant to Yakutsk' question.

I also see that I swapped ".450" for ".400" near the start of the article.
8:47 AM Dec 11th

tangotiger
Bill:

If we observe a player with a .400 OBP in 600 PA, in a league of .300 OBP, we have to presume that the TRUE talent level for that player should actually have resulted in a .380 OBP or so. That is, at the group level, when we observe a player above average, we have to assume (at the group level) that the player benefited from more good luck than bad luck. The point therefore is to first "regress toward the mean" the observed .400 down to .380. THEN, we can apply the Log5 (Odds Ratio Method).

As an example, let's say we have a batter's BABIP and a pitcher's BABIP. And you want to predict how an observed .320 batter would do against an observed .280 pitcher. And similarly, and observed .280 batter v an observed .320 pitcher.

Given what we know about DIPS, if we want to PREDICT the outcome, we have to therefore bet on the idea that the .320 batter v .280 pitcher will result in a higher outcome than a .280 batter v .320 pitcher.

RETROSPECTIVELY however, if we were to look back at these players and their actual matchups, then we would see the same outcome.

So, what we would first do is regress their observed BABIP. The .320 observed hitter would be .310, while the observed .320 pitcher would be .305. The observed .280 hitter would be .290, while the observed .280 pitcher would be .295.

Log5 (or Odds Ratio) would then clearly show that the .320 observed hitter v .280 observed pitcher would PREDICT a higher outcome than .280 hitter v .320 pitcher.

8:05 AM Dec 11th

MarisFan61
(Yes, your mistake about Yakutsk is that you spelled it RIGHT!
He spells it wrong.) :-)
2:46 AM Dec 11th

MarisFan61
I got lost right near the beginning, because I didn't grasp the meaning of "Underlying Skill Level." I do understand how you're using it -- i.e. its role in the stuff you're doing -- but I don't think you convey the concept of what it IS.

What is it?
2:45 AM Dec 11th

bjames
I see that at some point I also lost track of how to spell Yakustk' name. My apologies.
2:42 AM Dec 11th

The Log5 Method, Etc, Etc, Etc.

COMMENTS (24 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: