Username:	Password:

Remember me

Forgot your username/password?

Print Email

Home>Articles

NFL Team Scoring Tendencies

By Bill James

November 15, 2008

I have plowed this ground before, but bear with me for a few minutes. Let us suppose that there are six NFL teams, and six games.

In week one, Arizona beats Baltimore 13-7, Chicago beats Detroit 34-21, and El Paso beats Fort Worth 28-21.

In week two, Arizona beats Detroit 23-13, Baltimore beats El Paso 21-17, and Chicago beats Fort Worth 42-24.

Standings: Arizona 2-0, Chicago 2-0, Baltimore 1-1, El Paso 1-1, Detroit 0-2, Fort Worth 0-2.

We are assuming, for the sake of simplicity, that all the games are played on neutral fields. We can take these scores, and try to figure out how the teams rank. We assume initially that every team has a value of 100. When Arizona beats Baltimore by six, then:

a) the total values for the two teams must total up to 200, and

b) Arizona gets six more points than Baltimore.

Arizona gets 103, Baltimore 97.

By this method Arizona has game scores of 103.0 (for the game against Baltimore) and 105.0 (for the ten-point win over Detroit.) That’s an average of 104.0.

The averages for the six teams at this point are:

	Chicago	107.75
	Arizona	104.00
	El Paso	100.75
	Baltimore	99.50
	Detroit	94.25
	Ft. Worth	93.75

We then change these “output values” to the input values, and re-run the same process. The Arizona/Baltimore game now becomes Arizona (104.0) against Baltimore (99.5). That’s a total of 203.5. In the second round, then, the scores must total up to 203.5, and Arizona must have six more than Baltimore. That makes it Arizona, 104.75, vs Baltimore, 98.75.

After the second round, the rankings of the six teams are:

	Chicago	108.6
	Arizona	104.4
	Baltimore	100.4
	El Paso	99.4
	Detroit	94.3
	Ft. Worth	92.8

After the third round, the rankings become:

	Chicago	108.8
	Arizona	104.9
	Baltimore	100.7
	El Paso	98.8
	Detroit	94.7
	Ft. Worth	92.1

After some very large number of cycles of re-calculation, the values become

	Chicago	108.8
	Arizona	106.5
	Baltimore	100.8
	El Paso	97.2
	Detroit	96.2
	Ft. Worth	90.5

After which you can re-calculate until hell freezes over, and the numbers are never going to change. That’s the end point of the system.

OK, but there is a second issue that can be accessed with the same data. The other thing we can figure with the same data is each team’s tendency to score and allow points. Chicago has scored and allowed 121 points in their two games; Arizona has scored and allowed 56. These are the totals for the six teams:

	Chicago	121
	Ft. Worth	115
	Detroit	91
	El Paso	87
	Baltimore	58
	Arizona	56

If we’re going to predict the scores of games, we need to be able to predict the victor and the margin of victory, but we also need to predict how many points will be scored.

I had a notion that we could perhaps do this in this way. An average NFL game this season has featured 44.51 points (44.0 in this data, but 44.51 in real life). This can be seen as the product of the “scoring and allowing” tendencies of the two teams. The square root of 44.51 is 6.67. If each team has a “scoring and allowing” tendency of 6.67, the game figures to allow 44.51 points.

Let us assume initially, then, that each team has an S&A figure of 6.67, in the same way that we initially assumed before that each team had a “quality” of 100.00. When Arizona played Baltimore there were only 20 points scored in the game. Looking at it from Arizona’s standpoint, if Baltimore’s S&A tendency is 6.67, that implies that Arizona’s must be 20 divided by 6.67, which would be three, or some number very close to it with more digits. Same for Baltimore; their S&A derivative from the first game is also 3.00.

Arizona’s S&A derivatives from the first two rounds, then, are 2.999 (for the game against Baltimore, 13-7), and 5.397 (for the game against Detroit, 23-13). The average of these two is 4.198. These are the averages for the six teams:

	Chicago	9.070
	Ft. Worth	8.621
	Detroit	6.822
	El Paso	6.522
	Baltimore	4.348
	Arizona	4.198

In the second round of calculations, then, Arizona is assumed to have an S&A tendency of 4.198, and Baltimore of 4.348. I thought perhaps that, by this method, each team’s tendency to score and allow points could be backed out of their games.

It doesn’t work. Actually what happens. . ..it’s quite interesting. . .what happens is that at the end of the second round, every team goes exactly back to the norm, 6.67. After the third round, they go back to the figures above, and after the fourth round, back to 6.67. They never creep toward their actual individual S&A tendency, as we intended, because they keep leaping over it and then leaping over it the other direction.

How do we stop that from happening?

We don’t allow them to leap over the goal. We prevent that from happening by changing the “output score” of the calculation from the average S&A output score (AvgS&AOS) to this:

(AvgS&AOS + 6.67) / 2

In other words, we allow it to move halfway to where it is trying to go. Chicago becomes not 9.070 but 7.87:

(9.070 + 6.67) / 2 = 7.87

That works. That was the second thing I tried, and it worked perfectly. . . unusual to get something like this to work that quickly, but it did. After one round of calculations, the S&A tendencies appear to be:

	Chicago	7.87
	Ft. Worth	7.64
	Detroit	6.75
	El Paso	6.60
	Baltimore	5.51
	Arizona	5.43

When Arizona plays Baltimore, then, we assume (from Arizona’s standpoint) that Baltimore has an S&A tendency of 5.51. There are 20 points scored in the game, so Arizona’s S&A tendency must be that number which, when multiplied by 5.51, produces 20. Which is 20 divided by 5.51, which is 3.68.

After two rounds of calculations, we get these S&A values for each team:

	Chicago	7.18
	Ft. Worth	7.10
	Detroit	6.71
	El Paso	6.63
	Baltimore	5.97
	Arizona	5.91

And after a large number of cycles or re-calculation, we get these:

	Chicago	7.41
	Ft. Worth	7.28
	Detroit	6.72
	El Paso	6.62
	Baltimore	5.82
	Arizona	5.76

Not that large a number of cycles of re-calculation, actually; this system zeroes in on its destination targets much more rapidly than the other system. As in the ranking method, the system at some point stops moving, and delivers output values from each cycle which are the same as the input values—but this happens much more quickly in this system. I suspect it happens more rapidly because the technique of only allowing the system to move only halfway to its goal stabilizes the data more rapidly, and actually might be adapted into the original system, but that’s a question for another time.

Anyway, if Arizona has an S&A tendency of 5.76 and Baltimore has an S&A tendency of 5.82, we would expect that, when the two teams meet, the number of points in the game would be 5.76 * 5.82, or 34 (33.52). When Chicago plays Ft. Worth, we would expect that the number of points in the game would be 7.41 * 7.28, or 54.

Let us suppose that Chicago played El Paso, which they didn’t in this phony data sample. But if they did, we would predict:

1) that Chicago would win by 12 points (11.6), and

2) that 49 points would be scored in the game.

If 49 points are scored in the game and Chicago wins by 11.6, we would thus predict that Chicago would win 30-19--actually 30.3 to 18.7, but for obvious reasons we round it off.

There are a couple of technical issues here that I will come back to, but these are the S&A tendencies of the 32 NFL teams, based on this season’s data:

Team	Conf	Rank
Denver	A	7.08
Arizona	N	7.05
New Orleans	N	7.02
Houston	A	7.01
San Diego	A	7.00
San Francisco	N	6.96
NY Jets	A	6.91
Green Bay	N	6.90
Dallas	N	6.86
Chicago	N	6.84
Philadelphia	N	6.84
Detroit	N	6.82
Minnesota	N	6.80
NY Giants	N	6.79
St. Louis	N	6.72
Seattle	N	6.67
Indianapolis	A	6.60
Kansas City	A	6.60
Atlanta	N	6.58
Jacksonville	A	6.58
Buffalo	A	6.56
Miami	A	6.52
Cleveland	A	6.48
Baltimore	A	6.45
Cincinnati	A	6.45
New England	A	6.38
Tampa Bay	N	6.36
Washington	N	6.32
Pittsburgh	A	6.31
Tennessee	A	6.30
Oakland	A	6.27
Carolina	N	6.23

A game between Denver and Arizona would figure to generate 50 points (7.08 * 7.05). A game between Oakland and Carolina would figure to generate 39 points.

A lot of these numbers were surprising to me, by the way. Who knew that Oakland was one of the NFL’s most conservative, close-to-the vest teams, or that San Diego and San Francisco were so high on the wild and crazy list? Anyway, we are now in position to predict the scores of NFL games, based on the data. For this week:


Oakland	13	Miami	28
Detroit	10	Carolina	32
Chicago	23	Green Bay	24
New Orleans	26	Kansas City	20
Baltimore	20	Giants	24
Philadelphia	28	Cincinnati	16
Minnesota	17	Tampa Bay	26
Houston	18	Indianapolis	28
Denver	17	Atlanta	29
St. Louis	18	San Francisco	29
Arizona	28	Seattle	20
Tennessee	25	Jacksonville	17
San Diego	17	Pittsburgh	27
Dallas	21	Washington	23
Cleveland	21	Buffalo	22

The margins here are a little different than they were in my earlier predictions for the week, because I was still following the earlier announced policy of artificially reducing the margin when the game didn’t figure to be close.

A couple of things I was concerned about with this method. You may remember that, in the “ranking” system, it makes no difference whatsoever what initial values are entered for each team, so long as the numbers entered average out to 100. If you enter initial values of 300.00 for Detroit, zero for Tennessee and zero for the Giants, it makes no difference; you get the same output as if you enter them all initially at 100. That’s because the final output is a product of the scores of the games, rather than being in any way related to the initial assumptions.

So, I wondered, would this be true here, as well?

Actually, it is MORE true; in this case you don’t even have to get the average right. If you enter initial values of 2500, -6329, .000001, and -12.74, within a few cycles you get the exact same values as if you had started everybody out at 6.67. The only thing you can’t do is enter an initial value of zero, since that causes the system to try to divide by zero, which of course doesn’t work. Otherwise, the end-point values are entirely unrelated to the starting point values.

OK, but. . . .related problem. You remember that, to calculate the output, we stabilized the data by adding 6.67 to each team’s average, and dividing by two. I wondered: does this tend to push the teams back toward 6.67, thus creating an artificially low standard deviation for the S&A tendency scores?

It doesn’t. .. and here I really don’t understand what is happening in the process; maybe you can explain it to me, or maybe I’ll figure it out. I tested the system by adding to each team in each calculation cycle not 6.67, but the arbitrary number of 8.00.

If you had asked me to guess what that would do, my guesses would have been:

1) that it might cause the output figures to cluster around 8.00, rather than 6.67,

2) that it might cause the entire system to go haywire, and never stabilize, and

3) that it might have no effect whatsoever.

All wrong. What happens is that it DOES effect the data, but only to some very small extent. When you add an arbitrary 8—a number picked at random—rather than 6.67—a number that represents the square root of the points scored in an average NFL game—the output numbers DO change, but not meaningfully. The go down by about 1%. The teams wind up in exactly the same order with essentially the same numbers, but the lowest numbers go down by slightly less than 1.01%, and the highest by a little less than 1.4%.

That doesn’t make any sense to me; I don’t understand why that would happen. But in any case, apparently adding some out-of-range, arbitrary number does NOT cause the system to meaningfully malfunction—therefore, I think we can conclude that, in this case as well, the output numbers are the result simply of what is in the data, rather than being influenced by any input assumption. So, until I can figure out why that happens, it seems to me that I don’t need to worry about it.

COMMENTS (7 Comments, most recent shown first)

Trailbzr
At the risk of making more small mistakes, I'll contribute additional gibberish to the discussion. It looks like there's a problem with Bill's spreadsheet in the theoretical example of SA factors. Using the initial method, after one round of calculations, the theoretical league is Chicago 9.070 ... Arizona 4.198. Then Bill says the whole thing cycles back to the initial values of 6.67.????????

Let's take Arizona, who played Baltimore (4.348) to 20 points and Detroit (6.822) to 36 points. So isn't their second round rating the average of 20/4.348=4.600 and 36/6.822=5.277 (=4.938)?

8:02 PM Nov 18th

Trailbzr
We're trying to estimate total points scored in games using the SA-factors, right?
Looking at the theoretical example in the A-F league, teams CFDEBA have factors (7.41, 7.28, 6.72, 6.62, 5.82, 5.76). That is the right place to start?
Chicago (7.41) played Detroit (6.72) and FtWorth (7.28). So we should expect Chicago games to score 7.41x6.72=49.8 and 7.41x7.28=53.9; the average of those numbers is 51.9. Chicago's games actually averaged (34+21+42+24)/2 = 60.5.
Similarly, Arizona's games should score 5.76x5.82=33.4 and 5.76x6.72=38.7, the average of which is 36, against an observed average of (20+36)/2=28.
So Chicago games average 60.5 and the factors expect 52 (which is about the mean of 60.5 and 44); while Arizona's average 28 against expected 36 (the mean of 44 and 28).

This is not to imply that including a stabilizing factor is wrong when making predictions for the future based on only two games. I put the actual NFL power ratings from 11/11 and 10/01 into a spreadsheet (yesterday was cold and rainy where I live) and the 11/11 ratings were on average 62% the 10/01 ratings and 38% a stablizing factor of 100.
6:01 AM Nov 17th

bjames
Trailblazer. . ..sorry about the tone of the earlier comment. I'd edit it if I could figure out how. I see why you were focusing on Chicago. . .from the theoretical example, rather than the actual data. But. .you're still wrong about the inference.
2:58 AM Nov 17th

bjames
I think Clark is also wrong. . not really sure what he is saying, either. The problem he points to is an obvious one, but it doesn't seem to predict the actual consequences. If what was happening was what Clark thinks is happening, the S&A team numbers would drop, on average, from 6.67 to 5.53. In fact, they drop not by 1.14, but by .01. It is difficult to see how this would be predicted by his theory.
2:43 AM Nov 17th

bjames
Trailbzr's analysis is obviously wrong, although I suspect that he may have a valid point there if we could just focus on it. I've tried to figure out exactly what he is wrong about, but frankly I am so baffled by his collection of small mistakes and indecipherable gibberish that I am unable to decode what he is trying to say. For one thing, I don't know why he references Chicago, a team in the middle of the charts, as if they were at one end of the chart or the other. The claim that the inclusion of 6.67 as a stabilizing element reduces the standard deviation of the scores by 50% is clearly and absolutely incorrect, if in fact that is what he is trying to say.
2:38 AM Nov 17th

clarkshu
By plugging in an 8 you're assuming that the average points per game is 64. This makes all the teams "below average", leading to lower S&A numbers and lower predicted scores. By this stage of the season, we have enough games so that you can get good results by figuring out the actual average and then using that value.
1:36 PM Nov 15th

Trailbzr
This can be solved directly using two linear regression, where each row is a game and each column is a team.
In the point-differential regression, the matrix is:
__1_-1__0__0__0__0_=6
__0__0__1_-1__0__0_=13
...
__0__0__1__0__0_-1_=18
This produces +/- of +1.92, -3.74, +4.26, -8.41, -7.41, -14.08.
When normalized to average 100, this is the same as 108.6...90.5.

The scoring/allowing matrix is set up with +1 for each playing team, and the dependent variable is the LOGARITHM of points scored (by both teams) in the game. Then their factor is the exponent of the result.

The above method that averages 6.67 into each S&A for each game is, I'm pretty sure, keeping the scoring predictions halfway between 44 and the actual results. Arizona's games are predicted to average 36, when they actually averaged 28; while Chicago's are predicted to average 52 instead of 60.5.

I expect I've made a typo, so errata should appear below at some point.

12:12 PM Nov 15th

NFL Team Scoring Tendencies

COMMENTS (7 Comments, most recent shown first)

Leave a comment

Report inappropriate comment


Type of Abuse:
Comments: