In basketball, the number of points scored by a team in a game takes a different value from night to night. For example, if you plot the number of times the road team scored <65 pts in a game, 65-70 pts in a game, 70-75 pts, ..., 140-145 pts in a game in 1994-95, you end up with a distribution that looks like a bell curve, with a peak near the mean of 100 ppg and small tails on either end. This is approximately a Gaussian distribution, as statisticians formally call the bell curve. You can do the same thing for the home team scores from 1994-95 and end up with a similar Gaussian distribution, but its peak is near a mean of about 103 ppg and the distribution is spread out more.

If you have a browser that can handle beta applets (like Netscape 2.0b2 or 2.0b3), you can see these Bell Curves in the plots below (applet produced by Sun):

Knowing these distributions provides us with valuable predictive information. Namely, how much these two distributions overlap indicates something about how frequently the home or road team wins a game. In other words, we can estimate the home court advantage based on the above distributions. This comes about because a win by the home team means only that the home team's score is greater than the road team's score. So, if you pick a random point in the home team's point distribution, what is the chance that it is greater than a random point in the road team's point distribution? The answer gives you an estimate of the home court advantage.

Actually, this is only mostly true. You can believe it if you want to avoid a little statistical complication that I'll describe in this paragraph; most of the implications on strategy and prediction are understandable from just that. The complication that arises in basketball (and all sports, I'm sure) is that the number of points scored by a team (home or road) is correlated to the number of points allowed by that team. In other words, teams play up or down to their competition. Every team does it to some degree and some definitely more than others. This also comes about because of "garbage time", which allows a team to get close without changing the winner of the game. What this means to the analogy in the previous paragraph is that if you pick points in the home and road point distribution with some correlation, you estimate the home court advantage a little better than if you pick the points randomly. This correlation is relatively small (but not insignificant), which is why the previous analogy works.

The result of applying this method to last season's point distributions for the home and road teams is a predicted home court winning percentage of 0.587. The actual home court winning percentage was 0.597. The estimate is off by 1%, which is pretty good.

We know the home court advantage, though. This isn't a prediction, but a confirmation that the method is accurate. The predictive ability comes when applying the method to individual basketball teams, like the Suns. The Phoenix Suns won 59 games in 1994-95, but if you look at their offensive and defensive point distributions, they were estimated to win only 52. Such a large difference means Phoenix got a bit lucky to win those other 7 games. It's normal for teams to have a difference of up to about four or five games, but seven is pretty large. Part of the Suns difference is the fact that they were 4-1 in games decided by two points or less and 8-2 in games decided by three points or less, usually games that involve a bit of luck. What this implies for the 1995-96 season is that the Suns won't win as many as 59 games again. It's rare that teams get that lucky two seasons in a row. Through games of 11/27/95, the Suns are at a winning % of 0.500, well below their winning % of 0.720. The only other team with as large a difference between estimated win% and actual win% last season was the Lakers, which also is doing worse than they were last season.

A very important aspect of this estimation method is that it incorporates the variability in a team's scoring. A team that is inconsistent in how much it scores or how much it allows has a larger standard deviation (to use statspeak) than a team that is consistent. The spread of the bell curve is larger for an inconsistent team than for a consistent team. By spreading out the distributions, you increase the amount of overlap of the points scored and points allowed, which reflects how often the team wins. What this means for good teams is that they win less than they should. What this means for bad teams is that they win more than they should. In other words, being more inconsistent brings a team toward 0.500, toward mediocrity. A consistent team that averages 106 points offensively and 103 points defensively wins more than an inconsistent team with the same offensive and defensive averages. At the extreme, the ultimate consistent team scores 106 points every game and allows 103 points every game and, of course, wins every game. On the other end, a consistent team that averages 103 points offensively and 106 points defensively wins less than an inconsistent team with the same averages.

You can graphically see this point in the interactive plots shown below. The
amount of overlap of the Blue curve,
representing the distribution of
points *scored*, with the
Green curve,
representing the distribution of
points *allowed*, gives a graphical
view of how much the team wins. Leave one
plot at its default values, but play around with the standard deviations in
the other plot. You will see that the overlap goes down when you lower the
standard deviations, meaning fewer times the defense
outscores the offense. (Note that the winning percentages shown
are approximate at this point. I am developing the code further to
use a better integration method.) There are other ways of
affecting the team's winning percentage, of course. Obviously, if you change
the means from 102 and 96 to, say, 96 and 102, you will then go from a winning
team to a losing team. The second way is to vary the
covariance (which is supposed to be in red,
but isn't in my browser). Increasing the covariance effectively makes
a team more consistent, whether that be in losing or winning.
(**NOTE: ** Do not
increase the covariance to greater than half the sum of the offensive and
defensive variances or the winning % is wrong -- another bug I have to fix.)