Toying With 'Em

Dateline: 05/20/97

The way Chicago toyed with Miami tonight, I realized that I should finally sit down and figure out how to account for the fact that Chicago does play down to their opponents. It makes their numbers look a little worse than they would be if they were tested, as in the playoffs.

This concept also applies to individuals like Michael Jordan who can seriously turn it up when it counts. Ultimately, I'll extend the method to individuals, but right now, we'll just look at teams.

Below are the results of the method. Shown on the left are the "Uncorrelated Ratings", which removes the "correlation", or the factor associated with teams playing up or down to opponents. Shown on the right are the "Normal Ratings", or the true number of points per 100 possessions scored by the teams. Notice, in particular, that Chicago was 2nd and 4th offensively/defensively with the "Normal Ratings", but when you realize that the Bulls typically only play in low gear, the Bulls actually have both the best offense and the best defense in the NBA.

NBA Points Per 100 Possessions Ratings, 1997

Uncorrelated
Normal

Off.
Rating
Rank Def.
Rating
Rank
Off.
Rating
Rank Def.
Rating
Rank
Atlanta 107.1 9 99.2 4 106.2 9 100.1 3
Boston 100.8 26 111.1 28 102.2 26 109.7 26
Charlotte 109.3 4 106.5 18 108.9 4 106.9 22
Chicago 114.5 1 97.8 1 112.0 2 100.3 4
Cleveland 103.4 18 100.3 5 102.9 20 100.8 5
Dallas 98.4 28 107.4 22 99.4 28 106.4 18
Denver 100.9 25 110.5 26 102.4 25 109.0 24
Detroit 109.0 5 102.4 11 108.7 5 102.7 11
Golden St. 105.0 13 110.3 24 105.2 12 110.1 28
Houston 107.9 6 101.3 10 107.0 7 102.2 9
Indiana 104.3 14 103.0 12 104.2 15 103.1 12
LA Clippers 102.9 21 106.2 16 103.3 18 105.8 16
LA Lakers 107.2 8 101.2 8 106.5 8 101.9 8
Miami 106.1 11 98.1 2 105.2 11 99.1 1
Milwaukee 104.1 15 107.3 21 104.7 14 106.8 19
Minnesota 103.3 19 105.4 15 103.6 17 105.2 15
New Jersey 101.6 24 108.6 23 102.7 23 107.5 23
New York 103.4 17 98.9 3 102.8 21 99.4 2
Orlando 103.6 16 104.1 14 103.7 16 104.0 14
Philadelphia 101.8 23 110.4 25 102.7 22 109.5 25
Phoenix 107.6 7 106.8 19 107.5 6 106.8 20
Portland 106.3 10 101.2 9 106.0 10 101.5 7
Sacramento 102.9 20 107.2 20 103.2 19 106.9 21
San Antonio 99.5 27 112.1 29 101.4 27 110.2 29
Seattle 109.6 3 100.5 6 109.2 3 100.9 6
Toronto 102.4 22 106.2 17 102.7 24 106.0 17
Utah 113.6 2 101.2 7 112.2 1 102.6 10
Vancouver 97.6 29 110.9 27 98.6 29 109.9 27
Washington 105.2 12 103.0 13 105.0 13 103.3 13
Average 104.8
104.8

104.8
104.8

The effect of this is to make the good teams look better and the bad teams look worse. Perhaps you don't like this. Tough. There is a theory behind it, which I will explain below. If you have an argument with the theory, let me know. But just arguing with the results will get you nowhere.

The Theory Behind The Numbers

Before I explain, I will first emphasize that the theory is independent of the results. In other words, I didn't create this formula in order to arrive at the result that the Bulls have the best offense and best defense in the league, even though that is what I expected. That result came out of the theory, not vice versa.

So what is the method? If you are mathematically inclined, you are probably impatient, so I present the formula first:


               (R-1)(ORtg - DRtg)
ORtg! = ORtg + ------------------
                        2
               (R-1)(ORtg - DRtg)
DRtg! = DRtg - ------------------
                        2

where:
ORtg!:	New (or Uncorrelated) Offensive Rating
ORtg:	(Normal) Offensive Rating, not accounting for correlation
DRtg!:	New (or Uncorrelated) Defensive Rating
DRtg:	(Normal) Defensive Rating, not accounting for correlation

                  SQRT[Var(ORtg) + Var(DRtg)]
    R = -----------------------------------------------
        SQRT[Var(ORtg) + Var(DRtg) - 2 Cov(ORtg, DRtg)]

SQRT[]:	Square root of the quantity in []
Var():	Statistical variance from game-to-game of the quantity in ()
		Statistical variance = standard deviation squared
Cov():	Statistical covariance between the two quantities in ()

If you aren't mathematically inclined, you can ignore that and just follow along. You'll be learning math without even knowing it. Kinda like when the doctor says the TETANUS SHOT!! won't hurt.

I will explain the method with the Chicago Bulls as an example. The Bulls play every game to win, not to win by as many points as possible, just to win. When they have a safe lead, they take the starters out of the game and let the subs play -- as long as they don't let the game get too close. If the Bulls are winning by 20 with five minutes to go, they have a safe lead and put in people like Jud Buechler, Randy Brown, and Brian Williams. Over the last few minutes, they may let the opponents outscore them by 10, making the ultimate winning margin only 10. In a sense, the important Bulls, the Bulls players that would play in the playoffs or in any important game, were good enough to be better by 20 points, but the subs -- or any players who played when the game was already decided -- "made the starters look bad" by allowing the winning margin to be just 10.

There are various ways to measure this. One way, as suggested to me by a reader, is to look at the Bulls' points differential in the fourth quarter and compare it to other quarters. Presumably, the Bulls would not outscore their opponents as much in the fourth quarter as in other quarters because they typically have such a big lead. In contrast, poor teams would do relatively better in fourth quarters because opponents would play their subs against them in blowouts. Frankly, this measurement technique is too much work for me, though I would be very curious about the results if someone wants to carry it out.

The way I measured this effect is with something called the statistical covariance. The statistical covariance looks at several games and (in a simplistic view) looks at how consistently a team outscores its opponents. For instance, if the Bulls win three games 100-98, 120-118, and 80-78, they are not being very consistent in how many points they score or allow, but they are being very consistent in how much they outscore their opponents. In this case, very consistent translates to a large value of statistical covariance. That value goes into the formula, but the meaning comes from something real: how the Bulls play to the level of their competition.

Playing to the level of competition -- or what I often call "correlation" -- is the hallmark of this method, but the method also relies on the fact that the Bulls are inconsistent offensively and defensively. From one game to the next, the Bulls do not play equally well. In one game, they may play great offensively; in the next, they may play poorly and Jordan will yell at them in the locker room. This inconsistency is measured as the statistical variances of the offensive and defensive ratings (ratings are the number of points scored or allowed per 100 possessions). Again, it is just a way to attach a number to what we see.

This method of finding "uncorrelated" offensive and defensive ratings then asks the question: What happens to a team's ratings if they do not play up or down to their competition? Mathematically, this means setting the statistical covariance equal to zero. Non-mathematically, I am doing a thought experiment where the Bulls leave their starters out on the court even with a big lead in the fourth quarter. I just happen to need mathematics in order to get results that are concrete.

That is all the above formula is supposed to do. If you still don't see how the words translate into the equation, don't worry. Actually, that's good. I am one of the world's experts in knowing how to do all this, so if you did fully understand, I'd be losing some job security...

Really Technical

In order to be complete and to continue my understated mission to educate people on the value of mathematics in real-life things like basketball, I present some gory details here. This way, any overly-interested kid with a calculator can duplicate my results. If you do not like math or if you really cannot handle a technical discussion right now, I encourage you to leave now or forever be addicted to headache pills.

Since we're on a Bulls theme, I will demonstrate the calculations using the Bulls numbers. This means, unfortunately, introducing the real formula. Trust me, it is essentially the same as above, but it makes life a lot easier to use this one than the one above:


ORtg! = 100*OPPG!/(Poss/G)
DRtg! = 100*DPPG!/(Poss/G)

               (R-1)(OPPG - DPPG)
OPPG! = OPPG + ------------------
                        2
               (R-1)(OPPG - DPPG)
DPPG! = DPPG - ------------------
                        2

where:
ORtg!:	New (or Uncorrelated) Offensive Rating
DRtg!:	New (or Uncorrelated) Defensive Rating
Poss/G:	Average number of possessions per game
OPPG!:	New (or Uncorrelated) Points Scored Per Game
OPPG:	Actual Points Scored Per Game, not accounting for correlation
DPPG!:	New (or Uncorrelated) Points Allowed Per Game
DPPG:	Actual Points Allowed Per Game, not accounting for correlation

                  SQRT[Var(OPPG) + Var(DPPG)]
    R = -----------------------------------------------
        SQRT[Var(OPPG) + Var(DPPG) - 2 Cov(OPPG, DPPG)]

SQRT[]:	Square root of the quantity in []
Var():	Statistical variance from game-to-game of the quantity in ()
		Statistical variance = standard deviation squared
Cov():	Statistical covariance between the two quantities in ()

This formula is a very good approximation to the previous one, but it only requires the statistics to be calculated on the scores of all games, not the ratings, which are time consuming to calculate. The fundamental assumption allowing this equation to be comparable to the previous one is that the pace of the game has a small effect on the ability of teams to play offense or defense. It is a pretty good assumption.

Plugging in some numbers, this is what we have for Chicago. The Bulls averaged 103.1 points per game scored (OPPG) and 92.3 points per games allowed (DPPG). The standard deviations of these figures are 11.9 and 11.8, respectively, which translate to variances of Var(OPPG) = (11.9)2 = 141.61 and Var(DPPG) = (11.8)2 = 139.24, respectively. The covariance between their offensive and defensive points per game was a rather high 70.7 [Cov(OPPG, DPPG)]. The Bulls average 92.1 possessions per game (Poss/G).


             SQRT[141.6 + 139.2]
    R = ------------------------------ = 1.42
        SQRT[141.6 + 139.2 - 2 (70.7)]

                (1.42-1)(103.1 - 92.3)
OPPG! = 103.1 + ---------------------- = 105.4
                          2
               (1.42-1)(103.1 - 92.3)
DPPG! =  92.3 - ---------------------- = 90.0
                          2

ORtg! = 100*105.4/92.1 = 114.4
DRtg! = 100*90.0/92.1 = 97.7

Due to some slight round-off errors, we end up with very slight differences between this calculation and the values in the table.

Finally, I should mention that this method assumes that the correlation of teams affects both the offense and the defense equally.

Derivation

The specific theory here is, as mentioned previously, to find out what a team's offensive and defensive ratings would be if they didn't let their performance slip when a game was effectively over. Mathematically, winning percentage is predicted through the following equation:

            __                                        __
            |               (ORtg-DRtg)                |
Win% = NORM |------------------------------------------|
            |SQRT[Var(ORtg)+Var(DRtg)-2*Cov(ORtg,DRtg)]|
            --                                        --
where NORM means to take the percentile of a mean-zero variance-one normal distribution corresponding to a value given by that in the brackets I faked (see Basketball's Bell Curve for details).

What we are saying is that Win% stays the same when a team plays down to its opponents. In reality, this determines winning percentage:

            __                                        __
            |               (ORtg-DRtg)                |
            |------------------------------------------|
            |SQRT[Var(ORtg)+Var(DRtg)-2*Cov(ORtg,DRtg)]|
            --                                        --
In a world where teams do not play up or down to opponents, this determines winning percentage:
            __                       __
            |      (ORtg!-DRtg!)      |
            |-------------------------|
            |SQRT[Var(ORtg)+Var(DRtg)]|
            --                       --
In both cases, the winning percentage is the same, so we can set the two expressions equal and get
                __                                        __
                |  (ORtg-DRtg) SQRT[Var(ORtg)+Var(DRtg)]   |
(ORtg!-DRtg!) = |------------------------------------------|
                |SQRT[Var(ORtg)+Var(DRtg)-2*Cov(ORtg,DRtg)]|
                --                                        --
At this point, we assume that the offense declines and much as the defense declines when a team feels their lead is safe and get the equation demonstrated above.

Now, if that hasn't blown you away, I'm impressed.