Introduction to Methods

Half the problem in learning a new subject is getting used to the new vocabulary that comes with it.

The worst part of math classes in school was memorizing all the definitions that were highlighted by blue shading and enclosed by a box with white borders. The definition of a circle, "The locus of points in a plane equidistant from a fixed point," was a lot easier stored in memory by picturing an Oreo cookie than by reciting a bunch of words for nerds. Invariably, though, the easy way to remember a definition wasn't sufficient. When the teacher asked a question on a test requiring knowledge of the details of the definition or if he cruelly asked for the actual definition, a drawing or any definition using words that are common in normal conversation wouldn't garner more than a token pity point.

With the approach I've taken in analyzing basketball, it has been necessary to define certain new terms and, occasionally, those definitions are very wordy in order to be specific and to avoid confusion. In most cases, just general knowledge of the statistics and strategies of basketball should suffice in understanding everything written, but a familiarity with the following terms will make everything a little clearer.

First, the definitions and corresponding formulas, if applicable, then discussions on the subtleties, implications, and/or derivations of each term and the theory behind it.

Possession: The period of play between when one team gains control of the ball and when the other team gains control of the ball.

Possessions=FGA-OR+TO+0.4*FTA

FGA=Field goals attempted OR=Offensive rebounds

TO=Turnovers FTA=Free throws attempted

The possessions formula is for teams. When applied, a team's offensive and defensive (its opponents' offensive) stats are both run through the formula, then the average is taken. Almost without exception, the two estimates are within one percent of each other, making the averaging a safe procedure.

Scoring Possession: Any possession on which at least one point was scored.

Floor Percent: Scoring possessions divided by possessions. The percentage of a team's possessions on which it scored at least one point. Floor % is well approximated for teams by the formula

Floor %= (FG+OR)/(FGA+TO)

Play Percent: Scoring possessions minus scoring possessions on which no field goal was made (only free throws), divided by possessions minus scoring possessions on which no field goal was made. Approximately the percentage of the time a team will score if not sent to the free throw line.

Play %= FG/(FGA-OR+TO)

Points per Possession: Points divided by possessions. Related terms are Adjusted Points per Game and Overall Rating (or just Rating).

Points per possession= Points scored or allowed/possessions

Adjusted points per game (Adjppg) is just points per possession times the league average of possessions per team per game. The overall rating is points per possession times 100.

Adjusted Field Goal Percent: An adjustment made to field goal percentage giving three-halves credit for three point shots made.

Adj FG%= (Total FG+0.5*3ptFG)/Total FGA

Pythagorean 17 Method: A method that gives an expected winning percentage using the fact that the ratio of a team's wins and losses is related to the number of points scored by the team raised to the seventeenth power.

Expected Winning %=(Pts scored)^17/[Pts scored^17 + Pts allowed^17]

Johnson Effect: A baseball (sabermetric) term that has applications in basketball. It states: "The tendency of teams that exceed their Pythagorean projection for wins in one season to relapse in the following season." (From The Baseball Abstract)

Approximate Value: An integer estimate of a player's value, making no fine distinctions, but, rather, distinguishing easily between very good seasons, average seasons, and poor seasons. There are two ways to calculate approximate value (AV). One uses rules and is not explained here. The other is based on a statistic devised by Martin Manley called credits. Both methods produce essentially the same results.

Credits= PTS+REB+AST+STL+BLK-FG MISSED-FT MISSED-TO

AV= Credits^(3/4)/21

Before the '73-74 season, steals (STL), blocks (BLK), and turnovers weren't kept as official stats. In the credits formula for player seasons before '73-74, those stats are just omitted as they tend to cancel each other out to some degree when included anyway.

If a player makes first team or second team All-Defense, then one point is added to AV.

Trade Value: An estimate using a player's age and his approximate value to determine how much value a player has left in his career.

Y= 27-0.75*Age

Trade Value= (AV-Y)2(Y+1)AV/190 + AV*Y2/13

A player's Y factor represents an estimate of how many seasons he has left to play and is always assumed to be at least one and a half years.

Without a doubt the most important term to understand thoroughly is possessions. There are two meanings of the term used in this business and it is often difficult to tell which one is which in normal conversation. One meaning is the one given previously and the other is the following: "A team is said to have possession when it has uninterrupted and complete control of the ball. A possession ends when a field goal is attempted, when there is a turnover, on a jump ball, or after a free throw that is not the first of two."

Though the definitions are similar, there is one key difference. Under the former definition, teams alternate possession, while under the latter definition, a team can have consecutive possessions by getting an offensive rebound or by winning a jump ball after being tied up by the defense. With the former definition, opposing teams in a game will always have the same number of possessions (or be within two of each other). With the latter, a team that gets a lot of offensive rebounds will have more 'possessions' with which to score than their opposition if the opposition doesn't get many offensive rebounds.

The definition to get to know is the former one, which I'll call Definition A. In retrospect, I suppose that I could have done my research with the latter one (Definition B), but I did not.

The benefits of definition A become clear when using possessions to rate offenses and defenses, an invaluable exercise in getting to know basketball. Picture the two following situations: 1) A player brings the ball upcourt, takes a twenty foot jump shot and makes it. 2) A player brings the ball upcourt, takes a twenty foot jump shot and misses, but a teammate rebounds, misses the stickback, then gets his own rebound and finally puts in a layup. The first situation involves one scoring possession and one total possession regardless of which possession definition is used. The second situation has one scoring possession and one total possession using Definition A for possessions. Using the other meaning, the second situation involves one scoring possession and three total possessions.

Which situation represented the better offense? An offense's job is to score as many points as possible before the opponents take control of the ball (go on offense). If you can agree to that, then situations 1 &2 represent offenses with equal efficiency. Both times the offense came away with two points before the opponents played offense. Looking at it another way, in the first situation, the offense did one 'good' thing (made one shot) and nothing 'bad'. In the second situation, the offense did three 'good' things (one field goal and two offensive rebounds) and two 'bad' things (two missed field goal attempts), netting one 'good' thing. Looking at end results ("the end justifies the means" is a great expression in this case), it can't be disputed that the situations represent offenses of equal quality.

Using the strict definition of floor %, scoring possessions divided by total possessions, efficiencies can be calculated for each situation for both meanings of possession. In the first situation, either definition of possession yields a floor % of 1/1= 100%. In the second situation, floor %= 1/1= 100% using Definition A of possessions and floor %= 1/3= 33% with Definition B. Agreeing above that both offenses are equal, floor % is not a useful measure of quality with Definition B possessions. As a matter of fact, it would be difficult to come up with a stat that used Definition B possessions in any way to truly measure quality. Therefore, you can forget Definition B possessions. All references to possessions hereafter are meant as Definition A possessions.

Floor %, as already mentioned, is used to measure offensive efficiency. The not so obvious reason it can be used that way is because almost all scoring possessions for all teams involve two points being scored, not one point or three points. A normal game might have one team scoring on 58 of 100 possessions and the other scoring on 53 of 100 possessions. The team scoring on 58% of its possessions will win 99.9% of the time (that's an educated guess not based on scores of hundreds of games). The only ways the team with the 53% floor % will win is by making enough three pointers and/or by having several of the 58 opponent's scoring possessions be worth only one point (making only one of two free throws). A typical score for this game would be 116-106. It might be 114-108 or 117-105, but any difference smaller than about six points or larger than about fourteen would be very unusual.

The formula given to approximate floor % is, in many respects, nothing to be proud of. It is sometimes inaccurate, especially with small data samples and it wasn't logically derived as a good formula should be. I basically said to myself, "I wonder what happens if I add field goals and offensive rebounds, then divide by field goals attempted plus turnovers," and figured out a year later what the stat meant.

But, in fact, the formula for floor % is very useful. Because it is currently difficult to approximate scoring possessions, this formula has been the way to get floor %'s for teams in this book. It is also simple to calculate and it is a good first indicator of quality. Back in 1974, for example, the league floor % was 50.4%. Last season, it was 54.0%. Quickly, it is obvious that offenses are better now at scoring than they were fourteen years ago. In '82, the league floor % was 54.4%, indicating that offensive efficiency has gone down in the past six years. In actuality, this drop has been due to the increased use of the three point shot in the NBA and points per possession has actually increased slightly.

Points per possession is the best way available to measure the quality of offenses and defenses. The method takes into account points scored, field goal percentage, turnovers, offensive rebounds, and free throw percentage - everything (except for assists and, maybe, fouls) that can justifiably be looked at in measuring offensive or defensive quality. Possessions, as they were defined previously, make such a complete measurement possible. Repeating what is so important: When a team has the ball, its whole purpose is to score as many points as possible before it becomes the defense. If it were easily accomplished, teams would try to get fouled every time, miss the last free throw, get the offensive rebound, try to get fouled, miss the last free throw, etc., never having to play defense. Score lots of points in a possession and you are not giving the opposition a chance to catch up. The most common way to do that now is to score two points every time down the court. Points per possession shows which offenses do it best and which defenses stop it best.

In practice, points per possession (the number), is rarely used. Numbers like 1.071, which result by dividing points by possessions, are troublesome to handle with three numbers after the decimal and a leading 1. The overall rating (also called the study rating and points per 100 possessions) just multiplies points per possession by 100 to get aesthetically more normal numbers. It is often used in comparing offenses or defenses from different seasons. Adjusted points per game multiplies points per possession by the league average for possessions per team per game to reflect both the quality of the team and the average game pace in the league for that season.

As an example, the best offense of the '73-74 season was Milwaukee's with an offensive rating of 99.3, meaning that the Bucks and Kareem Abdul-Jabaar scored 99.3 points per 100 possessions. A normal NBA game in '73-74 had each of the opposing teams using 110.0 possessions to score their points. In such a normal game, the Bucks would score about 109 points (109.2 to be more exact) against an average defense. The Bucks actually employed a very slow pace that season, averaging only 107.9 possessions per game, meaning that they normally didn't score 109.2 points in a game. There were so many teams that had faster paces than the Bucks that seven teams scored more total points. But the Bucks did it better. Milwaukee led the league in field goal percentage and assists and did well in offensive rebounds. The Bucks' 99.3 rating, though it led the league would now be among the worst in the NBA. New Jersey had an offensive rating of 99.9, which was second to last in '87-88. Because the pace of the game is so much slower now, the Nets adjusted points per game rating was 101.7, much lower than the Bucks' 109.2.

Play % is a good but forgotten stat. In cases where floor % is obviously inaccurate, play % saves the day. The problem with the floor % formula is that it assumes a fairly normal number of free throws and doesn't take them into account in its calculations even though it clearly should. Play % assumes nothing and tells you how well a team did when it wasn't sent to the free throw line. Because play % and floor % use the exact same base statistics in similar form, it can be inferred that the number of free throws taken and made is a fairly predictable consequence of how the offense operates from the field.

The reason play % is 'forgotten' is that floor % is usually quite accurate and when it's not, points per possession is available. Play % also tells you something that is not as easily grasped as floor % and points per possession. It looks at only those possessions that were ended by a field goal attempt or turnover, ignoring free throws, and necessarily only counting possessions with field goals as scoring possessions.

Adjusted field goal percentage is just a real simple modification of field goal percentage that gives proportional extra credit for making three pointers. It has special uses for three point specialists and also helps to identify where a problem might be in a team's offense.

The Pythagorean 17 Method was derived from the corresponding method in baseball used by Bill James. 'Derived' may not be the proper word because I'm not sure if I knew what I was doing when the formula came out. You see, the corresponding baseball formula is identical to the basketball formula except that the exponents are 2's instead of 17's. What the derivation entailed was estimating average margins of victory for both sports and playing around with the logarithm button on a calculator. The number 16.76... came up on the first try. My expectations were for something between 13 and 20, so 16.76 was rounded up to 17 and tested as a valid possibility. I was amazed then and I'm amazed now that it does actually work. I couldn't reproduce the derivation and the estimates used for average margins of victory were probably wrong, so Lady Luck was most likely the reason for such a successful first derivation.

The principle behind the method - that a team's won-loss record is closely related to the number of points it scores and allows - should be no surprise. It just makes sense that teams that win 60 games outscore their opponents by more than teams that win 50 do.

It may not be as immediately obvious, but statements like the following contradict the Pythagorean principle: "A few points here and there and we would have won ten more games." This should sound familiar to Baseball Abstract readers as Bill James says it every twenty pages or so. A few points here and there may in fact improve a team's record by ten wins in the regular season, but they don't make the team better. The playoffs or the following season will show that. Fifty points in the right situations may win a team ten more games, but eight of those wins are lucky wins. A difference in points scored and points allowed between 25 and 30 points works out to be equivalent to one win in the NBA.

		Pts   Pts    --Actual-- Projected 
  		Scor. Alld.  W/L% W  L   W  L
Atlantic
Boston 		9315  8828  .714  59 23 57 25
Washington 	8653  8716  .469  38 44 38 44
New York 	8655  8695  .480  39 43 38 44
Philadelphia 	8667  8785  .443  36 46 36 46
New Jersey 	8235  8900  .211  17 65 19 63

Central
Detroit 	8957  8533  .695  57 25 54 28
Atlanta 	8844  8549  .640  53 29 50 32
Chicago 	8609  8330  .636  52 30 50 32
Cleveland 	8566  8504  .531  44 38 42 40
Milwaukee 	8697  8653  .522  43 39 42 40
Indiana 	8581  8646  .468  38 44 38 44

Midwest
Denver 		9573  9239  .647  53 29 54 28
Dallas  	8960  8602  .667  55 27 53 29
Utah  		8899  8597  .643  53 29 47 35
Houston 	8935  8821  .554  45 37 46 36
San Antonio 	9314  9714  .329  27 55 31 51
Sacramento 	8855  9327  .293  24 58 24 58

Pacific
LA Lakers 	9250  8771  .712  58 24 62 20
Portland 	9518  9147  .663  54 28 53 29
Seattle 	9135  8966  .579  47 35 44 38
Phoenix 	8901  9268  .335  27 55 28 54
Golden State 	8771  9453  .219  18 64 20 62
LA Clippers 	8103  8949  .156  13 69 17 65

Such luck resulting from 'well-timed scoring' is a weak force in the NBA. It doesn't separate the Lakers from the Bulls and it doesn't separate the Clippers from the Kings. The Lakers were a better team than the Utah Jazz last season even though the Lakers were the luckiest team and the Jazz was the unluckiest. The Pythagorean Formula says the Lakers should have won 58 and the Jazz 53 when the Lakers actually won 62 and the Jazz 47. Taking the luckiest and unluckiest teams in the NBA, we find a total deviation of ten wins (four for the Lakers and six for the Jazz). Luck has a place in basketball, just as the weather has a place in football and as Wrigley Field has a place in baseball. Each has an effect on the game, but, in the long run, the better teams win with or without the advantage or disadvantage of such factors.

Occasionally luck plays a major part in a team's season. The '85-86 Clippers won 32 games, while their point totals led to an expectation of only 21 wins. A third of their victories (!) came out of the Twilight Zone. The '86-87 Clippers came back to reality, going through a pitiful 12-70 season in a daze. The '86-87 Warriors exceeded their Pythagorean projection by eight games, winning 42 instead of 34 games. They, too, crashed the following season, winning only 20. Both the Clippers and Warriors lost key personnel in their follow-up seasons, but neither ever showed any signs of life anyway. This sort of collapse can be seen throughout the history of basketball, but it's also seen in baseball (and probably other sports). The baseball people called this the Johnson Effect. It's the same effect in basketball so it gets the same name.

The Value Approximation Method was a major task to come up with, taking me about two months to finally arrive at satisfactory results. The plan for the method was to end up with a scale of integers between 0 and about 20 rating players, with 10 representing an 'average' player. It was to be based upon several standards a player was to meet in order to gain points of approximate value. The whole thing was modeled on Bill James' Value Approximation method for baseball. As James did, I assigned verbal descriptions to ranges of scores in order to see if the method produced results that matched general descriptions of players. Those descriptions are as follows:

A score of about twenty indicates an exceptional MVP season.

A score of seventeen or eighteen indicates a strong MVP candidate or an ordinary MVP season.

A score of sixteen indicates an MVP candidate.

A score of fifteen indicates a definite All-Star who is a marginal MVP candidate.

A score of fourteen indicates a probable All-Star.

A score of thirteen indicates a marginal All-Star.

A score of twelve indicates a very fine season; an All-Star candidate.

A score of eleven indicates an above average regular; an excellent player playing about 1800 minutes.

A score of ten indicates an average regular or a very good sixth man.

A score of nine indicates an average regular or a good sixth man.

A score of eight indicates a fair regular or an average sixth man.

A score of six or seven indicates an average bench player or a good player playing under 1500 minutes.

A score of four or five indicates a player who plays about 1000 minutes and who doesn't deserve many more.

Scores of three or less usually indicate players who are unimpressive in limited playing time.

After all the work to produce rules and standards that would fit the above descriptions, Martin Manley soon came along with a better method to approximate basketball players' values. He called it a Production Rating. Production Rating (PR) was defined by him as credits (as defined by formula earlier) per game. I fooled around with PR a little in hopes of deriving a points created formula, but soon found it to be a fruitless task. Instead, a simple way to calculate approximate values came out.

AV= Credits^(3/4)/21 + 1 if All-Defense

In my conversations with Mr. Manley about this manipulation of his method to fit the verbal descriptions and range of scores above, he pointed out a couple of things. First, he thought that instead of using credits, a player's PR*82 should be used. For players who played a full 82 games, there would be no difference, but for players like Magic Johnson or Adrian Dantley last year, who missed quite a few games, there might be a difference of about two on the AV scale. His reasoning for this suggestion was that "stats on a per game basis is so basketballish whereas total season stats are so baseballish...no one cares how many total points Jordan scored [in '86-87] - only that it was 37.1 per game." The second suggestion Mr. Manley made was that the conversion be simpler, dividing credits by 130 rather than raising credits to the 3/4th power and dividing by 21.

Both suggestions have their merit, but there are reasons not to implement them. His first suggestion to replace credits with PR*82 I hesitate to use for one main reason. While per game stats may be more 'basketballish', they do not represent a player's total value to his team over a whole season, which is what AV tries to measure. No matter how good a player is, if he isn't playing, he isn't contributing to wins and isn't valuable to his team. In the ten games Magic Johnson missed last season, he did not contribute anything to the Lakers (at least nothing we can measure). In those ten games, someone else (Wes Matthews or Milt Wagner) came in to contribute. In a game, Magic Johnson contributes more to the Lakers than Matthews or Wagner does, but, in those ten games, Johnson's value was 0, while the other two put points on the board and were valuable to the Lakers. Potentially, Johnson's value was a lot more than Matthew's or Wagner's, but, to use a cliche (sorry), potential never won a game for anyone. Ask the coaches of Dennis Hopson, Chris Washburn, Benoit Benjamin, Reggie Williams, Ralph Sampson, Darryl Dawkins, Kent Benson, J.B. Carroll, Mychal Thompson, Len Bias or any player who ever had the 'potential' to be one of the best.

Mr. Manley's second suggestion comes down to simplicity and a little more. Dividing by 130 is simpler than doing what I do until you realize that all conversions are unnecessary once ranges of credits are written next to the corresponding AV. From 562.5 to 702.8 credits, the corresponding AV is 6. From 702.9 to 850.5, the corresponding AV is 7. Calculate credits, look on a chart for corresponding AV. That's simple enough. Numerically, the conversion methods don't differ too much until you get to great players with high numbers of credits. Wilt Chamberlain's '61-62 season gives an AV of 25 using the method in this book and an AV of 32 using Mr. Manley's suggestion of dividing by 130. Both are tremendous values and there is almost no reason to argue over the difference because Wilt's '61-62 season was clearly the best ever statistically. What it comes down to is this: the best baseball player ever, Babe Ruth, racked up AV's (using James' method) around 25-27 and if we say that Chamberlain dominated his sport as much as Ruth did his ("Chamberlain was the Babe Ruth of basketball"), then their best AV's should be about the same.

The purpose of the Value Approximation method is to quickly produce a useful number that represents the sum of all of a player's obvious contributions. In studies involving large groups of players, AV's are the most convenient way to quantify approximately how valuable the players are. If we wanted to find out how well a team drafts, we could add the career AV's of all the players they've drafted and compare that to the total for other teams. If we wanted to find out what position in the NBA was the position with the most valuable players, we'd use AV's and we might see how this compares with the NBA of 15 years ago. If we wanted to find out how productive the bench is on NBA teams, we might use AV's. There are many studies waiting to be done that could use the Value Approximation method to make them a little easier.

Another application of AV's is in the determination of trade values. Trade values, because they take into account both age and production, can be a good indicator of the future success of a team. Teams like Chicago, Cleveland, Sacramento, Portland, New York, and Seattle are all young clubs with high trade values and seem to have good futures. Teams like Denver, Detroit, Milwaukee, and, of course, Boston seem to be facing bad years within the next three years or so as they see their best players turning a little gray and their young players not producing.

Trade value will probably be an important part of one of my more ambitious (and unrealistic?) projects for the future, which is to devise a method that would give an approximate percent chance that a certain team will win a championship in one year, two years, three years, etc. Every year, magazines and newspapers give us their predictions for the coming season and they can be fairly accurate. Teams like New York and Cleveland that have so much young talent inspire questions about a more distant future, though. Of course, the farther in the future you look the less detail you can see and basketball predictions are an uncertain 'science' to begin with. Predictions, though, are an inevitable and unavoidable part of studying basketball from the point of view I've taken. The test of real sciences - physics, biology, chemistry - is not how well they can explain things that have already happened, but whether they can predict outcomes of future experiments. That is the general direction some of the research herein is going. Predictions may in fact be self-defeating because of the psychology involved, but they're worth trying.

The above methods are the primary tools used in the Hoopla to answer questions about the game. Additional methods will be interspersed throughout team or player comments and will be explained when introduced.


Basketball Hoopla, 1988, L. Dean Oliver