The Interpretation of Symbols
A Joint-post by Silverbird5000 and Bethlehem Shoals
Some of you may be following the latest blogosphere contretemps over Hollinger's Player Efficiency Ranking (PER)- that great Rosetta Stone of NBA statistical analysis, whose benevolent tyranny over our league it is our duty as fans to periodically resist. The argument comes down to the wisdom of the per-minute adjustment, which is a central part of PER, along with pretty much every other Ultimate Metric in the marketplace. On the one hand, adjusting for minutes played seems like a good idea, insofar as it immunizes our judgment from the folly of coaches. If a player who should be getting 40 minutes a game only gets 20, his per-game stats will understate his true value. What per-minute adjustments do is control for mismanagement, as Ziller correctly points out.
The problem with this line of reasoning is that it assumes the homogeneity of court time. It assumes that if a player scored 20 points in 20 minutes, he would also score 40 points in 40 minutes. That there will by systematic differences between these two situations is almost too obvious to point out. It's the difference between sharing the ball with Jordan Farmar while being guarded by Kenny Thomas, and sharing the ball with Kobe Bryant while being guarded by Ron Artest.
Insofar as the problem here is one of rotation, small-scale adjustments in minutes played shouldn't create major distortions (it isn't unrealistic to think that if Tim Duncan played 5 extra minutes per game, his per-minute production, as influenced by the level defense he'd face, would basically be the same). But when PER catapults bench players into the starting five (or vice-versa), be on the look-out for inflation. Call this the Silverbird-Shoals Hypothesis, or the THEOREM OF INTERTEMPORAL HETEROGENEITY (TOIH).
By way of proof, we propose the following experiment: imagine a league in which the distribution of minutes perfectly reflected the PER rankings, such that the top-ranked player played the most minutes per game (43mpg), the 100th-ranked player played the 100th most minutes (31mpg), and so on and so forth. Now, compare this projected distribution of minutes to that of the actual league. For most players, the difference between actual and projected mpg is fairly small - high PER players play high minutes, low PER players play low minutes. But for a significant minority of players, actual mpg falls far short of what Hollinger's rankings predict. If TOIH is correct, we should observe inflation in the value of these players' PER. We invite you to consider the evidence and decide for yourself.
The following table shows all players (PER rank < 150) whose actual-projected mpg differential is 12 minutes or more:
[That pretty much every player on this list is vastly overrated by PER is, ultimately, a subjective judgment. But it is the kind of subjective judgment only a lunatic doesn't share.]
Note that everyone here ranks in Hollinger's top 150; if PER accurately reflected productivity, then leaving aside issues of position, age, etc., every one of them should be on some team’s starting five. But none of them are, not even Ginobili. Indeed, despite their high per-minute productivity, many of these players see no more than 15 minutes a game. Somewhat paradoxically, this suggests that PER inflation is a matter of being both overvalued AND underplayed; or, more accurately, being overvalued per-minute but under-valued per game.
If PER inflation is basically the problem of over-qualified bench players, it would seem that PER deflation, its counterpart, is somewhat more complicated than just under-qualified starters. Among the worse 150 players in Hollinger’s rankings, the following play significantly more minutes than projected: Bruce Bowen (#372/+19mpg), Speedy Claxton (#371/+14), Adam Morrison (#366/+14), Trenton Hassell (#331/+14), Desmond Mason (#307/+17), Raja Bell (#262/+18), Shane Battier (#261/+17), Larry Hughes (#252/+17), and Marvin Williams (#239/+14). Given our hypothesis, it would stand to reason that all or some of these players have no business regularly facing top-tier competition.
Bowen, Bell, and Hassell can be excluded, for their defensive contributions are partly invisible. But Claxton, Morrison, Mason, Hughes (as currently constituted) and Williams (at this point in his career) are all players who, in a perfect world, would be used more sparingly. That they are not is a function of either missing resources or incompetent coaching. It is interesting to find fantasy and Right Way figurehead Shane Battier on here; 'twould appear that his production is a function of minutes, not value.
To conclude, the virtue of per-minute adjustment is that it adjusts for bad coaches. But in the process it runs afoul of a far greater and more systemic problem: that is, the unequal distribution of talent in the league. In a world of perfect balance, where starters played equal starters, bench players played equal bench players, or everybody just played everybody all at once, then each man’s productivity could be measured under equal conditions, and so without fear of distortion. This world would be called baseball, a sport whose statistical models we have borrowed without regard to their unspoken assumptions. As it happens, our sport is basketball, and its time is ineluctably structured by the inequities below. Whether our statisticians can ever overcome this problem remains to be seen.
Silverbird5000, Bethlehem Shoals
September 10th, 2007
Vienna, AU
24 Comments:
Couldn't agree more (if it's possible to "agree" with an essay...). Hollinger is willfully making a science in places where none can exist. Basketball does not have the ease of stat analysis that baseball does. Fuck Hollinger. I would add that there is a comparable idea in sabermetrics that Hollinger mistakenly presumes translates to basketball. That is the idea of the platoon split. Insofar as basketball is not divisible into the sum of its parts, and baseball TOTALLY FUCKING IS, then it would seem the dude just needs to stop. Start over. Or something. However, in related news (but applying to college hoops) coming soon: basketballprospectus.com !!!
Good points, SB&BS.
As to your closing statement re: inequities of opposition and support w/r/t starters & subs -- that's nothing close to PER, which is a linear-weight summary stat. BUT there is some exciting work going on in the plus-minus realm, especially Dan Rosenbaum's infamous 'adjusted defensive +/-.' It's too messy for the public, but it'll happen. Hell, if Football Outsiders can figure that sport out, a baseballesque understanding of hoops will come.
(Whether that's depressing to some or not is an entirely different issue.)
One more note, something I failed to highlight in my foray into this discussion: if PER did nothing but adjust for pace/opportunities (by using possession-based factors such as TS% and rebound rate in lieu of the mundane), it'd be valuable. The parts which standardize and make it minutes-independent might actually hurt its reputation w/r/t to calculations such as the one SB&BS exhibited here.
holly ain't perfect, but i value his work a hell of a lot more than most other stats tossed out there to give credibility to a player or team. but, of course, the NBA is not a stats league when the gavel hits. the truth is stats just give virtue to the ineffable quality of players - but VORP never will exist to gauge the value of the true supernova - wade, yao, nash or whoever. what's more interesting to me is how it helps grade mid-level type players. like reuben patterson and shane battier. anyway, you'd be a fool to use it alone, but you'd be just as bad not to look at it at all.
Dean Oliver expounded on this years ago. Everyone with an inclination towards stats should read Basketball on Paper. He does a great job of explaining what the numbers can and can't measure (which a statistician ought to be able to do), and he makes a concerted effort to figure out ways to correct the flaws of his method. I have a feeling that you guys would appreciate the way he calculates stuff.
He also critiques anyone who tries to distill everything to a single number, and explains why he thinks these metrics are flawed. One thing he says is that Hollinger's PER overvalues rebounding. Oliver uses the stats from past games to evaluate the value of an offensive or defensive rebound. Hollinger gives them a semi-arbitrary number. Now look at your list of guys who deserve more minutes. It's largely a bunch of jumping and banging fours who are good at getting their hands on stuff.
I also tend to think of PER as a basic measuring tool and not the ultimate device to determine a player's worth.
A guy from your list like Turiaf is effective exactly because he plays limited minutes. Not because of the foul trouble he would certainly have if he played for longer stretches, but just in terms of bringing the necessary energy to get those rebound numbers and hustle plays.
Also, the PER does not take into account one of the most important and most interesting aspects - at least to me - which is how a player changes the style of the game and pace of the game when he is on the court and changes how both teams interact.
Allen Iverson from his Finals appearance year is probably a good example. Part of his appeal (again, at least to me) was/is how the flow of the game totally changes when he is on the court, how his teammates play differently and how the opposing team has to adjust simply because of his presence. I assume one could find a player in the Eddie House mold whose adjusted numbers per 48min would look quite similar to Iver Anderson's, but of course having nowhere near the influence on how the game's actually played.
What you've also mentioned is that the PER has no chance to account for a whole set of variables. Whom does a certain player replace and in what role? Whom does he play with in those hypothetical extra minutes and what effect does that have on his chances for shot attempts, rebounds or assists? Just ask Kenyon Martin about his PER with and without Jason Kidd (taking his injuries out of consideration).
Thanks for breaking that down for us SB. I think Hollinger's PER has been labelled a "fun toy" at best but rarely do you see it used as a serious point of defense among fans or even sports writers when evaluating a player's worth. It's too academic, too much ceteris paribus is going on for it to be considered a reliable indicator of worth and when it comes down to it, there just aren't enough stats to capture everything a player does on the court. I haven't gone into his formula but does it adjust for the quality of teammates like kaifa brought up or does it look at players in a vaccuum?
Tom:
It's interesting that you say PER is too academic, because technically it isn't academically sound. PER is a summary statistic based on logic that appeals to John Hollinger: ie, take into consideration the number of shots a player creates, his accuracy, his turnovers, % of rebounds he gets, etc. Then add those up after weighting them in various ways, and normalize to the league average.
The problem, as illustrated by Dave Berri at the Wages of Wins blog (http://berri.wordpress.com), is that these stats are weighted to affirm Hollinger's assumptions about basketball, rather than run through a linear regression analysis to actually see how much they contribute to basketball. In Berri's case he analyzes the contribution of various stats towards the number of team wins, and then weights the statistics appropriately and sums them. It's not a perfect method as it can't account well for team interactions, but I think it's useful to also look at this. According to his model, Hollinger overemphasizes scoring rate and underemphasizes rebounding, which is contrary to popular logic.
I don't think any of this can replace actually watching basketball and understanding the flow of a game. PER might say that Iverson is really good but not as good as, say, Josh Howard. Berri's Win Score would argue that Iverson is way worse, because he places such a premium on shooting efficiency and hustle stats. But neither statistic can account for the type of terror that cuts through the collective midseason mental fog of a given team when Iver Anderson suddenly gets hot and threatens to drop 60. Everyone on the court changes their game to accomodate, and that gives him an edge that isn't captured by statistics.
But, both PER and something like Berri's Win Score are still useful. They highlight Iverson's strengths and flaws in different ways, and these are important things to note when evaluating players. IE - if I happen to have a low efficiency, incredibly high volume scorer who doesn't do much else? Make damn sure to surround him with some rebounders, preferably ones who don't want the ball. Reggie Evans, etc. Obviously if you have Iverson or Jerry Stackhouse you want hard nosed, low-offense rebounders - this isn't anything new. But what about someone with the same tendencies, but less radically pronounced? Say, Wally World, Devin Brown. Both of them can create a good number of shots, but neither (at this point in Wally's career, at least) is exactly Ray Allen when it comes to shooting efficiency. And Wally happens to be a barely adequate rebounder and passer, while Brown is much better in those statistics. So if you have Wally, you concentrate on getting some serious rebounding and passing help, whereas with Devin Brown you can slack off on that a little and perhaps have someone that can score little, too.
Ahem. Long winded. Anyway, moving on.
thanks for the comments so far.
I think a lot of the problems Kaifa brings up are, in principle at least, supposed to be solved using the kind of +/- ratings TZ mentioned, which are advocated by people like Rosenbaum, 82games, etc. These models have their own "rotation problem", in that they have to control for the quality of the other 9 players on the floor. Otherwise, how do you know if a player's high +/- rating is due to his own production, or the strength/weakness of the players he typically plays with/against?. As I understand it, the way they try to do this is to adjust each player's +/- rating by the +/- ratings of the other players on the floor. This always seemed kind of suspect to me, since all of these +/- ratings are by definition mutually-determined. You can control for Stockton's +/- when measuring Malone, and you can control for Malone when measuring Stockton. But how can you ever know that the production you've controlled for isn't actually attributable to the player your measuring? I assume they'd argue that no two players are ever on the floor together for their entire season, and so there's enough variation to make the adjustments work out. But this seems to raise a whole bunch of data limitation questions, if not the same logical ones from before.
It’s amazing to me how little statistical energy is devoted to figuring out stuff like “what players/styles complement eachother” (82games being the main exception), and how even less energy is devoted to evaluating teams as a whole –what makes for good management, how much does it matter, can it be measured, etc.? Instead, almost everyone is focused measuring individual players and figuring out who is more/less productive – whether its PER, +/- rankings, Dave Berri’s thing, or whatever. Though here it’s probably good to remember that this kind of work doesn’t happen in a vacuum, and that behind all of this a good number of these guys are competing for front-office jobs, where measuring individual productivity is the only thing that counts, and measuring something like “GM inefficiency” probably won't curry much favor. This isn't meant as a criticism - of course who wouldn't want to work for a professional basketball team? - so much as a explanation of why certain questions get ignored.
This comment has been removed by the author.
Yar, SB5K. Adjusted +/- is too complex for most to attempt (including this sap right here), and the intrinsic logical battles you detail are a major part of that.
I speak solely for myself, but I'm definitely more interested in the quantification and understanding of team more than individuals. (Proof.) And in basketball, typification is more important (I think) in analysis of the individual than sheer quality judgments, mostly because there's a need for poor rebounding but deadeye shooting guards. Five 20-PER guys would not necessarily win you the title (let alone three + two sub 15-PER, cough Boston). Ed Kupfer from the APBRmetric board has done some great work in the past on typification which has, I think sadly, gone by the wayside as most wage battle on PER or Dave Berri (or, in some cases, both).
Looks like Reggie Evans will be yanking testicles in Philadelphia now. Somebody tell "Gonad" Krstic to watch out!
I've written enough on this shit at Ballhype to make my English major head explode, but I just wanted to say one thing: when your blog's under attack, it's nice to know that FD's got your back.
TZ - yeah, i thought that mapping post you did was brilliant. exactly the kind of thing i'm thinking of. that last comment (just to be clear) was really directed at the statistical establishment, so to speak, which produces lots of interesting work to be sure, but still seems way biased towarded the managerial perspective.
first of all, no offense to chicago and boston, but your nba bloggers are as classy as an anne frank snuff film.
i am not a stats person, but ironically, this wasn't really an anti-stats post. it was intended to show what we felt PER could be useful for, which is, as TZ said, to smoke out gross mis-coaching. or show when a team is unjustly deep. if anything, we're trying to show how this "problem" with PER could be used to make a constructive decision. i.e. PROMOTE THE FUCKING PERSON BECAUSE THEY DESERVE MORE MOMENTS.
What Boston bloggers are you referring to, Shoals?
Good essay. The problem is one of "understanding". What are we to understand by these numbers? Well, we can understand that the stats have been factored to appropriately represent what we already believe - who are the best players. If these players didn't rank near the top of the stats, then the weighting factors would need to be adjusted.
As you point out, the problem is how we interpret this as the "talent" pool grows. It isn't simply a matter of collecting and portioning, since we are no longer sure that the intricacies of the stats correspond to the value of the player. Instead, believing we can extrapolate from the results that were purposely skewed at the top, we "realize" the PER as a factor that appropriately applies to ALL players.
Unless there is predictive value in the stat, I agree with what others have said about it obscuring an issue more than illuminating an issue. The valuation of player ability within a basketball system is not made easier by this stat and it doesn't give me any meaningful information about a player's performance. It is an average of a hypothetical metric that is skewed to conform to our expectations of evaluations of top talent.
You guys wrote this post as if the Wages of Wins had never been published. PER is not exactly state of the art. It's not the Rosetta Stone, its a stat system that dramatically overrates scoring. This fuss was started because PER rates Odom and Boykins equally, which is silly. If you look over to the Wages of Wins you can find at least one composite stat that thinks Odom has been twice as good as average over the last two years, while Boykins has been below average. That makes a lot more sense doesn't it?
I have to say, I am surprised how much time has gone into discussing the utility of per minute stats across all these posts. It seems like the conclusions generated by PER is casting a pall over the idea of per minute stats. It should be just the opposite. Looking at per minute stats should tell us that there is something very wrong with PER. Per minute stats, when paired with the idea of sample size, is just common sense. The weightings people apply tothem are where the problem starts.
And FWIW, I have zero problem believing Renaldo Balkman is a much better basketball player than Marbury.
Instead of PER you can use VAR (value over replacement) which incorporates minutes played
PER should in some way use Chris Gatling's name in some sort of formalist role.
At least two teams looked at Gatling and said "man, this dude scores 20 points in 12 minutes off the bench" and moved him to the starting lineup, where he proceeded to score...20 points.
The point of per-minute stats is not to project a what a player's stats would actually look like if he played X minutes, any more than the point of pace adjusted stats is to project what a player's stats would look like if his team played X possessions per game. This is a common and unfortunate misunderstanding.
The point of adjusting for minutes and pace is just to put players' production on an even playing field so that they can be readily compared. If you're measuring who's taller than whom, you better make sure the ground they're standing on is flat.
So if Boykins scores 20 points per 40 minutes (or whatever it is), no, no one is saying that give the man 40 minutes and he gets 20 points. The point is (say) that he gets more scoring bang for the buck than someone who scores 15 points per 40, even if that guy happens to score more points per game by virtue of playing more minutes.
It's definitely fair to point out differences in quality of play a bench player may face, compared to what a starter sees. That is another wrinkle that ideally we would be able to remove from the numbers, just as we can (easily) remove the confounds from minutes played. Short of properly controlling for the confound, we can just qualify our interpretations of the stats and be appropriately cautious.
The point of per-minute stats is not to project a what a player's stats would actually look like if he played X minutes, any more than the point of pace adjusted stats is to project what a player's stats would look like if his team played X possessions per game...The point of adjusting for minutes and pace is just to put players' production on an even playing field so that they can be readily compared.
I know this is the official line but I just don't buy it. Yes, Hollinger acknowledges that older players will be overrated in his system insofar as they don't have the stamina to play a whole game. But other than that, he generally claims that PER reflects productivity over all, not just productivity per X minutes played. Besides, the whole idea that per-minute adjustment puts a 10mpg player and 40mpg player "on an even playing field" implies that productivity doesn't change depending on the number of minutes played. Otherwise, the PER ranking would just be a ranking of itself, rather than a ranking of how good players actually are. If that was how Hollinger presented it, it would be fine. But it's not.
Silverbird:
No matter what Hollinger says, that is what normalizing for minutes played does. It makes a comparison easier, and then you use your head to take the comparison further. IE - if Sean May scores 20 points per 40 minutes, and Emeka Okafor scores 14, it becomes apparent that Sean May is likely a better scorer. However, he plays far fewer minutes, so then you have to take it further: what is holding him back? In his case, he's a bad defender and he's in bad shape. So clearly the problem for him isn't just a matter of small sample size, or playing inferior competition compared to Okafor (at least not much). He needs to get his ass in better shape.
No, prorating stats per 40 minutes isn't perfect, but it's (usually) better than comparing players who play different minutes directly.
yeah, manu's overrated compared to nash. and manu's been just about the central player on how many champeenship teams? and nash?
Stray comments:
PER/Usage is worth consideration
I argree with silverbird that more theoretical and detailed mechanical review of adjusted +/- would be worthwhile
as for silerbird's comment
"It’s amazing to me how little statistical energy is devoted to figuring out stuff like “what players/styles complement eachother"
I was agree with that and have suggested study of team w/l and point differential results for teams in moderan era of a certain 4 factor type or positional strength type (it could be as simple as above average, near average, below average characterizations or precise measurements)
Post a Comment
<< Home