Tuesday, March 8, 2011

How To Find Which Basketball Stats Matter

Dave Johns today summarized whether basketball stats can actually tell how much a player is helping his team. It seems that there are two key problems:

1) The one stat we know must be good in the long run, adjusted plus-minus, is too noisy to make good inferences based on short run data.

2) The stats that we do have lots of short-run game data on, like points, rebounds, and FG%, we don't know how to interpret in terms of how much they actually help the team. For example, Kevin Love is piling up boards, but does that actually help the Timberwolves win ballgames?

One approach to solve these problems would be to use a large set of training data for both adjusted +/- and summary stats, spanning many years. For each player and each game (or even each quarter), you try to use the statistics from (2), like points and rebounds, as features to try to predict the player's adjusted +/- in that time period. Some of the statistics will be able to predict the +/- really well, whereas others won't. So going forward, we'll be able to say which of the stats are good short-run proxies for long-run +/- and which are not. That's it. It will be beautiful.

For a few minutes I thought about trying to do this myself, but I couldn't find easy enough access to raw +/- data. That is to say, screen scraping nba.com does not sound like much fun. If anyone knows of a nice and clean data set, holla acha boy.

Frankly I was surprised Johns didn't mention this approach in his article (thus this post), but I assume that it's what teams using bball sabermetrics are doing. The approach is similar to the netflix prize or to many articles in machine learning, like Burstein et al '09, who predict the function of proteins based on training features. I'm pasting their figure 1 below for a schematic of the process, although note theirs is binary whereas our classification system would be continuous. Think of the "features" as either simple stats like assists, or more complicated ones like under-/over-shooting, and think of "classification algorithms" as either naive things like "how many points did the player score?", or more complicated things like John Hollinger's PER.

doi:10.1371/journal.ppat.1000508