Wednesday, August 13, 2008

Rating incompleteness theorem

Background: The anchoring effect is when a value gets assigned to an object, and subsequent guesses or proposals will hover around that first number. The classic study showing the effect is when individuals were asked about how many African Nations were a part of the UN. Those who were asked "is it higher or lower than 45%?" gave lower answers than than those asked "is it higher or lower than 65%?".

The anchoring effect's application to imdb's rating system is obvious. If you are rating a movie that is already rated highly, you will probably give it a higher rating. Statistically what this means is that each of the ratings is dependent--they adjust at least slightly based on what the previous data points have been.

This is a fairly big problem for imdb, and I've thought a lot about how they could fix it. One way they could do it would be to only reveal the movie's overall rating after users have rated it. But the problem with this idea is that it would cripple the usefulness of imdb. The rating system's chief utility is to tell us which movies we should watch, not to sit around ranking movies we've already seen.

We're left at an impasse, a catch-22, an inconsistent self-referential loop. If you strive to eliminate the anchoring effect, you destroy the utility of the rating system. But if you allow the anchoring effect, then your values are biased. The system is incomplete, and I'm not sure that you can solve it without vastly decreasing the sample size. Obviously I'm open to any of your ideas.