Thursday, April 15, 2010

Any Rating System Trumps None

"Nihilists! Fuck me. I mean, say what you like about the tenets of National Socialism, Dude, at least it's an ethos." - Walter, The Big Lebowski

There are lots of lists that attempt to describe the top movies, with various methodologies. Here are three of the major ones:

1) The American Film Institute determined its top 100 list (here) by having film "experts" create their own top 100 list from 400 nominated movies. Movies were ostensibly judged based on winning awards (read: the Oscars), popularity (box office, syndication, home video sales), historical significance, and cultural impact. It's by far the #1 most cited list that people mention when I bring up top movie lists, probably because AFI's lists have been on TV a decent amount and when it comes down to it Americans watch a shocking amount of TV.

2) Metacritic compiles its top 200 list (here) by averaging the subjective ratings of various movie critics. They ask for user votes but don't actually count them towards the top 200. The big supposed upside of their list is that the ratings should be higher quality because they are based on published reviews. The main disadvantage of their list is the low sample size, which leads to more random noise. For example, Superman II is #2 on their all time list on the basis of a whopping 7 critic votes, while it has the class average of a 6.7 on imdb based on 20,000+ votes. So, Metacritic needs to convert to a Bayesian system that punishes low sample sizes in some way. Unfortunately, they also don't include many older movies, as most of their reviews are from the past 10 years.

3) The internet movie database determines its top 250 (here) via user ratings. Anyone with a valid email address who can pass a CAPTCHA test can rate an individual movie, but you have to have a certain amount of votes and various other qualities for your vote to count towards the top 250. Qualities which imdb doesn't disclose. They use a system that punishes low sample sizes, avoiding the Superman II problem. Compared to the other two lists, theirs is more diverse, either via old movies (as compared to metacritic) or foreign movies (as compared to AFI). Their big problem is recent movies, which start off much higher than they end up as (see: the Dark Knight), but are not punished as such. Admittedly, the fact that many others are against imdb's ratings probably makes me like it more, and I am also biased because I'm currently watching the top 250 have invested a lot of time into it. But I definitely do think that it's the best.

What are some metrics by which we can compare these systems? One way would be to look at other systems that use fan votes as compared to expert votes. For example, the NBA All-Star game relies on fan votes to determine its starters, while it relies on journalists to vote on the MVP. The fan votes tend to be not highly correlated to the quality of the player that year. Allen Iverson has been voted a starter each of the last three years even though his stats have been awful. MVP votes are probably more correlated to player's statistical success, although experts aren't perfect either: Steve Nash probably shouldn't have won it twice.

So, NBA All Star votes might count as evidence against imdb. And perhaps that kind of example is why people don't trust imdb? I would argue that sports are qualitatively different because most people don't actually watch all of the games, whereas most everyone who votes on imdb has actually watched the movies.

Regardless, I think sober, intelligent minds can disagree about the relative merits of each of these systems. Your personal preference will probably depend based on to what extent you believe quality is universal, and how much you trust the opinion of insider elites as compared to normal folks.

Much more troubling is the lack of any system at all, of just wandering through the world like a little boy, lost, looking for his mommy. For example, critic Johnathon Rosenbaum thinks that presenting AFI's list of movies in order is "tantamount to ranking oranges over apples or declaring cherries superior to grapes." His attitude is just pure nihilism, through and through.