Wednesday, August 31, 2011

Power To The Wiki People

While reading Robert Kurzban's (mostly good) book Why Everyone (Else) Is a Hypocrite, I came across a fragment of a sentence which annoyed me.

Kurzban is in the midst of explaining a computer science topic when he writes, "according to Wikipedia, which I am usually hesitant to use, but will for this purpose," and then block quotes, starting with "creating a domain-specific language...".

As it turns out, Wikipedia is written by people. Rp, a Dutch software developer, added this sentence on April 24, 2008. The only difference between Rp's change and the current version is a superfluous "of course" which was removed on May 12, 2008. This is not hard to decipher based on a gander at the page's revision history. It took me about four minutes.

Listen, I recognize that Wikipedia is low status, which is why Kurzban had to express his hesitation to use it. But we all know how Wikipedia works now. People edit it. You aren't constrained to merely cite "Wikipedia", you can look back and see exactly which user wrote that passage first, and cite that user.

Some of my clever readers are probably already mentally defending the status quo by saying that some sentences or portions of articles have been edited so many times that it would be difficult to say who wrote them.

Yes, in very rare cases like The Iraq War this is the case, but it happens much less often than you'd expect. The vast majority of articles are edited in chunks of sentences, paragraphs, or sections, and these chunks are eminently traceable.

Perhaps the best thing about such a shift in citation norms is that it would help incentivize people to edit Wikipedia. If you think this is not an issue, you are sorely mistaken.

Consider the page on epigenetics. Inspired in part by Razib's manifesto about the growing importance of this topic, I have subscribed via RSS to the changes made to the page since January. What I expected were the vitriolic edit wars deserving of such an unfolding, important topic.

But I've seen nothing of the sort. In fact the page hasn't changed significantly since Team Cytokine Storm made some edits last November. In the meantime, how many words have been typed about epigenetics for publication elsewhere?

For example, in the past two months, there have been at least four academic reviews on topics in epigenetics (see here, here, here, and here). See for yourself--here is a pubmed search for "epigenetics review."

How many people will read these reviews? Do you think that more will read those reviews than will read the Wikipedia page? Is this a healthy division of labor?

nobody does homework on saturdays
Bottom Line: When quoting or referencing an article hosted on Wikipedia, cite the major user(s) that contributed, instead of just "Wikipedia."

Saturday, August 27, 2011

Punishing Praise

[B]ecause we tend to reward others when they do well and punish them when they do badly, and because there is regression to the mean, it is part of the human condition that we are statistically punished for rewarding others and rewarded for punishing them.
That's Daniel Kahneman, more here. Regression to the mean will occur in situations that involve at least some luck, which is to say, almost everywhere.

Luck seems especially inevitable once we consider that spontaneous fluctuations in your brain's dynamic states (as seen in fMRI BOLD responses) can help account for trial-to-trial variability in behavior.

One study found that the amount of pain that people feel following laser stimulation (equivalent to a pinprick) can be predicted (beyond 5% chance) based on baseline, spontaneous activity in certain brain regions three seconds before the stimulation. (pubmed, PNAS)

Another study found that 74% of the within-participant variability in a button press force task could be attributed to ongoing fluctuations in neural activity. (pubmed)

In light of the fact that seemingly uncontrollable neural fluctuations play such an important role in behavior on any given attempt, punishing people for poor performance on small sample sizes seems particularly pernicious.

But then again, most punishment is probably not really intended to improve future performance, but rather to improve the mood and status of the punisher. 

Thursday, August 25, 2011

Are We More Like Chimps Or Bonobos?

In Sex at Dawn, Ryan and Jethá argue that humans are more like bonobos. This is, in part, because we both 1) employ diverse sex positions, 2) have sex for non-reproductive ends, and 3) gaze into each other's eyes during sex (when this jives with #1).

In his review (pdf, OA), Ryan Ellsworth disputes their thesis and makes the case for the chimp model. He emphasizes that humans and chimps (but not bonobos) share "sex-based hierarchies, sex-biased cooperation and coalitions, and intergroup hostility."

I've only skimmed Sex at Dawn, but I find Ellsworth's review much more persuasive. I'm happy being chimp-like.

(photo credit to patries71)

Sunday, August 21, 2011

Seven Thoughts On Rules And Willpower

1) John Tierney discusses ego depletion, the idea that willpower is an (unconsciously) expendable resource. He relates it to trade-offs: 
Once you’re mentally depleted, you become reluctant to make trade-offs, which involve a particularly advanced and taxing form of decision making.... To compromise is a complex human ability and therefore one of the first to decline when willpower is depleted. You become what researchers call a cognitive miser, hoarding your energy. If you’re shopping, you’re liable to look at only one dimension, like price: just give me the cheapest.
This is why I think learning about trade-offs can be so useful. The less novel a decision is, the less resources it should use up. The more general your schemas are, the more easily you'll adapt. 

2) How do people typically deal with ego depletion? Apparently,
[E]ventually [you] look for shortcuts, usually in either of two very different ways. One shortcut is to become reckless: to act impulsively instead of expending the energy to first think through the consequences.... The other shortcut is the ultimate energy saver: do nothing. Instead of agonizing over decisions, avoid any choice. Ducking a decision often creates bigger problems in the long run, but for the moment, it eases the mental strain. 
I think Tierney is a bit off here, as he neglects a crucial strategy: devising rules. "No coffee after two, no liquor before five." "Always take the middle option." Even, much more perniciously, statistical discrimination. For better or worse, we advance cognitively by thinking less

3) Marketers know that we use rules, and they (wisely) use our rules against us. This makes rarer rules more valuable (holding effectiveness equal), as they will be less exploitable. 

4) Before you go off devising your own un-gameable rule system, recognize that there are trade-offs to thinking about topics like this. It might be a better use of your time to just go along with most of the status quo rules and accept that you'll sacrifice some small amount of money to savvy marketers. As Ice T says, it's not about being mad at everything, it's about being really mad at the right things.

5) If willpower is a muscle, can you train it? Yes, self-control training can improve one's willpower to complete unrelated tasks. For example, in one study, maintaining better posture for two weeks (and keeping a diary about it) significantly improved hand-grip persistance (linkpdf). More examples here

6) As Tierney's article discusses, decision fatigue often helps trap people in poverty. But since willpower is apparently like a muscle, shouldn't exercising decision circuits improve willpower enough over time to escape the trap? 

7) I expect that the confound with the above is an interaction with stress. Making self-control decisions when you feel comfortable and empowered increases your set "willpower" level. But the emotionally stable undergraduates studied are not very representative of the population at large. When people must make decisions under psychological duress, it might instead condition a sort of "decision avoidance." Kind of like how overtraining your muscles can actually decrease strength.

Addendum 8/25: See Robert Kurzban's astute criticisms of this model (HT Brian Potter).  

Saturday, August 20, 2011

The Motion Mystery

Even to eminent thinkers, an explanation for motion was seemingly unknowable near the turn of the 20th century.

For example, in the late 19th century Thomas Huxley said that "existence, motion, and law-abiding operation in nature are more stupendous miracles than any recounted by the mythologies...".

Santiago Ramón y Cajal went further in the early 20th century, writing, "there is no doubt that the human mind is fundamentally incapable of solving these formidable problems (the origin of life, nature of matter, origin of movement, and appearance of consciousness)."

It turns out that in the early 21st century we now have a pretty good answer to this mystery. That answer comes in the form of molecular machines.

As Steven Pinker explains, "the stuff of life turned out to be not a quivering, glowing, wondrous gel but a contraption of tiny jigs, springs, hinges, rods, sheets, magnets, zippers, and trapdoors...".

One example of such a machine is ATP synthase, which literally works like a rotor:

Now, molecular machines don't explain why atoms themselves move (you'd need to go deeper for that), but their existence certainly does explain why you can move in the absence of an outside force and a rock cannot.

If we don't kill ourselves (from, e.g., an environmental disaster or nuclear war) first, we will to continue to solve puzzles that some of us currently consider intractable mysteries.

And in the meantime, we should cultivate some doubt in our doubt of the potential of science.

Saturday, August 13, 2011

Searching For The Imdb Of Books, Part II

As watching imdb's top 250 most highly rated movies has proven to be such a smashing success, I have long yearned to find (and fleetingly, to develop) a similarly authoritative list for fiction books. The keys for a good list are: 1) a large sample size, 2) shrinkage estimation of ratings to the average, 3) a continuous scale (the more levels, the better, but yes we'll often have to settle for five stars), 4) defenses against gaming, and 5) a wide index of titles. To the best of my knowledge no site fulfills all of these requirements. These are the current contenders:

Amazon ReviewsUpside: They have a huge incentive to index all available books and are proficient at combining ratings across different editions of the same text. They also have a useful "was this review helpful to you?" tool which could eventually be employed to rate the raters and thus weight the overall ratings. Downsides: Their insistence on showing the average rating in half-star increments (typically 5, 4.5, 4, or 3.5) means that it involves manual calculation to distinguish between the two radically different scores of 4.24 and 3.76. I also often don't trust the resistance of their ratings to gaming. But most damningly, there is simply no attempt to create a good list of the most highly rated fiction books. Filtering by "highest average rating" in "literature and fiction", their #7 best fiction book of all time is currently Jim Gorant's The Lost Dogs: Michael Vick's Dogs and Their Tale of Rescue and Redemption, which is probably a fine book, but I think the author would be insulted to hear that it was considered fiction, and I think more than three-quarters of the english profs across the country would be insulted to hear it called literature.  

One-Time Votes: By this I refer to ad-hock competitions of various websites which ask users to vote on their favorite books. There are many of these strewn across the web, for example, check out NPR's top 100 science fiction and fantasy books, or Modern Library's top 100 novelsUpside: These tend to get large sample sizes (NPR had >60,000 votes), which makes them more accurate and harder to game. Downside: The process is not iterative and requires manual input to update, so they won't last or scale. More troublingly, many (such as NPR's) only allow the option to select one's favorite books, without voting others down, which unfairly favors books with high variance as opposed to just high average quality. 

Google Books: The site aggregates ratings from elsewhere on the web, including major vendors and online "bookshelves." Upside: Transparent code, takes ratings from diverse sources, and has a clean layout. Downside: Like Amazon, also displays ratings in half-star increments (et tu, google?). But their biggest problem is that different editions of books are stored in different locations and the ratings are not aggregated across editions. See, for instance, the first four results of a search for "pride and prejudice" (here, here, here, and here). Now, even if they did manage to output one total score per novel, it still doesn't seem very google-like to actually curate such a list themselves. But in that case, it wouldn't be hard for someone else to scrape the ratings and convert them into a ranked list. 

Library Thing: Upside: They have scale, with over 10 million ratings, and they already have some pretty cool statistics (check out the most "connected" people--Napolean is #1). They also do have a top 25 books by ratingDownside: They need to split the rankings for non-fiction and fiction. At this point I've given up on searching for a canonical non-fiction ranked list, as those ratings are so context-dependent and world-view driven. And they need to do a better job of categorizing in general. For example, the movie for LoTR:Two Towers, while an awesome movie and in imdb's top 250, should not be among the highest rated 25 books. More importantly, the editors of the site have not implemented a rating system that punishes books with fewer ratings. Instead, books simply need a minimum total of 20 ratings to make the list. This is bothersome, but easily improved, as the editors could simply study and implement the imdb method

Good Reads: Upside: As far as I can tell, this is the largest "bookshelf" site with the most user ratings. Huge potential. Downside: They've made no attempt to publish a list of the highest rated books across the site! All I can ask is, what is holding you back, GoodReads editors? Qualms about alienating authors whose works won't make the list? Fears of being labelled imperialistic? These are both hogwash. Our time is scarce and in order to be informed consumers we need to know what the best books are. If you are worried about the arbitrariness of the minimum votes cut-off, then publish multiple lists with different scaling parameters. You will thank me later when the list gets out-of-control traffic. Indeed, a group of passionate GoodReads users recently called for such a list. To this valiant effort I can only say, Viva la Résistance!