Oddhead Logo

Oddhead Blog

Musings of a computer scientist and yahoo1,2 about
prediction markets, gambling, and estimating the odds of everything

November 20th, 2008

Babel: English Lit Syndrome meets Economics 101

My wife and I just finished watching Babel, a movie about people lost in foreign cultures struggling to communicate.

It turns out that when you pop in the DVD and hit play, by default there are no subtitles, despite the fact that the majority of dialog takes place in Moroccan, Japanese, sign language, and Spanish.

I suffered from English Lit Syndrome, thinking how cool it was how the filmmakers made you feel like you were lost along with the characters, recalling the spot-on memoryless feel of Memento.

My wife insisted that there must be something wrong. Perhaps we missed a setting or choice among the menu options for subtitles? As the Japanese storyline reached its close, with lengthy and intricate back and forth dialog between characters whose relationships I hadn’t the least clue about, I realized that maybe, just maybe, she was right.

When the movie ended, I dug back into the menu. Low and behold, there in a “settings” submenu was a choice for subtitles: English, Spanish, or none. Default on “none”.

My artistic elitism crumbled into simple annoyance.

Poking around online, it turns out I’m not the only one duped by the DVD bug or struck by ELS.

Just think of all the time wasted by people watching the movie in incomprehension, investigating the problem, getting irked, and most especially complaining about it online.

A classic Econ 101 lesson in efficiency lost.

But wait! The DVD spurred the disorganized masses to work together to produce a tower of criticism. How clever!

November 14th, 2008

The “predict flu using search” study you didn’t hear about

In October, Philip Polgreen, Yiling Chen, myself, and Forrest Nelson (representing University of Iowa, Harvard, and Yahoo!) published an article in the journal Clinical Infectious Diseases titled “Using Internet Searches for Influenza Surveillance”.

The paper describes how web search engines may be used to monitor and predict flu outbreaks. We studied four years of data from Yahoo! Search together with data on flu outbreaks and flu-related deaths in the United States. All three measures rise and fall as flu season progresses and dissipates, as you might expect. The surprising and promising finding is that web searches rise first, one to three weeks before confirmed flu cases, and five weeks before flu-related deaths. Thus web searches may serve as a valuable advance indicator for health officials to spot the onset of diseases like the flu, complementary to other indicators and forecasts.

On November 11, the New York Times broke a story about Google Flu Trends, along with an unusual announcement of a pending publication in the journal Nature.

I haven’t read the paper, but the article hints at nearly identical results:

Google … dug into its database, extracted five years of data on those queries and mapped it onto the C.D.C.’s reports of influenzalike illness. Google found a strong correlation between its data and the reports from the agency…

Tests of the new Web tool … suggest that it may be able to detect regional outbreaks of the flu a week to 10 days before they are reported by the Centers for Disease Control and Prevention.

To the reporter’s credit, he interviewed Phillip and the article does mention our work in passing, though I can’t say I’m thrilled with the way it was framed:

The premise behind Google Flu Trends … has been validated by an unrelated study indicating that the data collected by Yahoo … can also help with early detection of the flu.

giving (grudging) credit to Yahoo! data rather than Yahoo! people.

The story slashdigged around the blogomediasphere quickly and thoroughly, at one point reaching #1 on the nytimes.com most-emailed list. Articles and comments praise how novel, innovative, and outside-of-the-box the idea is. The editor in chief of Nature praised the “exceptional public health implications of [the Google] paper.”

I’m thrilled to see the attention given to the topic, and the Google team deserves a huge amount of credit, especially for launching a live web site as a companion to their publication, a fantastic service of great social value. That’s an idea we had but did not pursue.

In the business world, being first often means little. However in the world of science, being first means a great deal and can be the determining factor in whether a study gets published. The truth is, although the efforts were independent, ours was published first — and Clinical Infectious Diseases scooped Nature — a decent consolation prize amid the go-google din.

Update 2008/11/24: We spoke with the Google authors and the Nature editors and our paper is cited in the Google paper, which is now published, and given fair treatment in the associated Nature News item. One nice aspect of the Google study is that they identified relevant search terms automatically by regressing all of the 50 million most frequent search queries against the CDC flu data. Congratulations and many thanks to the Google/CDC authors and the Nature editors, and thanks everyone for your comments and encouragement.