In October, Philip Polgreen, Yiling Chen, myself, and Forrest Nelson (representing University of Iowa, Harvard, and Yahoo!) published an article in the journal Clinical Infectious Diseases titled “Using Internet Searches for Influenza Surveillance”.
The paper describes how web search engines may be used to monitor and predict flu outbreaks. We studied four years of data from Yahoo! Search together with data on flu outbreaks and flu-related deaths in the United States. All three measures rise and fall as flu season progresses and dissipates, as you might expect. The surprising and promising finding is that web searches rise first, one to three weeks before confirmed flu cases, and five weeks before flu-related deaths. Thus web searches may serve as a valuable advance indicator for health officials to spot the onset of diseases like the flu, complementary to other indicators and forecasts.
On November 11, the New York Times broke a story about Google Flu Trends, along with an unusual announcement of a pending publication in the journal Nature.
I haven’t read the paper, but the article hints at nearly identical results:
Google … dug into its database, extracted five years of data on those queries and mapped it onto the C.D.C.’s reports of influenzalike illness. Google found a strong correlation between its data and the reports from the agency…
Tests of the new Web tool … suggest that it may be able to detect regional outbreaks of the flu a week to 10 days before they are reported by the Centers for Disease Control and Prevention.
To the reporter’s credit, he interviewed Phillip and the article does mention our work in passing, though I can’t say I’m thrilled with the way it was framed:
The premise behind Google Flu Trends … has been validated by an unrelated study indicating that the data collected by Yahoo … can also help with early detection of the flu.
giving (grudging) credit to Yahoo! data rather than Yahoo! people.
The story slashdigged around the blogomediasphere quickly and thoroughly, at one point reaching #1 on the nytimes.com most-emailed list. Articles and comments praise how novel, innovative, and outside-of-the-box the idea is. The editor in chief of Nature praised the “exceptional public health implications of [the Google] paper.”
I’m thrilled to see the attention given to the topic, and the Google team deserves a huge amount of credit, especially for launching a live web site as a companion to their publication, a fantastic service of great social value. That’s an idea we had but did not pursue.
In the business world, being first often means little. However in the world of science, being first means a great deal and can be the determining factor in whether a study gets published. The truth is, although the efforts were independent, ours was published first — and Clinical Infectious Diseases scooped Nature — a decent consolation prize amid the go-google din.
Update 2008/11/24: We spoke with the Google authors and the Nature editors and our paper is cited in the Google paper, which is now published, and given fair treatment in the associated Nature News item. One nice aspect of the Google study is that they identified relevant search terms automatically by regressing all of the 50 million most frequent search queries against the CDC flu data. Congratulations and many thanks to the Google/CDC authors and the Nature editors, and thanks everyone for your comments and encouragement.
