Category Archives: prediction markets

Predicting media success

Often, predicting success is being a success. Witness Sequoia Capital or Warren Buffet.

In the media industry (e.g., books, celebs, movies, music, tv, web), predicting success largely boils down to predicting popularity.

Predicting popularity would be wonderfully easy, if it weren’t for one inconvenient truth: people herd. If only people were as fiercely independent as they sometimes claim to be — if everyone decided what they liked independently, without regard to what others said — then polling would be the only technology we would need. A small audience poll would foreshadow popularity with high accuracy.

Alas, such is not the case. No one consumes media in a vacuum. People are persuaded by influencers and influenced by persuaders. People respond in whole or in part to the counsel of critics, peers, viruses, and (yes) advertisers. So, what becomes popular is not simply a matter of what is good. What becomes popular depends on a complex dynamic process of spreading influence that’s hard to track and even harder to predict.

Columbia sociologist (and I’m happy to note future Yahoo) Duncan Watts and his colleagues conducted an artful studydescribed eloquently in the NY Times — asking just how much of media success reflects the true quality of the product, and how much is due to the quirks of social influence. In a series of carefully controlled experiments, the authors tease apart two distinct factors in a song’s ultimate success: (1) the inherent quality of the song, or the degree people like the song if presented it in isolation, and (2) dumb luck, or the extent the song happens by chance to get some of the best early buzz, snowballing it to the top of the charts in a self-fulfilling prophesy. Lo and behold, they found that, while inherent quality does matter, the luck of the draw plays at least as big a role in determining a song’s ultimate success.

If so, Big Media might be forgiven for their notoriously poor record of picking winners. Over and over, BM hoists on us stinkers like Gigli and stale knockoffs like Treasure Hunters. (In prediction lingo, these are false positives.) At the same time, BM snubbed (at least initially) some cultural institutions like Star Wars and Seinfeld. (False negatives.)

So, are media executives making the best of a bad situation, eking out as much signal as possible from an inherently noisy process? Or might some other institution yield forecasts with fewer false-atives?

I think you know where this is going. Prediction markets for media!

Media Predict is exactly that: a new prediction market aimed at forecasting media success. I’d like to congratulate founder Brent Stinski on a spectacular launch done right. Media Predict sprinted out of the gates with a deal with Simon & Schuster’s Touchstone Books and a companion piece in the NY Times, spawning coverage in The Economist and NPR. (Also congrats to Inkling Markets, the “powered by” provider.) More importantly, the website is clean, clear, complete (enough), and ready for launch.

I first met Brent Stinski in 2006 at Collabria’s NYC Prediction Markets Summit and his concept impressed me. Among the flury of recent play money PM startups, Media Predict’s business plan seems one of the most credible. The site taps simultaneously into the wisdom of crowds ethos, the user-generated content explosion, artists’ anti-establishment streak, and the public’s ambivalence toward Big Media. (The latter two factors are epitomized no more vehemently and eloquently than in an essay by Courtney Love, and stoke the fires of sites like Garage Band, Magnatune, Creative Commons, Lulu, Kinooga, and even MySpace, not to mention mashup fever, open source, anti-DRM-ism, etc.)

The New York publishing world is ridiculing Simon & Schuster for ceding its editorial power to the crowd. (In fact, S&S reserves the right to choose any book or none at all.)

Time will tell whether prediction markets can be better than (or at least more cost effective than) traditional media executives. One thing is for certain: one way or another, the power structure in the publishing world is changing rapidly and dramatically (no one sees and explains this better than Tim O’Reilly). My bet is that many artists and consumers will emerge feeling better than ever.

My first best answer

I felt bad about this. So I made sure to answer this. And whaddya know?: they like me, they really like me!


——– Original Message ——–
Subject: Yahoo! Answers: Your answer has been chosen as the best answer
Date: 24 Apr 2007 09:44:59 -0700
From: Yahoo! Answers
To: pennockd

Hey, Dave, look what you got!

Congratulations, you’ve got a best answer and 10 extra points!

Your answer to the following question really hit the spot and has been chosen as the best answer:

Who will win in 2008 and why (real answers)?

Go ahead, do your victory dance. Celebrate a little. Brag a little.
Then come back and answer a few more questions!

Take me to Yahoo! Answers

Thanks for sharing what you know and making someone’s day.

The Yahoo! Answers Team

Get the Yahoo! Toolbar for one-click access to Yahoo! Answers.



My Yahoo! Answers Stats
So now I’m 1 for 1! I can see how this gets addictive.

Betting on Sirius and XM to … die

One of the great things about intrade (recently split from TradeSports) is that they are open to suggestions from wide-eyed academics. For example, at Justin and Eric‘s urging, intrade listed several simple combinatorial markets, including baskets of states (e.g., “FL+OH”) in the 2004 US Presidential election and an October surprise market probing for a statistical correlation between Bush’s 2004 reelection and bin Laden’s capture.

Recently, again at Eric and Justin’s request, intrade launched a Sirius-XM merger market to predict whether the two satellite radio companies’ wedding vows will be blessed by the U.S. Department of Justice and the Federal Communications Commission.

The picture of XiriuM as a powerful monopoly threatening consumer choice is, to put it bluntly, laughable.

why would I pay for satellite programming when I can simply hop onto Internet radio?

If the DOJ nixes this merger, it can be due only to a horrible misunderstanding of the march of communications technology. One by one, nearly every communications medium is converging to operate “over IP”: data, voice, music, print, TV, video, you name it. Audio in your car should be no exception. Does anyone doubt that sooner rather than later every car (indeed every person) will be connected to the Internet? Then why would I pay extra (a good deal extra if the naysayers are to be believed) for one-size-fits-all satellite programming when I can simply hop on the Internet and tap into my personalized Lauchcast radio or my iTunes account? Orbital space machinery must weigh a little more heavily on the balance sheet than rack space in Quincy: how will XiriuM possibly compete once Internet radio has equal access into consumers’ cars?

The problem I see for XiriuM is that one-way purely broadcast technologies are nearing extinction. Even if some media don’t directly utilize the Internet or even TCP/IP, they will almost surely use a two-way communications link of some kind. Why? Ostensibly, because consumers want personalization and interactivity. Perhaps more to the point, because publishers and advertisers want better targeting and performance metrics.

The only “way out” I see for XiriuM is to actually become an Internet service provider for cars, much like the (formerly broadcast-only) cable companies did, for example by bundling high speed satellite downloads with a low bandwidth cellular uplink. Even so, I imagine that latency would be a serious problem, as with HughesNet (formerly Direcway) satellite Internet service, meant for use in rural areas with no broadband alternatives.

So, although I have no idea how DOJ will rule, and thus have no advice for intrade bettors, I do know how DOJ should rule: “sure, knock yourselves out”. Plus I have some throw away advice for SIRI and XMSR shareholders:… Sell!

Bix's American Idol prediction market

When one corporate fish swallows another, a lot can happen. Sure, temporary indigestion, remorse, culture clash, layoffs, posturing, Borg assimilation, chaos, panic, flight, or even disaster may ensue. More likely the carnivore hiccups and life moves on. But, in rare cases, the kid fish just so happens to be a visionary thinker and kickbutt coder with exactly the right skills and temperament to turn mama into a bigger, badder, better, and youthier fish, newly invigorated for survival in the pond. Witness Microsoft’s Ray Ozzie and Yahoo!’s Flickr-ization.

Now Yahoo! has a new Bixation.

I recently had the pleasure of visiting Bix at their (old) headquarters in the heart of downtown Palo Alto. These folks are impressive. Simply put, they build cool stuff, fast. The typical product cycle?: Two weeks. They grok the rinse and repeat development cycle of the new web world and, more importantly, have the experience and talent to pull it off. Oh, and this can never hurt: they’re supremely smart.

Case in point: Two supremely smart Bixies — John Hayes and Mike Speiser — developed a supremely cool prediction market from the ground up in about two weeks of spare cycles. (To predict the American Idol winner, of course: what else?) Check out the brilliantly simple one-page UI, powered by ajax-ian magic. The attention to detail is clear, from the inline sparkline graphs, to the minimalist yet clear descriptions.

Bix American Idol Prediction Market Screenshot -- Doolittle

Under the hood, the site is running independent Hanson market makers for each contestant. The payoff structure is designed to predict a full ranking, projecting the eventual winner as well as the expected losers each week. Play around with it and see what you think. John and Mike would love your feedback — to a large extent user reactions will drive where this project goes next.

CFP: Second Workshop on Prediction Markets

We’re soliciting research paper submissions and participants for the Second Workshop on Prediction Markets, to be held June 12, 2007 in San Diego, California, in conjunction with the ACM Conference on Electronic Commerce and the Federated Computing Research Conference. The workshop will have an academic/research bent, though we welcome both researchers and practitioners from academia and industry to attend to discuss the latest developments in prediction markets.

See the workshop homepage for more details and information.

You can signal your intent to attend at upcoming.org, though official registration must go through the EC’07 conference.

The wisdom of the ProbabilitySports crowd

One of the purest and most fascinating examples of the “wisdom of crowds” in action comes courtesy of a unique online contest called ProbabilitySports run by mathematician Brian Galebach.

In the contest, each participant states how likely she thinks it is that a team will win a particular sporting event. For example, one contestant may give the Steelers a 62% chance of defeating the Seahawks on a given day; another may say that the Steelers have only a 44% chance of winning. Thousands of contestants give probability judgments for hundreds of events: for example, in 2004, 2,231 ProbabilityFootball participants each recorded probabilities for 267 US NFL Football games (15-16 games a week for 17 weeks).

An important aspect of the contest is that participants earn points according to the quadratic scoring rule, a scoring method designed to reward accurate probability judgments (participants maximize their expected score by reporting their best probability judgments). This makes ProbabilitySports one of the largest collections of incentivized1 probability judgments, an extremely interesting and valuable dataset from a research perspective.

The first striking aspect of this dataset is that most individual participants are very poor predictors. In 2004, the best score was 3747. Yet the average score was an abysmal -944 points, and the median score was -275. In fact, 1,298 out of 2,231 participants scored below zero. To put this in perspective, a hypothetical participant who does no work and always records the default prediction of “50% chance” for every team receives a score of 0. Almost 60% of the participants actually did worse than this by trying to be clever.

ProbabilitySports participants' calibrationParticipants are also poorly calibrated. To the right is a histogram dividing participants’ predictions into five regions: 0-20%, 20-40%, 40-60%, 60-80%, and 80-100%. The y-axis shows the actual winning percentages of NFL teams within each region. Calibrated predictions would fall roughly along the x=y diagonal line, shown in red. As you can see, participants tended to voice much more extreme predictions than they should have: teams that they said had a less than 20% chance of winning actually won almost 30% of the time, and teams that they said had a greater than 80% chance of winning actually won only about 60% of the time.

Yet something astonishing happens when we average together all of these participants’ poor and miscalibrated predictions. The “average predictor”, who simply reports the average of everyone else’s predictions as its own prediction, scores 3371 points, good enough to finish in 7th place out of 2,231 participants! (A similar effect can be seen in the 2003 ProbabilityFootball dataset as reported by Chen et al. and Servan-Schreiber et al.)

Even when we average together the very worst participants — those participants who actually scored below zero in the contest — the resulting predictions are amazingly good. This “average of bad predictors” scores an incredible 2717 points (ranking in 62nd place overall), far outstripping any of the individuals contributing to the average (the best of whom finished in 934th place), prompting someone in this audience to call the effect the “wisdom of fools”. The only explanation is that, although all these individuals are clearly prone to error, somehow their errors are roughly independent and so cancel each other out when averaged together.

Daniel Reeves and I follow up with a companion post on Robin Hanson’s OvercomingBias forum with some advice on how predictors can improve their probability judgments by averaging their own estimates with one or more others’ estimates.

In a related paper, Dani et al. search for an aggregation algorithm that reliably outperforms the simple average, with modest success.

     1Actually the incentives aren’t quite ideal even in the ProbabilitySports contest, because only the top few competitors at the end of each week and each season win prizes. Participants’ optimal strategy in this all-or-nothing type of contest is not to maximize their expected score, but rather to maximize their expected prize money, a subtle but real difference that tends to induce greater risk taking, as Steven Levitt describes well. (It doesn’t matter whether participants finish in last place or just behind the winners, so anyone within striking distance might as well risk a huge drop in score for a small chance of vaulting into one of the winning positions.) Nonetheless, Wolfers and Zitzewitz show that, given the ProbabilitySports contest setup, maximizing expected prize money instead of expected score leads to only about a 1% difference in participants’ optimal probability reports.

Evaluating probabilistic predictions

A number of naysayers [Daily Kos, The Register, The Big Picture, Reason] are discrediting prediction markets, latching onto the fact that markets like TradeSports and NewsFutures failed to call this year’s Democratic takeover of the US Senate. Their critiques reflect a clear misunderstanding of the nature of probabilistic predictions, as many others [Emile, Lance] have pointed out. Their misunderstanding is perhaps not so surprising. Evaluating probabilistic predictions is a subtle and complex endeavor, and in fact there is no absolute right way to do it. This fact may pose a barrier for the average person to understand and trust (probabilistic) prediction market forecasts.

In an excellent article in The New Republic Online [full text], Bo Cowgill and Cass Sunstein describe in clear and straightforward language the fallacy that many people seem to have made, interpreting a probabilistic prediction like “Democrats have a 25% chance of winning the Senate” as a categorical prediction “The Democrats will not win the Senate”. Cowgill and Sunstein explain the right way to interpret probabilistic predictions:

If you look at the set of outcomes estimated to be 80 percent likely, about 80 percent of them [should happen]; events estimated to be 70 percent likely [should] happen about 70 percent of the time; and so on. This is what it means to say that prediction markets supply accurate probabilities.

Technically, what Cowgill and Sunstein describe is called the calibration test. The truth is that the calibration test is a necessary test of prediction accuracy, but not a sufficient test. In other words, for a predictor to be considered good it must pass the calibration test, but at the same time some very poor or useless predictors may also pass the calibration test. Often a stronger test is needed to truly evaluate the accuracy of probabilistic predictions.

For example, suppose that a meteorologist predicts the probability of rain every day. Now suppose this meteorologist is lazy and he predicts the same probability every day: he simply predicts the annual average frequency of rain in his location. He doesn’t ever look at cloud cover, temperature, satellite imagery, computer models, or even whether it rained the day before. Clearly, this meteorologist’s predictions would be uninformative and nearly useless. However, over the course of a year, this meteorologist would perform very well according to the calibration test. Assume it rains on average 10% of the time in the meteorologist’s city, so he predicts “10% chance” every day. If we test his calibration, we find that, among all the days he predicted a 10% chance of rain (i.e., every day), it actually rained about 10% of the time. This lazy meteorologist would get a nearly perfect score according to the calibration test. A hypothetical competing meteorologist who actually works hard to consider all variables and evidence, and who thus predicts different percentages on different days, could do no better in terms of calibration.

The above example suggests that good predictions are not just well calibrated: good predictions are, in some sense, both variable AND well calibrated. So what is the “right” way to evaluate probabilistic predictions? There is no single absolute best way, though several tests are appropriate, and probably can be considered stronger tests than the calibration test. In our paper “Does Money Matter?” we use four evaluation metrics:

  1. Absolute error: The average over many events of lose_PR, the probability assigned to the losing outcome(s)
  2. Mean squared error: The square root of the average of (lose_PR)2
  3. Quadratic score: The average of 100 – 400*(lose_PR)2
  4. Logarithmic score: The average of log(win_PR), where win_PR is the probability assigned to the winning outcome

Note that the absolute value of these metrics is not very meaningful. The metrics are useful only when comparing one predictor against another (e.g., a market against an expert).

My personal favorite (advocated in papers and presentations) is the logarithmic score. The logarithmic score is one of a family of so-called proper scoring rules designed so that an expert maximizes her expected score by truthfully reporting her probability judgment (the quadratic score is also a proper scoring rule). Stated another way, experts with more accurate probability judgments should be expected to accumulate higher scores on average. The logarithmic score is closely related to entropy: the negative of the logarithmic score gives the amount (in bits of information) that the expert is “surprised” by the actual outcome. Increases in logarithmic score can literally be interpreted as measuring information flow.

Actually, the task of evaluating probabilistic predictions is even trickier than I’ve described. Above, I said that a good predictor must at the very least pass the calibration test. Actually, that’s only true when the predicted events are statistically independent. It is possible for a perfectly valid predictor to appear miscalibrated when the events he or she is predicting are highly correlated, as discussed in a previous post.

confab.yahoo: Thanks everyone!

Thanks to all two hundred and seventy (!) of you who attended the confab.yahoo last Wednesday, as far as I know a record audience for an event devoted to prediction markets. [View pictures]

Thanks for spending your evening with us. Thanks for waiting patiently for the pizza and books! Thanks to the speakers (Robin, Eric, Bo, Leslie, myself, Todd, Chris, and Adam) who, after all, make or break any conference: in this case IMO definitely “make”. The speakers delivered wit and wisdom, and did it within their allotted times! It’s nice to see Google, HP, Microsoft, and Yahoo! together in one room discussing a new technology and — go figure — actually agreeing with one another for the most part. Thanks to James Surowiecki for his rousing opening remarks and for doing a fabulous job moderating the event. Thanks to the software demo providers Collective Intellect, HedgeStreet, HSX, and NewsFutures: next time we’d like to give that venue more of the attention is deserved. Thanks to Yahoo! TechDev and Yahoo! PR for planning, marketing, and executing the event. A special thanks to Chris Plasser, who orchestrated every detail from start to finish flawlessly while juggling his day job, making it all look easy in the process.

Many media outlets and bloggers attended. Nice articles appear in ZDNet and CNET, the latter of which was slashdotted yesterday. The local ABC 11 o’clock news even featured a piece on the event [see item #35 in this report]. I’m collecting additional items under MyWeb tag ‘confab.yahoo’.

CNET and Chris Masse (on Midas Oracle) provide excellent summaries of the technical content of the event. So I’ll skip any substantive comments (for now) and instead mention a few fun moments:

  • Bo began by staring straight into the camera and giving a shoutout to Chris Masse, the eccentric Frenchman who also happens to be a sharp, tireless, and invaluable (and don’t forget bombastic) chronicler of the prediction markets field via his portal and blog.
  • Todd had the audience laughing with his story of how a prediction market laid bare the uncomfortable truth about an inevitable product delay, to the incredulousness of the product’s manager. (Todd assured us that this was a Microsoft internal product, not a consumer-facing product.)
  • I had the unlucky distinction of being the only speaker to suffer from technical difficulties in trying to present from my own Mac Powerbook instead of the provided Windows laptop. Todd later admitted that he was tempted to make a Windows/Mac quip like “Windows just works”.
  • Adam finished with an Jobsian “one more thing” announcement of their latest effort, worthio, a secret project they’ve been hacking away at nights and weekends even as they operate their startup Inkling at full speed ahead. (Yesterday Adam blogged about the confab.)

Our Yootles currency seems to have caught the public’s imagination more than any of the other various topics I covered in my own talk. (What’s wrong with you folks? You’re not endlessly fascinated with the gory mathematical details of my dynamic parimutuel market mechanism? ;-)) And so a meme is born. The lead on the Yootles project is Daniel Reeves and he is eager to answer questions and hear your feedback.

I enjoyed the confab immensely and it was great to meet so many people: thanks for the kind words from so many of you. Thanks again to the speakers, organizers, media, and attendees. I hope the event was valuable to you. Archive video of the event is available [100k|300k] for those who could not attend in person.

confab.yahoo update

Here is an update on the confab.yahoo on prediction markets happening this Wed Dec 13 at 5:30pm at Yahoo!’s Sunnyvale headquarters, Building C, Classroom 5.

  1. We’ve added Stanford b-school professor Eric Zitzewitz as a speaker
  2. We’ll hold an ad-hoc vendor session immediately following the event, tentatively featuring Collective Intellect, HedgeStreet, HSX, Inkling Markets, NewsFutures, and RIMDEX
  3. There will be food!
  4. We’ll be giving away a limited number of copies of Surowiecki’s book The Wisdom of Crowds
  5. We’re planning to webcast the event at two connection speeds: 100k | 300k

Again, the event is free and open to the public. Hope to see you there!

confab.yahoo on prediction markets: Sunnyvale, Dec 13 5:30p

I’m happy to announce the following public event:

confab.yahoo on prediction markets: tapping the wisdom of crowds
5:30-8:00pm Wed Dec 13, 2006 (sign up on Upcoming.org!)
Yahoo! Headquarters, Building C, Classroom 5
701 First Avenue, Sunnyvale, CA 94089 USA

Join us for a public “how to” session on prediction markets moderated by James Surowiecki, New Yorker columnist and best-selling author of The Wisdom of Crowds. Speakers from Google, HP, Microsoft, and Yahoo! will describe how they are using prediction markets to aid corporate forecasting and decision making. Other speakers include the developer of Zocalo, an open source prediction market platform; the co-founder of InklingMarkets.com, a Paul Graham yCombinator startup; and Robin Hanson, the visionary economist and inventor whose pioneering work paved the way. The event is open to the public and will emphasize practical lessons and hands-on advice. After brief presentations from each speaker, Surowiecki will open up the session for discussion with the audience.

confab.yahoo is Yahoo! TechDev’s new open microconference series. Join academic and industry experts from across the valley and the country as they discuss the latest technologies and their applications and see for yourself what’s next on the web. Attendance is free and open to the public.