Category Archives: prediction markets

Predictalot! (And we mean alot)

I’m thrilled to announce the launch of Predictalot, a combinatorial prediction market for the NCAA Men’s Basketball playoffs. Predict almost anything you can think of, like Duke will advance further than UNC, or Every final four team name will start with U. Check the odds and invest points on your favorites. Sell your predictions anytime, even as you follow the basketball games live.

The basic game play is simple: select a prediction type, customize it, and invest points on it. Yet you’ll never run out of odds to explore: there are hundreds of millions of predictions you can make. The odds on each update continuously based on other players’ predictions and the on-court action.

Predictalot is a Yahoo! App, so you can play it at apps.yahoo.com or you can add it to your Yahoo! home page. I have to admit, it’s an incredible feeling to play a game I helped design right on the Yahoo! home page.

Predicalot app on the Yahoo! home page

That’s all you need to get started. If you’re curious and would like a peek under the hood, read on: there’s some interesting technology hidden in the engine.

Background and Details

Predictalot is a true combinatorial prediction market of the sort academics like us and Robin Hanson have been dreaming about since early in the decade. We built the first version during an internal Yahoo! Hack Day. Finally, we leveraged the Yahoo! Application Platform to quickly build a public version of the game. (Note that anyone can develop a YAP app that’s visible to millions — there’s good sample code, it supports YUI and OpenSocial, and it’s easy to get started.) After many fits and starts, late nights, and eventually all nights, we’re proud and excited to go live with Predictalot version 1.0. I can’t rave enough about the talent and dedication of the research engineers who gave the game a professional look and feel and production speed, turning a pie-in-the-sky idea into reality. We have many features and upgrades in mind for future versions, but the core functionality is in place and we hope you enjoy the game.

In the tournament, after the play-in game, the 64 top college basketball teams play 63 games in a single elimination tournament. So there are 2 to the power 63 or 9.2 quintillion total possible outcomes, or ways the entire tournament can unfold. Predictalot implicitly keeps track of the odds for them all. To put this in perspective, it’s estimated that there are about 10 quintillion individual insects on Earth. Of course, for all practical purposes, we can’t store 9.2 quintillion numbers, even with today’s computers. Instead, we compute the odds for any outcome on the fly by scanning through the predictions placed so far.

A prediction is a statement, like Duke will win in the first round, that will be either true or false in the final outcome. In this case, the prediction is true in exactly half, or 2 to the power 62 outcomes. (Note this does not mean the odds are 50% — remember the outcomes themselves are not all equally likely.) In theory, Predictalot can support predictions on any set of outcomes. That’s 2 to the power 2 to the power 63, or more than a googol predictions. For now, we restrict you to “only” hundreds of millions of predictions categorized into thirteen types. Computing the odds of a prediction precisely is too slow. Technically, the problem is #P-hard: as hard as counting SAT and harder than the travelling salesman problem. So we must resort to approximating the odds by randomly sampling the outcome space. Sampling is a tricky business — equal parts art and science — and we’re still actively exploring ways to increase the speed, stability, and accuracy of our sampling.

Because we track all possible outcomes, the predictions are automatically interconnected in ways you would expect. A large play on Duke to win the tournament instantly and automatically increases the odds of Duke winning in the first round; after all, Duke can’t win the whole thing without getting past the first round.

With 9.2 quintillion outcomes, Predictalot is to our knowledge the largest prediction market built, testing the limits of what the wisdom of crowds can produce. Predictalot is a game, and we hope it’s fun to play. We’d also like to pave the way for serious use of combinatorial prediction market technology.

Why did Yahoo! build this? Predictalot is a smarter market, letting humans and computers each do what they do best. People enter predictions in simple terms they understand like how one team fares against another. The computer handles the massive yet methodical number crunching needed to combine all the pieces together into a coherent overall prediction of a complex event. Markets like Predictalot, WeatherBill, CombineNet, and Internet advertising systems, to name a few, represent the evolution of markets in the digital age, empowering users with extreme customization. More and more, matching buyers with sellers — the core function of markets — requires sophisticated algorithms, including machine learning and optimization. Predictalot attempts to illustrate this trend in an entertaining way.

David Pennock
Mani Abrol, Janet George, Tom Gulik, Mridul Muralidharan, Sudar Muthu, Navneet Nair, Abe Othman, Daniel Reeves, Pras Sarkar

Psst: WeatherBill doesn’t know New Jersey is the new Florida: Place your bets now

Quantifying New York’s 2009 June gloom using WeatherBill and Wolfram|Alpha

In the northeastern United States, scars are slowly healing from a miserably rainy June — torturous, according to the New York Times. Status updates bemoaned “where’s the sun?”, “worst storm ever!”, “worst June ever!”. Torrential downpours came and went with Florida-like speed, turning gloom into doom: “here comes global warming”.

But how extreme was the month, really? Was our widespread misery justified quantitatively, or were we caught in our own self-indulgent Chris Harrisonism, “the most dramatic rose ceremony EVER!”.

This graphic shows that, as of June 20th, New York City was on track for near-record rainfall in inches. But that graphic, while pretty, is pretty static, and most people I heard complained about the number of days, not the volume of rain.

I wondered if I could use online tools to determine whether the number of rainy days in June was truly historic. My first thought was to try Wolfram|Alpha, a great excuse to play with the new math engine.

Wolfram|Alpha queries for “rain New Jersey June 200Y” are detailed and fascinating, showing temps, rain, cloud cover, humidity, and more, complete with graphs (hint: click “More”). But they don’t seem to directly answer how many days it rained at least some amount. The answer is displayed graphically but not numerically (the percentage and days of rain listed appears to be hours of rain divided by 24). Also, I didn’t see how to query multiple years at a time. So, in order to test whether 2009 was a record year, I would have to submit a separate query for each year (or bypass the web interface and use Mathematica directly). Still, Wolfram|Alpha does confirm that it rained 3.8 times as many hours in 2009 as 2008, already one of the wetter months on record.

WeatherBill, an endlessly configurable weather insurance service, more directly provided what I was looking for on one page. I asked for a price quote for a contract paying me $100 for every day it rains at least 0.1 inches in Newark, NJ during June 2010. It instantly spat back a price: $694.17.



WeatherBill rainy day contract for June 2010 in Newark, NJ

It also reported how much the contract would have paid — the number of rainy days times $100 — every year from 1979 to 2008, on average $620 for 6.2 days. It said I could “expect” (meaning one standard deviation, or 68% confidence interval) between 3.9 and 8.5 days of rain in a typical year. (The difference between the average and the price is further confirmation that WeatherBill charges a 10% premium.)

Below is a plot of June rainy days in Newark, NJ from 1979 to 2009. (WeatherBill doesn’t yet report June 2009 data so I entered 12 as a conservative estimate based on info from Weather Underground.)


Number of rainy days in Newark, NJ from 1979-2009

Indeed, our gloominess was justified: it rained in Newark more days in June 2009 than any other June dating back to 1979.

Intriguingly, our doominess may have been justified too. You don’t have to be a chartist to see an upward trend in rainy days over the past decade.

WeatherBill seems to assume as a baseline that past years are independent unbiased estimates of future years — usually not a bad assumption when it comes to weather. Still, if you believe the trend of increasing rain is real, either due to global warming or something else, WeatherBill offers a temptingly good bet. At $694.17, the contract (paying $100 per rainy day) would have earned a profit in 7 of the last 7 years. The chance of that streak being a coincidence is less than 1%.

If anyone places this bet, let me know. I would love to, but as of now I’m roughly $10 million in net worth short of qualifying as a WeatherBill trader.

Thank you Bangalore

Sunday I returned from a trip to Bangalore, India, where I gave a talk on “The Automated Economy” about how computers can and should take over the mechanical aspects of economic activity, optimizing and learning from data in the way people cannot, with detailed case studies in online advertising and prediction markets. You can read the abstract, watch archive video of the talk, view my talk slides, browse the official pictures of the event, or see my personal pictures of the trip.

Some say everything’s bigger in Texas (most vociferously Texans). They haven’t been to India. My talk is part of Yahoo!’s Big Thinkers India series — four talks a year from (so far) Yahoo! Research speakers. If the Thinking isn’t Big, the crowds certainly are — the events can draw close to 1000 attendees from, apparently, all over India. Duncan Watts says its the largest crowd he’s spoken too; me too. This time they disallowed Yahoo! employees to attend the main event and the hotel ballroom still filled to capacity.

Here is a linked-up version of my journal entry for the trip, a kind of windy and winded thank you letter to Bangalore. If you’re not interested in personal details, you might skip to Thoughts on Bangalore.

Getting there

The Philadelphia airport international terminal is dead empty. I breeze through security — the only one in line. I’m inside security two hours early thinking that either the recession is still in full force or traveling internationally on a Monday night out of Philadelphia is the best ever. Maybe not. Get on plane. Wait two hours on tarmac. Apparently a two hour layover isn’t enough leeway on international flights. Miss my connecting flight in Frankfurt by a few minutes. Team up with a fellow passenger in the same boat. We are rebooked via Dubai. Fly directly over Bagdad. Dubai is an impressive airport. Endless terminals lined with upscale shopping. Packed with Asians, Europeans at midnight and beyond. From there, Emerites Air to Bangalore. Only 9 hours behind schedule. Sneezing fits begin after 28 hours of airplane air.

Day 0: Yahoo! internal practice talk

Driver right there outside baggage claim, nice guy. Takes me to hotel. Over an hour. Traffic. Time for shower, NeilMed nasal rinses (bottled water), Sudafed, but not sleep. Call home. Yahoo! Messenger with Voice doesn’t roll off the tongue like ‘Skype’, but it rocks. Super clear and dirt cheap. Lauren and the girls are so sweet. Miss them. To Yahoo! office. Meet Anita, Mani. Time for Yahoo! internal version of Big Thinkers talk. Nose is still running. Drips and wipes during my talk. Talk goes well but I run out of time for prediction market section and this seems what people are most interested in. I’m glad I had the practice run to work out the kinks and rebalanced the talk. Back to hotel. Call home again for a recharging dose of home. I missed Ashley’s graduation from pre-school: she did great: they sang six songs and she knew them all. She was dressed up in a yellow cap and gown. I’m upset I had to miss such an adorable milestone but am proud of my little girl (and dismayed she is rapidly becoming not so little!). More NeilMed. Room service. (Called “private dining” here — sounds illicit.) Sleep! For a few hours at least. Wake up in the middle of the night since it’s NY daytime. Finally get back to sleep again.

Day 1: Meetings

Hard to wake up at 9am = midnight. Shower. Feel 1000% better. Driver takes me to the Yahoo! office. It’s in a complex with Microsoft, Google, Target, Dell, and many other US brands. Once you’re inside it’s like every other Yahoo! office except the food — built essentially to corporate spec. Meet with Anita, Raghu, and Rajeev: go over PR angles and they brief me on the media interviews. These guys and gal are on top of things. Meet with Mani and her team: great group. Skip intern pizza talks because I can’t eat cheese, going for the cafeteria instead. Mistake. Order a veggie grill thinking that since it’s grilled, it’s cooked enough. I only take a few bites of this before thinking it’s too risky. I eat some bread and Indian mixtures. Not sure what the culprit is but something doesn’t sit well in my stomach. Give prediction markets portion of my talk to a few interested people in labs. Very sharp group. Meet with Dinesh and Sachin, their intern, and one other. Interesting work. Meet with Chid and Preeti on Webscope. Back to hotel. Call Lauren. Good to hear her voice. Ashley wants to say hi. She’s so adorable. She finds it hilarious that I am about to have dinner while she is eating breakfast. I can hear her laughing uncontrollably at the thought. Sarah says hi too and even ends our conversation without prompting with a “bye, love you”. I go down to the restaurant for dinner. Have a chicken Indian dish with paratha (is it lachha paratha?) bread. Spicy (sweat inducing) yet so delicious. The bread is fantastic — round white with flaky layers. Back to room. TV. CNN. CNBC. ESPN. Hard to sleep. There is an incredible thunderstorm with torrents of rain. I open my balcony door briefly to catch its power. I find out later that monsoon season is just beginning. I also find out that it rained so hard and so long that the roads flooded to the point of becoming impassible. In fact, Anita, the Bangalore PR lead, had a near-disastrous experience in the rapidly flooding streets on her way home and had to turn back and check into a hotel before going home briefly in the morning and then back to Yahoo! for our am meeting. Finally get to sleep.

Day 2: My talk!

Hard to wake up at 8:30am too. Talk’s today! Nerves begin. Media interviews are first! Even worse. Turns out they went fine. Two nice/sharp reporters, especially the second one who really knows her stuff and spoke to us (Rajeev and I) for 1.5 hours. She’s especially interested in the prediction market stuff since that is something new. She may write two articles (for Business World India). Lunch, then a bit of time to rest and freshen up. Stomach is not doing well. Pepto to the rescue. Back down to lobby. They take my picture in the courtyard. Then into the ballroom. Miked. Soundchecked. They accept a final last minute change to my slides: hooray! Room starts filling. 100 people. 200. 300. Now 500. It’s time to start! Rajeev gives a very nice intro. I walk up the stairs onto the stage. I’m miked, in lights, speaking in front of 500 people expecting a Big Thinker. Here I go! “Four score and seven years…” Ha ha. Actually: “Thanks Rajeev, and thanks everyone for your time and attention. I am happy and honored to be here. I’m going to talk about trends in automation in the economy…”

David Pennock speaking at Yahoo! Big Thinkers India June 2009Audience at Yahoo! Big Thinkers India June 2009

65 minutes later “Thank you very much.” Applause. I think it went well: one of my better talks. I covered everything, including the prediction market stuff. It turns out, like at Yahoo!, and like the journalists, the audience is more interested in prediction markets than advertising. Lots of questions. Some I follow, some I can’t parse the words, others I hear the words but just don’t understand. I do my best. Several people mention they follow my blog: gratifying. After the official Q&A session ends, there is a line up of folks with questions or comments and business cards. It’s the closest I’ll ever be to a rock star. A handful of people wait patiently around me while I try to get to everyone. Eventually the PR folks rescue me and take me to a “high tea” event with Yahoo! Bangalore execs and some recruiting targets. Relief and euphoria kick in. It’s over. I talk with a number of people. I make my exit. Private dining. Call home. Lauren has explained to Ashley that I am on the other side of the world, so when she has the sun, I have the moon. So I can hear Ashley asking in the background, “does Daddy have the moon?” I do. She can’t stop laughing. A repeat of game 6 of the Stanley Cup is on Ten Sports India. I watch it, getting psyched for Game 7. I check online for Ten Sports schedule. Game 7 will be on at 5:30am! I can’t miss that! Set my alarm. Try to sleep. Can’t sleep. Try to sleep. Can’t sleep. Try with TV on. Can’t sleep. Try with TV off. Can’t sleep. Finally fall asleep… Alarm!

Day 3a: Penguins win the Stanley Cup!

Really hard to wake up at 5:30am. Actually maybe not quite as hard since it’s 8pm in my head. Game on! Nerves are racked up. Can’t sit down: bad luck. Pacing. No score first period. Tons of commercials, all for Ten Sports programming: wrestling, cricket, tennis. Every commercial repeats three times. Is period two coming? Yes, it’s back on! Pens score first! Fist pumping and muted cheering. Can they really do this? No sitting rule in full effect. Pacing. Pens score again! Talbot second goal. Wow, is this real? Can it be? Don’t think about it yet. Don’t celebrate to soon. Plenty of time left. Period two end at 2-0. Unbelievable. All the same commercials come back, three times each. Period three begins. Stand up. Pace. Clock ticks. Pens are playing too defensive: not taking shots, just throwing the puck out of their zone. This isn’t good. Detroit is getting tons of chances. Fleury is awesome. Five minutes left. I let myself think about winning the cup. Mistake! Detroit scores! It’s 2-1! Nerves are ratcheted up beyond ratcheting. I think about it all slipping away. How awful that would feel. If Detroit ties it up, imagine the let down, the blown opportunity. Clock ticks. More chances. More saves. More defense. It’s working! Detroit pulls their goalie. Pressure. Final seconds. Faceoff in our zone. Detroit wins control. Shot. Rebound. Right to a Red Wing — Nick Lidstrom — in perfect position. He shoots. Fleury swings around. He saves it! It’s over! Pens win the Cup! Super fist pumping, jumping around, dancing, muted cheering. They did it! How amazing it feels after last year’s loss to the same team. After falling behind 2-0 and 3-2 in the series. They came back! A delicious payback with the same but opposite script as last year: a two goal lead cut in half in the waning minutes, a flurry of attempts at the end including a few-inch miss of the tying goal in the last seconds. These guys are young and have the potential to rule hockey for several years if they’re lucky. Mario Lemieux is on the ice. How sweet. Twice as player, now as owner, the one who saved hockey in Pittsburgh. What a year for Pittsburgh sports! Two nail biter games, two comebacks, two championships. City of Champions again. Too bad the Pirates have no shot to join them in a trifecta. Back to sleep.

Day 3b: Sightseeing

Phone rings at 11am — my driver is here. Off to do some whirlwind sightseeing. Everyone here who finds out I have a day off recommends I leave Bangalore — Bangalore is just not that nice, nothing really to see, they say. They all recommend Mysore, 3.5 hours away, but that is too far for my comfort level given that my flight is late tonight and it’s supposed to thunderstorm. We start with some souvenir shopping on “MG Road”. My driver takes me to a store and waits in the car outside. I walk in an instantly there are people greeting me and showing me things. One aggressive man takes over and remains my “tour guide” through the whole store. The fact that I reward his aggressiveness by following along and eventually buying stuff will only bolster him to do more of the same in the future. Annoying but clearly it works. I do negotiate him down, but I leave still feeling I didn’t bargain hard enough and with a bit of distaste in my mouth that I fueled and validated the pushy tactics. Next we drive past parliament and the courthouse. Impressive, large, old buildings. But I can just gaze and take photos from the car — can’t go inside. Next we drive past Cubbon Park — tree lined paths and flower gardens in center city. Next is ISKCON temple. But it’s closed. So one more round of shopping at a place called Cottage Industries. I’m wary given the last experience, but go anyway. This one is better. Again one person escorts me around but I feel less pressure. Plus I’m more prepared to say no and negotiate harder. I leave with what seems like a fair amount of value in goods. I recommend Cottage Industries to future visitors: more professional, more familiar (items have price tags), lower pressure, greater variety, and higher quality than at least the first shop I visited. Now we’ve killed enough time and the ISKCON temple is open. It’s a giant Hare Krishna temple. The parking lot is full. I tell the driver it’s ok — we don’t need to go. He says “you go, you go”. “Ok” I say. We drive around again to the same full parking lot. The attendant waves at us to leave, blowing a whistle. My driver is talking to him. They are talking quite heatedly. The attendant in his official looking uniform is waving us on vigorously. Although I can’t understand the words, he is clearly telling us the lot is full and we must leave immediately — we are holding up traffic. My driver is getting more insistent. They are yelling back and forth. I have no idea what he says but it works. The guard let’s us in. Meanwhile another car sees our success and tries to argue his way in too but to no avail. I ask my driver what he said: he simply replies “don’t talk”. Indeed once we’re in, there is an empty spot. We put all my bags in my suitcase in the trunk and cover my backpack. We take off our shoes and my driver leads me to the temple. He knows the back entrance and is guiding me to cut in front of lines everywhere. We walk past the main attraction: the altar with some people on the floor worshiping. Then the line weaves past a gift shop of course: I buy a crazy looking book (Easy Journey to Other Planets). We need to kill some time. We go to the gardens again to walk around. We walk into the public library. Most books are in English. Most seem old and worn. The attendant says the library is 110 years old. We start walking through the garden but I am paranoid about mosquitoes/malaria so we turn around early to return to the car. We go to UB City where I meet Rajeev. It’s a thoroughly modern office tower half owned by Kingfisher of Kingfisher Airlines. The building is full of high-end shopping like almost any upscale western mall with all the same brands. Here is the Apple Store. Here is Louis Vuitton. We have dinner at an Italian restaurant that could be anywhere in the western world, owned by an Italian expat. The only seating is outside and I remain worried about mosquitoes but don’t see any. The food is good and the conversation is good.

This place is the closest I’ve seen of the future of Bangalore. In the center of town, a gorgeous building filled with gleaming shops and tantalizing restaurants and bars, with apartments and condos within walking distance, and a palm-tree-lined street leading to the central town circle and the park. As Rajeev says, though, whereas New York has hundreds of similar scenes, Bangalore has one. For now.


Thoughts on Bangalore

Bangalore is a city of jarring contradictions, a hard-to-fathom mix of modernity and poverty. Signs with professional logos and familiar brands are set askew on dilapidated shacks and garages lining the road. While many live on dollars and day and others beg, the majority are smartly dressed (men invariably in button-down shirts), have mobile phones, and are intelligent and friendly. There are gleaming office towers indistinguishable from their western counterparts, yet a strong rain can flood the roads to the point of become impassible for hours and day-long blackouts aren’t uncommon. Many billboards are in English, sporting familiar brands and messages. Others, like sexy stars promoting a Bollywood film, are entirely familiar, English or not. Others are impenetrable. Still another advertises a phone number to learn why Obama quoted the Koran.

BMWs and Toyotas join bikes, motorcycles, pedestrians, aging trucks and buses, and colorful open-air motorized rickshaws in a sea of disorganized line-ignoring sign-ignoring traffic. People drive here the way New Yorkers walk sidewalks: weaving past one another in a noisy self-organized tangle that somehow — mostly — works. You can eat outside in a restaurant bar next to upscale shops, a fountain, and smiling yuppies, yet worry that a malaria-infected mosquito lurks nearby or that a washed vegetable will turn a western-coddled stomach deathly ill. When two people ride a motorcycle, as is common, only the driver wears a helmet — the passenger clinging on behind does not: new and old rules on display atop a single vehicle. And the traffic. Oh, the traffic. Roads are clogged nearly every hour of every day. My Saturday of sightseeing was as bad or worse than weekday rush hour. The extent of congestion itself illustrates Bangalore’s two faces: so many people with youth (India is one of the youngest countries in the world), energy, purpose, and the means and intelligence to accomplish it overtaxing a primitive infrastructure. Buildings are going up according to western specs, but under old-time rules where corruption reins and bribery is an accepted fact of life by even the western-educated aspirational class (about 20% and growing, according to Rajeev).

Thoughts on Yahoo! Labs Bangalore

The folks I met are impressive. Rajeev has done a great job hiring talented, driven folks. Mani‘s group of research engineers is fantastic. One is headed to Berkeley for grad school and asks great questions about CentMail. Another proposes an attack on Pictcha. Another (Rahul Agrawal) has read up deeply on prediction markets, including Hanson’s LMSR.

Thoughts on the Yahoo! Big Thinkers India program

The whole event was organized to precision. Anita, the PR lead, was incredible. I especially appreciated the extra “above and beyond” touches like having someone pick up Yahoo! India schwag for my family and send it to my hotel after I forgot: so nice. Raghu, who arranged the media interviews, is supremely organized and on top of his game. The fact that the event draws such a large crowd shows that there is great thirst for events like this in Bangalore. I’m not sure whose idea it is, but it’s a brilliant one: great marketing and great for recruiting.

Thank you Bangalore

In sum, thanks to the people of Bangalore for a fascinating and rewarding trip. Thanks to Rahul at the travel desk whose instant replies about the driver arrangements calmed my nerves on the stressful day of my departure. Thanks to the Yahoo! folks who arranged and organized my talk, and the Yahoo! Labs members for seeding an exceptional science organization. Thanks to my driver who got me everywhere — including into full parking lots, back entrances, and fronts of lines — with efficiency, safety, and a smile (when I tipped him, I tried to think wwsd and wwdd: what would Sharad or Dan do?). Thanks to those who attending my talk and whom I met afterward: it’s gratifying and invigorating to see your level of interest and enthusiasm (and your numbers). And thanks Bangalore chefs for keeping any stomach upset relatively mild and brief.

At the airport on the way out, the flight is overbooked and they are offering close to US$1000 plus hotel to leave tomorrow. Not a chance. It’s been fun and an adventure but my nerves are on high and I miss my family: it’s time to make the 20+ hour journey home.

Where to find the Yahoo!-Google letter to the CFTC about prediction markets

At the Prediction Markets Summit1 last Friday April 24 2009, I mentioned that Yahoo! and Google jointly wrote a letter to the U.S. Commodity Futures Trading Commission encouraging the legalization of small-stakes real-money prediction markets, and that Microsoft had recently written its own letter in support of the effort. (The CFTC maintains a list of all public comments responding to their request for advice on regulating prediction markets.)

I told the audience that they could learn more by searching for “cftc yahoo google” in their favorite search engine, showing the Yahoo! Search results with MidasOracle’s coverage at the top.2

It turns out that was poor advice. 63.7% of the audience probably won’t find what they’re looking for using that search.3


Yahoo! versus Google search for "cftc yahoo google"

If some search engines don’t surface the MidasOracle post, I’m hoping they’ll find this.

And back to the effort to guide the CFTC: I hope other people and companies will join. The CFTC’s request for help itself displays a clear understanding of the science and practice of prediction markets and a real willingness to listen. The more organizations that speak out in support, the greater chance we have of convincing the CFTC to take action and open the door to innovation and experimentation.

1Which I hesitated to attend and host a reception for and now regret endorsing in any way.
2In September 2008, journalist Chris Masse uncovered the letter on the CFTC website before Google or Yahoo! had announced it. We should have known: Masse is extraordinarily skilled at finding anything relevant anywhere, and has been a tireless, invaluable (and unpaid) chronicler of all-things-prediction-markets for years now.
3Even Microsoft Live has the “right” result in position 3. Interestingly, Daniel Reeves got slightly different, presumably personalized, results in Google, even less excuse for not knowing what two MO junkies were looking for with that query.

A tale of two insurance/prediction markets

Chris Masse has the scoop (once again proving how indispensable he is) on a new real-money prediction market coming soon, one of the few with the CTFC’s blessing to operate in the United States: The American Civics Exchange. Their tag line focuses on the insurance angle: “Your greatest financial risks may be hiding in plain sight — market-based solutions for political risk management”.

Meanwhile, Carlos Saieh, a sharp student in Justin Wolfers’ class where I just gave a guest lecture, found an apparent pricing bug in another insurance-oriented prediction market, WeatherBill (proving how indispensable attentive students with laptops and wifi are):



WeatherBill pricing mistake


Let’s see: for a mere $770, you can purchase a contract that pays out at most $700 in the absolute best case, possibly much less. Hmm, let me think about that one.

Finally, a financial contract that makes mortgage-backed securities look good.

KISS prediction markets (lingo) goodbye

The lingo of prediction markets varies widely.

The same “thing” might be called an information market, idea future, virtual stock market, financial market, securities market, event market, binary option, betting exchange, bookmaker, market in uncertainty, or gambling/wagering. Only recently has the name prediction market emerged with some sort of consensus.

To place a prediction in the market, you might do any of the following:

[bid/buy/bet on/back] the “yes” [security/contract/coupon/future/outcome] at [price/probability/fractional odds/decimal odds/moneyline] X

Predicting something won’t happen gets even uglier. You might:

[ask/short sell yes/buy no/buy bundle & sell yes/bet against/lay] at [price/probability/fractional odds/decimal odds/moneyline] X

For example, InklingMarkets uses the “short sell yes” variation:

InklingMarkets' explanation of short selling

So what is the clearest language for prediction markets?

A good guiding principle in this regard is KISS: Keep It Simple Stupid. Or, in more grandiose terms, Occam’s razor. All else being equal, one should choose the simplest and most straightforward option.

By this measure, it seems that betting lingo wins hands down. It’s vastly simpler to say “I bet $10 that Obama will lose” than to say “I short sell three shares of Obama at price 67”. The former is more direct and intuitive. Almost everyone understands what it means to place a bet, including subtleties like risk, uncertainty, and competition. On the other hand, even avid stock traders get tripped up by the concept of selling short.

Every prediction can be stated as: “I bet that outcome O will/won’t happen; I’ll risk $X to win $Y”. Betting for things and against things is symmetric. There is no need to short sell, buy bundles first, etc.

Yet most prediction markets don’t KISS, going with financial terminology instead, reflected even in the name itself. Why? I believe it’s because of the legal and social stigma attached to gambling. It’s a shame that such considerations force vendors to make the technology harder to understand and more complicated to use.

Should there be a Prediction Market Institute?

There’s a Prediction Market Industry Association (sort of).

Is it time for a Prediction Market Institute dedicated to scientific advancement and engineering innovation in prediction markets?

On the face of it, the concept is ludicrous: there is no “Support Vector Machine Institute”, for example. But a bunch of tech companies have PM research efforts of some sort, including Google, HP, Microsoft, and Yahoo!. Folks at these companies have come together to lobby, to speak, and to exchange academic research results. Would YaHPooglesoft fund such an institute? If not, who? Chris Masse, who adds “PM journalism” to the list of institute goals, is on the case.

Wall Street's version of a combinatorial market

I was poking around TD AMERITRADE and came across this description of conditional orders (login required, or look here), or sequences of orders that are synchronized in various ways:

What is a conditional order and how do I place one?

Conditional orders let you combine two or three individual orders that will, if filled, either cancel or trigger additional orders. Conditional orders are available for both stocks and single-leg option orders (in option-approved accounts).

The following types of conditional orders are available:

  • OCA (one cancels another) – submit two orders simultaneously; if one order is filled, the other is canceled.
  • OTA (one triggers another) – submit an order and if that order is filled, submit another order.
  • OTT (one triggers two) – submit an order and if that order is filled, submit two additional orders.
  • OT/OCA (one triggers an OCA order) – submit an order; if that order is filled, submit two orders simultaneously; if one of these orders is filled, cancel the other.
  • OT/OTA (one triggers an OTA order) – submit an order; if that order is filled, submit another order. If that order is filled, submit a third order.

At first glance these resemble combinatorial bids that allow traders to buy several things at once, but they’re not. They’re more like bidding agent programs that describe exactly what to do when under various conditions: more complex, but not fundamentally different, than limit orders and stop-loss orders. They can be executed without any cooperation from the exchange.

This brings to light a key distinction: some forms of expressiveness can be achieved by layering increasingly complicated bidding agents on top of an existing exchange. Other types of expressiveness, for example true combinatorial bids, require new optimization routines put directly into the exchange.

The distinction arises in advertising as well. In a sponsored search auction, advertisers can bid lower during the day when people tend to browse and higher in the evening when people tend to buy, and they can even write a program to do it for them automatically. However an advertiser cannot execute a “guaranteed delivery” contract in sponsored search without changing the underlying auction mechanism.

Why should we care about the latter type of expressiveness that requires “smarter” exchange mechanisms? One word: efficiency. Economic efficiency, that is. With greater expressiveness, resources can be shuffled to align more precisely with who wants them the most. Advertising opportunities (a particular user’s attention on a particular page) can go to advertisers who value them most. Financial transactions that otherwise might go unmet can be consummated. Insurance buyers can get better coverage. And gamblers can have more fun.

What is (and what good is) a combinatorial prediction market?

What exactly is a combinatorial prediction market?

2010 Update: Several of us at Yahoo! Labs, along with academic researchers, have theorized and written about combinatorial prediction markets for several years, as you’ll see below. But now we’ve gone beyond talking about them and actually built one. So the best way to answer the question is to see the market we built and play with it. It’s called Predictalot. The first version was based on the NCAA Men’s College Basketball tournament known as March Madness.

Combinatorial Madness

March Madness is the anything-can-happen-and-often-does tournament among the top 64 NCAA Men’s College Basketball teams. The “madness” of the games is rivaled only by the madness of fans competing to pick the winners. In Las Vegas, you can bet on many things, from individual games to the overall champion to more exotic “propositions” like which conference of teams will do best. Still, each gambling venue defines in advance exactly what you are allowed to bet on, offering an explicit list of usually no more than a few thousand choices.

A combinatorial market maker fulfills an almost magical promise: propose any obscure proposition, click “accept”, and your bet is placed: no doubt and no waiting.

In contrast, a combinatorial market could allow you to make up nearly any proposition you want on the fly, for example, “Duke will advance further than UNC” or “At least one of the top four seeds will lose in the first round”, or “ACC conference teams will win every game they play against lower-seeded SEC conference teams”. How many such propositions are there? Let’s count. There are 63 games (ignore the new play-in game), each of which could go to either to the favorite or the underdog, so there are 263 or over 9,220,000,000,000,000,000 (9.22 quintillion) outcomes, or ways the tournament in its entirety could unfold. Propositions are collections or sets of outcomes: for example “Duke will advance further than UNC” is a statement that’s true in something less than half of the 9.2 quintillion outcomes. Technically, then, there are 2263 possible propositions, a number that dwarfs the number of atoms in the universe. Clearly we could never write down a list that long, even inside a computer. However that doesn’t necessarily mean we can’t operate such a market if we are a little clever about how we implement it, as we’ll see below.

So here is my informal definition: a combinatorial market is one where users can construct their own bets by mixing and matching options in myriad ways, sort of like ordering a Wendy’s hamburger. (Or highly customized insurance.)

The Details

Now I’ll try for a more precise definition.

Just to set the vocabulary straight, outcomes are all possible things that might happen: for example all five candidates in an election, all 30 teams in an NBA Championship market, all 3,628,800 (or 10!) finish orderings in a ten-horse race, or all 9.2 quintillion March Madness tournament results. Among the outcomes, in the end one and only one of them will actually occur; traders try to predict which.

Bids express what outcome(s) traders think will happen. Bids also contain the risk-reward ratio the trader is willing to accept: the amount she wins if correct and the amount she is willing to lose if incorrect.

There are two reasons why we might call a market “combinatorial”: either the bids are combinatorial or the outcomes are combinatorial. The latter poses a much harder computational problem. I’ll start with the former.

  1. Combinatorial bids. A combinatorial bid or bundle bid is a concise expression representing a collection or set of outcomes, for example “a Western Conference team will win the NBA Championship”, encompassing 15 possible outcomes, or “horse A will finish ahead of horse B” in a ten-horse race, encoding 1,814,400, or half, of the possible outcomes. Yoopick, our experimental sports prediction market on Facebook, features a type of combinatorial bidding called interval bidding. Traders select the range they think the final score difference will fall into, for example “Pittsburgh will win by between 2 and 11 points”. An interval bet is actually a collection of bets on every outcome between the left and right endpoints of the range.

    For comparison, a non-combinatorial bid is a bet on a single outcome, for example “candidate O will win the election”. The vast majority of fielded prediction markets handle only non-combinatorial bids.

    What are examples of combinatorial bids besides Yoopick? Abe Othman built an interval betting interface similar to Yoopick (he came up with it on his own, proving that great minds think alike) to predict when the new CMU computer science building will finish construction. Additional examples include Bossaerts et al.’s concept of combined value trading and the parimutuel call market mechanism [Baron & Lange, Lange & Economides, Peters et al.]. 2010 Update: Predictalot is our latest example of a market featuring both combinatorial bids and outcomes.

  2. Combinatorial outcomes. The March Madness scenario is an example of combinatorial outcomes. The number of outcomes (e.g., 9.2 quintillion) may be so huge that we could never hope to track every outcome explicitly inside a computer. Instead, outcomes themselves are defined implicitly according to some counting process that involves enumerating every possible combination of base objects. For example, the outcome space could be all n! possible finish orderings of an n-horse race. Or all 2n combinations of n binary events. In both cases, the number of outcomes grows exponentially in the number of base objects n, quickly becoming unimaginably large as n grows.

    A market with combinatorial outcomes is almost nonsensical without allowing combinatorial bids as well, since individual outcomes are like microbes on a needle on a cruise ship of hay in a universe-sized sea. No one wants to bet on these minuscule possibilities one at a time. Instead, traders bet on high-level properties of outcomes, like “Duke will advance further than UNC”, that encode sets of outcomes. Here are some example forms of combinatorics and corresponding bidding languages that seem natural:

    • Boolean betting. Outcomes are combinations of binary events. Bids are phrased in Boolean logic. So if base objects are “Democrat will win in Alabama”, “Democrat will win in Alaska”, etc. for all fifty US states, and outcomes are all 250 possible ways the election might swing across all 50 states, then bids may be of the form “Democrat will win in Ohio and Florida, but not Virginia”, or “Democrat will win Nevada if they win California”, etc. For further reading, see Hanson’s paper on combinatorial market makers and our papers on the computational complexity of Boolean betting auctioneers and market makers.
    • Tournament betting. This is the March Madness example and a special case of Boolean betting. See our paper on tournament betting market makers.
    • Permutation betting. Outcomes are possible finish orderings in a horse race. Bids are properties of orderings, for example “Horse B will finish ahead of horse D”, or “Horse B will finish between 3rd and 7th place”. See our papers on permutation betting auctioneers and market makers.
    • Taxonomy betting. Base objects are (discretized) numbers arranged in a taxonomy, for example web site page views organized by topic, subtopic, etc. Outcomes are all possible combinations of the numbers. Bets can be placed on the range of any number in the taxonomy, for example page views of a sports web site, page views of the NBA subsection of the web site, etc. Coming soon: a paper on taxonomy betting led by Mingyu Guo at Duke. [Update: here is the paper.]

    We summarize some of these in a short article on Combinatorial betting and a more detailed book chapter on Computational aspects of prediction markets.

    2009 Update: Gregory Goth writes an excellent and accessible summary in the March 2009 Communcations of the ACM, p.13.

Auctioneer versus market maker

So far, I’ve only talked about the form of bids from traders. Next I’ll discuss the actual mechanics of the marketplace, or how bids are processed. How does the market operator decide which bids to accept or reject? At what prices?

I’ll focus on two major possibilities: either the market operator acts as an auctioneer or he acts as an automated market maker.

An auctioneer only matches up willing traders with each other — the auctioneer never takes on any risk of his own. This is how most financial exchanges like the stock market operate, and how intrade and betfair operate. (A call market is a special case where the auctioneer collects many bids over a period of time, then processes them all together in a single batch.)

An automated market maker will quote a price for any bet whatsoever. Even lone traders can place their bet with the market maker as long as they accept the price, greatly enhancing liquidity. The liquidity comes at a cost though: an automated market maker can and often does lose money, though clever pricing algorithms can guarantee that losses won’t mount beyond a fixed amount set in advance. Hanson’s logarithmic market scoring rule market maker is far and away the most popular for prediction markets, and for good reason: it’s simple, has nice modularity properties, and behaves well in practice. We catalog a number of bounded-loss market makers in this paper. The dynamic parimutuel market used in the (now closed) Yahoo! Tech Buzz Game can be thought of as another type of automated market maker.

A market with combinatorial outcomes almost requires a market maker to function smoothly. When traders have such a mind-boggling array of choices, the chances that two or more of their bets will exactly counter each other seems remote. If trades are rarely filled, then traders won’t bother bidding at all, causing a no-chicken-no-egg spiral into failure.

One the other hand, a market maker allows anyone to get a price quote at any time on any bet, no matter how convoluted or specific, even if no other traders had thought about that particular possibility. Thus interacting with a combinatorial market maker can be highly satisfying: propose any obscure proposition, click “accept price”, and your bet is placed: no doubt and no waiting.

I’ll discuss one more technicality. An auctioneer must decide whether bids can be partially filled, giving traders both less risk and less reward than they requested, in the same ratio. This makes sense. If I’m willing to risk $100 to win $200, I’d almost surely risk $50 to win $100 instead. Allowing partial fills greatly simplifies life for the auctioneer too. If bids are divisible, or can be filled in part, the auctioneer can use efficient linear programming algorithms; if bids are indivisible, the auctioneer must use integer programming algorithms that may be intractable. For more on the divisible/indivisible distinction, see Bossaerts et al. and Fortnow et al. Allowing divisible bids seems the logical choice in most scenarios, since the market functions better and most traders won’t mind.

The benefits of combinatorial markets

Why do we need or want combinatorial markets? Simply put, they allow for the collection of more information, the life-blood of every prediction market. Combinatorial outcomes allow traders to assess the correlations among base objects, not just their independent likelihoods, for example the correlation between Democrats winning in Ohio and Pennsylvania. Understanding correlations is key in many applications, including risk assessment: one might argue that the recent financial meltdown is partly attributable to an underestimation of correlation among firms and securities and the chances of cascading failures.

Although financial and betting exchanges, bookmakers, and racetracks are modernizing, turning their operations over to computers and moving online, their core logic for processing bids hasn’t changed much since auctioneers were people. For simplicity, they treat all bets like apples and oranges, processing them independently, even when they are more like hamburgers and cheeseburgers. For example, bets on a horse “to win” and “to finish in the top two” are managed separately at the racetrack, as are options to buy a stock at “strike price 30” and “strike price 20” on the CBOE. In both cases it’s a logical truism that the first is worth less than the second, yet the market pleads ignorance, leaving it to traders to enforce consistent pricing.

In a combinatorial market, a bet on “Duke will win the tournament” automatically increases the odds on “Duke will win in the first round”, as it logically should. Mindless mechanical tasks like this are handled automatically, by algorithms that are far better at it anyway, freeing up traders for the primary task a prediction market asks them to do: provide information. Traders are free to express their information in whatever form they find most natural, and it all flows into the same pool of liquidity.

I discuss the benefits of combinatorial bids further in this post, including one benefit I don’t mention here: smarter accounting, or making sure no more is reserved from a trader’s balance than necessary to cover their worst-case loss.

The disadvantages of combinatorial markets

I would argue that there is virtually no disadvantage to allowing combinatorial bids. They are more flexible and natural for traders, and they eliminate redundancy and thus concentrate liquidity (again I refer the reader to this previous post). Allowing indivisible combinatorial bids can cause computational problems, but as I argue above, divisible bids make more sense anyway.

On the other hand, there can be disadvantages to markets with combinatorial outcomes. First, trader attention and liquidity may be severely fractured, since there are nearly limitless things to bet on.

Second, and perhaps more troublesome, running an auctioneer with combinatorial outcomes is computationally intractable (specifically, NP-hard, or as hard as solving SAT) and running a market maker is even harder (specifically, #P-hard, as hard as counting SAT), meaning that the amount of time needed to run is proportional to the number of outcomes, exponential in the number of objects.

It gets worse. Even if we place strict limits on what types of bets traders can make, the market may still be infeasible to run. For example, even if all bets are pairwise, like “Horse B will finish ahead of horse D”, the auctioneer and market maker problems for permutation betting remain NP-hard and #P-hard, respectively. Likewise, Boolean betting remains hard even if the most complicated bet allowed is joining two events, like “E will happen and F will not” [see Chen et al. and Fortnow et al.].

How to build one

Now for some good news: in some cases, fast algorithms are possible. If all bets are subset bets of the form “Horse A will finish in position 1,2, or 10” or “Horse B,C, or E will finish in position 3”, then permutation betting with an auctioneer is feasible (using a combination of linear programming and maximum matching), even though the corresponding market maker problem is #P-hard. If all bets are of the form “Team B will advance to round k”, tournament betting with a market maker is feasible (using Bayesian network inference). Taxonomy betting with a market maker is feasible (using dynamic programming).

Finally, even better news: fast market maker approximation algorithms are not only possible and practical, they work without limiting what people can bet on, fulfilling the almost magical promise I made at the outset of constructing any bet you can imagine on the fly. Approximation works because people like to bet on things that have a decent chance of happening, say between a 1% and 99% chance. Standard sampling algorithms, including importance sampling and MCMC, are good at approximating prices for such reasonable events. For the extreme (e.g., 1-in-a-billion) events, sampling may fail, so the market maker will have to round off in its own favor to be safe.

Wrapping up, in my mind, the best way to implement a combinatorial-outcome prediction market is as follows:

  • Use a market maker. Without one, traders are unlikely to find each other in the sea of choices. Specifically, use Hanson’s LMSR market maker.
  • Use an approximation algorithm for pricing. Importance sampling seems to work well. MCMC is another possibility. See Appendix A of this paper.
  • The interface is absolutely key, and the aspect I’m least qualified to opine on. I think Predictalot, WeatherBill, Yoopick, and WhenWillWeMove point in the right direction.

2010 Update: Predictalot is our first pass at carrying through on this vision of how to build a combinatorial prediction market. In building it, we learned a great deal already, for example that sampling is much much trickier than I had initially imagined, and that it’s easy to accidentally create arbitrage loopholes if you’re not extremely careful with the math.

I glossed over a number of details. For example, care must be taken for the market maker to always round approximations in its own favor to avoid opening itself up to arbitrage attacks. Another difficulty is how to implement smart accounting to allow traders maximum leverage when they place many interrelated bets. The assumption that traders could lose all their bets is far too conservative — they might have bets that provably cannot simultaneously lose — but may serve as a reasonable starting point in practice.

The "predict flu using search" study you didn't hear about

In October, Philip Polgreen, Yiling Chen, myself, and Forrest Nelson (representing University of Iowa, Harvard, and Yahoo!) published an article in the journal Clinical Infectious Diseases titled “Using Internet Searches for Influenza Surveillance”.

The paper describes how web search engines may be used to monitor and predict flu outbreaks. We studied four years of data from Yahoo! Search together with data on flu outbreaks and flu-related deaths in the United States. All three measures rise and fall as flu season progresses and dissipates, as you might expect. The surprising and promising finding is that web searches rise first, one to three weeks before confirmed flu cases, and five weeks before flu-related deaths. Thus web searches may serve as a valuable advance indicator for health officials to spot the onset of diseases like the flu, complementary to other indicators and forecasts.

On November 11, the New York Times broke a story about Google Flu Trends, along with an unusual announcement of a pending publication in the journal Nature.

I haven’t read the paper, but the article hints at nearly identical results:

Google … dug into its database, extracted five years of data on those queries and mapped it onto the C.D.C.’s reports of influenzalike illness. Google found a strong correlation between its data and the reports from the agency…

Tests of the new Web tool … suggest that it may be able to detect regional outbreaks of the flu a week to 10 days before they are reported by the Centers for Disease Control and Prevention.

To the reporter’s credit, he interviewed Phillip and the article does mention our work in passing, though I can’t say I’m thrilled with the way it was framed:

The premise behind Google Flu Trends … has been validated by an unrelated study indicating that the data collected by Yahoo … can also help with early detection of the flu.

giving (grudging) credit to Yahoo! data rather than Yahoo! people.

The story slashdigged around the blogomediasphere quickly and thoroughly, at one point reaching #1 on the nytimes.com most-emailed list. Articles and comments praise how novel, innovative, and outside-of-the-box the idea is. The editor in chief of Nature praised the “exceptional public health implications of [the Google] paper.”

I’m thrilled to see the attention given to the topic, and the Google team deserves a huge amount of credit, especially for launching a live web site as a companion to their publication, a fantastic service of great social value. That’s an idea we had but did not pursue.

In the business world, being first often means little. However in the world of science, being first means a great deal and can be the determining factor in whether a study gets published. The truth is, although the efforts were independent, ours was published first — and Clinical Infectious Diseases scooped Nature — a decent consolation prize amid the go-google din.

Update 2008/11/24: We spoke with the Google authors and the Nature editors and our paper is cited in the Google paper, which is now published, and given fair treatment in the associated Nature News item. One nice aspect of the Google study is that they identified relevant search terms automatically by regressing all of the 50 million most frequent search queries against the CDC flu data. Congratulations and many thanks to the Google/CDC authors and the Nature editors, and thanks everyone for your comments and encouragement.