All posts by David Pennock

Reporting prediction market prices

Reuters recently ran a story on political prediction markets, quoting prices from intrade and IEM. (Apparently the story was buzzed up to the Yahoo! homepage and made the Drudge Report.)

The reporter phrased prices in terms of the candidates’ percent chance of winning:

Traders … gave Democratic front-runner Barack Obama an 86 percent chance of being the Democratic presidential nominee, versus a 12.8 percent for Clinton…

…traders were betting the Democratic nominee would ultimately become president. They gave the Democrat a 59.1 percent chance of winning, versus a 48.8 percent chance for the Republican.

The latter numbers imply an embarrassingly incoherent market, giving the Democrats and Republicans together a 107.9% chance of winning. This is almost certainly the result of a typo, since the Republican candidate on intrade has not been much above 40 since mid 2007.

Still, typos aside, we know that the last-trade prices of candidates on intrade and IEM often don’t sum to exactly 100. So how should journalists report prediction market prices?

Byrne Hobart suggests they should stick to something strictly factual like "For $4.00, an investor could purchase a contract which would yield $10.00" if the Republican wins.

I disagree. I believe that phrasing prices as probabilities is desirable. The general public understands “percent chance” without further explanation, and interpreting prices in this way directly aligns with the prediction market industry’s message.

When converting prices to probabilities, is a journalist obligated to normalize them so they sum to 100? Should journalists report last-trade prices or bid-ask spreads or something else?

My inclination is that bid-ask spreads are better. Something like "traders gave the Democrats between a 22 and 30 percent chance of winning the state of Arkansas". These will rarely be inconsistent (otherwise arbitrage is sitting on the table) and the phrasing is still relatively easy to understand.

Avoiding this (admittedly nitpicky) dilemma is another advantage of automated market makers like Hanson’s. The market maker’s prices always sum to exactly 100, and the bid, ask, and last-trade prices are one and the same. Auction-type mechanisms like intrade’s can also be designed better so that prices are automatically kept consistent.

A freakonomist takes on Big Weather and, … stumbles

It seems that even D.I.Y. freakonomists aren’t sure how to judge probability forecasts.

In measuring precipitation accuracy, the study assumed that if a forecaster predicted a 50 percent or higher chance of precipitation, they were saying it was more likely to rain than not. Less than 50 percent meant it was more likely to not rain.

That prediction was then compared to whether or not it actually did rain…

New York Post Video: Gambling on Politics

Two New York Post video reporters came to Yahoo!’s midtown NYC office last Friday to interview me for a piece they were producing on intrade‘s political prediction markets. The video is now up on NYPOST.COM and in the embedded player below. The reporters were friendly and professional — thankfully they cut out most of my word-fumbling moments — and the end result is an entertaining, polished, informative video geared toward newbies. My own role came out at least not terrible.

If you look carefully, you’ll see subtle product placement of the Yahoo! Election Dashboard, which aggregates a ton of election numbers including intrade prices. You can also see short clips of the conference room, whiteboard scribbles, ylogo, and cubes at our Y! Research NYC office.

See also: Chris Masse’s comments

Crowdsourcing meets crowd wisdom

I met Lukas Biewald at CI Foo [1 2 3 4 5 6]. Lukas is involved in a fascinating startup called Dolores Labs that helps crowdsource your problem to Amazon’s Mechanical Turk. Read his manifesto.

As an experiment, they hired Turkers to label a sample of news items about Barack Obama and Hillary Clinton as either positive or negative for each of the candidates. As it turns out, every news source was pro-Obama except ABC News, with Digg being the pro-est of the pro-Obama camp.

They then plotted changes in news sentiment alongside the price of Obama’s intrade contract:

News sentiment and intrade price for Obama vs Clinton Feb-March 2008

Visually, there appears to be a correlation and news sentiment may actually be the leading indicator between the two, however it would be great to see statistical confirmation, if it’s even possible with such a small sample.

I sent Lukas some poll data and search buzz data that we’ve been collecting for the Yahoo! Election Dashboard. I’ll post an update if anything interesting results from lining up all four signals.

Gambling advertising legal silliness

Google AdSense ads on intrade.comThe absurdity of gambling laws in the US leads to such silliness as:

  • In 2007, Google, Microsoft, and Yahoo! paid millions in penalties for placing gambling ads, something they haven’t done since they were told to stop in 2004.
  • Yahoo! can quote prices from intrade, but can’t link to intrade.
  • Google can’t advertise for intrade/tradesports, but can place AdSense ads on intrade.com and tradesports.com. In other words, Google can’t sell eyeballs to gambling sites, but can sell eyeballs on gambling sites.

The right way to implement a multi-outcome prediction market: Linear programming

There are many examples of multi-outcome prediction markets, for example election markets with more than two candidates, or sports championship markets with dozens of teams.

What is the best way to implement a multi-outcome prediction market?

The simplest way is to effectively ignore the fact that there are multiple outcomes, breaking up the market into a bunch of separate binary markets, one for each outcome. Each outcome-market is an independent instrument with its own order flow and processing.

This seems to be the most common approach, taken by for example intrade, IEM, racetracks, and most financial exchanges. IMHO, it’s the wrong way, for three reasons.

  1. Splitting up a market can hurt liquidity. In a split market, there are effectively two ways to do everything (e.g., buy outcome 1 equals sell outcomes 2 through N), so traders may not see the best price for what they want to do, and orders may not fill at the best price available. There may even be orders that together constitute an agreeable trade, yet are stuck waiting in separate queues.
  2. A split market may also slow information propagation. Price changes in one outcome do not directly affect prices of other outcomes; it’s left to arbitrageurs to propagate logical implications.
  3. Finally, a naïve implementation of a split market may limit traders’ leverage, forcing them set aside more money than necessary to complete a set of trades. For example, on IEM, short selling one share at $0.99 requires that you have $1 in your account, even though the most you could possibly lose in this transaction is $0.01. The reason is that to short sell on IEM you must first buy the bundle of all outcomes for $1, then sell off the outcome that you don’t want.

IEM has possibly the worst implementation, suffering from all three problems.

Intrade’s implementation is slightly better: they at least handle leverage correctly.

Newsfutures is smarter still.1 They generate phantom bids to reflect the redundant ways to place bets. For example, if there are bids for outcomes 2 through N that add up to $0.80, they place a phantom ask on outcome 1 for $0.20. A trader who accepts the ask, buying outcome 1 for $0.20, actually sells outcomes 2 through N behind the scenes, an entirely equivalent transaction. Chris Hibbert has a more elaborate methodology for eking out as much liquidity as possibly using phantom bids, an approach he has implemented plans to implement in his Zocalo platform.

Yet phantom bids are a band-aid that cannot entirely heal a fractured market. Still missing is the ability to trade bundles of outcomes in a single transaction.

For example, consider the US National Basketball Association championship market, with 30 teams. A split market (possibly with phantom bids) works great for betting on individual teams one at a time, but is terribly cumbersome for betting on groups of teams. For example, betting that a Western conference team will win requires 15 separate transactions. A common fix is to open yet another market in each popular bundle, however this limits choice and exacerbates all three problems above.

Bundling is especially useful with interval bets. For example, consider this bet on the peak price of gasoline through September 2008, broken up into intervals $3-$3.25, $3.25-$3.40, etc. In order to bet that gas prices will peak between, say, $3.40 and $4.30, you must buy all six outcomes spanning the interval, one at a time. (Moreover, you must sum the six outcome prices manually to compute a price quote.)

Fortunately, there is a trading engine that solves all three problems above and also allows bundle bets…

It’s linear programming!

Bossaerts et al. call it combined value trading. Baron & Lange, Lange & Economides and Peters et al. call it a parimutuel call market. Fortnow et al. and Chen et al. describe it in the context of combinatorial call markets.

Whatever you call it, the underlying principle is relatively straightforward, and it seems inherently the right way to implement a multi-outcome market. Yet I’ve rarely seen it done. The only example I know of is the now defunct economic derivatives markets run by Longitude, Goldman Sachs, and Deutsche Bank.

The set up of the linear program is as follows. Each order is associated with a decision variable x that ranges between 0 and 1, encoding the fraction of the order that the auctioneer can accept.2 There is one constraint per outcome that ensures that the auctioneer never loses money across all outcomes. The choice of objective function depends on the auctioneer’s goals, but something like maximizing the fill fraction makes sense.

Once the program is set up, the auctioneer solves for the x variables to determine which orders to accept in full (x=1), which to accept partially (0<x<1), and which to reject (x=0). The program can be solved either in batch mode, after waiting to collect a number of orders, or in continuous mode immediately as new orders arrive. Batch mode corresponds to a call market. Continuous mode corresponds to a continuous auction, a generalization of the continuous double auction mechanism of the stock market.

Each order consists of a price, a quantity, and an outcome bundle. Traders can just as easily bet on single outcomes, negations of outcomes, or sets of outcomes (e.g., all Western Conference NBA teams). Every order goes into the same pool of liquidity no matter how it is phrased.

Price quotes are queries to the linear program of the form “at what price p will this order be accepted in full?” (I believe that bounds on the dual variables of the LP can be interpreted as bid and ask price quotes.)

Lange & Economides and Peters et al. devise clever ways to make prices unique rather than bid-ask ranges, by injected a small subsidy to seed the market at the onset.

Note that Hanson’s market scoring rules market maker also elegantly solves all the same problems as the LP formulation, including handling bundle bets. However, the market maker requires a patron to subsidize the market, while the LP auctioneer formulation is budget balanced — that is, can never lose money.

Also note that I am not talking about a combinatorial-outcome market here. In this post, I am imagining that the number of outcomes is tractable — small enough so that we can explicitly list, store, and compute across all of the outcomes. A true combinatorial-outcome market, on the other hand, has an exponentially large number of outcomes making it impossible to even list them all explicitly, and forcing all calculations to operate on an implicit representation of outcomes, for example Boolean combinations of base events.

1Apparently worked out in conjunction with Brian Galebach, a mathematician and Newsfutures fan extraordinaire who runs the prediction contest probabilitysports.com.
2Alternatively, the variables can range between 0 and q, where q is the quantity of shares ordered.

Death in artificial intelligence

Until just reading about it in Wired, I knew little1 of the apparent suicide of Push Singh, a rising star in the field of artificial intelligence.

Singh seemed to have everything going for him: brilliant and driven, he became the protégé of his childhood hero Marvin Minsky, eventually earning a faculty position alongside him at MIT. Professionally, Singh earned praise from everyone from IEEE Intelligent Systems, who named Singh one of AI’s Ten to Watch (apparently revised), to Bill Gates, who asked Singh to keep him appraised of his latest publications. Singh’s social life seemed healthy and happy. The article struggles to uncover a hint of why Singh would take his own life, mentioning his excruciating chronic back pain (and linking it to a passage on the evolutionary explanation of pain as “programming bug” in Minsky’s new book, a book partly inspired by Singh).

The article weaves Push’s story with the remarkable parallel life and death of Chris McKinstry, a man with similar lofty goals of solving general AI, and even a similar approach of eliciting common sense facts from the public. (McKinstry’s Mindpixel predated Singh’s OpenMind initiative.) McKinstry’s path was less socially revered, and he seemed on a never ending and aching quest for credibility. The article muses whether there might be some direct or indirect correlation between the eerily similar suicides of the two men, even down to their methods.

For me, the story felt especially poignant, as growing up I was nourished on nearly the same computer geek diet as Singh: Vic 20, Apple II, Star Trek, D&D, HAL 9000, etc. In Singh I saw a smarter and more determined version of myself. Like many, I dreamt of solved AI, and of solving AI, even at one point wondering if a neural network trained on yes/no questions might suffice, the framework proposed by McKinstry. My Ph.D. is in artificial intelligence, though like most AI researchers my work is far removed from the quest for general AI. Over the years, I’ve become at once disillusioned with the dream2 and, hypocritically, upset that so many in the field have abandoned the dream in pursuit of a fractured set of niche problems with questionable relevance to whole.

Increasingly, researchers are calling for a return to the grand challenge of general AI. It’s sad that Singh, one of the few people with a legitimate shot at leading the way, is now gone.

Push Singh Memorial Fund

1Apparently details about Singh’s death have been slow to emerge, with MIT staying mostly quiet, for example not discussing the cause of death and taking down a memorial wiki built for Singh.
1 My colleague Fei Sha, a new father, put it nicely, saying he is “constantly amazed by the abilities of children to learn and adapt and is losing bit by bit his confidence in the romantic notion of artificial intelligence”.

The proverbial wisdom of crowds

I am fascinated by thingnaming.

In some ways there is no more straightforward way to certify your influence on the world than to count the number of times people use a word or phrase you invented.

On this count, James Surowiecki is a champion.1 His catch phrase the wisdom of crowds — a brilliant feat of thingnaming — has in four short years spread to over 2.1 million nooks and crannies around the web.2

In fact, BusinessWeek reporter Jennifer L. Schenker recently termed it the “proverbial wisdom of the crowd”. [Finding faces in the e-crowd, Businessweek, Dec 24, 2007, p.70]

At first I meant to poke fun at Schenker for attributing this adjective associated with adages of ancient origin to a four-year-old artifact.

However, digging further, I noticed that Schenker is right. Another use of the word proverbial is “having become an object of common mention or reference”, for example “your proverbial inability to get anywhere on time”.

Interestingly, a pun on Surowiecki’s phrase appears in the same issue of BusinessWeek. Stephen Baker’s long (yet remarkably content-free) piece on cloud computing is titled Google and the wisdom of clouds.

It’s amazing how crucial a good thingname can be to the success of a thing. Thanks James!

1Of course, beyond thingnaming, Surowiecki wrote a fantastic book that helped catalyze an industry, among his other plentiful contributions and accomplishments.
2For examples of unsuccessful thingnaming look here and here.

Search engine futures!

I am happy to report that on my suggestion intrade has listed futures contracts for 2008 search engine market share.

Here is how they work:

A contract will expire according to the percentage share of internet searches conducted in the United States in 2008. For example, if 53.5% of searches conducted in the United States in 2008 are made using Google then the contract listed for Google will expire at 53.5…

…Expiry will be based on the United States search share rankings published by Nielson Online.

I think this could be a fascinating market because:

  • Search engine market share is very important to these major companies, with dramatic effects on their share prices.
  • Search engine market share is fluid, so far with Google growing inexorably. However, Microsoft has cash, determination, Internet Explorer, and the willingness to experiment. Ask.com has erasers, 3D, ad budgets, and The Algorithm. Yahoo!, second in market share, often tests equal or better than Google, and new features like Search Assist are impressive.
  • The media loves to write about it.
  • A major search company might use the market to hedge. Well, this seems far-fetched but you never know. Certainly, from an economic risk management standpoint it would seem to make a great deal of sense. (Here, as always on this blog, I speak on behalf of myself and not my company.)

Finally, I have to comment on how refreshingly easy the process was in working with intrade. They went from suggestion to implementation in a matter of days. It’s a shame that US-based companies are in contrast stuck in stultifying legal and regulatory mud.

Addendum 2008/01/26: Here are links to some market research reports:
Nielsen | ComScore | HitWise | Compete

(It seems that Nielsen Netratings homepage is down; getting 404 error at the moment)

Addendum 2008/03/07: If you prefer, you can now also bet on search share just for fun with virtual currency at play.intrade.com.

(Nielsen Netratings homepage is still down, now for over a month. It’s even more ridiculous given that their own Nielsen Online website points to this page.)

1 year is more than 1% of your life

“No duh,” you might say.

Or, “no it’s not,” you might say.

Still, I find it a powerful thought. One year seems like almost nothing — it can pass in a flash. I’ve procrastinated many projects and reunions well past one year without blinking. Yet 1% of a life seems monstrous. Thinking of a year in this way seems to put it in perspective.

Coming soon: 3.65 days is 1% of a year…
53 minutes is…