The right way to implement a multi-outcome prediction market: Linear programming

There are many examples of multi-outcome prediction markets, for example election markets with more than two candidates, or sports championship markets with dozens of teams.

What is the best way to implement a multi-outcome prediction market?

The simplest way is to effectively ignore the fact that there are multiple outcomes, breaking up the market into a bunch of separate binary markets, one for each outcome. Each outcome-market is an independent instrument with its own order flow and processing.

This seems to be the most common approach, taken by for example intrade, IEM, racetracks, and most financial exchanges. IMHO, it’s the wrong way, for three reasons.

  1. Splitting up a market can hurt liquidity. In a split market, there are effectively two ways to do everything (e.g., buy outcome 1 equals sell outcomes 2 through N), so traders may not see the best price for what they want to do, and orders may not fill at the best price available. There may even be orders that together constitute an agreeable trade, yet are stuck waiting in separate queues.
  2. A split market may also slow information propagation. Price changes in one outcome do not directly affect prices of other outcomes; it’s left to arbitrageurs to propagate logical implications.
  3. Finally, a naïve implementation of a split market may limit traders’ leverage, forcing them set aside more money than necessary to complete a set of trades. For example, on IEM, short selling one share at $0.99 requires that you have $1 in your account, even though the most you could possibly lose in this transaction is $0.01. The reason is that to short sell on IEM you must first buy the bundle of all outcomes for $1, then sell off the outcome that you don’t want.

IEM has possibly the worst implementation, suffering from all three problems.

Intrade’s implementation is slightly better: they at least handle leverage correctly.

Newsfutures is smarter still.1 They generate phantom bids to reflect the redundant ways to place bets. For example, if there are bids for outcomes 2 through N that add up to $0.80, they place a phantom ask on outcome 1 for $0.20. A trader who accepts the ask, buying outcome 1 for $0.20, actually sells outcomes 2 through N behind the scenes, an entirely equivalent transaction. Chris Hibbert has a more elaborate methodology for eking out as much liquidity as possibly using phantom bids, an approach he has implemented plans to implement in his Zocalo platform.

Yet phantom bids are a band-aid that cannot entirely heal a fractured market. Still missing is the ability to trade bundles of outcomes in a single transaction.

For example, consider the US National Basketball Association championship market, with 30 teams. A split market (possibly with phantom bids) works great for betting on individual teams one at a time, but is terribly cumbersome for betting on groups of teams. For example, betting that a Western conference team will win requires 15 separate transactions. A common fix is to open yet another market in each popular bundle, however this limits choice and exacerbates all three problems above.

Bundling is especially useful with interval bets. For example, consider this bet on the peak price of gasoline through September 2008, broken up into intervals $3-$3.25, $3.25-$3.40, etc. In order to bet that gas prices will peak between, say, $3.40 and $4.30, you must buy all six outcomes spanning the interval, one at a time. (Moreover, you must sum the six outcome prices manually to compute a price quote.)

Fortunately, there is a trading engine that solves all three problems above and also allows bundle bets…

It’s linear programming!

Bossaerts et al. call it combined value trading. Baron & Lange, Lange & Economides and Peters et al. call it a parimutuel call market. Fortnow et al. and Chen et al. describe it in the context of combinatorial call markets.

Whatever you call it, the underlying principle is relatively straightforward, and it seems inherently the right way to implement a multi-outcome market. Yet I’ve rarely seen it done. The only example I know of is the now defunct economic derivatives markets run by Longitude, Goldman Sachs, and Deutsche Bank.

The set up of the linear program is as follows. Each order is associated with a decision variable x that ranges between 0 and 1, encoding the fraction of the order that the auctioneer can accept.2 There is one constraint per outcome that ensures that the auctioneer never loses money across all outcomes. The choice of objective function depends on the auctioneer’s goals, but something like maximizing the fill fraction makes sense.

Once the program is set up, the auctioneer solves for the x variables to determine which orders to accept in full (x=1), which to accept partially (0<x<1), and which to reject (x=0). The program can be solved either in batch mode, after waiting to collect a number of orders, or in continuous mode immediately as new orders arrive. Batch mode corresponds to a call market. Continuous mode corresponds to a continuous auction, a generalization of the continuous double auction mechanism of the stock market.

Each order consists of a price, a quantity, and an outcome bundle. Traders can just as easily bet on single outcomes, negations of outcomes, or sets of outcomes (e.g., all Western Conference NBA teams). Every order goes into the same pool of liquidity no matter how it is phrased.

Price quotes are queries to the linear program of the form “at what price p will this order be accepted in full?” (I believe that bounds on the dual variables of the LP can be interpreted as bid and ask price quotes.)

Lange & Economides and Peters et al. devise clever ways to make prices unique rather than bid-ask ranges, by injected a small subsidy to seed the market at the onset.

Note that Hanson’s market scoring rules market maker also elegantly solves all the same problems as the LP formulation, including handling bundle bets. However, the market maker requires a patron to subsidize the market, while the LP auctioneer formulation is budget balanced — that is, can never lose money.

Also note that I am not talking about a combinatorial-outcome market here. In this post, I am imagining that the number of outcomes is tractable — small enough so that we can explicitly list, store, and compute across all of the outcomes. A true combinatorial-outcome market, on the other hand, has an exponentially large number of outcomes making it impossible to even list them all explicitly, and forcing all calculations to operate on an implicit representation of outcomes, for example Boolean combinations of base events.

1Apparently worked out in conjunction with Brian Galebach, a mathematician and Newsfutures fan extraordinaire who runs the prediction contest probabilitysports.com.
2Alternatively, the variables can range between 0 and q, where q is the quantity of shares ordered.

9 thoughts on “The right way to implement a multi-outcome prediction market: Linear programming”

  1. Good summary article Dave. I can’t think of anything that needs to be added. I do feel compelled to correct you on the claim about Zocalo. I’ve described my approach in reasonable detail, but the code doesn’t exist yet. I’d really like to implement it, but my clients have had other priorities so far.

  2. There is one constraint per outcome that ensures that the auctioneer never loses money across all outcomes.

    Why not tell readers exactly what that constraint is?

  3. Cool. The LP formulation of the problem seems inherently pretty powerful.

    But this approach doesn’t strike me as an alternative to the phantom bid method, instead it seems like a more general/more powerful way to generate the phantom bids.

    Also, it seems to me that there is nothing inherent in bundling that makes it incompatible with, say, the existing phantom bid approach of Newsfuture. If Newsfuture wanted to enable bundling, they could define the necessary relationships (i.e., candidates 1…3 are in party A, candidates 4…9 are in party B) and then accept bids/asks for either a candidate to win or a party to win.

    But using the LP formulation seems inherently more flexible and likely to be more efficient at finding matches among a diverse set of bids.

  4. Thanks Mike. You’re exactly right: phantom bids could be much more general. In the extreme limit where every possible type of phantom bid is generated, I believe you’d have something very similar to if not exactly like linear programming.

  5. Thanks Robin. Good idea.

    I believe the constraint corresponding to outcome w is something like:

    Sum_O q_O*x_O(1_{w\in O} – p_O) <= 0 where O is an order; we sum over all O p_O & q_O are the price & quantity of order O x_O is the fraction of order O accepted 1_{w\in O} equal 1 if the outcome is part of order O, and equals 0 otherwise there is one such contraint for every outcome w

  6. So, by my count the equations seem to go something like this…

    O is your set of outstanding orders.
    W is your set of possible states.

    Each order, o, specifies a maximum quantity q(o), a price-per-unit p(o) and a proportion of each state wanted (A(o,w) for each state w in W).

    At the end of the day, x(o) (amount satisfied) is found for each order, and the order submitter receives A(o,w)*x(o) shares in state w, at a cost of p(o)*x(o).

    We add some “phantom” orders to round things off:

    The “creation” order will create or destroy shares, so assigning it the identifier C, with A(c,w) = 1 for all w, and p(o) = 1. x(c) is unconstrained.

    Some “leftovers” orders, L_w, will account for leftover shares for each w (say Alice buys share 1 for 50c, B buys share 2 for 50c, the share 3 onwards are leftover). In this case A(L_w, w’) = 1 if w=w’ and 0 otherwise, p(L_w) = 0, and x(L_w) >= 0.

    Then it’s just a question of balancing everything. So, set O’ to all the orders (O + {C} + {L_w…}), then…

    (a) All the money goes somewhere:

    Sum {o \in O’} x(o) * C(o) = 0

    (b) All the shares are accounted for:

    Forall {w \in W}
    Sum {o \in O’} x(o) * A(o,w) = 0

    For completeness there’s also:

    Forall {o in O} 0 long time since I did inequalities though so maybe I’m just confused.

  7. Ooops. HTML and less-than isn’t friendly, is it…

    So, by my count the equations seem to go something like this…

    O is your set of outstanding orders.
    W is your set of possible states.

    Each order, o, specifies a maximum quantity q(o), a price-per-unit p(o) and a proportion of each state wanted (A(o,w) for each state w in W).

    At the end of the day, x(o) (amount satisfied) is found for each order, and the order submitter receives A(o,w)*x(o) shares in state w, at a cost of p(o)*x(o).

    We add some “phantom” orders to round things off:

    The “creation” order will create or destroy shares, so assigning it the identifier C, with A(c,w) = 1 for all w, and p(o) = 1. x(c) is unconstrained.

    Some “leftovers” orders, L_w, will account for leftover shares for each w (say Alice buys share 1 for 50c, B buys share 2 for 50c, the share 3 onwards are leftover). In this case A(L_w, w’) = 1 if w=w’ and 0 otherwise, p(L_w) = 0, and x(L_w) \gteq 0.

    Then it’s just a question of balancing everything. So, set O’ to all the orders (O + {C} + {L_w…}), then…

    (a) All the money goes somewhere:

    Sum {o \in O’} x(o) * C(o) = 0

    (b) All the shares are accounted for:

    Forall {w \in W}
    Sum {o \in O’} x(o) * A(o,w) = 0

    For completeness there’s also:

    Forall {o in O} 0 \lteq x(o) \lteq q(o)
    Forall {o in O} x(o)*p(o) \lteq BALANCE(o)

    Sum {w \in W} p(L_w) represents an increment of profit for the marketplace.

    x(C)*p(C) represents how much (less) the marketplace is holding in trust until the market can be settled as a result of these trades.

    I’m not following quite how that all works though — shouldn’t this be just as hard as the (multiple?) knapsack problem, ie Hard?

    But aren’t the inequalities defining a continuous solution space that includes the origin, and thus, in some sense Easy? It’s been a long time since I did inequalities though so maybe I’m just confused.

  8. Ah, of course; the linear programming bit is easy if you’re only solving it in the reals and hard if you’re limiting yourself to integers.

Comments are closed.