Oddhead Logo

Oddhead Blog

Musings of a computer scientist and yahoo1,2 about
prediction markets, gambling, and estimating the odds of everything

March 7th, 2010

Countdown to web sentience

In 2003, we wrote a paper titled 1 billion pages = 1 million dollars? Mining the web to play Who Wants to be a Millionaire?. We trained a computer to answer questions from the then-hit game show by querying Google. We combined words from the questions with words from each answer in mildly clever ways, picking the question-answer pair with the most search results. For the most part (see below), it worked.

It was a classic example of “big data, shallow reasoning” and a sign of the times. Call it Google’s Law. With enough data nothing fancy can be done, but more importantly nothing fancy need be done: even simple algorithms can look brilliant. When in comes to, say, identifying synonyms, simple pattern matching across an enormous corpus of sentences beats the most sophisticated language models developed meticulously over decades of research.

Our Millionaire player was great at answering obscure and specific questions: the high-dollar questions toward the end of the show that people find difficult. It failed mostly on the warm-up questions that people find easy — the truly trivial trivia. The reason is simple. Factual answers like the year that Mozart was born appear all over web. Statements capturing common sense for the most part do not. Big data can only go so far.*

That was 2003.

In the paper, our clearest example of a question that we could not answer was How many legs does a fish have?. No one on the web would actually bother to write down the answer to that. Or would they?

I was recently explaining all this to a colleague. To make my point, we Googled that question. Low and behold, there it was: asked and answered — verbatim — on Yahoo! Answers. How many legs does a fish have? Zero. Apparently Yahoo! Answers also knows the number of legs of a crayfish, rabbit, dog, starfish, mosquito, caterpillar, crab, mealworm, and “about 133,000″ more.

Today, there are way more than 1 billion web pages: maybe closer to 1 trillion.

What’s the new lesson? Given enough time, everything will be on the web, including the fact that hungry poets blink (✓). Ok, not everything, but far more than anyone ever imagined.

It would be fun to try our Millionaire experiment again now that the web is bigger and search engines are smarter. Is there some kind of Moore’s Law for artificial intelligence as the web grows? Can sentience be far behind? :-)

__________
* Lance agreed, predicting that IBM’s quest to build a Jeopardy-playing computer would succeed but not tell us much.

March 5th, 2010

Predictalot! (And we mean alot)

I’m thrilled to announce the launch of Predictalot, a combinatorial prediction market for the NCAA Men’s Basketball playoffs. Predict almost anything you can think of, like Duke will advance further than UNC, or Every final four team name will start with U. Check the odds and invest points on your favorites. Sell your predictions anytime, even as you follow the basketball games live.

The basic game play is simple: select a prediction type, customize it, and invest points on it. Yet you’ll never run out of odds to explore: there are hundreds of millions of predictions you can make. The odds on each update continuously based on other players’ predictions and the on-court action.

Predictalot is a Yahoo! App, so you can play it at apps.yahoo.com or you can add it to your Yahoo! home page. I have to admit, it’s an incredible feeling to play a game I helped design right on the Yahoo! home page.

Predicalot app on the Yahoo! home page

That’s all you need to get started. If you’re curious and would like a peek under the hood, read on: there’s some interesting technology hidden in the engine.

Background and Details

Predictalot is a true combinatorial prediction market of the sort academics like us and Robin Hanson have been dreaming about since early in the decade. We built the first version during an internal Yahoo! Hack Day. Finally, we leveraged the Yahoo! Application Platform to quickly build a public version of the game. (Note that anyone can develop a YAP app that’s visible to millions — there’s good sample code, it supports YUI and OpenSocial, and it’s easy to get started.) After many fits and starts, late nights, and eventually all nights, we’re proud and excited to go live with Predictalot version 1.0. I can’t rave enough about the talent and dedication of the research engineers who gave the game a professional look and feel and production speed, turning a pie-in-the-sky idea into reality. We have many features and upgrades in mind for future versions, but the core functionality is in place and we hope you enjoy the game.

In the tournament, after the play-in game, the 64 top college basketball teams play 63 games in a single elimination tournament. So there are 2 to the power 63 or 9.2 quintillion total possible outcomes, or ways the entire tournament can unfold. Predictalot implicitly keeps track of the odds for them all. To put this in perspective, it’s estimated that there are about 10 quintillion individual insects on Earth. Of course, for all practical purposes, we can’t store 9.2 quintillion numbers, even with today’s computers. Instead, we compute the odds for any outcome on the fly by scanning through the predictions placed so far.

A prediction is a statement, like Duke will win in the first round, that will be either true or false in the final outcome. In this case, the prediction is true in exactly half, or 2 to the power 62 outcomes. (Note this does not mean the odds are 50% — remember the outcomes themselves are not all equally likely.) In theory, Predictalot can support predictions on any set of outcomes. That’s 2 to the power 2 to the power 63, or more than a googol predictions. For now, we restrict you to “only” hundreds of millions of predictions categorized into thirteen types. Computing the odds of a prediction precisely is too slow. Technically, the problem is #P-hard: as hard as counting SAT and harder than the travelling salesman problem. So we must resort to approximating the odds by randomly sampling the outcome space. Sampling is a tricky business — equal parts art and science — and we’re still actively exploring ways to increase the speed, stability, and accuracy of our sampling.

Because we track all possible outcomes, the predictions are automatically interconnected in ways you would expect. A large play on Duke to win the tournament instantly and automatically increases the odds of Duke winning in the first round; after all, Duke can’t win the whole thing without getting past the first round.

With 9.2 quintillion outcomes, Predictalot is to our knowledge the largest prediction market built, testing the limits of what the wisdom of crowds can produce. Predictalot is a game, and we hope it’s fun to play. We’d also like to pave the way for serious use of combinatorial prediction market technology.

Why did Yahoo! build this? Predictalot is a smarter market, letting humans and computers each do what they do best. People enter predictions in simple terms they understand like how one team fares against another. The computer handles the massive yet methodical number crunching needed to combine all the pieces together into a coherent overall prediction of a complex event. Markets like Predictalot, WeatherBill, CombineNet, and Internet advertising systems, to name a few, represent the evolution of markets in the digital age, empowering users with extreme customization. More and more, matching buyers with sellers — the core function of markets — requires sophisticated algorithms, including machine learning and optimization. Predictalot attempts to illustrate this trend in an entertaining way.

David Pennock
Mani Abrol, Janet George, Tom Gulik, Mridul Muralidharan, Sudar Muthu, Navneet Nair, Abe Othman, Daniel Reeves, Pras Sarkar

March 3rd, 2010

Wanted: Bluetooth sethead

In a typical pairing of a cell phone and a bluetooth device, the “smart” phone drives the “dumb” bluetooth. The computational brains and user interface controls live inside the cell phone together with the antenna. The bluetooth device simply follows orders. For example, a bluetooth headset acts as an alternate microphone and speaker for the phone. The bluetooth truly is an accessory to the phone.

I’d like a reverse sort of bluetooth device. A bluetooth “sethead”, if you will. The cellular antenna lives inside the earpiece, or maybe stays inside your pocket or bag — technically this is the “phone” but it is a dumb device with no screen or interface. The “bluetooth” part is the thing you hold in your hand with all the smarts: the processor, the address book, the screen, the controls, the camera, the gps, another microphone and speaker — everything you normally expect in a phone except the antenna.

Why do I want this? If it existed, I could choose any carrier with any phone. I select a dumb phone from the best carrier and a smart sethead from the best hardware company. A version of an iPod touch with a camera, microphone, and gps would make an ideal sethead.

A MiFi device comes close: it’s a dumb cellular antenna that creates as a mobile wifi hotspot that can connect you to Skype, etc. (I have one from Verizon Wireless and love it.) But it’s not “always on”. MiFi + iPod is great for making calls but not for receiving calls, so is not sufficient for replacing a cell phone.

Sure, the advent of setheads would speed the carriers’ transformation into “dumb pipes”, something they are resisting, but that is inevitable anyway.

October 21st, 2009

Notes from Yahoo! Open Hack Day NYC

Here are my notes from Yahoo! Open Hack Day NYC. For other perspectives read New York Times open sourcerer Nick Thuesen or the Yahoo! devel blog. You can watch videos of some of the talks or browse pictures.

First off, I cheated. I went to sleep in a hotel room rather than hack all through the night. (Even in college I woke up at 4am rather than pull an all nighter.) Still, I made decent progress on some pet projects including combinatorial betting. Daniel, Sharad, and Winter from Yahoo! Research New York participated for real, working through the night. Returning in the morning showered and caffeinated to greet the sleepwalkers was a little surreal. A number of ex-Yahoos joined the festivities including David Yang, Mor Naaman, and Chad Dickerson. (Havi joked that Yahoo! is like finishing school for entrepreneurs. If you count Yahoo! capture and releases like Mark Cuban and Paul Graham, the spreading influence is enormous.)

Clay Shirky kicked off the event. He’s a fantastic speaker — watch his talk here. His punch line — that successful communities like facebook, twitter, flickr, and wikipedia start small and cohesive (as opposed to large and fragmented: see Yahoo! 360) — was aimed perfectly at the many founders and foundreamers in the audience. There were speakers from Mint and foursquare and tutorials on the Yahoo! Application Platform, Yahoo! Query Language (the most popular service), Yahoo! TV widgets, and more. There was a round of Ignite NYC, a barrage of twenty-slides-in-five-minutes talks, some educational (geek’s guide to patents), some charitable (aid to South America), some hilarious (spaceman from outerspace), some thought provoking (makerbot 3d printers), and many all of the above (meta mechanical turk; the Emoji translation of Moby Dick). Watch the Ignite talks here.

A bunch of small touches made the event memorable, including a steampunk-themed hacking hall complete with retroRed Victorian couches, portraits of hackers through history, funky tweet-streaming sculptures, chalk drawings of old patents, power cords dangling from hanging bird cages, and a guitarhero-foosball corner. The food was tasty and at times eccentric, like the hot dog stand and toppings bar under a rainbow umbrella, ice cream cart, and old-fashioned popcorn machine. There was plenty of beer, coffee, red bull, sliders, and cookies, and even (gasp) vegan fare, salmon, and salad.

I give the event an A for style (decor, food) and content (talks, hacks, organization). The one sour note was the wireless — certainly a key ingredient for a good hack day — which began flaky and ended slow but acceptable.

I attended the YAP tutorial and created a rudimentary application. I was pleasantly surprised how simple the process was — the documentation and sample code are great. You can get the hello world app (complete with social hooks) running and add some ajax magic within minutes.

By far one of the coolest sights was the MakerBot Industries 3D printer in action. It sucks in plastic wire, melts it, and deposits it in perfect formation to produce coins, busts, parts for itself, or almost anything in the thingiverse. For Hack Day, the device printed news headlines in peanut butter on toast. We met an nyc resistor who was working on a conveyer belt mechanism for his own MakerBot printer, and he invited us to craft night at their shared hackspace in Brooklyn (a place that would be heaven for my dad and brother; Sharad, Jake, Daniel, and Bethany went to check it out).

I missed the tutorial on Yahoo! TV widgets but I’d like to learn more. They are now in most major TV brands including Sony, Samsung, and LG — millions of sets around the world in the coming months. (The Sony won editor’s choice in the Sept 2009 issue of Wired magazine; the Samsung and LG rated close behind. The sole TV reviewed without Yahoo! Widgets, a Panasonic, was ridiculed for is clunky Viera Cast online interface.) If you’re an internet video startup, like my friend, you need a widget channel. Personally, I’d love to see a sports game tracker that highlights pivotal moments by monitoring in-game betting odds.

Footnote: Two Yahoos made a humorous video (that’s both self-promotional and -deprecating) on what people in Times Square think ‘hacker’ means:

See Paul Tarjan and Christian Heilmann for real definitions.

July 16th, 2009

Psst: WeatherBill doesn’t know New Jersey is the new Florida: Place your bets now

Quantifying New York’s 2009 June gloom using WeatherBill and Wolfram|Alpha

In the northeastern United States, scars are slowly healing from a miserably rainy June — torturous, according to the New York Times. Status updates bemoaned “where’s the sun?”, “worst storm ever!”, “worst June ever!”. Torrential downpours came and went with Florida-like speed, turning gloom into doom: “here comes global warming”.

But how extreme was the month, really? Was our widespread misery justified quantitatively, or were we caught in our own self-indulgent Chris Harrisonism, “the most dramatic rose ceremony EVER!”.

This graphic shows that, as of June 20th, New York City was on track for near-record rainfall in inches. But that graphic, while pretty, is pretty static, and most people I heard complained about the number of days, not the volume of rain.

I wondered if I could use online tools to determine whether the number of rainy days in June was truly historic. My first thought was to try Wolfram|Alpha, a great excuse to play with the new math engine.

Wolfram|Alpha queries for “rain New Jersey June 200Y” are detailed and fascinating, showing temps, rain, cloud cover, humidity, and more, complete with graphs (hint: click “More”). But they don’t seem to directly answer how many days it rained at least some amount. The answer is displayed graphically but not numerically (the percentage and days of rain listed appears to be hours of rain divided by 24). Also, I didn’t see how to query multiple years at a time. So, in order to test whether 2009 was a record year, I would have to submit a separate query for each year (or bypass the web interface and use Mathematica directly). Still, Wolfram|Alpha does confirm that it rained 3.8 times as many hours in 2009 as 2008, already one of the wetter months on record.

WeatherBill, an endlessly configurable weather insurance service, more directly provided what I was looking for on one page. I asked for a price quote for a contract paying me $100 for every day it rains at least 0.1 inches in Newark, NJ during June 2010. It instantly spat back a price: $694.17.



WeatherBill rainy day contract for June 2010 in Newark, NJ

It also reported how much the contract would have paid — the number of rainy days times $100 — every year from 1979 to 2008, on average $620 for 6.2 days. It said I could “expect” (meaning one standard deviation, or 68% confidence interval) between 3.9 and 8.5 days of rain in a typical year. (The difference between the average and the price is further confirmation that WeatherBill charges a 10% premium.)

Below is a plot of June rainy days in Newark, NJ from 1979 to 2009. (WeatherBill doesn’t yet report June 2009 data so I entered 12 as a conservative estimate based on info from Weather Underground.)


Number of rainy days in Newark, NJ from 1979-2009

Indeed, our gloominess was justified: it rained in Newark more days in June 2009 than any other June dating back to 1979.

Intriguingly, our doominess may have been justified too. You don’t have to be a chartist to see an upward trend in rainy days over the past decade.

WeatherBill seems to assume as a baseline that past years are independent unbiased estimates of future years — usually not a bad assumption when it comes to weather. Still, if you believe the trend of increasing rain is real, either due to global warming or something else, WeatherBill offers a temptingly good bet. At $694.17, the contract (paying $100 per rainy day) would have earned a profit in 7 of the last 7 years. The chance of that streak being a coincidence is less than 1%.

If anyone places this bet, let me know. I would love to, but as of now I’m roughly $10 million in net worth short of qualifying as a WeatherBill trader.

March 29th, 2009

Data-driven Dukie

“The No-Stats All-Star” is an entertaining, fascinating, and — warning — extremely long article by Michael Lewis in the New York Times Magazine on Shane Battier, a National Basketball Association player and Duke alumni whose intellectual and data-driven play fits perfectly into the Houston Rockets’s new emphasis on statistical modeling.

For Battier, every action is a numbers game, an attempt to maximize the probability of a good outcome. Any single outcome, good or bad, cannot be judged in isolation, as much as human nature desires it. Actions and outcomes have to be evaluated in aggregate.

Michael Lewis is a fantastic writer. Battier is an impressive player and an impressive person. Houston is not the first and certainly not the last sports team to turn to data as the arbiter of truth. This approach is destined to spread throughout industry and life, mostly because it’s right. (Yes, even for choosing shades of blue.)

March 27th, 2009

Pricing the cloud, circa 1968

This article (membership required) is remarkable mostly for the fact that it was published in 1968. (Hat tip to Jonathan Smith.) It describes an experiment in creating an artificial economy to buy and sell computer time in the cloud, an idea that has been kicked around a number of times in the intervening decades but never quite took hold, until recently if you count literal pricing in dollars in EC2. The concept of buying time on your company’s compute cluster in a pseudo currency may come back into vogue as such installations become commonplace and over demanded.

Also check out the hand drawn figure and the advertisement at the end:


COBOL extensions to handle  data bases

March 23rd, 2009

Remembering greasemonkey

As part of an internal hack day I’ve been diving back into greasemonkey, and remembering how much the monkey mentality changes the way you think about the web. Greasemonkey seems to have lost some mindshare momentum, probably due to a natural hype/fatigue cycle, the still minority share of Firefox browsers, and the very real “laziness barrier” that keeps the vast majority of people from installing new stuff.

In any case, rediscovering how easy it is to muck with any and every website, usually for fun, and sometimes to truly improve usability or productivity, brings back the giddy avalanche of ideas of ways to “reclaim the web”.

For example, it wouldn’t be terribly hard to add a bit of xmlhttpRequest to WebVocab to create a shortcut that, with one click, inserts a custom signature into any comment you leave on any web page, at the same time notifying your favorite social feed service (e.g., friendfeed, Facebook, Yahoo! updates) and/or your own server of the comment location and content. Your friends see where and what you’re commenting, and you get a searchable archive of all the breadcrumbs you leave around the web. It’s like a comment aggregator service that users control rather than publishers, and thus that works on any website, putting the user back into user-generated content.

March 13th, 2009

Challenge: Derive the Kelly criteria for play money

The Kelly criteria is a money management strategy for gamblers and investors. The strategy says that, when faced with a positive-expectation bet, you should invest a fraction of your budget that is proportional to your expected profit. The more your expect to gain, the more you should risk, but you never risk your entire budget.

The Kelly strategy is optimal in several senses: (1) it minimizes your “doubling time”, or the time it takes to go from having X dollars to having 2X dollars; (2) it minimizes the time it takes to achieve any given level of wealth; (3) it maximizes your long-run wealth.

(It turns out that the Kelly strategy is equivalent to maximizing a logarithmic utility function.)

A key reason the Kelly strategy is optimal is that it is very careful to never take you completely bankrupt: you spend only a fraction of your money, always reserving a bit for tomorrow, however small. This is sound advice when dealing with real money. (Aside: this all assumes you have a strict budget cap, which is not entirely realistic: you can almost always borrow at least some amount, even in today’s economy.)

But what about maximizing your virtual “wealth” inside a play-money game like NewsFutures, InklingMarkets, HubDub, or MediaPredict? The problem is not quite the same, precisely because you cannot really go bankrupt. Almost every game offers an option to “recharge” your account if you go bust. Even if the option is not explicit, you can always just abandon your account and start a new one with a fresh initial bankroll they typically give to new players.

So what is the Kelly criteria for play money? What is the optimal strategy that minimizes your doubling time when you’re always allowed to recharge back to a fixed starting value any time you go bankrupt? The answer is not obvious to me, so I’m crowdsourcing the problem: can readers derive the right rule?

My only conjecture is that it might become optimal to go “all in” on every single bet. But I’m not sure. [Update: I've convinced myself this is not optimal. Imagine two sequential bets, the first with minuscule expected profit and the second with huge expected profit: surely you should not go "all in" on the first.]

Note that finding the optimal solution may not just help you win more bragging rights in online games. There is a fascinating sports betting site called CentSports that gives everyone ten real cents to start with. If you can turn that ten cents into twenty dollars, they’ll cut you a check. Moreover, if you ever go to zero, they’ll restore you right back to ten cents. In other words, the system works just like play-money games except the potential for profit is real. So another way to phrase the challenge question is: what strategy in CentSports minimizes the time it takes you to go from ten cents to twenty dollars?

February 23rd, 2009

March is World Blogging Month (WoBloMo)

I’m planning to take the World Blogging Month (WoBloMo) challenge in March. Join me!

The goal is simple: blog at least every other day from March 1 to March 31. Post something — anything — on every odd day of the month and you win. Skip any day not divisible by 2 and you lose.

Many bloggers already write every day or nearly so. More power to them. For the rest of us, who blog infrequently and spend copious time arguing with their inner editors, ludicrous and artificial pretenses can be a good thing.

WoBloMo resembles the write-a-novel-in-a-month contest NaNoWriMo and other timed artistic challenges prefaced on the idea that quantity and quality can be friends. By suppressing the Spock-like perfectionist inside you, you can bring out your inner Kirk and “just do it”. Agonizing over details always has diminishing returns and sometimes, perversely, can make things worse. Or so the theory goes. You be the judge once (if) my WoBloMo fountain erupts.

Added 2009/02/26: Full disclosure.