Category Archives: research

Third Annual New York Computer Science and Economics Day

Join us at NYCE Day 2010 on Friday October 15 at the lovely New York Academy of Sciences, a gathering of “researchers in the larger New York metropolitan area with interests in Computer Science, Economics, Marketing and Business and a common focus in understanding and developing the economics of internet activity.”

If you’d like to speak in the rump session, submit your topic by Monday September 13: details on the meeting webpage. The rump session is a series of five minute talks by a variety of speakers including students, and often one of the most interesting part of the program.

Three Crowd-ed events this fall

Research and Analysis of Tail Phenomenon Symposium

August 20, 2010, Sunnyvale, CA

The last decade has witnessed the emergence of enormous scale artifacts resulting from the independent action of hundreds of millions of people; for example, web repositories, social networks, mobile communication patterns, and consumption in “limitless” stores… the first Research and Analysis of Tail phenomena Symposium (RATS)… will explore the different computational, statistical, and modeling problems related to tail phenomena… We are particularly encouraging summer interns in any of the Bay Area research centers to join us in the event.
We will start with a video welcome by Chris Anderson (Wired), followed by a series of invited talks by Michael Mitzenmacher (Harvard), Aaron Clauset (Univ. of Colorado), Neel Sundaresan (eBay), Sharad Goel (Yahoo! Research, NY) and Michael Schwarz (Yahoo! Research, CA).

We invite proposals for short (20 minute) talks from students and researchers working in the area.

CrowdCof2010: 1st Annual Conference on the Future of Distributed Work

October 4, 2010, San Francisco, CA

Were you crowdsourcing before it was cool? We want to hear about your projects.

We are inviting submissions on all topics regarding crowdsourcing, including:

  • Past, present, and future of crowdsourcing
  • Quality assurance and metrics
  • Social and economic implications of crowdsourcing
  • Task design/Worker incentives
  • Innovative projects, experiments, and applications
  • Submission Guidelines

Deadline: Sept. 1

CrowdConf will bring together researchers, technologists, outsourcing entrepreneurs, legal scholars, and artists for the first time to discuss how crowdsourcing is transforming human computation and the future of work.

Confirmed Speakers:
Sharon Chirella: Vice President, Amazon Mechanical Turk
Tim Ferriss : Author, The 4-Hour Work Week
David Alan Grier: Author, When Computers Were Human
Barney Pell: Partner, Search Strategist, and Evangelist, Microsoft
Maynard Webb: CEO, LiveOps
Jonathan Zittrain: Professor of Law and Computer Science, Harvard

Computational Social Science and the Wisdom of Crowds Workshop at NIPS 2010

December 10th or 11th, 2010, Whistler, Canada

We welcome contributions on theoretical models, empirical work, and everything in between, including but not limited to:

  • Automatic aggregation of opinions or knowledge
  • Prediction markets / information markets
  • Incentives in social computation (e.g., games with a purpose)
  • Studies of events and trends (e.g., in politics)
  • Analysis of and experiments on distributed collaboration and consensus-building, including crowdsourcing (e.g., Mechanical Turk) and peer-production systems (e.g., Wikipedia and Yahoo! Answers)
  • Group dynamics and decision-making
  • Modeling network interaction content (e.g., text analysis of blog posts, tweets, emails, chats, etc.)
  • Social networks

[Covers] computational social science… [and] social computing… with an emphasis on the role of
machine learning…

Deadline for submissions: Friday October 8, 2010

How high can high-level programming go?

Our first prototype of Predictalot was written mainly in Mathematica with a rudimentary web front end that Dan Reeves put together (with editable source code embedded right on the page via etherpad!). It proved the concept but was ugly and horribly slow.

Screenshot of pre-alpha Predictalot: Mathematica + etherpad + web

Dan and I built a second prototype in PHP. It was even uglier but about twice as fast and somewhat useable on a small scale (at least by user willing/able to formulate their own propositions in PHP). Yet it still wasn’t good enough to serve thousands of users accustomed to simplicity and speed.

Screenshot of alpha Predictalot: PHP + YAP

The final live version of Predictalot was not only pleasing to the eye — thanks to Sudar, Navneet, and Tom — but pleasingly fast, due almost entirely to the heroic efforts of Mridul M who wrote a mini PHP parser inside of java and baked in a number of datbase and caching optimizations.

Screenshot of live beta Predictalot: Java + Javascript + YAP

It seems that high-level programming languages haven’t climbed high enough. To field a fairly constrained web app that looks good and works well, we benefit greatly from having at least three specialists, for the app front end, the app back end, and the platform back end (apache, security, etc.).

Here’s a challenge to the programming language community: anything I can whip up in Mathematica I should be able to run at web scale. Math majors should be able to create Predictalot. Dan and I can mock up the basic idea of Predictalot but it still takes tremendous talent, time, and effort to turn it into a professional looking and well behaved system.

The core market math of Predictalot — a combinatorial version of Hanson’s LMSR market maker — involves summing thousands of ex terms. Here we are in the second decade of the new millenium and in order for a sum of exponentials to execute quickly and without numeric overflow, we had to work out a transformation to conduct all our summations in log space. In other words, programming still requires me to think about how my machine represents my number. That shouldn’t qualify as “high level” thinking in 2010.

I realize I may be naively asking too much. Solving the challenge fully is AI-complete. Still, while we’re making impressive strides in artificial intelligence, programming feels much the same today as it did twenty years ago. It still requires learning specialized tricks, arcane domain knowledge, and optimizations honed only over years of experience, and the most computationally intensive applications still require that extra compilation step (i.e., it’s still often necessary to use C or Java over PHP, Perl, Python, or Ruby).

Some developments hardly seem like progress. Straightforward HTML markup like border=2 has given way to unweildy CSS like style=”border:2px solid black”. In some ways the need for specialized domain knowledge has gone up, not down.

Visual programming is an oft-tried, though so far largely unsuccessful way to lower the barrier to programming. Pipes was a great effort, but YQL proved more useful and popular. Google just announced new visual developer tools for Android in an attempt to bring mobile app creation to the masses. Content management systems are getting better and broader every day, allowing more and more complex websites to be built with less time touching source code.

I look forward to the day that computational thinking can suffice to create the majority of computational objects. I suspect that day is still fifteen to twenty years away.

Why automated market makers?

Why do prediction markets need automated market makers?

Here’s an illustration why. Abe Othman recently alerted me to intrade’s market on where basketball free agent LeBron James will sign, at the time a featured market. Take a look at this screenshot taken 2010/07/07:

Wide bid-ask spread for Lebron James contract on intrade -- needs a market maker 2010-07-07

The market says there’s between a 42 and 70% chance James will sign with Cleveland, between a 5 and 40% chance he’ll sign with Chicago, etc.

In other words, it doesn’t say much. The spreads between the best bid and ask prices are wide and so its predictions are not terribly useful. We can occasionally tighten these ranges by being smarter about handling multiple outcomes, but in the end low liquidity takes the prediction out of markets.

Even if someone does have information, they may not be able trade on it so may simply go away. (Actually, the problem goes beyond apathy. Placing a limit order is a risk — whoever accepts it will have a time advantage — and reveals information. If there is little chance the order will be accepted, the costs may outweigh any potential gain.)

Enter automated market makers. An automated market maker always stands ready to buy and sell every outcome at some price, adjusting along the way to bound its risk. The market maker injects liquidity, reducing the bid-ask spread and pinpointing the market’s prediction to a single number, say 61%, or at least a tight range, say 60-63%. From an information acquisition point of view, precision is important. For traders, the ability to trade any contract at any time is satisfying and self-reinforcing.

For combinatorial prediction markets like Predictalot with trillions or more outcomes, I simply can’t imagine them working at all without a market maker.

Abe Othman, Dan Reeves, Tuomas Sandholm, and I published a paper in EC 2010 on a new automated market maker algorithm. It’s a variation on Robin Hanson‘s popular market maker called the logarithmic market scoring rule (LMSR) market maker.

Almost anyone who implements LMSR, especially for play money, wonders how to set the liquidity parameter b. I’ve been asked this at least a dozen times. The answer is I don’t know. It’s more art than science. If b is too large, prices will hardly move. If b is too small, prices will bounce around wildly. Moreover, no matter what b is set to, prices will be exactly as responsive to the first dollar as the million and first dollar, counter to intuition.

Our market maker automatically adjusts its level of liquidity depending on trading volume. Prices start off very responsive and, as volume increases, liquidity grows, obviating the need to somehow guess the “right” level before trading even starts.

A side effect is that predictions take the form of ranges, like 60-63%, rather than exact point estimates. We prove that this is a necessary trade off. Any market maker that is path independent and sensitive to liquidity must give up on providing point estimates. In a way, our market maker works more like real bookies who maintain a vig or spread for every outcome.

The market maker algorithm is theoretically elegant and seems more practical than LMSR in many ways. However I’ve learned many times than nothing can replace implementing and testing a theory with real traders. Final word awaits such a trial. Stay tuned.

It’s official: More people are playing Predictalot than Mafia Wars

It’s true.

More people are playing Predictalot today than Mafia Wars or Zynga Poker… On Yahoo!, that is.

In fact, Predictalot is the #1 game app on Yahoo! Apps by daily count. By monthly count, we are 5th and rising.

A prediction is being made about every three minutes.

Come join the fun.

predictalot most popular game app on yahoo 2010-06-12

Predictalot for World Cup: Millions of predictions, stock market action

I just left the 2010 ACM Conference on Electronic Commerce, where six (!) out of 45 papers were about prediction markets.

Yahoo! Lab’s own Predictalot market is now live and waiting for you to place almost any prediction your heart desires about the World Cup in South Africa.

Here are some terribly useful things you can learn this time around. All numbers are subject to change, and that’s kind of the point:

  • There’s a 37% chance Brazil and Spain will both make it to the final game; there’s only a 15% chance that neither of them will make it
  • There’s is a 1 in 25 chance Portugal will win the cup; 1 in 50 for Argentina
  • 42.92% chance that a country that has never won before will win
  • 19.07% chance that Australia will advance further than England
  • 65.71% chance that Denmark, Italy, Mexico and United States all will not advance to Semifinals
  • Follow Predictalot on twitter for more

If you think these odds are wrong, place your virtual wager and earn some intangible bragging rights. You can sell your prediction any time for points, even in the middle of a match, just like the stock market.

There are millions of predictions available, yet I really believe ours is the simplest prediction market interface to date. (Do you disagree, Leslie?) We have an excellent conversion rate, or percent of people who visit the site who go on to place at least one prediction — for March Madness, that rate was about 1 in 5. One of our main goals was to hide the underlying complexity and make the app fast, easy, and fun to use. I personally am thrilled with the result, but please go judge for yourself and tell us what you think.

In the first version of Predictalot, people went well beyond picking the obvious like who will win. For example, they created almost 4,000 “three-dimensional” predictions that compared one team against two others, like “Butler will advance further than Kentucky and Purdue”.

If you’re not sure what to predict, you can now check out the streaming updates of what other people are predicting in your social circle and around the world:

Predictalot recent activity screenshot 2010-06-11 18:45

Also new this time, you can join a group and challenge your friends. You can track how you stack up in each of your groups and across the globe. We now provide live match updates right within the app for your convenience.

If you have the Yahoo! Toolbar (if not, try the World Cup toolbar), you can play Predictalot directly from the toolbar without leaving the webpage you’re on, even if it’s Google. 😉

playing predictalot from the yahoo! toolbar

Bringing Predictalot to life has been a truly interdisciplinary effort. On our team we have computer scientists and economists to work out the market math, and engineers to turn those equations into something real that is fast and easy to use. Predictalot is built on the Yahoo! Application Platform, an invaluable service (open to any developer) that makes it easy to make engaging and social apps for a huge audience with built-in distribution. And we owe a great deal to promotion from well-established Yahoo! properties like Fantasy Sports and Games.

We’re excited about this second iteration of Predictalot and hope you join us as the matches continue in South Africa. We invite everyone to join, though please do keep in mind that the game is in beta, or experimental, mode. (If you prefer a more polished experience, check out the official Yahoo! Fantasy Sports World Soccer game.) We hope it’s both fun to play and helps us learn something scientifically interesting.

Read more here, here, and here.

Or watch a screencast of how to play:

CS ∩ Econ news

Here are some news items about the field with no name (at least not yet, see below) that lies at the intersection of computer science and economics.

  1. The Sixth Workshop on Ad Auctions is soliciting papers. The workshop will be held June 8, 2010, in Cambridge, MA, in conjunction with the ACM Conference on Electronic Commerce (EC’10). There is a terrific organizing committee this year spanning industry and academia, CS and business schools.
  2. The EC’10 list of accepted papers is out and looks great.
  3. The first-ever Behavioral and Quantitative Game Theory Conference on Future Directions will be held May 14-16 in Newport Beach, CA. The program looks fantastic.
  4. Last fall, the University of Pennsylvania announced the first-ever undergraduate degree program in Market and Social Systems Engineering. Kudos to UPenn: the move shows impressive vision and leadership.
  5. The NSF is funding research in the CS-Econ area. They support efforts to “explore the emerging interface between computer science and economics, including algorithmic game theory, automated mechanism design, computational tractability of basic economic problems, and the role of information, trust, and reputation in markets” (page 7).
  6. The NBER Market Design working group is soliciting papers for a workshop October 8-9, 2010 in Cambridge, MA.
  7. We are now reviewing some amazing submissions to Yahoo!’s 2010 Key Scientific Challenges program. Read the challenges for the area we call Algorithmic Economics.
  8. Members of Yahoo! Labs can submit proposals to fund collaborative research with academic colleagues through the Yahoo! Faculty Research and Engagement program. If you’re interested, contact a Yahoo! Labs employee.

What should be the name? CS ∩ Econ is accurate but cryptic. At Yahoo!, we call it Algorithmic Economics. At Google, they call it Market Algorithms. The ACM Special Interest Group in this area calls it Electronic Commerce, causing complaints every year. I’ve heard people suggest Economics and Computation. The name Algorithmic Game Theory has emerged as something of a standard within the CS theory community. [Update: Noam suggests Algorithmic Game Theory and Economics and even renamed his blog accordingly.] The phrase Computational Economics makes sense but is already in use by a different field. A fun suggestion is Economatics (or Autonomics), meant to invoke a mashup of economics and automation.

Prediction markets had a similar naming/identity crisis. They’ve been called information markets, idea markets, securities markets, event markets, binary options, market in uncertainty, and more. But now almost everyone has settled on prediction markets. I’ve come to like the name and I think it’s helped establish the field in it’s own right. I hope we can settle on a good name for CS ∩ Econ in part so we can create the Journal of PerfectNameForCSEcon, an outlet sorely missing from the field.

Update 2011/10/11: The journal now exists! Called the ACM Transactions on Economics and Computation, it circumvented the naming issue.

Let the madness begin

Sixty-five men’s college basketball teams have been selected. Tomorrow there will be sixty-four. Half of the remaining teams will be eliminated twice every weekend for the next three weekends until only one team remains.

On April 5th, we will know who is champion. In the meantime, it’s anybody’s guess: any of 9.2 quintillion things could in principle happen.

At Predictalot it’s your guess. Make almost any prediction you can think of, like Duke will win go further than both Kansas and Kentucky, or the Atlantic Coast will lose more games than the Big East. There’s even the alphabet challenge: you pick six letters that include among them the first letters of all four final-four teams.

Following Selection Sunday yesterday, the full range of prediction types are now enabled in Predictalot encompassing hundreds of millions of predictions about your favorite teams, conferences, and regions. Check it out. Place a prediction or just lurk to see whether the crowd thinks St. Mary’s is this year’s Cinderella.

Come join our mad science experiment where crowd wisdom meets basketball madness. We’ve had many ups and down already — for example sampling is way trickier than I naively assumed initially — and I’m sure there is more to come, but that’s part of what makes building things based on unsolved scientific questions fun. Read more about the technical details in my previous posts and on the Yahoo! Research website.

And for the best general-audience description of the game, see the Yahoo! corporate blog.

Update: Read about us on the New York Times and VentureBeat.

You can even get your fix on Safari on iPhone!

Dave playing Predictalot on iPhone

Below is a graph of our exponential user growth over the last couple days. Come join the stampede!

graph of YAP installs for Predictalot

Computer science = STEAM

At a recent meeting of the Association for Computing Machinery, the main computer science association, the CEO of ACM John White reported on efforts to increase the visibility and understanding of computer science as a discipline. He asked “Where is the C in STEM?” (STEM stands for Science, Technology, Engineering, and Math, and there are many policy efforts to promote teaching and learning in these areas.) He argued that computer science is not just the “T” in “STEM”, as many might assume. Computer science deserves attention of its own from policy makers, teachers, and students.

I agree, but if computer science is not the “T”, then what is it? It’s funny. Computer science seems to span all the letters of STEM. It’s part science, part technology, part engineering, and part math. (Ironically, even though it’s called computer science, the “S” may be the least defensible.*)

The interdisciplinary nature of computer science can be seen throughout the university system: no one knows quite where CS departments belong. At some universities they are part of engineering schools, at others they belong to schools of arts and sciences, and at still others they have moved from one school to another. That’s not to mention the information schools and business schools with heavy computer science focus. At some universities, computer science is its own school with its own Dean. (This may be the best solution.)

Actually, I’d go one step further and say that computer science also involves a good deal of “A”, or art, as Paul Graham popularized in his wonderful book Hackers and Painters, and as seen most clearly in places like the MIT Media Lab and the NYU Interactive Telecommunications Program.

So where is the C in STEM? Everywhere. Plus A. Computer science = STEAM.**

__________
* It seems that those fields who feel compelled to append the word “science” to their names (social science, political science, library science) are not particularly scientific.
** Thanks to Lance Fortnow for contributing ideas for this post, including the acronym STEAM.

Predictalot! (And we mean alot)

I’m thrilled to announce the launch of Predictalot, a combinatorial prediction market for the NCAA Men’s Basketball playoffs. Predict almost anything you can think of, like Duke will advance further than UNC, or Every final four team name will start with U. Check the odds and invest points on your favorites. Sell your predictions anytime, even as you follow the basketball games live.

The basic game play is simple: select a prediction type, customize it, and invest points on it. Yet you’ll never run out of odds to explore: there are hundreds of millions of predictions you can make. The odds on each update continuously based on other players’ predictions and the on-court action.

Predictalot is a Yahoo! App, so you can play it at apps.yahoo.com or you can add it to your Yahoo! home page. I have to admit, it’s an incredible feeling to play a game I helped design right on the Yahoo! home page.

Predicalot app on the Yahoo! home page

That’s all you need to get started. If you’re curious and would like a peek under the hood, read on: there’s some interesting technology hidden in the engine.

Background and Details

Predictalot is a true combinatorial prediction market of the sort academics like us and Robin Hanson have been dreaming about since early in the decade. We built the first version during an internal Yahoo! Hack Day. Finally, we leveraged the Yahoo! Application Platform to quickly build a public version of the game. (Note that anyone can develop a YAP app that’s visible to millions — there’s good sample code, it supports YUI and OpenSocial, and it’s easy to get started.) After many fits and starts, late nights, and eventually all nights, we’re proud and excited to go live with Predictalot version 1.0. I can’t rave enough about the talent and dedication of the research engineers who gave the game a professional look and feel and production speed, turning a pie-in-the-sky idea into reality. We have many features and upgrades in mind for future versions, but the core functionality is in place and we hope you enjoy the game.

In the tournament, after the play-in game, the 64 top college basketball teams play 63 games in a single elimination tournament. So there are 2 to the power 63 or 9.2 quintillion total possible outcomes, or ways the entire tournament can unfold. Predictalot implicitly keeps track of the odds for them all. To put this in perspective, it’s estimated that there are about 10 quintillion individual insects on Earth. Of course, for all practical purposes, we can’t store 9.2 quintillion numbers, even with today’s computers. Instead, we compute the odds for any outcome on the fly by scanning through the predictions placed so far.

A prediction is a statement, like Duke will win in the first round, that will be either true or false in the final outcome. In this case, the prediction is true in exactly half, or 2 to the power 62 outcomes. (Note this does not mean the odds are 50% — remember the outcomes themselves are not all equally likely.) In theory, Predictalot can support predictions on any set of outcomes. That’s 2 to the power 2 to the power 63, or more than a googol predictions. For now, we restrict you to “only” hundreds of millions of predictions categorized into thirteen types. Computing the odds of a prediction precisely is too slow. Technically, the problem is #P-hard: as hard as counting SAT and harder than the travelling salesman problem. So we must resort to approximating the odds by randomly sampling the outcome space. Sampling is a tricky business — equal parts art and science — and we’re still actively exploring ways to increase the speed, stability, and accuracy of our sampling.

Because we track all possible outcomes, the predictions are automatically interconnected in ways you would expect. A large play on Duke to win the tournament instantly and automatically increases the odds of Duke winning in the first round; after all, Duke can’t win the whole thing without getting past the first round.

With 9.2 quintillion outcomes, Predictalot is to our knowledge the largest prediction market built, testing the limits of what the wisdom of crowds can produce. Predictalot is a game, and we hope it’s fun to play. We’d also like to pave the way for serious use of combinatorial prediction market technology.

Why did Yahoo! build this? Predictalot is a smarter market, letting humans and computers each do what they do best. People enter predictions in simple terms they understand like how one team fares against another. The computer handles the massive yet methodical number crunching needed to combine all the pieces together into a coherent overall prediction of a complex event. Markets like Predictalot, WeatherBill, CombineNet, and Internet advertising systems, to name a few, represent the evolution of markets in the digital age, empowering users with extreme customization. More and more, matching buyers with sellers — the core function of markets — requires sophisticated algorithms, including machine learning and optimization. Predictalot attempts to illustrate this trend in an entertaining way.

David Pennock
Mani Abrol, Janet George, Tom Gulik, Mridul Muralidharan, Sudar Muthu, Navneet Nair, Abe Othman, Daniel Reeves, Pras Sarkar