Category Archives: incentives

2011 ACM Conference on Electronic Commerce and fifteen other CS conferences in San Jose

If you’re in the Bay Area, come join us at the 2011 ACM Conference on Electronic Commerce, June 5-9 in San Jose, CA, one of sixteen conferences that comprise the ACM Federated Computing Research Conference, the closest thing we have to a unified computer research conference.

The main EC’11 conference includes talks on prediction markets, crowdsourcing, auctions, game theory, finance, lending, and advertising. The papers span a spectrum from theoretical to applied. If you want evidence of the latter, look no further than the roster of corporate sponsors: eBay, Facebook, Google, Microsoft, and Yahoo!.

There are also a number of interesting workshops and tutorials in conjunction with EC’11 this year, including:


  • 7th Ad Auction Workshop
  • Workshop on Bayesian Mechanism Design
  • Workshop on Social Computing and User Generated Content
  • 6th Workshop on Economics of Networks, Systems, and Computation
  • Workshop on Implementation Theory


  • Bayesian Mechanism Design
  • Conducting Behavioral Research Using Amazon’s Mechanical Turk
  • Matching and Market Design
  • Outside Options in Mechanism Design
  • Measuring Online Advertising Effectiveness

The umbrella FCRC conference includes talks by 2011 Turing Award winner Leslie G. Valiant, IBM Watson creator David A. Ferrucci, and CMU professor, CAPTCHA co-inventor, and Games With a Purpose founder Luis von Ahn.

Hope to see many of you there!

Workshops @ACM Electronic Commerce: Ad Auctions, Social Computing, June 5, 2011

The 2011 ACM Conference on Electronic Commerce will be held June 5-9 in San Jose as part of the ACM Federated Computing Research Conference. FCRC is a collection of seventeen computer science conferences with joint plenary speakers, this year featuring David A. Ferrucci, head of IBM’s Watson project, CMU professor and GWAP founder Luis von Ahn, and 2011 Turing Award winner Leslie Valiant. I’d love to someday see a true unified computer science conference in the style of the math or economics national meetings. Barring that, FCRC is the next-best thing. I hope more conferences will join.

The EC’11 list of accepted papers is out and the program looks great (including six papers from Yahoo! authors). And it’s not too late to submit a paper to one of the associated workshops. Two of particular interest, both on June 5, 2011, are:

Workshop on Social Computing and User Generated Content

The workshop will bring together researchers and practitioners from a variety of relevant fields, including economics, computer science, and social psychology, in both academia and industry, to discuss the state of the art today, and the challenges and prospects for tomorrow in the field of social computing and user generated content.

Social computing systems are now ubiquitous on the web– Wikipedia is perhaps the most well-known peer production system, and there are many platforms for crowdsourcing tasks to online users, including Games with a Purpose, Amazon’s Mechanical Turk, the TopCoder competitions for software development, and many online Q&A forums such as Yahoo! Answers. Meanwhile, the user-created product reviews on Amazon generate value to other users looking to buy or choose amongst products, while Yelp’s value comes from user reviews about listed services…

SUBMISSIONS DUE April 15, 2011, 5pm EDT

Seventh Ad Auctions Workshop

In the past decade we’ve seen a rapid trend toward automation in advertising, not only in how ads are delivered and measured, but also in how ads are sold… The rapid emergence of new modes for selling and delivering ads is fertile ground for research from both economic and computational perspectives…

We solicit contributions of two types: (1) research contributions, and (2) position statements…

Submission deadline: April 15th, 2011 (midnight Hawaii Time)

Three Crowd-ed events this fall

Research and Analysis of Tail Phenomenon Symposium

August 20, 2010, Sunnyvale, CA

The last decade has witnessed the emergence of enormous scale artifacts resulting from the independent action of hundreds of millions of people; for example, web repositories, social networks, mobile communication patterns, and consumption in “limitless” stores… the first Research and Analysis of Tail phenomena Symposium (RATS)… will explore the different computational, statistical, and modeling problems related to tail phenomena… We are particularly encouraging summer interns in any of the Bay Area research centers to join us in the event.
We will start with a video welcome by Chris Anderson (Wired), followed by a series of invited talks by Michael Mitzenmacher (Harvard), Aaron Clauset (Univ. of Colorado), Neel Sundaresan (eBay), Sharad Goel (Yahoo! Research, NY) and Michael Schwarz (Yahoo! Research, CA).

We invite proposals for short (20 minute) talks from students and researchers working in the area.

CrowdCof2010: 1st Annual Conference on the Future of Distributed Work

October 4, 2010, San Francisco, CA

Were you crowdsourcing before it was cool? We want to hear about your projects.

We are inviting submissions on all topics regarding crowdsourcing, including:

  • Past, present, and future of crowdsourcing
  • Quality assurance and metrics
  • Social and economic implications of crowdsourcing
  • Task design/Worker incentives
  • Innovative projects, experiments, and applications
  • Submission Guidelines

Deadline: Sept. 1

CrowdConf will bring together researchers, technologists, outsourcing entrepreneurs, legal scholars, and artists for the first time to discuss how crowdsourcing is transforming human computation and the future of work.

Confirmed Speakers:
Sharon Chirella: Vice President, Amazon Mechanical Turk
Tim Ferriss : Author, The 4-Hour Work Week
David Alan Grier: Author, When Computers Were Human
Barney Pell: Partner, Search Strategist, and Evangelist, Microsoft
Maynard Webb: CEO, LiveOps
Jonathan Zittrain: Professor of Law and Computer Science, Harvard

Computational Social Science and the Wisdom of Crowds Workshop at NIPS 2010

December 10th or 11th, 2010, Whistler, Canada

We welcome contributions on theoretical models, empirical work, and everything in between, including but not limited to:

  • Automatic aggregation of opinions or knowledge
  • Prediction markets / information markets
  • Incentives in social computation (e.g., games with a purpose)
  • Studies of events and trends (e.g., in politics)
  • Analysis of and experiments on distributed collaboration and consensus-building, including crowdsourcing (e.g., Mechanical Turk) and peer-production systems (e.g., Wikipedia and Yahoo! Answers)
  • Group dynamics and decision-making
  • Modeling network interaction content (e.g., text analysis of blog posts, tweets, emails, chats, etc.)
  • Social networks

[Covers] computational social science… [and] social computing… with an emphasis on the role of
machine learning…

Deadline for submissions: Friday October 8, 2010

Meet the splORGers: The latest breed of web spam parasites

Via Muthu. This is mind boggling to me.

Sparasites on the web now somehow find it worth their while to invade ultra-specialized academic conferences. Call them splORGers. (In close analogy to sploggers).

The website appears to be the official home of the 49th Annual IEEE Symposium on Foundations of Computer Science. (In fact, it’s the top result for the search “focs 2008” in Bing, Google, and Yahoo!.) Historically a few hundred people attend to hear talks like “A Hypercontractive Inequality for Matrix-Valued Functions with Applications to Quantum Computing and LDCs”.

The website appears fully functional: you can browse the entire website structure including internal links like the list of accepted papers and external links like the online registration form.

But look more closely at the lower left corner of the front page. What do you see? SPAM KEYWORDS!: “Data Recovery Dell Memory HP Memory PC RAM wow accounts WoW gold”.

spam keywords on splORG site


It turns out that is NOT the official FOCS 2008 conference home page. Rather, it’s (Yahoo! ranks this site in second place, Bing and Google in seventh.)

This doesn’t seem like a zero-cost no-brainer automated attack. It involves identifying the appropriate domain name and mirroring another website, not as one-click as it sounds. There’s even a small sign of manual effort: the fox graphic in the upper left links to rather than 2008, as in the original. And of course there’s the cost to register and host the domain.

So why bother? Clearly, the perpetrator is not expecting real people to click on the spam links. At it’s peak, about as many people searched for “focs 2008” as for “pennock” and the offending links are fairly obscure. This is most certainly about siphoning link juice from seemingly legitimate .orgs that search engines trust.

But can that benefit really outweigh the cost? Again and again I simply fail to grok the economics of spam.

SplORGers have also set up camp at and Curiously, has a more transparent yet still head-scratching disclaimer.

Today, I stumbled onto a similar spamfiltration on, the first external link on the Wikipedia definition of mortgage points, prompting me to finally write this post. Look what our ultra open web has wrought!

Intelligent blog spam

As I alluded to previously, I seem to be getting “intelligent spam” on my blog: comments that pass the re-captcha test and seem on-topic, yet upon further inspection clearly constitute link spam: either the author URI or a link in the comment body is spam.

Here is one of the most clear cases, received on January 9 as a comment to my post on the CFTC’s call for proposals to regulate prediction markets:

Date: Fri, 9 Jan 2009 01:28:01 -0800
From: Matt.Herdy
New comment on your post #71 “A historic MayDay: The US
government’s call for help on regulating prediction markets”
Author : Matt.Herdy
Thanks for that post. I’ll put a note in the post.

1. It’s nothing new. The CFTC will just formalize the current
status quo.
2. We are prisoner of the CFTC regulations and the US Congress’
distaste of sports “gambling”. As for the profitability of prediction
exchanges in that strict environment, I don’t see how you can deny that
HedgeStreet went bankrupt even though it was well funded. Isn’t that a
hard fact?
3. You’re right, but all “pragmatists” should follow a business
plan and make profits. See point #2. Pragmatists won’t make miracles.

<a href=””>Removing stretch marks</a>

At first blush, the comments seems to come from a knowledgeable person: they refer to HedgeStreet, an extremely relevant yet mostly unknown company that’s not mentioned anywhere else in the post or other comments.

It turns out the comments seem intelligent because they are. In fact, they’re copied word for word from Chris Masse’s comments on his own blog.

Chris Masse’s page has a link to my page, so it could have been discovered with a “link:” query to a search engine.

Though now I understand what this spammer did, I remain puzzled exactly how they did it and especially why.

  1. Are these comments being inserted by people, perhaps hired on Mechanical Turk or other underground equivalent? Or are they coming from robots who have either broken re-captcha or the security of my blog? (John suspects a security breach.)
  2. Is it really worth it economically? All links in blog comments are NOFOLLOW links anyway, and disregarded by search engines for ranking purposes, so what is the point? Are they looking for actual humans to click these links?

In any case, it seems an intriguing development in the spam arms race. Are other bloggers getting “intelligent spam”? Does anyone know how it’s done and why?

Update 2010/07: Oh, the irony. I got a number of intelligent seeming comments on this post about SEO, nofollow, economics of spam, etc. that were… promoting spammy links. I left them for humor value though disabled the links.

The seedy side of Amazon's Mechanical Turk

I mostly side with Lukas and Panos on the fantastic potential of Amazon’s Mechanical Turk, a crowdsourcing service specializing in tiny payments for simple tasks that require human brainpower, like labeling images. Within the field of computer science alone, this type of service will revolutionize how empirical research is done in communities from CHI to SIGIR, powering unprecedented speed and scale at low cost (here are two examples). My guess is that the impact will be even larger in the social sciences; already, a number of folks in Yahoo’s Social Dynamics research group have started running studies on mturk. (A side question is how university review boards will react.)

However there is a seedier side to mturk, and I’m of two minds about it. Some people use the service to hire sockpuppets to enter bogus ratings and reviews about their products and engage in other forms of spam. (Actually this appears to violate mturk’s stated policies.)

For example, Samuel Deskin is offering up to ten cents to turkers willing to promote his new personalized start page samfind.



1. Set up an anoymous email account likke gmail or yahoo so you can register on #2 anonymously

2. Visit and sign up for an account – using your anonymous email account.

3. Visit and vote for:


By clcking “Pick”


4. Visit the COMMENTS Page on The Search Race, it is the Button Right Next to “Picks” on this page: and

5. Say something awesome about samfind ( on The Search Race’s Comments page.

Make sure to:

1. Tell us that you Picked us.
2. Copy and Paste the Comment you typed on The Search Race’s Comment page here so we know you wrote it and we will give you the bonus!

In fact, Deskin is currently offering bounties on mturk for a number of different spammy activities to promote his site. On the other hand, what Deskin is doing is not illegal and is arguably not all that different than paying PRWEB to publish his rah-rah press release (Start-up, samfind, Launches Customizable Startpage to Compete with Google, Yahoo & MSN, Los Angeles, California (PRWEB) August 4, 2008). And I have to at least give him credit for offering the money under his own name.

Another type of task on mturk involves taking a piece of text and paraphrasing it so that the words are different but the meaning remains the same. Here is an example:

Paraphrase This Paragraph

Here’s the original paragraph:

You’re probably wondering how to apply a wrinkle filler to your skin. The good news is that it’s easy! There are a number of different products on the market for anti aging skin care. Each one comes with its own special application instructions, which you should always make sure to read and carefully follow. In general, however, most anti aging skin care products are simply applied to the skin and left to soak in.

1. Use the same writing style as much as possible.
2. Vary at least 50% of the words and phrases – but keep the same concepts. Use obviously different sentences! Your paragraph should not be just a copy of the first with a few word replacements.
3. Any keywords listed in bold in the above paragraph must be included in your paraphrase.
4. The above paragraph contains 75 words… yours must contain at least 64 words and not more than 101 words.
5. Write using American English.
6. No obvious spelling or grammar mistakes. Please use a spell-checker before submitting. A free online spell checker can be found at

If you find it easier to paraphrase sentence-by-sentence, then do that. Please do not enter anything in the textbox other than your written paragraph. Thanks!

I have no direct evidence, but I imagine such a task is used to create splogs (I once found what seems like such a “paraphrasing splog”), ad traps, email spam, or other plagiarized content.

It’s possible that paid spam is hitting my blog (either that or I’m overly paranoid). I’m beginning to receive comments that are almost surely coming from humans, both because they clearly reference the content of the post and because they pass the re-captcha test. However, the author’s URL seems to point to an ad trap. I wonder if these commenters (who are particularly hard to catch — you have to bother to click on the author URL) are paid workers of some crowdsourcing service?

Can and should Amazon try to filter away these kinds dubious uses of Mechanical Turk? Or is it better to have this inevitable form of economic activity out in the open? One could argue that at least systems like mturk impose a tax on pollution and spam, something long argued as an economic force to reduce spam.

My main objection to these activities is the lack of disclosure. Advertisements and press releases are paid for, but everyone knows it, and usually the funding source is known. However, the ratings, reviews, and paraphrased text coming out of mturk masquerade as authentic opinions and original content. I absolutely want mturk to succeed — it’s an innovative service of tremendous value, one of many to come out of Amazon recently — but I believe Amazon is risking a minor PR backlash by allowing these activities to flow through its servers and by profiting from them.

Fred, Fran, and baby makes three

Two mathematicians Fred and Fran were having a baby girl, their first child! They sought the perfect name, a name that would simultaneously reflect togetherness, relationships, and individuality in their burgeoning family. Day and night they debated, rejecting name after name. Finally, they had it! The perfect name!

They named her Erin.


[Yootleoffer: 1 Yootle for first correct response.]

  • 2008/06/18 Addendum: Fred and Fran both study set theory.
  • 2008/07/27 Addendum: It turns out I didn’t need the 6/18 hint-addendum: commenters had already chimed in with correct answers but, due to a combination of mechanical and pilot error, I didn’t realize it.

    So, … drum roll please…
    the winner is… John! His is the first correct response. Commenter d is also correct with a more succinct and mathematical explanation. Dennis is close but not quite complete. So I’ll award John 2 yootles, d 1 yootle, and Dennis 1/2 yootle. John and d please let me know your contact info to claim your bounty.

    Dennis asks what a yootle is worth. A yootle is a quantified “thanks, I owe you one”. So it’s worth a return favor from me, someone who trusts me, someone who trust someone who trust me, etc.

    Bonus challenge: come up with a family of four with the same property and reasonable names (necessarily of eight letters each).

  • 2008/08/13 Addendum: The bonus round winner is… aj! He hacked up a script and discovered one of apparently many possible “perfectly” named families of four. Details are in the comments of this post. Thanks aj!

Predictions: Apple bites, Google eats

Happy 5768 everyone!

Time for some predictions.

  1. Apple bites into PC pie. Apple Computer (remember them?) will attain at least 30% PC market share by 5772.

    Probability: 40% ; Willing to stake: $Y20

    On the front lines, silver Powerbooks are infiltrating in droves. At techie conventions and computer science conferences, penetration has gone from almost zero to something approaching 1/3 by anecdotal evidence. Wandering about these venues, it’s not terribly uncommon to see a table of three or four who apparently all agree to think different. At Yahoo!, more and more of Jobs’s ministers are simply preaching to the converted. In our Yahoo! Research New York office, for example, laps are topped at least two to one with half-eaten half-glowing apples. Even tech celeb Marc Andreessen has returned to the fold.

    But can the Apple bug jump from geeks to grandmas? (Well, my daughters’ grandma is already infected.) I’m guessing so. After all, these same alphadopters led the way to mp3s, Google, Wikipedia, Slashdot, blogs, Firefox, Digg, and Homestar Runner, unlocking remarkable truths along the way like “web search can be monetized”, “Really Simple trumps Really Smart”, and “give up now, Friendster has already won”. (Oops.)

    Why is there an Apple renaissance on the desktop? A big reason is that the OS’s natural monopoly is not so natural anymore. Today, the browser is the most important piece of software on your computer, and a viable cross-platform browser (Firefox) exists that almost every web site designs to. A second reason: it turns out that Intel chips are faster and better than PowerPC chips after all, despite decades of vehement Apple fanboy arguments to the contrary. Third, Apple’s built-in iLife software suite really is astonishingly useful and well designed and speaks to the new killer apps of the desktop: pictures, music, video, web, and email. A final reason is, well, Apple is cool, and technology is at least as much about fashion as function, or at least more than geeks would like to admit.

    Disagreers can accept my yootleoffer or put your play money where your mouth is on related bets at PPX and Inkling.

    (Side note: My take on Apple’s fumbled iPhone price cut: I believe that Apple reacted in fear of the looming gPhone. However, if history is a guide, that fear may be an exaggerated fear of the unknown.)

  2. Google eats its own dog food. Google buys an advertisement by the end of 5768.

    Probability: 60% ; Willing to stake: $Y20

    Google is the king of selling advertisements. So they must believe that advertising is effective, right? Then why doesn’t Google advertise for itself? (I’m not counting recruiting ads.) I’m guessing the reason is that they don’t have to. As a media darling, they get more than enough free press to catalyze their already monstrous word of mouth. I expect that as the glow wears off, as some of the not not evil jabs — deserved or not — start to stick, and as they settle into Big Company mode, you will start to see Google spots on TV and elsewhere.

2007/09/17 Update: Sean McNee noticed that Google is advertising Google Apps to enterprise customers on VentureBeat and the Seattle Times [example ad image]. As a result, let me update my prediction to “Google buys a TV ad for aimed at mass consumers”.

2007/09/19 Update: Maverick blogger, Maverick owner, Yahoo! benefactor, and uber alphadopter Mark Cuban is dancing with the Steves.

2010 Update: I was right, just 1.5 years too early. In other words, I was wrong.

Challenge: Low variance craps strategy

This is the first of a series of challenge posts. I’ll pose a problem in the hopes of convincing the wise Internauts to come forth with solutions. I intend the problems to be do-able rather than mind boggling: simply intriguing problems that I’d love to know the answer to but haven’t found the time yet to work through. Think of it as Web 2.0 enlightenment mixed with good old fashioned laziness. Or think of it as Yahoo! Answers, blog edition.

Don’t expect to go unrewarded for your efforts! I’ll pay ten yootles, plus an optional and unspecified tip, to the respondent with the best solution. What can you do with these yootles? Well, to make a long story short, you can spend them with me, people who trust me, people who trust people who trust me, etc. (In lieu of a formal microformat specification for yootles offers, for now I’ll simply use the keyword/tag “yootleoffer” to identify opportunities to earn yootles, in the spirit of “freedbacking”.)

dice So, on with the challenge! I just returned from a pit stop in Las Vegas, so this one is weighing on my mind. I’d like to see an analysis of strategies for playing craps that take into account the variance of the bettor’s wealth, not just the expectation.

Every idiot knows the best strategy to minimize the casino’s edge in craps: bet the pass line and load up on the maximum odds possible. The odds bet in craps is one of the only fair bets in the casino, so the more you load up on odds, the closer the casino’s edge is to zero. But despite the fact that craps is one of the fairest games on the casino floor, it’s also one of the highest variance games, meaning that your money can easily swing wildly up or down in a manner of minutes. So on a fixed budget, craps can be exceedingly dangerous. What I’m looking for is one or more strategies that have lower variance, and are thus less risky.

So that this challenge is not vague and open ended, let me boil this overall goal down into something fairly specific:

The Challenge: Suppose that I walk into a casino with $200. I arrive at a craps table that has a $5 minimum bet and allows 2X odds. I’m looking for a strategy that:

  1. Has at least some chance of making a profit (otherwise, why bother?), and
  2. Maximizes the expected amount of time (number of dice rolls) that my $200 will last.

I prefer if you ignore the center bets in your analysis. Bonus points if you examine what happens with different budgets, table limits, and/or allowed odds. Another way to motivate this is as follows: I have a small fixed budget but want to hang around a high-limit table for as long as possible, because I get a better atmosphere, more drinks, and a glimpse of life as a high roller.

As an example, here is a strategy that appears to have very low variance: On the come out roll, bet on both the pass line and the don’t pass line. If the shooter rolls 2, 3, 7, or 11 you break even. If the shooter rolls 4, 5, 6, 8, 9, or 10, you’re also guaranteed to eventually break even. The only time you lose money is when the shooter rolls a 12 on a come out roll, in which case you lose your pass line bet and keep your don’t pass bet (i.e., you lose half your total stake). There’s only one problem with this strategy: it’s moronic. You have absolutely no possibility of winning: you can only either break even or lose. One thing you might add to this strategy to satisfy condition (1) is to take or give odds whenever the shooter establishes a point. Will this strategy make my $200 last longer on average than playing the pass line only?

For bonus points, I’d love to see a graph plotting a number of different strategies along the efficient frontier, trading off casino edge and variance. Another bonus point question: In terms of variance, is it better to place a single pass line bet with large odds, or is it better to place a number of come bets all with smaller odds?

To submit your answer to this challenge, post a comment with a link to your solution. If you can dig up the answer somewhere on the web, more power to you. If you can prove something analytically, I bow to you. Otherwise, I expect this to require some simple Monte Carlo simulation. Followed of course by some Monte Carlo verification. 🙂 Have fun!

Addendum: The winner is … Fools Gold!