All posts by David Pennock

The long tail of science: Good, bad, or ugly?

(First in a series of “random thoughts on science”)

A mind boggling number of academic research conferences and workshops take place every year. Each fills a thick proceedings with publications, some containing hundreds of papers. High-profile conferences can attract five times that many submissions, often of low average quality. Smaller venues can seem absurdly specialized (unless it happens to be your specialty). Every year, new venues emerge. Once established, rarely do they “retire” (there is still an ACM Special Interest Group on the Ada programming language, in addition to a SIG on programming languages). It’s impossible for all or even most of the papers published in a given year to be impactful. Most of them, including plenty of my own, will never be cited or even read by more than the authors and reviewers.

No one can deny that incredible breakthroughs emerge from the scientific process — from Einstein to Shannon to Turing to von Neumann — but scientific output seems to have a (very) long tail.

Is this a good thing, a bad thing, or just a thing?

Is the tail…

Good?
Is the tail actually crucial to the scientific process? Are some breakthroughs the result of ideas that percolate through long chains — person to person, paper to paper — from the bottom up? Is science less dwarfs standing on the shoulders of giants than giants standing on the shoulders of dwarfs? I published a fairly straightforward paper that applies results in social choice theory to collaborative filtering. Then a smarter scientist wrote a better paper on a more widely applicable subject, apparently partially inspired by our approach. Could such virtuous chains actually lead, eventually, to the truly revolutionary discoveries? Is the tail wagging the dog?
Bad?
Are the papers in the tail a waste of time, energy, and taxpayer dollars? Do they have virtually no impact, at least compared to their cost? Should we try hard to find objective measures that identify good science and good scientists and target our funding to them, starving out the rest?
Ugly?
Is the tail simply a messy but necessary byproduct (I can’t resist: a “messessity”) of the scientific process? Under this scenario, breakthroughs are fundamentally rare and unpredictable hits among an enormous sea of misses. To get more and better breakthroughs, we need more people trying and mostly failing — more monkeys at typewriters trying to bang out Shakespeare. Every social system, indeed almost every natural system, has a long tail. Maybe it’s simply unavoidable, even if it isn’t pretty. Was the dog simply born with its (long and scraggly) tail attached?

Jamesburg, New Jersey: Per-capita bank branch capital of the world

By 2007, Jamesburg, New Jersey, a town of 6,000, had four walk-in bank branches — Bank of America, Constitution, PNC, and Sovereign — complete with bricks, mortar, tellers, and aura of trust along its quaint “Main Street” downtown corridor.

Apparently that wasn’t enough.

In 2008, Chase Bank and TD Bank broke ground. Thousands of motorists now pass them every weekday morning on their way to the New Jersey Turnpike and again every evening on their way home. If I had a hand in it, I might insert a drive-thru restaurant, of which there are currently none, into the path of commuters. But I don’t and the Invisible Hand chose otherwise: to erect two more banks for a total of six banks within one square mile, or one for every 1000 residents. (To be fair, the surrounding township has 30,000 people, but probably a dozen more banks.)


Six walk-in bank branches within one square mile in Jamesburg, NJ USA

We live in an era of electronic banking when ATMs dispensing paper money seems horribly analog. Walking through a door under a roof of a building representing the shelter for my money to talk to a person is, I’ll admit, occasionally reassuring, and even less occasionally useful. But everyone must admit that this is an activity growing rarer by the day.

So why are bank branches staging a last stand in this small New Jersey town?

Probably because the surrounding community, Monroe Township, is home to several retirement communities whose residents select banks based on the accessibility of branches. (They also buy newspapers and watch ABC’s World News with Charles Gibson at 6:30 and hence commercials for prescription drugs.)

Several new shopping centers have gone up in the area and each seems to have the same collection of stores, anchored by a drug store and a bank.

The data may say that these are profitable investments, but for how long?

Jamesburg would seem to have great potential as a consumer destination: a walkable urban strip in the center of a relatively affluent suburban township, on the bank of a gorgeous lake adjacent to a 675 acre park. Yet it has a few mom and pop shops, one Subway, one Dunkin’ Donuts, and one gas station. And six banks. Go figure.

KISS prediction markets (lingo) goodbye

The lingo of prediction markets varies widely.

The same “thing” might be called an information market, idea future, virtual stock market, financial market, securities market, event market, binary option, betting exchange, bookmaker, market in uncertainty, or gambling/wagering. Only recently has the name prediction market emerged with some sort of consensus.

To place a prediction in the market, you might do any of the following:

[bid/buy/bet on/back] the “yes” [security/contract/coupon/future/outcome] at [price/probability/fractional odds/decimal odds/moneyline] X

Predicting something won’t happen gets even uglier. You might:

[ask/short sell yes/buy no/buy bundle & sell yes/bet against/lay] at [price/probability/fractional odds/decimal odds/moneyline] X

For example, InklingMarkets uses the “short sell yes” variation:

InklingMarkets' explanation of short selling

So what is the clearest language for prediction markets?

A good guiding principle in this regard is KISS: Keep It Simple Stupid. Or, in more grandiose terms, Occam’s razor. All else being equal, one should choose the simplest and most straightforward option.

By this measure, it seems that betting lingo wins hands down. It’s vastly simpler to say “I bet $10 that Obama will lose” than to say “I short sell three shares of Obama at price 67”. The former is more direct and intuitive. Almost everyone understands what it means to place a bet, including subtleties like risk, uncertainty, and competition. On the other hand, even avid stock traders get tripped up by the concept of selling short.

Every prediction can be stated as: “I bet that outcome O will/won’t happen; I’ll risk $X to win $Y”. Betting for things and against things is symmetric. There is no need to short sell, buy bundles first, etc.

Yet most prediction markets don’t KISS, going with financial terminology instead, reflected even in the name itself. Why? I believe it’s because of the legal and social stigma attached to gambling. It’s a shame that such considerations force vendors to make the technology harder to understand and more complicated to use.

A world without roads and wires

Take the Earth and subtract just two things: roads and wires. How much more pleasant a place would it be? No asphalt arteries carving a dense grid throughout the world’s grass and trees devouring tax dollars. No endless rows of poles and towers draped with miles and miles of wires coming between our eyes and our skies. Imagine the makeover the space around and under your desk would receive!

Actually, the vision may not be as far fetched as it seems: we just need personal flying vehicles and wireless power & communications.

Challenge: Derive the Kelly criteria for play money

The Kelly criteria is a money management strategy for gamblers and investors. The strategy says that, when faced with a positive-expectation bet, you should invest a fraction of your budget that is proportional to your expected profit. The more your expect to gain, the more you should risk, but you never risk your entire budget.

The Kelly strategy is optimal in several senses: (1) it minimizes your “doubling time”, or the time it takes to go from having X dollars to having 2X dollars; (2) it minimizes the time it takes to achieve any given level of wealth; (3) it maximizes your long-run wealth.

(It turns out that the Kelly strategy is equivalent to maximizing a logarithmic utility function.)

A key reason the Kelly strategy is optimal is that it is very careful to never take you completely bankrupt: you spend only a fraction of your money, always reserving a bit for tomorrow, however small. This is sound advice when dealing with real money. (Aside: this all assumes you have a strict budget cap, which is not entirely realistic: you can almost always borrow at least some amount, even in today’s economy.)

But what about maximizing your virtual “wealth” inside a play-money game like NewsFutures, InklingMarkets, HubDub, or MediaPredict? The problem is not quite the same, precisely because you cannot really go bankrupt. Almost every game offers an option to “recharge” your account if you go bust. Even if the option is not explicit, you can always just abandon your account and start a new one with a fresh initial bankroll they typically give to new players.

So what is the Kelly criteria for play money? What is the optimal strategy that minimizes your doubling time when you’re always allowed to recharge back to a fixed starting value any time you go bankrupt? The answer is not obvious to me, so I’m crowdsourcing the problem: can readers derive the right rule?

My only conjecture is that it might become optimal to go “all in” on every single bet. But I’m not sure. [Update: I’ve convinced myself this is not optimal. Imagine two sequential bets, the first with minuscule expected profit and the second with huge expected profit: surely you should not go “all in” on the first.]

Note that finding the optimal solution may not just help you win more bragging rights in online games. There is a fascinating sports betting site called CentSports that gives everyone ten real cents to start with. If you can turn that ten cents into twenty dollars, they’ll cut you a check. Moreover, if you ever go to zero, they’ll restore you right back to ten cents. In other words, the system works just like play-money games except the potential for profit is real. So another way to phrase the challenge question is: what strategy in CentSports minimizes the time it takes you to go from ten cents to twenty dollars?

Should there be a Prediction Market Institute?

There’s a Prediction Market Industry Association (sort of).

Is it time for a Prediction Market Institute dedicated to scientific advancement and engineering innovation in prediction markets?

On the face of it, the concept is ludicrous: there is no “Support Vector Machine Institute”, for example. But a bunch of tech companies have PM research efforts of some sort, including Google, HP, Microsoft, and Yahoo!. Folks at these companies have come together to lobby, to speak, and to exchange academic research results. Would YaHPooglesoft fund such an institute? If not, who? Chris Masse, who adds “PM journalism” to the list of institute goals, is on the case.

Innovation (or lack thereof) in casino gambling

Casino floors from Macau to Mississippi look eerily similar. The slot machine seas. The table game islands. The high-limit oases. The restaurants, shows, buffets. The colorful currency. The slot machines. The excruciating check-in lines. Minimum bet forced scarcity. The bleeping beeping slot machines.

The games themselves are for the most part the same that people have played for centuries, with rare exceptions. People flock to the games they already know: blackjack, craps, baccarat. Is this a matter of making gamblers comfortable wherever they go, luring them into a wallet-emptying rhythm? Have casinos evolved to perfection, like sharks? It seems ironic that gamblers who clearly exhibit risky behavior only want to deal with games that are known and familiar. Is there room for innovation in casino gambling? Is this a fat satiated industry resting on its laurels ready for a spark of creativity to ignite a shakeup, or a smart, precisely tuned machine already operating at full throttle in optimized mode, thank you very much?

For example, innovation in slot machine design seems to involve replacing spinning wheels with LCD screens that display in gorgeous 3D detail… spinning wheels. The greatest advance in poker technology has been the hole-card camera, enabling more engaging television coverage.

Outside of the casino, companies like betfair and twinspires are shaking up their respective industries. Why do casinos seem to be standing still?

I’d love to see an experimental marketplace where people play and invent new gambling games, and where breakout winners move on to trials in the “big leagues”. Would it ever fly? Would gamblers bother to play, or are they by and large unimaginative creatures of habit?

P.S. Did I mention that the woblomo deadline is midnight Hawaii time?


time in Hawaii

Bem+Wom happens: The ALL-ETT wallet anecdote



pocket
mousetrap

Ernie told me about it. Sid and I told Lance who blogged it. Bill Gasarch read it, bought it, loved it, and blogged it again.

And so it goes for ALL-ETT, the ultimate wallet. Bem+Wom: BEtter Mousetrap + Word Of Mouth. It actually works.

It works for Google too:

[Google’s] growth has come not through TV ad campaigns, but through word of mouth from one satisfied user to another

And now, a viral restaurant.

But, beyond anecdote, continuing from the previous post, is this sort of thing worth $15 billion?

Companies with Bem benefit hugely from Wom and will happily pay for it.

And social networks are nothing if not mouths exchanging words, so it’s natural to think of some paid version of Bem+Wom as their killer app. Facebook Beacon is an innovative attempt despite the overblown backlash.

Paying mouths for words is affiliate marketing, a respectable if not Google-sized business. But turning friends (or celebrities) into salespeople induces a threshold of skepticism, as it should. Paid mouths’ faces must be awfully trustworthy, their words especially persuasive to be believed. Is it even “word of mouth” anymore?

Can Bem+Wom be monetized without mucking it up?

The social advertising puzzle

There’s no doubt that social ties have tremendous value: people find love and work largely through the people they know and the people the people they know know.

And there’s no doubt that digital representations of social ties add value. Facebook does improve people’s lives.1

The puzzle, and one of the key challenges facing companies like Facebook, Google, and Yahoo!., is how social media can make money. So far the evidence is most users won’t pay directly, which leaves ideas like virtual goods, community marketplaces, app stores, and, of course, advertising. Unfortunately, although we know great ways to advertise to people searching, and decent ways to advertise to people viewing content, it’s less clear how to advertise to people communicating.

P&G’s Ted McConnell puts it bluntly:

What in heaven’s name made you think you could monetize the real estate in which somebody is breaking up with their girlfriend?

Riffing off of this quote, Wired asks the $15 billion question: Is social advertising an oxymoron?:

So, what if social media and advertising just don’t mix?

SocialMedia.com, a social advertising startup, begs to differ (hat tip to Cong Yu), reacting to the same provocative McConnell quote. Their answer:

Advertisers only pay when users volunteer to say something about the brand to their friends.

Indeed, this sort of paid version of Bem+Wom (“BEtter Mousetrap + Word Of Mouth”) — more on this in the next post — is one of the first things people think of when pondering how to monetize a social network. But can it work well and if so, how?


Three disjoint friends like Rooster Sauce. Who knew?

1For example, I never would have guessed that three completely disjoint friends of mine are all fans of Sriracha Rooster Sauce. Who knew?

Wall Street's version of a combinatorial market

I was poking around TD AMERITRADE and came across this description of conditional orders (login required, or look here), or sequences of orders that are synchronized in various ways:

What is a conditional order and how do I place one?

Conditional orders let you combine two or three individual orders that will, if filled, either cancel or trigger additional orders. Conditional orders are available for both stocks and single-leg option orders (in option-approved accounts).

The following types of conditional orders are available:

  • OCA (one cancels another) – submit two orders simultaneously; if one order is filled, the other is canceled.
  • OTA (one triggers another) – submit an order and if that order is filled, submit another order.
  • OTT (one triggers two) – submit an order and if that order is filled, submit two additional orders.
  • OT/OCA (one triggers an OCA order) – submit an order; if that order is filled, submit two orders simultaneously; if one of these orders is filled, cancel the other.
  • OT/OTA (one triggers an OTA order) – submit an order; if that order is filled, submit another order. If that order is filled, submit a third order.

At first glance these resemble combinatorial bids that allow traders to buy several things at once, but they’re not. They’re more like bidding agent programs that describe exactly what to do when under various conditions: more complex, but not fundamentally different, than limit orders and stop-loss orders. They can be executed without any cooperation from the exchange.

This brings to light a key distinction: some forms of expressiveness can be achieved by layering increasingly complicated bidding agents on top of an existing exchange. Other types of expressiveness, for example true combinatorial bids, require new optimization routines put directly into the exchange.

The distinction arises in advertising as well. In a sponsored search auction, advertisers can bid lower during the day when people tend to browse and higher in the evening when people tend to buy, and they can even write a program to do it for them automatically. However an advertiser cannot execute a “guaranteed delivery” contract in sponsored search without changing the underlying auction mechanism.

Why should we care about the latter type of expressiveness that requires “smarter” exchange mechanisms? One word: efficiency. Economic efficiency, that is. With greater expressiveness, resources can be shuffled to align more precisely with who wants them the most. Advertising opportunities (a particular user’s attention on a particular page) can go to advertisers who value them most. Financial transactions that otherwise might go unmet can be consummated. Insurance buyers can get better coverage. And gamblers can have more fun.