Category Archives: politics

A toast to the number 303: A redemptive election night for science, and The Signal

The night of February 15, 2012, was an uncomfortable one for me. Not a natural talker, I was out of my element at a press dinner organized by Yahoo! with journalists from the New York Times, Fast Company, MIT Tech Review, Forbes, SF Chronicle, WIRED, Reuters, and several more [1]. Even worse, the reporters kept leading with, “wow, this must a big night for you, huh? You just called the election.”

We were there to promote The Signal, a partnership between Yahoo! Research and Yahoo! News to put a quantitative lens on the election and beyond. The Signal was our data-driven antidote to two media extremes: the pundits who commit to statements without evidence; and some journalists who, in the name of balance, commit to nothing. As MIT Tech Review billed it, The Signal would be the “mother of all political prediction engines”. We like to joke that that quote undersold us: our aim was to be the mother of all prediction engines, period. The Signal was a broad project with many moving parts, featuring predictions, social media analysis, infographics, interactives, polls, and games. Led by David “Force-of-Nature” Rothschild, myself, and Chris Wilson, the full cast included over 30 researchers, engineers, and news editors [2]. We confirmed quickly that there’s a clear thirst for numeracy in news reporting: The Signal grew in 4 months to 2 million unique users per month [3].

On that night, though, the journalists kept coming back to the Yahoo! PR hook that brought them in the door: our insanely early election “call”. At that time in February, Romney hadn’t even been nominated.

No, we didn’t call the election, we predicted the election. That may sound like the same thing but, in scientific terms, there is a world of difference. We estimated the most likely outcome – Obama would win 303 Electoral College votes, more than enough to return him to the White House — and assigned a probability to it. Of less than one. Implying a probability of more than zero of being wrong. But that nuance is hard to explain to journalists and the public, and not nearly as exciting.

Although most of our predictions were based on markets and polls, the “303” prediction was not: it was a statistical model trained on historical data of past elections, authored by economists Patrick Hummel and David Rothschild. It doesn’t even care about the identities of the candidates.

I have to give Yahoo! enormous credit. It took a lot of guts to put faith in some number-crunching eggheads in their Research division and go to press with their conclusions. On February 16, Yahoo! went further. They put the 303 prediction front and center, literally, as an “Exclusive” banner item on Yahoo.com, a place that 300 million people call home every month.

The Signal 303 prediction "Exclusive" top banner item on Yahoo.com 2012-02-16

The firestorm was immediate and monstrous. Nearly a million people read the article and almost 40,000 left comments. Writing for Yahoo! News, I had grown used to the barrage of comments and emails, some comic, irrelevant, or snarky; others hateful or alert-the-FBI scary. But nothing could prepare us for that day. Responses ranged from skeptical to utterly outraged, mostly from people who read the headline or reactions but not the article itself. How dare Yahoo! call the election this far out?! (We didn’t.) Yahoo! is a mouthpiece for Obama! (The model is transparent and published: take it for what it’s worth.) Even Yahoo! News editor Chris Suellentrop grew uncomfortable, especially with the spin from Homepage (“Has Obama won?”) and PR (see “call” versus “predict”), keeping a tighter rein on us from then on. Plenty of other outlets “got it” and reported on it for what it was – a prediction with a solid scientific basis, and a margin for error.

This morning, with Florida still undecided, Obama had secured exactly 303 Electoral College votes.

New York Times 2012 election results Big Board 2011-11-07

Just today Obama wrapped up Florida too, giving him 29 more EVs than we predicted. Still, Florida was the closest vote in the nation, and for all 50 other entities — 49 states plus Washington D.C. — we predicted the correct outcome back in February. The model was not 100% confident about every state of course, formally expecting to get 6.8 wrong, and rating Florida the most likely state to flip from red to blue. The Hummel-Rothschild model, based only on a handful of variables like approval rating and second-quarter economic trends, completely ignored everything else of note, including money, debates, bail outs, binders, third-quarter numbers, and more than 47% of all surreptitious recordings. Yet it came within 74,000 votes of sweeping the board. Think about that the next time you hear an “obvious” explanation for why Obama won (his data was biggi-er!) or why Romney failed (too much fundraising!).

Kudos to Nate Silver, Simon Jackman, Drew Linzer, and Sam Wang for predicting all 51 states correctly on election eve.

As Felix Salmon said, “The dominant narrative, the day after the presidential election, is the triumph of the quants.” Mashable’s Chris Taylor remarked, “here is the absolute, undoubted winner of this election: Nate Silver and his running mate, big data.” ReadWrite declared, “This is about the triumph of machines and software over gut instinct. The age of voodoo is over.” The new news quants “bring their own data” and represent a refreshing trend in media toward accountability at least, if not total objectivity, away from rhetoric and anecdote. We need more people like them. Whether you agree or not, their kind — our kind — will proliferate.

Congrats to David, Patrick, Chris, Yahoo! News, and the entire Signal team for going out on a limb, taking significant heat for it, and correctly predicting 50 out of 51 states and an Obama victory nearly nine months prior to the election.

Footnotes

[1] Here was the day-before guest list for the February 15 Yahoo! press dinner, though one or two didn’t make it:
-  New York Times, John Markoff
-  New York Times, David Corcoran
-  Fast Company, EB Boyd
-  Forbes, Tomio Geron
-  MIT Tech Review, Tom Simonite
-  New Scientist, Jim Giles
-  Scobleizer, Robert Scoble
-  WIRED, Cade Metz
-  Bloomberg/BusinessWeek, Doug MacMillan
-  Reuters, Alexei Oreskovic
-  San Francisco Chronicle, James Temple

[2] The extended Signal cast included Kim Farrell, Kim Capps-Tanaka, Sebastien Lahaie, Miro Dudik, Patrick Hummel, Alex Jaimes, Ingemar Weber, Ana-Maria Popescu, Peter Mika, Rob Barrett, Thomas Kelly, Chris Suellentrop, Hillary Frey, EJ Lao, Steve Enders, Grant Wong, Paula McMahon, Shirish Anand, Laura Davis, Mridul Muralidharan, Navneet Nair, Arun Kumar, Shrikant Naidu, and Sudar Muthu.

[3] Although I continue to be amazed at how greener the grass is at Microsoft compared to Yahoo!, my one significant regret is not being able to see The Signal project through to its natural conclusion. Although The Signal blog was by no means the sole product of the project, it was certainly the hub. In the end, I wrote 22 articles and David Rothschild at least three times that many.

Raise your WiseQ to the 57th power

One of the few aspects of my job I enjoy more than designing a new market is actually building it. Turning some wild concept that sprung from the minds of a bunch of scientists into a working artifact is a huge rush, and I can only smile as people from around the world commence tinkering with the thing, often in ways I never expected. The “build it” phase of a research project, besides being a ton of fun, inevitably sheds important light back on the original design in a virtuous cycle.

In that vein, I am thrilled to announce the beta launch of PredictWiseQ, a fully operational example of our latest combinatorial prediction market design: “A tractable combinatorial market maker using constraint generation”, published in the 2012 ACM Conference on Electronic Commerce.

You read the paper.1  Now play the game.2 Help us close the loop.

PredictWiseQ Make-a-Prediction screenshot October 2012

PredictWiseQ is our greedy attempt to scarf up as much information as is humanly possible and use it, wisely, to forecast nearly every possible detail about the upcoming US presidential election. For example, we can project how likely it is that Romney will win Colorado but lose the election (6.2%), or that the same party will win both Ohio and Pennsylvania (77.6%), or that Obama will paint a path of blue from Canada to Mexico (99.5%). But don’t just window shop, go ahead and customize and buy a prediction or ten for yourself. Your actions help inform the odds of your own predictions and, crucially, thousands of other related predictions at the same time.

For example, a bet on Obama to win both Ohio and Florida can automatically raise his odds of winning Ohio alone. That’s because our market maker knows and enforces the fact that Obama winning OH and FL can never be more likely than him winning OH. After every trade, we find and fix thousands of these logical inconsistencies. In other words, our market maker identifies and cleans up arbitrage wherever it finds it. But there’s a limit to how fastidious our market maker can be. It’s effectively impossible to rid the system of all arbitrage: doing so is NP-hard, or computationally intractable. So we clean up a good bit of arbitrage, but there should be plenty left.

So here’s a reader’s challenge: try to identify arbitrage on PredictWiseQ that we did not. Go ahead and profit from it and, when you’re ready, please let me and others know about it in the comments. I’ll award kudos to the reader who finds the simplest arbitrage.

Why not leave all of the arbitrage for our traders to profit from themselves? That’s what nearly every other market does, from Ireland-based Intrade, to Las Vegas bookmakers, to the Chicago Board Options Exchange. The reason is, we’re operating a prediction market. Our goal is to elicit information. Even a completely uninformed trader can profit from arbitrage via a mechanical plug-and-chug process. We should reserve the spoils for people who provide good information, not those armed (solely) with fast or clever algorithms. Moreover, we want every little crumb of information that we get, in whatever form we get it, to immediately impact as many of the thousands or millions of predictions that it relates to as possible. We don’t want to wait around for traders to perform this propagation on their own and, besides, it’s a waste of their brain cells: it’s a job much better suited for a computer anyway.

Intrade offers an impressive array of predictions about the election, including who will win in all fifty states. In a sense, PredictWiseQ is Intrade to the 57th power. In a combinatorial market, a prediction can be any (Boolean) function of the state outcomes, an ungodly degree of flexibility. Let’s do some counting. In the election, there are actually 57 “states”: 48 winner-takes-all states, Washington DC, and two proportional states — Nebraska and Maine — that can split their electoral votes in 5 and 3 unique ways, respectively. Ignoring independent candidates, all 57 base “states” can end up colored Democratic blue or Republican Red. So that’s 2 to the power 57, or 144 quadrillion possible maps that newscasters might show us after the votes are tallied on November 6th. A prediction, like “Romney wins Ohio”, is the set of all outcomes where the prediction is true, in this case all 72 quadrillion maps where Ohio is red. The number of possible predictions is the number of sets of outcomes, or 2 to the power 144 quadrillion. That’s more than a googol, though less than a googolplex (maybe next year). To get a sense of how big that is, if today’s fastest supercomputer starting counting at the instant of the big bang, it still wouldn’t be anywhere close reaching a googol yet.

Create your own league to compare your political WiseQ among friends. If you tell us how much each player is in for, we’ll tell you how to divvy things up at the end. Or join the “Friends Of Dave” (FOD) league. If you finish ahead of me in my league, I’ll buy you a beer (or beverage of your choice) the next time I see you, or I’ll paypal you $5 if we don’t cross paths.

PredictWiseQ is part of PredictWise, a fascinating startup of its own. Founded by my colleague David Rothschild, PredictWise is the place to go for thousands of accurate, real-time predictions on politics, sports, finance, and entertainment, aggregated and curated from around the web. The PredictWiseQ Game is a joint effort among David, Miro, Sebastien, Clinton, and myself.

The academic paper that PredictWiseQ is based on is one of my favorites — owed in large part to my coauthors Miro and Sebastien, two incredible sciengineers. As is often the case, the theory looks bulletproof on paper. But I’ve learned the hard way many times that you don’t really know if a design is good until you try it. Or more accurately, until you build it and let a crowd of other people try it.

So, dear crowd, please try it! Bang on it. Break it. (Though please tell me how you did, so we might fix it.) Tell me what you like and what is horribly wrong. Mostly, have fun playing a market that I believe represents the future of markets in the post-CDA era, a.k.a the digital age.

__________
1 Or not.
2 Or not.

The key to understanding net neutrality: Anonymity=good, egalitarianism=bad

For a long time I was terribly confused and conflicted about net neutrality (and embarrassed about being uncommitted on such a core issue in my industry). On the one hand, paying more for higher quality of service is only natural and leads to better provisioning of resources and less waste. HD movie watchers can pay for low latency streaming while email users need not. Treating their packets the same is madness, even worse legislating it so. On the other hand, many people I respect including economically literate ones vociferously argue for net neutrality. And Comcast “shaping” Skype traffic scores an 88 on the Ticketmaster scale of evil.

The key to understanding this debate is recognizing the difference between anonymity and egalitarianism. A mechanism is anonymous if the outcome does not depend on the identity of the players: two players who bid the same are treated equally. It doesn’t matter what their name, age, or wealth is, what company they represent, or how they plan to use the item — all that matters is what they bid. This is a good property for almost any public marketplace that ensures fair treatment, and one worth fighting for on the Internet. AppleT&T should not block Google Voice just because it’s a threat. In fact, even without legislation, it’s almost impossible to bar anonymous participation on the Internet. Service providers can, if forced to, encrypt their packets and hide their content, origin, and purpose, making them indistinguishable from others.

However no one would argue that everyone in a marketplace should receive identical resources. Players who bid more can and must be distinguished (for example, by winning more items) from players who bid less. So, while it’s wrong to discriminate based on identity, it’s absolutely essential to discriminate based on willingness to pay. That is the difference between an egalitarian lottery (silly) and an anonymous marketplace (good).

Somehow the net neutrality debate has confounded these two issues. I agree that any Internet constitution should include that all packets are equal regardless of their creator or purpose (charging $30 for “unlimited” data and in addition 30 cents per 160-char text message scores 72 on the ticketmasterindex). However, users or services who are willing to pay for it can and should receive higher quality. To do otherwise virtually guarantees wasting resources.

Update 2009/08/27: Mark Cuban (as always) says it well. [Via Tom Murphy]

2 weeks, 2 geeks: My two new fearless leaders

Well, geeks are certainly inheriting my earth.

On January 13, my company named Carol Bartz, a self-avowed math nerd and former punch-card carrying member of her college computer club, as its CEO. In her own words:

I was a real nerd. I love, love, love, love math. Back in the late ’60s, math meant being a teacher if you were a woman. I wasn’t interested in teaching. Then I took my first computer course. It was crazy. It was like math, only more fun. I switched to computer science.

Exactly one week later, on January 20, my country turned over executive control to Barack Obama, a CrackBerry addicted comic book geek. In his inauguration speech, Obama vowed to “restore science to its rightful place”, “wield technology’s wonders”, and even addressed “non-believers” — wording that in any sane universe should be entirely unremarkable, yet in ours appears to represent an unprecedented milestone.

I can’t recall a two-week span filled with so much geek pride and cautious optimism.

Back to the Carol Bartz quote. Reading it brings a smile to my face. It also reminds me of my mom, who, convinced it was her only option, taught middle school for a few years before returning to medical school to pursue her passion, enjoying a successful career as one of the first women radiologists.

I highly recommend Bartz’s essay, which mixes biography with prescience and insight. Bartz describes how technology and the Internet are transforming collaboration and improving productivity, at the same time ushering in an era of information overload, email bankruptcy, and misuse of the extra time technology affords. Remarkably, she wrote about these things in 1997!

It’s amazing to think how things have changed since 1997. My own first web experience, courtesy Mosaic, came in 1994, the same year Yahoo! was founded. In 1996, PayPal predecessor and public company First Virtual wrote their own keystroke-sniffing malware as a stunt to bolster their urgent call to “NEVER TYPE YOUR CREDIT CARD NUMBER INTO A COMPUTER”. Ebay was founded in 1995, PayPal in 1998. In 1997, Friendster had neither come nor gone, and Facebook CEO Mark Zuckerberg was 13.

Yet Bartz’s words seem more relevant than ever today.

Find where your polling place isn’t

Just in time for Election Day Tuesday November 4, 2008, here is an extremely un-useful mapping service to help you find exactly where not to go on election day in order to cast your vote.

Click here to find where your polling place isn’t for this election

For example, here is precisely where I would not go to vote if I lived where I work which I don’t:

Map Where Dave's Polling Place is Not


Ok, what’s the point of this you ask?

Well, first, there is little point — it’s mostly a joke.

Beyond that, it’s meant as a satirical commentary on the inability of computers to understand satirical commentary.

Search engine algorithms and search advertising algorithms can’t distinguish well between “polling place is” and “polling place is not”.

Enough googlebombing and I’d wager the above link could rise in the ranks for search queries like polling place.

Enough money and a griefer serious about policing the Internet’s un-seriousness could advertise the link to people searching for their polling place in battleground zip codes, keeping the ad text perfectly factual with a few well placed negations, bypassing human editors at least for a few crucial hours.

In a way, it’s a thought experiment into our future as robots replace humans in the workforce, in this case librarians and editors.

The site is not meant to fool people, even foolish people, only computers.

Not the lesser of two evils

Every election, many voters rationalize their choice as the “lesser of two evils”.1

However, for me, this year’s election is not about the lesser of evils.

In fact, for the first time I can remember, I actually like both major candidates in the US Presidential election.

I like Obama more and I voted for him2 — I think he’s smarter, inspires optimism, and has better policies and people surrounding him. But I like many aspects of McCain including how he denounced Pat Robertson, Jerry Falwell, and the extreme religious right they represent.3

If the party of less government could ever manage to stop legislating morality, I might actually consider voting for them. By the opposite logic, I imagine some evangelicals actually hope that Obama wins, thus strengthening their argument that Republicans can’t win without them.

On a related note, I received an email chain letter from a Snopes-averse source4 warning that McCain’s campaign is sending out erroneous absentee ballot applications to Obama supporters in an attempt to disqualify voters. Initially I dismissed it as conspiracy theory. Then, a few days ago, I received an absentee ballot application in the mail myself, even though I had just finished voting! For a moment, I thought I was a target of the scam with the evidence right in my hand. I could feel the blograge composing in my head.

So I investigated. (Read: conducted a few web searches.)

The Wisconsin State Journal (in)concludes that McCain either meddled or messed up, with benefit of the doubt going to the latter. Blackboxvoting.com (not affiliated with Bev Harris’s more cited blackboxvoting.org) paints a picture of more widespread fraud and malicious intent.

And it seems that the application I received was a legitimate and well intentioned mailing from the League of Conservation Voters Education Fund, a left-leaning environmental organization. The application’s return address had one line missing and an incorrect zip code by one digit, but the address was “correct” in the sense that it would almost surely end up at the right place, so I believe this was not part of any intentional plot to mislead.5 Still, the whole ordeal got me thinking that perhaps all unsolicited applications for absentee ballots should be outlawed — there’s just too much room for error, both malicious and inadvertent.

1Likely more a testament to the effectiveness of attack ads than anything else, and one of the many maddening features of a duop-racy.
2If the choice had been between Clinton and McCain, I think I would have had a harder decision.
3I also like the fact that he defied Bush on torture and held firm on the Iraq surge, a strategy that seems to have helped, despite the political consequences. On the other hand, I cringe at the thought of a President Palin, an outcome with a better than 1 in 7 chance of happening if elected, according to one estimate.
4My mom! 🙂
5You decide: The return address on the application is: Middlesex County Clerk, PO Box 1110, New Brunswick, NJ 08903-1110. The correct address is County Clerk, Hon. Elaine Flynn, P.O. Box 1110, 75 Bayard Street, 4th Floor, New Brunswick, NJ 08901-1110.

New Yahoo! News election dashboard

Cross-posted on midasoracle.org

The Yahoo! News Political Dashboard has re-launched for the general election stretch run of the 2008 US Presidential election.

Yahoo! News political dashboard for the 2008 US general Presidential election

From the main map you can see the status of the election in every state according to either polls or Intrade prediction market odds. Hover your mouse over a state to see current numbers or click on a state to see historical trends. On the side, help you can see search trends, price blogs, story news, and demographic breakdowns at national and state levels.

You can also “create your own scenario” by picking who will win in every state. You can save and share your prediction and compare against markets, polls, history, or celebrities. More on ycorpblog.

In the markets view, states are colored either bright red or bright blue, regardless of how close the race is in that state. To see a visualization that blends colors to reflect the tightness of the race, see electoralmarkets.com.

Yahoo! News also offers a candidate badge that you can display on your blog declaring your choice. The badge features national-level polls, prediction markets, search buzz, and money raised.

Pipes dream

If you haven’t played around with Yahoo! Pipes, I highly recommend it. It’s a usable and useful service that brings web mashups to the masses, making this favorite hacker pastime as easy as dragging objects around on the screen.

For example, it took me probably about ten minutes as a first-time user to create a map mashup showing Barack Obama’s upcoming campaign stops. I “piped” the output of Washington Post’s RSS feed to a location-extractor module that identifies and geo-codes place names and renders them on a map. Here’s a screenshot of the output:

Screen shot of Yahoo! Pipe: Barack Obama 2008 US Presidential Election Campaign Travel Map

The easiest way to get started is to find an existing Pipe, clone it, and modify it as your own. Using this feature, I cloned my Obama map and in about one minute had a McCain map too.

Pipes uses a visual programming interface. The idea of “programming by picture” (I recall playing with one in the 1980s) never took hold as a mainstream tool. However, as a metaphor for mashups, where to goal is to chain together a number of sources and services, the visual approach seems exactly right. The implementation in a browser is a feat of ajaxian magic that I still find remarkable, even as Yahoo! and others are commoditizing the art. I imagine that even non-programmers should have little trouble constructing their own Pipes. Here is a screenshot of the source “code” for my Obama map:

Source code of Yahoo! Pipe: Barack Obama 2008 US Presidential Election Campaign Travel Map

Pipes has dozens of useful modules, including user input, Yahoo! Search, Flickr, and regular expressions.

You can embed the Pipe on your own website with a single line of javascript. I did this with my Obama and McCain campaign travel maps here. Or you can grab the output as an XML feed to use however you wish.

Pipes allows you to create human-readable URLs (e.g., http://pipes.yahoo.com/oddhead/obamatravelmap), a nice touch.

The icing on the cake for me is how Pipes — unlike so many other web sites, including some on Yahoo! — treats me and my Opera browser like adults:

Yahoo! Pipes treats me and my Opera browser like adults

(BTW, Pipes seems to work fine on Opera).

Unfortunately, Daniel Raffel, one of the key founders of Yahoo! Pipes, left Yahoo!. However, the team seems to be strong and continues to innovate, so I’m hopeful this fantastic service will continue to improve and thrive.