<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Oddhead Blog &#187; spam</title>
	<atom:link href="http://blog.oddhead.com/category/spam/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.oddhead.com</link>
	<description>Musings of a computer scientist on predictions, odds, and markets</description>
	<lastBuildDate>Wed, 25 Jan 2012 15:12:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Recaptcha poetry: The barking</title>
		<link>http://blog.oddhead.com/2010/03/25/recaptcha-poetry-the-barking/</link>
		<comments>http://blog.oddhead.com/2010/03/25/recaptcha-poetry-the-barking/#comments</comments>
		<pubDate>Fri, 26 Mar 2010 03:13:27 +0000</pubDate>
		<dc:creator>David Pennock</dc:creator>
				<category><![CDATA[challenges]]></category>
		<category><![CDATA[fun]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[spam]]></category>
		<category><![CDATA[woblomo]]></category>

		<guid isPermaLink="false">http://blog.oddhead.com/?p=1323</guid>
		<description><![CDATA[My first attempt. I&#8217;m sure you can do better.]]></description>
			<content:encoded><![CDATA[<p>My first attempt. I&#8217;m sure you can do better.</p>
<p><br/><br />
<img src="http://blog.oddhead.com/wp-content/uploads/2010/03/recaptcha-poem-the-barking.png" alt="The barking. Reenact task. Arnold understanding a caveated consideration. Snowmen ammunition, shoutouts stumbling. A gridded courtroom, community pumping. Cussions berate" title="recaptcha-poem-the-barking" width="692" height="493" class="aligncenter size-full wp-image-1324" /></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.oddhead.com/2010/03/25/recaptcha-poetry-the-barking/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Meet the splORGers: The latest breed of web spam parasites</title>
		<link>http://blog.oddhead.com/2009/06/24/meet-the-splorgers/</link>
		<comments>http://blog.oddhead.com/2009/06/24/meet-the-splorgers/#comments</comments>
		<pubDate>Wed, 24 Jun 2009 14:16:34 +0000</pubDate>
		<dc:creator>David Pennock</dc:creator>
				<category><![CDATA[advertising]]></category>
		<category><![CDATA[economics]]></category>
		<category><![CDATA[incentives]]></category>
		<category><![CDATA[spam]]></category>

		<guid isPermaLink="false">http://blog.oddhead.com/?p=752</guid>
		<description><![CDATA[Via Muthu. This is mind boggling to me. Sparasites on the web now somehow find it worth their while to invade ultra-specialized academic conferences. Call them splORGers. (In close analogy to sploggers). The website focs2008.org appears to be the official home of the 49th Annual IEEE Symposium on Foundations of Computer Science. (In fact, it&#8217;s <a href='http://blog.oddhead.com/2009/06/24/meet-the-splorgers/'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>Via <a href="http://mysliceofpizza.blogspot.com/2009/04/focs-for-real.html">Muthu</a>. This is mind boggling to me.</p>
<p>Sparasites on the web now somehow find it worth their while to invade ultra-specialized academic conferences. Call them splORGers. (In close analogy to <a href="http://en.wikipedia.org/wiki/Splog">sploggers</a>).</p>
<p>The website <a href="http://focs2008.org">focs2008.org</a> appears to be the official home of the 49th Annual IEEE Symposium on Foundations of Computer Science. (In fact, it&#8217;s the top result for the search <a href="http://blindsearch.fejus.com/?q=focs+2008&#038;type=web">&#8220;focs 2008&#8243; in Bing, Google, and Yahoo!</a>.) Historically a few hundred people attend to hear talks like &#8220;A Hypercontractive Inequality for Matrix-Valued Functions with Applications to Quantum Computing and LDCs&#8221;. </p>
<p>The website appears fully functional: you can browse the entire website structure including internal links like the list of accepted papers and external links like the online registration form.</p>
<p>But look more closely at the lower left corner of the front page. What do you see? SPAM KEYWORDS!: &#8220;Data Recovery Dell Memory HP Memory PC RAM wow accounts WoW gold&#8221;.</p>
<p><center><img src="http://blog.oddhead.com/wp-content/uploads/2009/06/spam-keywords-on-focs2008-org.gif" alt="spam keywords on splORG site focs2008.org" title="spam-keywords-on-focs2008-org" width="244" height="73" class="size-full wp-image-777" /></center></p>
<p>WTF??!!</p>
<p>It turns out that focs2008.org is NOT the official FOCS 2008 conference home page. Rather, it&#8217;s <a href="http://www.cs.cmu.edu/~FOCS2008/"><span style="font-family: courier new;"><strong>http://www.cs.cmu.edu/~FOCS2008/</strong></span></a>. (<a href="http://blindsearch.fejus.com/?q=focs+2008&#038;type=web">Yahoo! ranks this site in second place, Bing and Google in seventh.</a>)</p>
<p>This doesn&#8217;t seem like a zero-cost no-brainer automated attack. It involves identifying the appropriate domain name and mirroring another website, not as one-click as it sounds. There&#8217;s even a small sign of manual effort: the fox graphic in the upper left links to focs<strong>2007</strong>.org rather than 2008, as in the original. And of course there&#8217;s the cost to register and host the domain.</p>
<p>So why bother? Clearly, the perpetrator is not expecting real people to click on the spam links. At it&#8217;s peak, about as many people searched for <a href="http://www.google.com/trends?q=focs+2008%2C+pennock&#038;ctab=0&#038;geo=all&#038;date=all&#038;sort=0">&#8220;focs 2008&#8243; as for &#8220;pennock&#8221;</a> and the offending links are fairly obscure. This is most certainly about siphoning <a href="http://thekeywordacademy.com/link-juice-explained/">link juice</a> from seemingly legitimate .orgs that search engines trust.</p>
<p>But can that benefit really outweigh the cost? <a href="http://blog.oddhead.com/2009/06/22/un-hacking-my-blog/">Again</a> and <a href="http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/">again</a> I simply fail to grok the economics of spam.</p>
<p>SplORGers have also set up camp at focs2007.org and ioi2008.org. Curiously, focs2009.org has a more transparent yet still head-scratching disclaimer.</p>
<p>Today, I stumbled onto a similar spamfiltration on mortgagepoints.com, the first external link on the <a href="http://en.wikipedia.org/wiki/Point_(mortgage)">Wikipedia definition of mortgage points</a>, prompting me to finally write this post. Look what our ultra open web has wrought!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.oddhead.com/2009/06/24/meet-the-splorgers/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Recovering from swine&#8217;s infection (my blog, that is)</title>
		<link>http://blog.oddhead.com/2009/06/22/un-hacking-my-blog/</link>
		<comments>http://blog.oddhead.com/2009/06/22/un-hacking-my-blog/#comments</comments>
		<pubDate>Mon, 22 Jun 2009 21:43:57 +0000</pubDate>
		<dc:creator>David Pennock</dc:creator>
				<category><![CDATA[oddhead blog]]></category>
		<category><![CDATA[spam]]></category>

		<guid isPermaLink="false">http://blog.oddhead.com/?p=708</guid>
		<description><![CDATA[For the second time, a hacker (in the swine sense of the word) broke in and defaced Oddhead Blog. Once again, I&#8217;m left impressed by the ingenuity of web malefactors and entirely mystified as to their motivation. Last week several readers notified me that my rss feed on Google Reader was filled with spam (&#8220;Order <a href='http://blog.oddhead.com/2009/06/22/un-hacking-my-blog/'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/odisie/3460906590/"><img src="http://farm4.static.flickr.com/3593/3460906590_1dc3fa7066_t.jpg" hspace="10" align="left" alt="Odd head hacker" /></a>For the <a href="http://blog.oddhead.com/2007/06/07/hacked-and-splogged-and-left-for/">second time</a>, a hacker (in the <a href="http://en.wikipedia.org/wiki/Hacker_(computer_security)">swine sense of the word</a>) broke in and defaced Oddhead Blog. <a href="http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/">Once again</a>, I&#8217;m left impressed by the ingenuity of web malefactors and entirely mystified as to their motivation.</p>
<p>Last week several readers notified me that my rss feed on Google Reader was filled with spam (&#8220;Order Emsam No RxOrder Emsam Overnight DeliveryOrder&#8230; BuyBuy&#8230;&#8221;).</p>
<p>The strange part was, the feed looked fine when accessed directly on my website or via <a href="http://www.bloglines.com/preview/http://blog.oddhead.com/feed">Bloglines</a>. Only when <em>Google</em> requested the feed did it become corrupted, thus mucking up my content inside Google Reader but not on my website.</p>
<p>(Hat tip to <a href="http://www.erisian.com.au/wordpress/">Anthony</a> who diagnosed the ailment: calling <span style="font-family: courier new;">curl http://blog.oddhead.com/feed/</span> yielded clean output, while the same request masquerading as coming from Google, <span style="font-family: courier new;">curl -A &#8216;Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 10 subscribers; feed-id=12312313123123)&#8217; http://blog.oddhead.com/feed/</span>, yielded the <a href="http://blog.oddhead.com/wp-content/uploads/2009/06/oddhead-blog-rss-feed-hacked-but-only-when-google-requests-it.xml">spammed-up version</a>.)</p>
<p>In the meantime, Google Search had apparently deduced that my site was compromised and categorized my blog as spam. Look at the difference between these <a href="http://www.google.com/webhp#hl=en&amp;q=oddhead+blog+%22thank+you+bangalore%22">two</a> <a href="http://www.google.com/webhp#hl=en&amp;q=site%3Aoddhead.com+%22thank+you+bangalore%22">searches</a>. Nearly every page containing the query terms, no matter how tangential, takes precedence over blog.oddhead.com in the results. <strong>[2009/06/23 Update: This is no longer the case: Apparently Google Search has <a href="http://blog.oddhead.com/2009/06/22/un-hacking-my-blog/comment-page-1/#comment-502">reconsidered my blog</a>.]</strong></p>
<p>So began a lengthy investigation to find and eradicate the invader. The offending text did not appear anywhere in my WordPress code or database. Argg. I found that my plugins directory was world-writeable: uh oh. Then I found a file named remv.php in my themes directory containing a decidedly un-<a href="http://automattic.com/">automattic</a> jumble of code. Apparently this is an <a href="http://jasoncosper.com/archives/wordpress-remvphp-and-you/">especially nasty bugger</a>:</p>
<blockquote cite="http://jasoncosper.com/archives/wordpress-remvphp-and-you/"><p>I’ve never seen a hack crop up with the tenacity of “remv.php” tho.  Seriously, it’s kind of scary.</p></blockquote>
<p>I&#8217;m still not sure how or even if an attacker used remv.php to corrupt my feed in such a subtle way. I decided on surgery by chainsaw rather than scalpel. I exported all my content into a WordPress XML file, deleted my entire installation of WordPress, reinstalled WordPress, then imported my content back in. I restored my theme and re-entered some meta data, but I still have many ongoing repairs to do like importing my blogroll and other links.</p>
<p>The attack was clever: a virus that sickens but does not kill the patient. The disease left my web site functioning perfectly well, making it less likely for me to notice and harder to track down. The bizarre symptom &#8212; corrupting the rss feed but only inside Google Reader &#8212; led <a href="http://www.midasoracle.org/">Chris</a> to wonder if the attacker knew I was a Yahoo! loyalist. That seems unlikely. I don&#8217;t think I have enemies who care that much. Also, the spammy feed appeared in Technorati as well. Almost surely I was the victim of an indiscriminate robot attack. Still, after searching around, I couldn&#8217;t find another example of exactly this form of RSS feed &#8220;selective corruption&#8221;: has anyone seen or heard of this attack or can find it? And can anyone explain <em>why</em>?</p>
<p>What did I learn? I learned to <a href="http://www.midasoracle.org/2009/02/11/upgrading-wordpress-2-7-1/">listen to Chris</a> and <a href="http://www.midasoracle.org/2009/06/15/beating-david-pennock/">not make him mad</a>. <img src='http://blog.oddhead.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I also found a bunch of useful WordPress security tips, resources, and plugins that might be useful to others including my future self:</p>
<ul>
<li><a href="http://jasoncosper.com/archives/wordpress-remvphp-and-you/">WordPress, remv.php and you</a></li>
<li><a href="http://www.dailyblogtips.com/3-must-apply-security-tips-for-wordpress/">3 must apply security tips for WordPress</a></li>
<li><a href="http://codex.wordpress.org/Hardening_WordPress">Hardening WordPress</a></li>
<li><a href="http://www.dailyblogtips.com/5-plugins-to-keep-wordpress-secure/">5 plugins to keep WordPress secure</a></li>
<li><a href="http://jaredwsmith.com/2009/02/15/anatomy-of-a-wordpress-hack/">Anatomy of a WordPress hack</a> (&#8220;The kicker? All these sites were on Dreamhost.&#8221;)</li>
<li><a href="http://ocaoimh.ie/did-your-wordpress-site-get-hacked/">Did your WordPress site get hacked?</a></li>
<li><a href="http://wiki.dreamhost.com/Troubleshooting_Hacked_Sites">DreamHost: Troubleshooting hacked sites</a></li>
<li><a href="http://www.sinosplice.com/life/archives/2009/05/30/dealing-with-a-hacker-on-dreamhost">Dealing with a hacker on DreamHost</a></li>
<li><a href="http://codex.wordpress.org/WordPress_Feeds">Docs on WordPress feeds</a></li>
<li><a href="http://www.askapache.com/htaccess/rewriterule-viewer-plugin.html">AskApache plugin to display all the internal WordPress URL rewrite rules</a> (<a href="http://www.askapache.com/htaccess/redirecting-wordpress-feeds-to-feedburner.html">example use</a>) (I couldn&#8217;t discern how to interpret the output)</li>
<li><a href="http://wordpress.org/extend/plugins/exploit-scanner/">WordPress exploit scanner plugin</a> (I didn&#8217;t use after <a href="http://wordpress.org/support/topic/265783">this question</a> spooked me)</li>
<li><a href="http://wordpress.org/extend/plugins/secure-wordpress/">Secure WordPress plugin</a></li>
<li><a href="http://www.askapache.com/wordpress/htaccess-password-protect.html">AskApache password protect plugin</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.oddhead.com/2009/06/22/un-hacking-my-blog/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Intelligent blog spam</title>
		<link>http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/</link>
		<comments>http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/#comments</comments>
		<pubDate>Wed, 14 Jan 2009 22:08:19 +0000</pubDate>
		<dc:creator>David Pennock</dc:creator>
				<category><![CDATA[economics]]></category>
		<category><![CDATA[incentives]]></category>
		<category><![CDATA[oddhead blog]]></category>
		<category><![CDATA[spam]]></category>

		<guid isPermaLink="false">http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/</guid>
		<description><![CDATA[As I alluded to previously, I seem to be getting &#8220;intelligent spam&#8221; on my blog: comments that pass the re-captcha test and seem on-topic, yet upon further inspection clearly constitute link spam: either the author URI or a link in the comment body is spam. Here is one of the most clear cases, received on <a href='http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>As I alluded to <a href="http://blog.oddhead.com/2008/08/13/the-seedy-side-of-amazons-mechanical-turk/">previously</a>, I seem to be getting &#8220;intelligent spam&#8221; on my blog: comments that pass the <a href="http://recaptcha.net/">re-captcha test</a> and seem on-topic, yet upon further inspection clearly constitute link spam: either the author URI or a link in the comment body is spam.</p>
<p>Here is one of the most clear cases, received on January 9 as a comment to <a href="http://blog.oddhead.com/2008/05/02/a-historic-mayday-the-us-governments-call-for-help-on-regulating-prediction-markets">my post on the CFTC&#8217;s call for proposals to regulate prediction markets</a>:</p>
<blockquote><p>
Date: Fri, 9 Jan 2009 01:28:01 -0800<br />
From: Matt.Herdy<br />
New comment on your post #71 &#8220;A historic MayDay: The US<br />
government&#8217;s call for help on regulating prediction markets&#8221;<br />
Author : Matt.Herdy<br />
Comment:<br />
Thanks for that post. I’ll put a note in the post.</p>
<p>1. It’s nothing new. The CFTC will just formalize the current<br />
status quo.<br />
2. We are prisoner of the CFTC regulations and the US Congress’<br />
distaste of sports “gambling”. As for the profitability of prediction<br />
exchanges in that strict environment, I don’t see how you can deny that<br />
HedgeStreet went bankrupt even though it was well funded. Isn’t that a<br />
hard fact?<br />
3. You’re right, but all “pragmatists” should follow a business<br />
plan and make profits. See point #2. Pragmatists won’t make miracles.</p>
<p><strong>&lt;a href=&#8221;http://www.stretch-marks-help.com/&#8221;&gt;Removing stretch marks&lt;/a&gt;</strong>
</p></blockquote>
<p>At first blush, the comments seems to come from a knowledgeable person: they refer to HedgeStreet, an extremely relevant yet mostly unknown company that&#8217;s not mentioned anywhere else in the post or other comments.</p>
<p>It turns out the comments seem intelligent because they are. In fact, they&#8217;re copied word for word from <a href="http://www.midasoracle.org/2008/05/01/cftc-anouncement/">Chris Masse&#8217;s comments</a> on his own blog.</p>
<p>Chris Masse&#8217;s page has a link to my page, so it could have been discovered with a <a href="http://siteexplorer.search.yahoo.com/search?p=http%3A%2F%2Fblog.oddhead.com%2F2008%2F05%2F02%2Fa-historic-mayday-the-us-governments-call-for-help-on-regulating-prediction-markets&#038;bwm=i&#038;bwmf=u&#038;bwms=p&#038;fr=flo2&#038;fr2=seo-rd-se">&#8220;link:&#8221;</a> query to a search engine.</p>
<p>Though now I understand <em>what</em> this spammer did, I remain puzzled exactly how they did it and especially <em>why</em>.</p>
<ol>
<li>Are these comments being inserted by <em>people</em>, perhaps hired on <a href="http://www.mturk.com/">Mechanical Turk</a> or other underground equivalent? Or are they coming from <em>robots</em> who have either broken re-captcha or the security of my blog? (<a href="http://hunch.net/">John</a> suspects a security breach.)</li>
<li>Is it really worth it economically? All links in blog comments are NOFOLLOW links anyway, and <a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-12.html">disregarded by search engines for ranking purposes</a>, so what is the point? Are they looking for actual humans to click these links?</li>
</ol>
<p>In any case, it seems an intriguing development in the spam arms race. Are other bloggers getting &#8220;intelligent spam&#8221;? Does anyone know how it&#8217;s done and why?</p>
<p><strong>Update 2010/07:</strong> Oh, the irony. I got a number of intelligent seeming comments on this post about SEO, nofollow, economics of spam, etc. that were&#8230; promoting spammy links. I left them for humor value though disabled the links.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.oddhead.com/2009/01/14/intelligent-blog-spam/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>The seedy side of Amazon&#039;s Mechanical Turk</title>
		<link>http://blog.oddhead.com/2008/08/13/the-seedy-side-of-amazons-mechanical-turk/</link>
		<comments>http://blog.oddhead.com/2008/08/13/the-seedy-side-of-amazons-mechanical-turk/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 19:31:32 +0000</pubDate>
		<dc:creator>David Pennock</dc:creator>
				<category><![CDATA[advertising]]></category>
		<category><![CDATA[commentary]]></category>
		<category><![CDATA[economics]]></category>
		<category><![CDATA[incentives]]></category>
		<category><![CDATA[ratings]]></category>
		<category><![CDATA[spam]]></category>

		<guid isPermaLink="false">http://blog.oddhead.com/2008/08/13/the-seedy-side-of-amazons-mechanical-turk/</guid>
		<description><![CDATA[I mostly side with Lukas and Panos on the fantastic potential of Amazon&#8217;s Mechanical Turk, a crowdsourcing service specializing in tiny payments for simple tasks that require human brainpower, like labeling images. Within the field of computer science alone, this type of service will revolutionize how empirical research is done in communities from CHI to <a href='http://blog.oddhead.com/2008/08/13/the-seedy-side-of-amazons-mechanical-turk/'>[...]</a>]]></description>
			<content:encoded><![CDATA[<p>I mostly side with <a href="http://doloreslabs.com/">Lukas</a> and <a href="http://behind-the-enemy-lines.blogspot.com/search/label/mechanical%20turk">Panos</a> on the fantastic potential of <a href="http://www.mturk.com/">Amazon&#8217;s Mechanical Turk</a>, a crowdsourcing service specializing in tiny payments for simple tasks that require human brainpower, like labeling images. Within the field of computer science alone, this type of service will revolutionize how empirical research is done in communities from <a href="http://sigchi.org/">CHI</a> to <a href="http://www.sigir.org/">SIGIR</a>, powering unprecedented speed and scale at low cost (here are <a href="http://pages.stern.nyu.edu/~panos/publications/kdd2008.pdf">two</a> <a href="http://blog.doloreslabs.com/2008/04/search-engine-relevance-an-empirical-test/">examples</a>). My guess is that the impact will be even larger in the social sciences; already, a number of folks in <a href="http://research.yahoo.com/Econ_and_Social_Sys">Yahoo&#8217;s Social Dynamics research group</a> have started running studies on mturk. (A side question is how university review boards will react.)</p>
<p>However there is a seedier side to mturk, and I&#8217;m of two minds about it. Some people use the service to hire <a href="http://en.wikipedia.org/wiki/Sockpuppet_(Internet)">sockpuppets</a> to enter bogus ratings and reviews about their products and engage in other forms of spam. (Actually this appears to violate mturk&#8217;s <a href="http://requester.mturk.com/mturk/help?helpPage=policies#restrictions_use_mturk">stated policies</a>.)</p>
<p>For example, Samuel Deskin is <a href="http://www.mturk.com/mturk/preview?groupId=T96ZJ1YGTVDJRYYGVRKZ">offering up to ten cents</a> to turkers willing to promote his new personalized start page samfind.</p>
<blockquote cite="http://www.mturk.com/mturk/preview?groupId=T96ZJ1YGTVDJRYYGVRKZ"><p>
EARN TEN CENTS WITH THE BONUS &#8211; EASY MONEY &#8211; JUST VOTE FOR US AND COMMENT ABOUT US</p>
<p> EARN FOUR CENTS IF YOU:</p>
<p>1. Set up an anoymous email account likke gmail or yahoo so you can register on #2 anonymously</p>
<p>2. Visit http://thesearchrace.com/signup.php   and sign up for an account &#8211; using your anonymous email account.</p>
<p>3. Visit http://www.thesearchrace.com/recent.php   and vote for:</p>
<p>samfind</p>
<p>By clcking &#8220;Pick&#8221;</p>
<p>SIX CENTS BONUS:</p>
<p>4. Visit the COMMENTS Page on The Search Race, it is the Button Right Next to &#8220;Picks&#8221; on this page: http://www.thesearchrace.com/recent.php   and</p>
<p>5. Say something awesome about samfind (http://samfind.com)  on The Search Race&#8217;s Comments page.</p>
<p>Make sure to:</p>
<p>1. Tell us that you Picked us.<br />
2. Copy and Paste the Comment you typed on The Search Race&#8217;s Comment page here so we know you wrote it and we will give you the bonus!
</p></blockquote>
<p>In fact, Deskin is currently offering bounties on mturk for <a href="http://www.mturk.com/mturk/searchbar?selectedSearchType=hitgroups&#038;searchWords=samfind&#038;minReward=0.00&#038;.x=0&#038;.y=0">a number of different spammy activities</a> to promote his site. On the other hand, what Deskin is doing is not illegal and is arguably not all that different than paying PRWEB to publish his rah-rah press release (<a href="http://news.yahoo.com/s/prweb/20080804/bs_prweb/prweb1167214_1">Start-up, samfind, Launches Customizable Startpage to Compete with Google, Yahoo &#038; MSN</a>, Los Angeles, California (PRWEB) August 4, 2008). And I have to at least give him credit for offering the money under his own name.</p>
<p>Another type of task on mturk involves taking a piece of text and paraphrasing it so that the words are different but the meaning remains the same. Here is an <a href="http://www.mturk.com/mturk/preview?groupId=FYTA32HJ2ZPZNM1ZSA10">example</a>:</p>
<blockquote cite="http://www.mturk.com/mturk/preview?groupId=FYTA32HJ2ZPZNM1ZSA10"><p>
Paraphrase This Paragraph</p>
<p>Here&#8217;s the original paragraph:</p>
<p>You&#8217;re probably wondering how to apply a wrinkle filler to your skin. The good news is that it&#8217;s easy! There are a number of different products on the market for anti aging skin care. Each one comes with its own special application instructions, which you should always make sure to read and carefully follow. In general, however, most anti aging skin care products are simply applied to the skin and left to soak in.</p>
<p>Requirements:<br />
1. Use the same writing style as much as possible.<br />
2. Vary at least 50% of the words and phrases &#8211; but keep the same concepts. Use obviously different sentences! Your paragraph should not be just a copy of the first with a few word replacements.<br />
3. Any keywords listed in bold in the above paragraph must be included in your paraphrase.<br />
4. The above paragraph contains 75 words&#8230; yours must contain at least 64 words and not more than 101 words.<br />
5. Write using American English.<br />
6. No obvious spelling or grammar mistakes. Please use a spell-checker before submitting. A free online spell checker can be found at www.spellcheck.net.</p>
<p>If you find it easier to paraphrase sentence-by-sentence, then do that. Please do not enter anything in the textbox other than your written paragraph. Thanks!
</p></blockquote>
<p>I have no direct evidence, but I imagine such a task is used to create <a href="http://en.wikipedia.org/wiki/Splog">splogs</a> (I once found what seems like such a &#8220;paraphrasing splog&#8221;), ad traps, email spam, or other plagiarized content.</p>
<p>It&#8217;s possible that paid spam is hitting my blog (either that or I&#8217;m overly paranoid). I&#8217;m beginning to receive comments that are almost surely coming from humans, both because they clearly reference the content of the post and because they pass the re-captcha test. However, the author&#8217;s URL seems to point to an <a href="http://en.wikipedia.org/wiki/Domaining">ad trap</a>. I wonder if these commenters (who are particularly hard to catch &#8212; you have to bother to click on the author URL) are paid workers of some crowdsourcing service?</p>
<p>Can and should Amazon try to filter away these kinds dubious uses of Mechanical Turk? Or is it better to have this inevitable form of economic activity out in the open? One could argue that at least systems like mturk impose a tax on pollution and spam, something <a href="http://stiet.cms.si.umich.edu/icd/ICDforSpam">long argued as an economic force to reduce spam</a>.</p>
<p>My main objection to these activities is the lack of disclosure. Advertisements and press releases are paid for, but everyone knows it, and usually the funding source is known. However, the ratings, reviews, and paraphrased text coming out of mturk masquerade as authentic opinions and original content. I absolutely want mturk to succeed &#8212; it&#8217;s an innovative service of tremendous value, one of many to come out of Amazon recently &#8212; but I believe Amazon is risking a minor PR backlash by allowing these activities to flow through its servers and by profiting from them.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.oddhead.com/2008/08/13/the-seedy-side-of-amazons-mechanical-turk/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
	</channel>
</rss>

