My blog has been hacked yet again. For those keeping track, that’s infection number three. This latest exploit is very similar to the previous one. To humans arriving via browser (e.g., me), the site appears perfectly normal and healthy. Even upon clicking ‘view source’, nothing untoward is revealed. The <title> of my blog is, as always, Oddhead Blog.
However, when Google’s or Bing’s crawlers arrive to index my corner of the web, they see a different <title> altogether — Buy Cheap Cialis Online — and immediately roll their eyes. (Actually even if you run 'curl http://blog.oddhead.com'
, you’ll see the spam keywords.) The effect of the attack is a kind of reverse cloaking. Cloaking is the black-hat SEO practice of serving legitimate content to crawlers and spam content to people. Here, the spam content is shown to the crawlers and the legitimate content to the people.
Once the crawlers report this appalling information back to their respective mother ships, the search engines have no choice but to delist and demote my blog in their pagerankings. Right now, if you search for or within Oddhead Blog on Google, you’ll see how poorly the bots in Mountain View think of me:
You can hardly find any deep links into my blog by searching Google. For example, try searching for Bem+Wom, my invented term for “BEtter Mousetrap, Word of Mouth”. Even try “Bem+Wom oddhead blog”. You”ll find aggregators republishing my content, but no links to the original source, my blog, anywhere in sight. (Note to self: the Bing results for Bem+Wom are awful.)
Once again I am at a loss to understand my attacker’s motivation. Clearly it’s not to sell Cialis to my users, as they remain blissfully ignorant of any changes. The only benefit to anyone is to remove one relatively obscure blog from the search engine rankings and thus to move the attacker one slot up. Having a blog tangentially about gambling probably puts me into a shady neighborhood of the web, yet reverse-cloaking your competition (even if it can be somewhat automated and strike more than one competitor) seems like an awfully indirect way to improve one’s standing in Google. It’s also possible this is an act of pure vandalism.
So what should I do? Although I partly blame WordPress for writing insecure software, I may end up paying WordPress protection money to make this problem go away. I am seriously considering giving up on self hosting and moving my whole operation to worpress.com’s hosted service, where presumably security is tighter, or at least it’s not my responsibility any more. My web hosting service, DreamHost, may also be partly to blame, yet I like the company and have been quite happy with them in many respects. Any advice, dear reader? WordPress.com? Blogger? Try again and hope the fourth time is the charm? Should I be looking to ditch DreamHost as well?
My guess is that Dreamhost is the problem, just based on the fact that my dreamhost account was hacked recently but I haven’t had these problems with my wordpress installation. (Though holy crap with break-ins like the one you just described I wouldn’t be likely to notice if I did!)
I do really like the idea of outsourcing the hosting of wordpress. I’m thinking of doing that and have decided the best option is WPengine.com based mostly on my admiration for the founder, Jason Cohen.
Thanks Daniel. WPengine looks incredible though it may be a bit more than I need.
Huh, what a strange world SEO is. I’d wager that you’re right about the gambling connection being what got you targeted (probably automatically). The key, I think, is that even if an orchestrated attack like this helps the perpetrator go up 2 spaces, while traditional black hat SEO would have netted them 20, it’s a low risk tactic. Since nobody knows which site is responsible, they can’t be punished. Real fly-by-night sites might be fine with having a good ranking for a week, selling their scam, then having the site delisted and starting over. But a legitimate gambling site may have spent years advertising and building up a customer base, and would have to pay a hefty cost getting caught cheating and losing search engine largess.
Thanks Paul. ‘Strange’ may not be the word I was looking for. Interesting, that explanation does makes sense: very insightful.
I filed a bug with Google search; hopefully they should fix the problem soon. I am curious how this is done. If they’ve actually hacked into the server, they could replace the contents with their own content, which seems more profitable for them. Why have they chosen to only report the wrong contents to the search engine, and not to the viewer? Could it be that they taking advantage of an insecurity in the connection between the search engine and the server?
I like that somehow I’ve managed to +1 your Cialis ad. 🙂
Are you registered with Google’s Webmaster tools? If not, that should hopefully help identify and potentially pre-warn you on hacking attacks. Also, how good are you at keeping your WordPress installation up-to-date? I believe that can help quite a bit.
And here’s more things to check out/ consider:
After you get started with Google Webmaster Tools, see this: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=158587 and then fix it like this: http://googlewebmastercentral.blogspot.com/2011/08/submit-urls-to-google-with-fetch-as.html
Here is someone else that was hacked in what sounds like a similar manner: http://intertwingly.net/blog/2012/04/02/Hacked
Finally, I saw this quote from a colleague:
“I’ve helped two friends disinfect their WP sites recently. In both cases the infected content was only revealing itself to Google IP addresses; to everyone else the site looked fine.
In one case the infected content was in the mysql database, and two seemingly innocuous modifications of the core WP php scripts were calling the infected content. In another case, the theme they were using was corrupted.”
Hopefully some of that helps.
@Mohammad, replacing the site with their own contents entails risk of getting punished. And replacing it for humans means it will be noticed and fixed immediately.
Thanks Mohammad. I appreciate it very much. I don’t think this is Google’s responsibility though and there may not be much they can do about it without having their crawlers pretend to be browser agents, which is probably against their policy. Once I fix the problem I’ll resubmit my site as Jed suggests. Thanks a lot Jed: that info is very useful. Thanks Dan. Yes, it seems like the goal here is to nuke a competitor without them knowing it.
Yipes, am I out of my depth on this one! I surely think, however, that this is a personal and targeted attack rather than a result of some bot trying to suppress the competition for those who might search for gambling related terms. Remember that most would-be gamblers would search for more specific terms such as slot machine or casino rather than a general term such as gambling. That bot might have found you but it surely would not consider you to be the enemy. I would say your site was targeted by a human being.
I’ve long notices some odd searches by spam sites supposedly looking at specialized blogs of no interest to spammers and I’ve wondered what was going on. Why would spam sites search obscure specialized blogs for postings that were obsolete and of little current interest? If it is a spammer then they may be stealing your content for their own cloaking purposes but poisoning the search results to down-rank your site.
As to solutions, I assume more expensive hosting companies have better security checks and perform them more frequently than you care to check your own blog’s code integrity. The trouble is that you will get into a situation similar to that involving bugging devices and the phone company. All the phone company ever says is “we have checked your line and it is free of unauthorized equipment”. They do not report that they found a device and removed it. So more experienced security personnel does not mean more helpful and informative security specialists even if you are paying more for your hosting.
Good luck.
I had similar problems with my WordPress blog and before that one that ran on Pivot. My conclusion was that if your pages are being served up dynamically you are open to attacks like these.
These days I run my blog (and the rest of my site) using Jekyll. This is just a script that converts a bunch of markdown files (one for each blog post or page) into static HTML files. I then sync these with my web host and that server just serves the static HTML – no databases, no PHP, no exploits. Comments are handled by a bit of JavaScript in each page that uses Disqus to manage discussions.
It’s not a solution for everyone as it is certainly not as straightforward to post a new article, but it drastically reduces the chance of having your blog hijacked.
Some people are just idiots. There are a couple of good security plugins for wordpress, Id advise you use one.
That’s too bad. there always a hacker who could found vuln in a website.
Especially in WordPress that used by most of blog.
Check wp plugin directory. you can find very good one for sure.