Referer Spam


So far I’ve nuked comment-spam, by using embedded graphic security codes in my comments. It sucks to have to do this, but we all know that this is war.

Next up is referer-spam, which is at an all-time high. It seems that the developers of my stats analyzer, Awstats, either don’t care or don’t know how to set up an easy method for referer blacklisting. If anyone knows of a good way to do this with Awstats, or a better stats analyzer, let me know.

Meanwhile, I’ve had some luck with modifying my .htaccess file. Here’s a sample:

< limit GET HEAD POST>
SetEnvIfNoCase Referer ".*(porn).*" BadReferrer
SetEnvIfNoCase Referer ".*(casino).*" BadReferrer
SetEnvIfNoCase Referer ".*(sex).*" BadReferrer
order deny,allow
deny from env=BadReferrer
< /limit> 

This method sucks, because it is manual and purely reactive, but it does help. I’ve also password-protected my stats page, making it completely un-crawlable by any web spider. If you’re having problems with referer spam, I recommend you make the same change. If you really want people to be able to see your stats, you can post the username/password as an image.

Phase III Complete!


As Kayvan suggested, on Susi’s blog, “You could implement MT-Blacklist.”

MT-Blacklist did nothing to help me, during the comment-spam attack. The botnet connections occurred regardless of blacklisting. Server load was up to 110 — where 0.2 is normal — before DP stopped the Apache process. I removed my mt-comments.cgi script entirely, but the requests still came in from over 100 infected clients across the Internets.

I’ve disliked MT since the whole blogspam thing started, and I’ve hated it since 6Apart decided to screw everyone on licensing fees. It would not suprise me to see that they had a helping hand in some blogspamming, just to push people to pay the upgrade fee.

It took about 2 hours to install WordPress. I exported my MT posts and comments, and split the file into thirds (anything over 1MB is bad for importing). I downloaded and unpacked WordPress, made a few config changes, and then imported all of my old posts. The import took less time than the export.

The biggest bastard was fixing my MT index template to work with WordPress. It wasn’t terribly complicated, but it did require matching up each systems tags.

Now that it’s done, I can say that WordPress truly PWNZ MoveableType. It has some killer features, like built-in link-management [all my sidebar links], OPML importing, and future-dated posting [the post won’t show up until after the future date]. And it is stupidly fast — no more waiting for my site to ‘rebuild’.

More on the link-management and OPML import features: a while back I played around with Bloglines, adding RSS feeds of all of my linked sites to my Bloglines account. WordPress will import all of your links from your Bloglines RSS page, and add them into its Link Manager. From the Link Manager, you can create categories for links, and then specify which category a given link is in [along with things like rating, description, and so on]. Once you get your links set up, it is very easy to incorporate separate category-based divs into your main site template — and WordPress will auto-sort them for you.

So yeah, screw MoveableType. All the cool kids use WordPress now.

Phase B Complete


Phase B is now complete!

The import from MT to WP went very quickly. The only ‘gotcha’ is that PHP hates importing text files of over 1MB (unless you screw with the php.ini settings), so I had to manually split the import.txt file into several smaller chunks.

As you can see, a few minor cosmetic glitches remain. I’m still not sure what’s up with the post title, but I’ll work on that later.

Phase One Complete


Phase One of the move to WordPress is complete: all MT entries are imported, and the site is mostly active, although the main Index redirector still sends people to the MT page.

Tomorrow I’ll start working on the templates and such.


Not entirely sure what’s going on, but lately I’ve got a ton of fake referrer entries from various sites’ mt-comments.cgi — and they all come from one IP: .

Luckily, I know the magic of .htaccess, so now nobody from “Everyone’s Internet” can read my page. And good damn riddance.

Here’s an example of the log files: - - [02/Sep/2004:20:05:43 -0400] "GET /mt/archives/002005.html HTTP/1.1" 200 16969 "http://WWW.katiehood.COM/cgi/mt-comments.cgi" "Windows XP Internet Explorer 6.x" - - [02/Sep/2004:21:22:24 -0400] "GET /mt/archives/001302.html HTTP/1.1" 200 17057 "" "Windows XP Internet Explorer 6.x" - - [02/Sep/2004:21:39:55 -0400] "GET /mt/archives/001981.html HTTP/1.1" 200 17365 "" "Windows XP Internet Explorer 6.x" - - [02/Sep/2004:22:26:51 -0400] "GET /mt/archives/001467.html HTTP/1.1" 200 16895 "" "Windows XP Internet Explorer 6.x"

And the .htaccess file (doesn’t block EI’s whole range, but it’s good enough):

order allow,deny
deny from 66.98.210
allow from all

Don’t Do It Here


According to my referrer stats, someone just came here via a Google search for “have to pee”.

Look, folks. I know that Google is a great invention and all, but you should know what to do if you have to pee. There’s no need to perform a quoted query in search engine.

Just go, already.

But don’t do it here.

Referrer spam, again


Although I think the whole concept of ‘referrer spam‘ is stupid and a waste of time, here’s a quick fix for those of you running Analog:

REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**
REFEXCLUDE http://**