Monday, August 28, 2006

Keeping stats clean with mod_security

Anyone that really knows me knows that I tend to be a bit obessive-compulsive at times. I know that at least part of it came from working at the bakery at a ski resort, especially during the week between Christmas and New Years. We baked a lot of cookies, especially during that week. They took about 19 - 22 minutes to bake, and by the end of the season I had managed to set my mental timer for about 18 minutes.

Something else that I can be a little obsessive-compulsive on is checking my web stats. I've gotten better lately, but it was pretty bad for a while. One thing that really bugs me is when spammers screw up my stats with what's called "referer spam" (and yes, that's how web servers spell referrer). You see, when you click on a link to my site from another site, your web browser tells me what that site is. This is really handy for a lot of things, such as when a person's site is really slow; they can look at their stats and be able to tell whether they're being Slashdotted or not. Even better, the stats program that I use gives me a report with a bunch of links that I can click on to look at referring sites.

Here's the kicker. There are more browsers out there than you can imagine. Some of these are called spiders, because they crawl the web for you (get it?) without you necessarily having to watch them the whole time. Sites like Google do this to build their database, so that when you search for something, they have enough search results for you. But spammers use spiders to do things like harvest email addresses and other information. And since they control the spiders, they control the "referer address" that the spider reports to the site. Some spammers like to put links to icky spammer sites in there, in the hopes that the person in charge of the site will leave their web stats out in the open for just anyone to see.

This is where mod_security comes in. Like most webmasters (statistically speaking), I run Apache as my web server software. It may have had a little bit of a learning curve when I started (though much less than I expected), but I've found it offers a great deal of flexibility and security. One of the nice things you can do is just add or remove modules, depending on what you're going for. I recently added mod_security to help battle referer spam. It allowed me to add a blacklist to block not only referer spam, but also comment spam. I remember the first time somebody posted an advertisement to their commercial MySpace site in one of my comments. I first deleted the comment, and then added Blogger's Captcha feature to keep that from happening again. I later added mod_security.

Every so often I find a new spammer in my referer stats and have to add it to my blacklist. In addition to literally thousands of sites that came with my mod_security blacklist, I have added 14 more sites on my own. If you know how to run apache, setting up mod_security is an easy thing. In fact, from the time I clicked download, it took me less than five minutes. Go ahead and check it out.

No comments:

Post a Comment

Comments for posts over 14 days are moderated