si-blog

Referral spam problems continue

Posted Nov 12, 2004 in Technology.

I am at a loss. Referral spam continues to plague me, despite my best efforts. My .htaccess file is getting longer and longer, and I can no longer keep up with all the clever tricks employed by the spammers. The problem is exacerbated by my brain's complete refusal to understand mod-rewrite and regular expressions.

At the moment, my referral spam is dominated by the top level domains .biz and .info. I am trying to deny both access, but it doesn't seem to be working properly. I'd appreciate it if some of you mod-rewrite gurus would take a look at my re-writing efforts and comment on them.

Why must the World Wide Web, a beautiful and noble thing, be fucked up by spammers and the businesses and peddlers who pay them?

Comments

  1. Gravatar

    That was the ultimate "fuck you". The first reply to this entry was some comment spam, which I naturally deleted.

    Posted by Simon Jessey on Nov 12, 2004.

  2. Gravatar

    Well, I have to thank you for providing your .htacces file to us. With it, I understood completely how mod_rewrite is working! Thank you very much! :)

    Posted by Remi Prevost on Nov 13, 2004.

  3. Gravatar

    I read this a few days back on wired:
    http://www.wired.com/news/culture/0,1284,56017,00.html

    Thought it may be of interest if you haven't read it.

    Posted by wil on Nov 15, 2004.

  4. Gravatar

    I noticed an upsurge in referer spam too.

    So I redirected their bots to a little CGI script which does nasty things to them. It took a couple of days for the twerps to get the message, but then they abruptly went away.

    Posted by Jacques Distler on Nov 16, 2004.

  5. Gravatar

    I'd be interested to know more about how you handled them, Jacques.

    Posted by Simon Jessey on Nov 17, 2004.

  6. Gravatar

    Email me, and I'll supply details.

    Posted by Jacques Distler on Nov 17, 2004.

  7. Gravatar

    I'm getting a huge amount of spam in my referral logs and comment spam this week. I found this posting via Google, as I'm trying to find a cure. I had to institute mandatory registration on my blog, but the spammers are trying to sign up for accounts (I delete and ban them, of course).

    Must be something in the air. Good luck fighting those spammers, they really have made my blog less usable.

    Posted by Tony Walsh on Nov 17, 2004.

  8. Gravatar

    I have addressed referral spammers in a apache tomcat centric way. Thanks for providing your htaccess file. I am running a diff between the ips and the ones I have blocked. You can get a listing of the ips I blocked @ http://www.javablog.com/deniedRanges.xml

    Thanks again

    Posted by Ben Simpson on Dec 03, 2004.

  9. Gravatar

    I've also got many referral spams lately, which I attributed to those free .info domains some registrars have been running recently. Spammers can basically register as many domains as they want to avoid being black listed. And thanks to your .htaccess - they look complicated :) I only have

    SetEnvIf referer "-w+-w+.info" KEEPOUT
    SetEnvIf referer "-4u.info" KEEPOUT
    Deny From env=KEEPOUT

    Which blocks out majority of referral spams, and possibly some false negatives as well but I don't really care :)

    Posted by Scott Yang on Dec 05, 2004.

  10. Gravatar

    What I did for the refers to ban, is bounce them, also. This makes it so that every hit they spam to you goes back to their own site, and their own efforts show up in their own logs the next day. If you just absorb their hits, they'll never catch on.
    As a bonus, if enough admins implement the bounce-effect and a spammer attacks enough updated sites, they will catch a DDOS attack on their own sites from all the mirror-bounces.
    Simply replace the
    RewriteRule .* - [F,L]
    with
    RewriteRule ^(.*)$ %1 [R=301,L]
    And you're all set.
    Peace

    Posted by Pascal on Jan 06, 2005.

  11. Gravatar

    Thank you for that bounce trick, Pascal. I will do as you suggest.

    Posted by Simon Jessey on Jan 07, 2005.

  12. Gravatar

    The problem with your *.info blocking seems to be that you're requiring a . or a - after the word info. If you put the following two rules in, you might have better luck:

    RewriteCond %{HTTP_REFERER} ^http://[a-z.-]+.info/.*$ [NC,OR]
    RewriteCond %{HTTP_REFERER} ^http://[a-z.-]+.biz/.*$ [NC,OR]

    You may also want to use [NC] on the rest of your rules, to avoid someone bypassing them by hitting you with a referer in all caps.

    I've made my own rules available at https://www.resonant.org/badreferers.conf.txt if you are interested. I don't use that file exactly, since I have those rules embedded among other unrelated rewrites, but they're a cut-and-paste. At some point I really ought to go through yours and merge the ones that haven't hit me yet.

    Posted by Zed Pobre on Jan 24, 2005.

  13. Gravatar

    Thank you for the advice, Zed. I'll modify my rules accordingly.

    Posted by Simon Jessey on Jan 24, 2005.

  14. Gravatar

    I've got one more trick for you, just tested today. I was going through my logs checking for more link spam, and realized that 95% of the spam that was still leaking through was looking for the nonexistent file adserver/campaign.php:

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_URI} (adserver/campaign.php) [NC]
    RewriteCond %{HTTP_REFERER} !=""
    RewriteRule ^(.*) %{HTTP_REFERER} [R=301,L]

    In plain English, what this says is that if the file requested contains "adserver/campaign.php", and that file doesn't exist on your server either as a file or a directory, and a referrer is set, redirect back to the referrer. Otherwise, proceed normally.

    I'm guessing that there's a common commercial piece of linkspamming software that behaves this way. Nice of them to identify themselves. I wonder how long it will stay that way.

    Posted by Zed Pobre on Jan 31, 2005.