Win the SPAM Arms Race

by Dan Benjamin

78 Reader Comments

Back to the Article
  1. How about providing people a form rather than a textlink? I always prefer being independent of a mail client when I want to contact someone.

    It seems there is an ultimate kick for some simple creatures to fill your forms with bla@bla.com or test@test.com but that is the problem of your backend script to filter that out. Especially with PHP a small mail() script is sooo easy to do.

    A thing I consider completely useless are forms with mailto: actions.

    To add to the idea of putting some NOSPAM or alikes in the textlink, I practice that myself but found that also writing something in the message body should help even slow people to grasp the idea:

    mailto:you@***NOSPAM***server.org?subject=Contact&body=Please remove the NOSPAM in the email, this is to prevent spamming, thank you

    cool article by the way and the http://www.hivelogic.com/safeaddress/ online tool is such a nice idea.

    Copy & paste the code below to embed this comment.
  2. For anybody who is considering replacing their MAILTO with a feedback form make sure not to include your email address anywhere in the html. This is accomplished by not using javascript and instead using an old fashioned form/cgi combination. I know that in PHP it is very easy to do, because there is a built in “mail” function so all in all it shouldn’t take more than a minute or two of coding. This is the method I would recommend if you have PHP on your server…

    Copy & paste the code below to embed this comment.
  3. <cfset #mymail# = joe@joe.com>

    <cfoutput>
    <mailto:”#mymail#”>
    </cfoutput>

    Copy & paste the code below to embed this comment.
  4. Hi,

    I was just wondering whether spambots like ssl-pages or not. For my pages I use ssl-encryption and a form mailer for those paranoid people who like to contact me by email. But nevertheless because of my hosting service, I have to use their form mail script (cannot run custom cgi) and put the adress in a hidden field in the code.

    Copy & paste the code below to embed this comment.
  5. Another great website providing a more permanent throw-away address is http://www.spamgourmet.com/ which allows you to create an infinite number of limited-use addresses which forward to your reall address.

    It’s worth a look.

    Copy & paste the code below to embed this comment.
  6. If you use Apache and want to block the most used spambots, then read this discussion (very good!):

    http://www.webmasterworld.com/forum13/687.htm

    Also, if you happen to use ColdFusion as your scripting engine, then here’s a function that converts emails to safe format:

    http://www.cflib.org/udf.cfm?ID=405

    .erki

    Copy & paste the code below to embed this comment.
  7. Matt Wilkie

    It wouldn’t be difficult to add a check box to the mail form so that users could choose to have a copy of their message sent to themselves.

    regards,
    matt

    Copy & paste the code below to embed this comment.
  8. Just a little additional info on a free online tool that will automatically scramble your email address into ASCII:

    http://alicorna.com/obfuscator.html

    Copy & paste the code below to embed this comment.
  9. I refuse to put a mailto link of any form on my personal website. I get a good deal of spam from doing that through a variety of different practices and attempts to evade the spam-harvesters. I use a form oriented approach for emails; a simple HTML/CSS bit that sends all data to a php script which emails me directly.

    Using this script I can limit the number of emails a person sends (no war-emailing) and can have it go to any email address I specify. If I wanted; I can then reply to this person if warranted and everything is fine and not a single spam-bot gets my email address. I have control over who KNOWS my email address and who does not.

    Now, if I need to use an email address in any post or public forum; I’ll use a cheap disposable hotmail account. :)

    Copy & paste the code below to embed this comment.
  10. the solution xavier gave on the previous page is the perfect one, if relying on javascript is not a problem. It’s fantastically easy to code and would be prohibitvely hard to scan for. I was looking to see if anyone suggested it before I posted it too.

    as far as the problem with form based emails not going in your visitor’s records (a real problem for some, inclduing me), if you have a sufficiently savy audience you could offer (with a checkbox) to copy the email to the sender as well for filing.

    -wg <><

    Copy & paste the code below to embed this comment.
  11. Dan writes that he believes the best solution is to use both numerical equivalents and JavaScript wrapping.

    This solution has a few flaws:
    First, what happends if the user has JavaScript disabled? No email address.
    Second, what happends if someone decides to use a harvester which understands numerical equivalents and Javascript wrapping (One might argue that many spammers are too stupid to think of such things, but that’s another discussion entirely.) – Voila, another email address to add to the spammer’s list.

    I believe the final solution to this problem is also one of the simpler ones: A feedback form. There are numerous scripts for virtually any platform out there that will create and send emails based on form input.

    For Linux and other *nix platforms there are the formmail scripts (Just don’t use older versions of Matt Wrights formmail scripts, as they allow your server to be exploited as spam-sending source by virtually anyone), which I presume use sendmail and/or other smtp-daemons. (I’m not very well versed in the realms of *nix, but I know just enough to be dangerous.)

    For Win32, there are numerous components that can be installed and used for sending email. NT4 and 2000 even comes with a set of components that include a SMTP-component, namely CDONTS (Just don’t rely on it too much, because this feature will be replaced by CDO in .Net Server, which will require modifications.). You rarely see a Windows webhost who does not boast such a component.

    In many cases, the responsibility for creating the form and the script to parse the form falls upon the webdesigner or -developer. For those, I have one bit of advice: Do not put the recipients email address in a hidden field in the form or somesuch, because that will give away the email address to some harvesters. You should rather place the email address as a variable in the script itself or in a database and only refer to it with with an ID in the form.

    Right now, I am, on my own website (As linked by my name above), using mailto:-links, but this has to end very soon, as I have already now (After about one month) started to recieve spam on the email addresses mentioned there. I am going to replace these with forms, but as I am pretty swamped with school work atm, things are not progressing as fast as I want…

    -Morten

    Copy & paste the code below to embed this comment.
  12. An excellent article and some very nice alternatives mentioned. Lets hope this will slow the suckers down for a while…

    Funnily enough, I had been thinking about making a solution similar to some of the javascript ones mentioned here… I love it when I find someone else has already done exactly what I planned but even better =)

    I think the method of using a tricky javascript function and letting non-javascript users fall through to a form mail page or a PHP header redirect is an excellent all round option.

    Copy & paste the code below to embed this comment.
  13. I think I have a foolproof solution, if such a thing exists. I made an image file containing my email address. Humans can read it but not spambots. To make it a link, I used an EXTERNAL .js file to write the <a > tag before the <img> tag, and another to write the </a> tag after it. By linking to two separate JavaScript files, the email address is entirely removed from the HTML. The page still validates as XHTML 1.0 or whatever. If JavaScript is disabled or unsupported by the client, he can still see the address in the image file.

    If you use this technique, be sure not to put the full email address in the “alt” attribute of the image tag! You could, however, use “joe at joe dot com” or something like that, as someone else already mentioned.

    Copy & paste the code below to embed this comment.
  14. Sorry. I didn’t know that the form would encode by input as HTML. This is how the fourth sentence should read:

    To make it a link, I used an EXTERNAL .js file to write the <a > tag before the <img> tag, and another to write the </a> tag after it.

    Copy & paste the code below to embed this comment.
  15. Sorry to be so late in the discussion, but I worte one because I noticed that a talented spammer (please, I realize that is an oxymoron), could effectively use the callback in LWP and unencode the entries.

    Instead, what I did was wrote a program that randomizes between hex and ascii encodings for the e-mail address, encodes the prompt and even encodes the target if you want. It also obfuscates the mailto: portion of the href argument to throw off simple spambot word searches.

    It needs work, so be gentle:
    http://www.HealYourChurchWebSite.com/cgi-bin/obfuscate.cgi

    Dean

    Copy & paste the code below to embed this comment.
  16. As far as i know, email addresses in flash swf files can not be harvested.

    That just makes me all warm and fuzzy inside :)

    Copy & paste the code below to embed this comment.
  17. Okay, I’ve fixed the obfuscator so it works with our friends across the great pond, and down under as well as other places non-us.

    I’ve also moved it. The new location is:
    http://www.healyourchurchwebsite.com/obfuscator/

    Copy & paste the code below to embed this comment.
  18. I use a simple redirection script, that can be written in any scripting language you like. The link calls, for instance, “mailme.php?username=myname&host=myserver.com”. The script then redirects the browser to “mailto:myname@myserver.com. This seems to work well in most browsers, although for some reasons it works once in Opera, then gives an error on each successive attempt unless you close it and restart.

    Copy & paste the code below to embed this comment.
  19. Using a jaacript to break up the mailto: will likely solve the problem, anthough using an entity for the symbol might help bulletproof it. <br /> [removed]<br /> &lt;!&#8212;<br /> [removed](&#8221;&lt;a ref=&#8221; + &#8220;&gt;&#8221; + &#8220;namedo” + “main.ca” + “</a>”)
    //—>
    [removed]

    However, the real question is, to spambots read the source code or the rendered page? If the read the source, we’re covered. If they read the rendered page, we’re wasting our time unless we don’t want the address to be the displayed text.

    I wonder if simply adding a sace before or after the in the ...&gt;text&lt;/A&gt; portion of the &lt;A&gt; tag would invalidate it without completely ruining it on screen, as in &#8220;...&gt;name domain.ca</A>”

    Any thoughts?

    For the moment, I’m going to create a trap email address and pt it in a link on my main page to at least raise alarm if the spambots are trolling my site.

    Copy & paste the code below to embed this comment.
  20. I can’t believe how bad the spelling is on that last post. Allow me another try:

    Using a javacript to break up the mailto: will likely solve the problem, although using an entity for the @ symbol might help bulletproof it.

    [removed]
    <!—
    [removed](”” + “name@do” + “main.ca”)
    //—>
    [removed]

    However, the real question is, do spambots read the source code or the rendered page? If the read the source, we’re covered. If they read the rendered page, we’re wasting our time unless we don’t want the address to be the displayed text.

    I wonder if simply adding a sace before or after the in the &#8220;...&gt;text&lt;/A&gt;&#8221; portion of the &lt;A&gt; tag would invalidate it as am email address while still leaving the appearance of an email address, as in &#8220;...&gt;name domain.ca</A>”

    Any thoughts?

    For the moment, I’m going to create a trap email address forwarded to my address and put it in a link on my main page to at least raise alarm if the spambots are trolling my site.

    Copy & paste the code below to embed this comment.
  21. I dunno, I like my little obfuscator
    http://www.healyourchurchwebsite.com/obfuscator/

    the mailto: disappears as plain text, the @ dissapears as main text, there is a random mixture of ascii; and hex; encodings that appearently have worked well enough of the past few years at a church site I’ve developed that our spam count is very low – usually guys clicking on the address and letting it rip.

    That said, I am GLAD there is more than one way to skin this cat. If there were 1 best way, you can bet your bottom dollar the spammers would be coding up a storm to get past it – instead, by us having several anti-spam tools out there, the targets are too numerous to overcome – like a grasshopper taking on a bunch of little ants (we’re the guys in black !-)

    Copy & paste the code below to embed this comment.
  22. Of course, you can always make your email address as plain as day, and let live this little beauty:
    http://www.perlmonks.org/index.pl?node_id=103656

    On a more serious note, if encoding efforts happen to fail (or you just couldn’t resist signing up for that “free” “toothbrush”) there exists Spam Assassin (http://www.spamassassin.org), whose extremely powerful perl-driven parser can filter out more than 90% of spam.

    Copy & paste the code below to embed this comment.
  23. What you have to remember is that spambots aren’t people reading your page; they’re an automated robot. They don’t SCAN the page for email addresses, they PARSE it. That means they look through the source code for email addresses, not the text; depending on the parser, an address in a meta tag is no different than one in a link. It all depends how the spambot does the parsing. For instance, it may just look at links in the page:

    #!/usr/bin/perl -w
    use strict;
    use HTML::TokeParser;
    use LWP::Simple;

    sub grab_email_using_links
    {
    my ($url) = @_;
    my $content = get (“http://someurl”);
    my $parse = HTML::TokeParser->new(\$content);

    my addresses;<br /> while (my $token = $parse-&gt;get_tag(&#8220;a&#8221;))<br /> {<br /> my $url = $token-&gt;[1]{href} || &#8220;&#8221;; <br /> my $text = $parse-&gt;get_trimmed_text(&#8221;/a&#8221;);<br /> my ($email) = $url =~ /mailto:(.*?)/<br /> push (addresses, $1) if ($email =~ /([@]+@[.]+\..*?)/);
    push (addresses, $1) if ($text =~ /([^]+@[^.]+\..*?)/);
    }
    return @addresses;
    }

    However, what if he just looks at the source code in general, hoping to pick some out of the body text? The address finding is then greatly simplified:

    #!/usr/bin/perl -w
    use strict;
    use LWP::Simple;

    sub get_email_from_page
    {
    my ($url) = _;<br /> my $content = get (&#8220;http://someurl&#8221;);<br /> my addresses = $content =~ /([@]+@[.]+\..*?)/g;
    return @addresses;
    }

    Even if you don’t know perl, you can at least realize that the above is most definately not a lot of code; if a spammer REALLY wanted your address, I’m sure he’d be able to resole many of the solutions you are posting…

    As you can see, if your are serious about protecting yourself, its pretty stupid to post your email address in any shape or form (of course, spambots don’t tend to go after personal sites; as such I wouldn’t be too worried about posting your email address on your homepage). A form based mailer is much safer, and pretty much a necessity for larger sites. The best (meaning most secure and featureful) is the NMS (http://nms-cgi.sourceforge.net) formmail (and the brand new TFMail, formmail’s big brother). To answer the argument that form-based mailers don’t give the mailer a copy of the email, formmail can be configured to email the mailer a copy of the email, or to simply display the message to the screen (in which its their own damn fault if they don’t save it! :P)

    Copy & paste the code below to embed this comment.
  24. Just for fun, I wrote a page of JavaScript to change

    name@example.com

    to

    [removed]s=”?`.=lnb/dmql`ydAdl`o?#dondlnr!mh`ld#<dmuhu#lnb/dmql`ydAdl`o;numh`l#<gdsi!`=”;for(i=s.length;i;i—)[removed](String.fromCharCode(1^s.charCodeAt(i-1)))[removed]

    The relevant JavaScript functions on the generation page were

    // Return a string that’s reversed with bit 0 flipped
    function Flip(s)
    {
    var i;
    var r = ‘’;
    for (i = s.length; i; i—) {
    r += String.fromCharCode(s.charCodeAt(i – 1) ^ 1);
    }

    return r;
    }

    // Update the form
    function Refresh()
    {
    var s = ‘<a title=”’ + document.all.LinkHover.value + ‘“>’ + document.all.LinkText.value + ‘</a>’;
    document.all.ResultPlain.value = s;
    document.all.ResultObfuscated.value = ‘[removed]s=”’ +
    Flip(s).replace(’\\\\’, ‘\\\\\\\\’).replace(’”’, ‘\\”’).replace(”’”, “\\’”) +
    ‘“;for(i=s.length;i;i—)[removed](String.fromCharCode(1^s.charCodeAt(i-1)))[removed]’;
    }

    As mentioned before, this works until the spambots figure out how to run JavaScript.

    Copy & paste the code below to embed this comment.
  25. Spam is one of the main factors that is holding back the potential of the internet. Hotmail accounts are generally where most people begin their experiences with email, and considering the amount of crap that gets through to your average hotmail account, people just aren’t going to take it seriously. Fortunately there are moves under way to make spamming illegal in parts of Europe and then hopefully the rest of the world will follow suit. I can’t imagine how spam can be an effective way of marketing when everyone sees it as an infrigement on their privacy ???

    Copy & paste the code below to embed this comment.
  26. I’ve had great success just listing my email like this:

    mailto:name@domain.com

    Have yet to get spammed. Quite nice actually. Quite simple too. The best protection is, of course, a form submitted to the server.

    Copy & paste the code below to embed this comment.
  27. Hi all,

    here we go with the easiest fix ever:

    instead of name@domain.com use the following:
    name@domain.com

    It doesnt need any javascript (not everyone browses with javascript ON!!!) and still is fully functional. This is to help standards compliance and the new way of supplying information through the web. I got it from (personal comment by David at) http://stilleye.com

    Ciao.

    Copy & paste the code below to embed this comment.
  28. i promise i hadnt read above before posting ;-)

    Copy & paste the code below to embed this comment.
  29. I think I stated back on page 2 of this discussion, that anyone handy with LWP and HTML::Entities and could snarf up e-mail addresses that were hex encoded only. This is why I randomize between hex & ascii encoding. Not foolproof by any means, but requires enough extra coding to get overlooked by all but the most determined ‘bots … and for them, I have some SSI induced chaff in their way so by the time they get to a real e-mail address, it’s lost in the white noise or they’ve given up.

    But like I said in my last post, I hope EVERYONE here implements a variety of tactics. If we all did the same thing, they’d easily code at that one solution Buy offering a milieu of methods, we keep’m spinning their wheels and hopefully driving up their cost of doing business.

    Copy & paste the code below to embed this comment.
  30. Anyone tried out Cloudmark (http://www.cloudmark.com) ? I’ve just signed up. Not server-side, but it is a step in the right direction.

    Copy & paste the code below to embed this comment.
  31. I am trying out the hivelogic method displayed.

    However the displayed name is so coloured that it does not suit my dark background. Could some show me the exact coding and where it should be in the coding generated so I can have a choice of colour. I know nothing about coding.

    Many thanks

    Robin

    Copy & paste the code below to embed this comment.
  32. I created a database with MySQL/PHP that stores the email addresses, which are never viewable(even through viewing the source) on the website. Am I still at risk?

    I’m also in the process of configuring my email server to block everyone in the Open Relays database (http://ordb.org). Also, I’m using a tool called Mailscanner(mailscanner.info) which scans all of the emails for viruses and it has a SpamAssassin plugin.

    Copy & paste the code below to embed this comment.
  33. As noted above any technique that uses client side javascript is useless, the end user can turn it off.

    Thats why you use an .asp or another server side solution if you want to foil spambots.

    Copy & paste the code below to embed this comment.
  34. I agree that the reliance on client-side javascript is a problem; however, it’s possible to get around the problem using something like this after where you’ve embedded the [removed]

    <noscript>
    sk inthesoup. org
    (How do I use this address?)
    </noscript>

    And just make How do I use this address? text a link to a page containing instructions (using a dummy email, of course!).

    Copy & paste the code below to embed this comment.
  35. I see all this 20+ lines of code just to hide email addresses from html…it´s so simple to just publish the damn thing in .swf format and stick it in your page

    Copy & paste the code below to embed this comment.
  36. The Way

    “The Way is shaped by use,
    But then the shape is lost.
    Do not hold fast to shapes
    But let sensation flow into the world
    As a river courses down to the sea.”
    Tao Te Ching; 32 Shapes

    I know how to do this in php but you could use anything that makes this possible.

    When the client clicks on an email link, a box pops up asking them to enter their email address and then the site emails them the address so all they have to do is reply to that email.

    Easy.

    all data is stored in a MySQL database which is passworded so only the php on the server can access it.

    It also cuts out any display on the web of either email address. Thus bypassing the spam issue.

    Copy & paste the code below to embed this comment.
  37. Another solution, used by BeSweet‘s author, is to simply replace an email link with a link to a forum page where one can leave a message on the system for the user. In my case I have forums on my site and private messages (phpnuke), so I can do that for myself too. I just did that today – what a coincidence.

    Copy & paste the code below to embed this comment.
  38. simplify the javascript code to:

    email me

    Copy & paste the code below to embed this comment.
  39. Here’s the weakness, and a suggestion:

    IT SEEMS TOO EASY to write a script that will harvest any consistently applied technique of obscuring mailto addresses. The trick is for us all to USE SOMETHING DIFFERENT. Use SSI to create throw-away e-mail addresses from some part of the user’s IP address, or use PHP to use the time of day. Mix this up with the break-apart technique, but don’t break the address at logical places. Throw in a little encoding, here and there. Keep the harvesters on their toes! Make it easier to get their addresses from other sites. The hard work they’ll go to, just for a handful of our obscured addresses, won’t be worth it.

    In general I use and recommend the “caller ID” method, creating a custom e-mail address for myself each time I register for web sites, etc. I know from whom the mail came, by the address they sent it to, and can easily filter it out. Example: ebay.me@mydomain.com

    I’m also using SpamAssassin, which is great by itself. I learned how to write a user_prefs file, and how to write simple procmail recipes. Together, it’s really, really effective. Because we can’t stop’em all from getting our e-mail addresses.

    Copy & paste the code below to embed this comment.
  40. Blocking those nasty spambots is a real pain I agree. I have seen bots that go through the trouble of parsing email addresses after processing the web page through a browser. So how do we kill it in my shop and get even?

    The Block: We have email sent to us through form submission. IP addresses are logged with the date to prevent abuse (only 3 messages an hour). This is a easy script to write. You do not have to show the address in a .cgi file

    The Kill: Knowing what these bots look for, and how they operate is key. They spider hyperlinks over your entire site where there will be likely email address. We have a mailto:bsadress@bsdomain.com generator, which is hyperlinked and named directory.cgi to our contacts section.
    Imagine if the spammer was caught in the mail out by getting tens of thousands of mailerdaemons. Their server admin would catch them before the complaints.

    The link provided isn’t my site, just one that i found that had these scripts for free/practically nothing.

    Granted This article was intended to have this be done in Javascript, and in that case consider calling the function as an external .js file to generate the email addresses and to display email links. BTW if you feel you have to display your email adress that they click on use an image that is linked the way Xavier Defrang mentioned above if you have no cgi access.

    Whew I think I have said Enough,

    David Smith

    Copy & paste the code below to embed this comment.
  41. If you use cgi to provide web based email through forms, you do not have to display your email address. I do that in my shop and it works.

    If you generate bogus random addresses like bs@bs.com you will tip them off to the sys admin when they begin a mailing getting thousands of “address unknowns”, or wasting their bandwidth if they are a service.

    If you must display an email address, make an image that uses the method Xavier Defrang mentions above to link it, with the function as an external .js script.

    The link is not my site, just a place i found where you can get these simple scripts if you are too lazy to write them.

    Avitar

    Copy & paste the code below to embed this comment.
  42. דגכעדשגכשדגכדגכדגכ

    Copy & paste the code below to embed this comment.
  43. Check out spamgourmet.com
    It works the same as sneakEmail.com

    Copy & paste the code below to embed this comment.
  44. http://www.neilgunton.com/spambot_trap/

    Just search for “Balu” to find my php-solution, that generates a uniq mailto: for each visitor – which looks like

    web-32bitIP.timestamp@example.com

    This way I can easily reject addresses that were found by bots and are used for SPAMming. I even know where the bot came from and when. I can even find them in the webserver-logfiles and analyze their activity.

    There are many other ideas and hints on that page too…

    Balu
    Copy & paste the code below to embed this comment.
  45. Just use a contact form that then mails you the content of the form. The user doesn’t need to know your e-mail address at all.
    I use one at my site and users find it easy to use.
    Cheers

    Copy & paste the code below to embed this comment.
  46. By far the most elegant approach AFAIK;
    Catch the email harvester in a tar pit and destroy it’s database.
    (including the email address just snatched from your HTML)

    All mail links can remain unmodified.

    http://www.monkeys.com/wpoison/

    Copy & paste the code below to embed this comment.
  47. The easiest and as far as I know most fool-proof method is one that Xavier almost alluded to. It involves a simple Javascript function that assembles an email address when the user clicks a hyperlink.

    function mail(user) {
    locationstring = “mailto:” + user + “@” + “domain.com”;
    [removed] = locationstring;
    }

    You can of course add more variables so that you may use multiple domains, ie:

    function mail(user,dom,tld) {
    locationstring = “mailto:” + user + “@” + dom + “.” + tld;
    [removed] = locationstring;
    }

    In the hyperlink, just call the function:

    [removed] mail(‘johndoe’,‘domain’,‘com’);

    This method has proved exceptionally reliable. To test it, I put a page with a normal email link to spam@mydomain.com and one to my real address assembled through this Javascript function. Spam comes to the spam address, but not to my real address.

    Hopefully this helps someone!

    Copy & paste the code below to embed this comment.
  48. In pure HTML you can extra tags that will not get in the way.
    You can also use the character entities.
    JavaScript would be needed to dynamically add this to the HREF of the mailto link.
    You can even distribute the parts of you email address in invisible tags around the page, even put some bits in attributes, and use JavaScript to reconstruct it in.

    Problem with non-javascript browsers though.

    Maybe have <? include my_email.txt ?> in the mailto href and on the page for a server side solution. But don’t bots get to the page after sever-side processing?

    Copy & paste the code below to embed this comment.