Graceful E-Mail Obfuscation

by Roel Van Gils

101 Reader Comments

Back to the Article
  1. I wonder why it is not mentioned to ‘build’ the email address with a mouse-over and an innerhtml javascript swap, this is what I always do:

    http://blog.thingsdesigner.com/index.php?/archives/199-Spam-safe-email-link.html

    Copy & paste the code below to embed this comment.
  2. Most bots are now using rhino (a java implementation of javascript) to parse and execute scripts on pages. Thus, they would have no trouble defeating this.

    Copy & paste the code below to embed this comment.
  3. For some time, I’ve been using CSS and Javascript to hide emails from harvesters and, so far, it proved to be quite effective. You can read more about this on “my blog”:http://www.it-base.ro/2007/06/20/email-anti-spam-protection/.

    Copy & paste the code below to embed this comment.
  4. @Gareth Adams:

    You’re right. I’m aware of the fact that, according to the RFC, a plus sign is allowed for the local part of an e-mail address. In reality, however, e-mail service providers typically don’t allow user to create addresses that contain a plus sign. I did point this out in the article.

    Though, you can easily adapt the regex (both the JavaScript and the PHP one) so that the @ is replaced with something else (instead of ‘+’).

    Copy & paste the code below to embed this comment.
  5. why are we discussing this? Just use email with a good filter like SpamAssasian, or route all your mail through Gmail and/or Google Apps for your Domain.
    I have my email address plastered all over the internet, and i see maybe 1 SPAM per week in my inbox, while my gmail spam folder fills up with 100+ per day. The false-negative rate is zero, as far as I can tell for the past several months. So instead of trying ultimately futile methods of security through obscurity, just let Google or someone else do the hard work for you.

    Copy & paste the code below to embed this comment.
  6. @joe:

    So the idea of thousands of spam messages being sent to your mail server doesn’t bother you as long as you don’t see them in your in-box? Does it concern you that bandwidth is being wasted on these messages?

    To me it’s like heating your house in the summer and then running air conditioning to lessen the heat. Sure, if your air conditioner is powerful enough, you can lower the temperature to a comfortable level—but look at all the power you’ll waste in the process.

    I agree that no solution is pefect (so does the author, and says so) and that some spam will inevitably find its way to your server. But it still better to try to prevent addresses from being harvested. Belt and suspenders. Better levees and better evacuation procedures.

    Copy & paste the code below to embed this comment.
  7. I wonder why it is not mentioned to “˜build’ the email address with a mouse-over and an innerhtml javascript swap.
                   
    Possibly because this is then unusable to anyone who can’t use a mouse?

    For me, the bigger problem is that it requires a specific server type, or use of a specific language. I want something that I can use on different platforms, with different server languages that meets the original requirements plus platform independence.

    But I accept I’m probably in cloud-cuckoo land for the time being: so I’ll just stick to my contact forms and/or published email addresses with spam filters…

    Copy & paste the code below to embed this comment.
  8. I think you’re missing the point when you’re saying that “e-mail service providers typically don’t allow user to create addresses that contain a +” The point of plus addressing is to add on to your email address. It’s a great “native” feature to fight spam by allowing you to track who sold your email or block specific incoming emails, all from a single email account… (It works great with gmail btw).

    I think your solution is interesting (although it would seem that it might strain the server too much for what it does, but that’s just a guess at this point) but since this article is on ALA, it would be good if it was updated to replace “+” in your regex. I think most people will just use your script without modifications and thus ALA will effectively participate in indirectly furthering the obsolescence of plus addressing which is itself a great anti-spam feature… Sort of ironic since the point of your article is to fight spam ;) No?

    Copy & paste the code below to embed this comment.
  9. “A “+”? is typically not allowed in real e-mail addresses and it doesn’t have to be URL-encoded”

    That’s wrong on both accounts, actually. As Yann B already pointed out, “” is perfectly valid in an email address. As for URL encoding, if you don’t encode the “”, it gets turned into a space when the query string is decoded. So, unless that’s the desired effect, you DO have to encode “+” in a URL.

    Copy & paste the code below to embed this comment.
  10. On sites I build, I don’t have email links anymore. Everything goes to a form, which in turn has a logic question (not a captcha image) to defeat bots. Yes, that presents a certain barrier to users, but if the forms are accessbile and the user even slightly motivated, it works. The email servers are no longer swamped and the request forms don’t become porn magnets.

    Copy & paste the code below to embed this comment.
  11. @joe:

    Agreed. The bandwidth is already being used, and it’s Google’s, thusly, no spam will ever end up on any server belonging to me. Not mention, I won’t need to have this script run every time someone visits my page, so extra savings there as well.

    If someone really wants to take the hard way to stopping spam, maybe they should write a letter to their congressman. Not only is it less effective, but takes many times longer to see results!

    Copy & paste the code below to embed this comment.
  12. I run my own mail server for me and my family, and I encourage them to use a + in their email addresses for different sites, so they know which sites are sharing their e-mail addresses.

    It annoys me to no end when I try and sign up to a site and it says my e-mail address isn’t valid!

    I would aggree with Vann B, that a lot of people are going to use the ALA scripts unmodified, and in that respect, would encourage fixing the code to allow the + to be used.

    Copy & paste the code below to embed this comment.
  13. I (also) wrote a tool for obfuscation that works better than most of the other online tools I’ve seen. It was written mostly for myself, but it does work, so maybe someone else can benefit from it. Details about why I think it’s awesome are on the FAQ page if anyone cares.

    The techniques it uses all suffer from the negative points mentioned in this article, however.

    http://www.obfuscatr.com/

    Copy & paste the code below to embed this comment.
  14. In reality, however, e-mail service providers typically don’t allow user to create addresses that contain a +. I did point this out in the article.

    Errm. You mean like Gmail. I’m always giving out e-mail addresses with a + in them. It allows for easy filtering.

    Copy & paste the code below to embed this comment.
  15. JZ notes: “So the idea of thousands of spam messages being sent to your mail server doesn’t bother you as long as you don’t see them in your in-box? Does it concern you that bandwidth is being wasted on these messages?”

    Of course bandwidth is a concern, but the solution isn’t an arms race, particularly an arms race where the website owner is degrading the usability or accessibility experience of the site visitor. By treating all visitors as bad until they confirm themselves as humans (or humans who correctly answer a generated question) is treating visitors as culprits.

    If you want to avoid email spam, don’t use email. If you insist on using email as a means of communication then it is your responsibility to deal with the implications of using email, and you should not belabour the visitor and treat them like a guilty party until they prove their innocence.

    The techniques in this article are both simple and easily reversible. Even the “human proof” question is easily scraped/parsed and answered. Its a generally trivial regex to reduce the question into a form that can quickly be calculated.

    I’m watching black-hat SEOers and comment spammers automate the signup process on Blogger and Yahoo, their code works fairly stably, and both systems use Captcha and other anti-scripting techniques. (I did a presentation on this in the last Barcamp London)

    The problem here is that spammers have a monetary incentive to break through these flimsy defences. So any solution is merely temporary until the spammer is incentivised enough to spend an hour coding their way through these obstacles.

    There is no long term benefit with this solution, but there is a long term usability cost. How do visitors typically react when they are presumed guilty the first time they visit a website? Is that the experience we really want to be recommending?

    Copy & paste the code below to embed this comment.
  16. A very well written article as well as a brilliant idea.  Thank you for sharing. I work for a public university so accessibility and security are also major concerns for us.  We have had a lot of success using SpamSpan (http://www.spamspan.com/), which relies entirely on JavaScript and doesn’t embed quite as much data into a page as some other solutions.  It is also based on the DOM and is easy for Contribute users to remove (which by default protects embedded forms and JavaScript).

    I do have a question about the solution you suggest to handle if a user has JavaScript disabled.  The form presents a question to the user, which when answered correctly, proves they are not a machine.  As a person working in an accessibility-minded social-services agency, do you think that this form poses a problem for persons with cognitive disabilities?

    Copy & paste the code below to embed this comment.
  17. While I certainly agree that a + sign is allowed in email, and many people do in fact use it as an anti-spam measure (though, to be accurate, it’s more about tracking spam than preventing it), I honestly can’t expect it to serve this purpose for very long.

    If it’s standard behavior for email servers to simply discard the + and anything after it, what’s to stop people from simply doing the same thing? If I run a site which collects email addresses, and I intend to sell them, I’ll most certainly just run a quick regex and strip the + and everything after it before selling the address. Harvesters will likely do the same soon, if they don’t already.

    I’m not saying you shouldn’t use a + in email addresses you give out, just don’t pretend it’s a cure-all, when what limited usefulness it currently serves is bound to be lost in the not-so-distant future.

    Copy & paste the code below to embed this comment.
  18. Given the example in the article uses preg_replace() in the callback for the output buffer, a large page on a high traffic site can introduce some performance problems. The degraded version, however, has proven time and time again to help thwart spam. If you fill out a contact form (something like /contact/sales or /contact?who=sales), you still achieve the removal of the email address from the site, your users can still contact you, and you drop the expensive preg, the JS reversal of obfuscated addresses, and the Apache rewrite rules. While this probably goes without saying, be sure to benchmark before deployment.

    (Yes, it’s possible to automate the posting of data to the form in order to achieve the same goal, but this requires a custom tooling of a harvester which continues to be more effort than it is worth when there are thousands of other emails floating around on the Internet.)

    Copy & paste the code below to embed this comment.
  19. First off. Before I rant. Excellent article and ideas. I will be using the ideas and techniques in the future.

    And now, rant one:
    Congratulations on using some of the most unrepresentative of “average” stats available—W3Schools, a site visited mainly by Web designers and developers, not the general public. Let’s stop using stats as an argument whether or not to adopt something as best practice. There’s no such thing as a universally representative set of stats. Stats are only useful from your own site for analyzing your own users—and even then, it’s only useful for analyzing the users that you currently support (not the potential customers that are getting a sub-par experience through bugs, bad code or obtrusive practices).

    Rant two:
    I was going to mention the + as being valid. Others did. You also had a fine point: modify the RegEx to fit your own needs.

    Again, thanks for some excellent ideas. Please be responsible with those quotes of statistics. ;)

    Copy & paste the code below to embed this comment.
  20. I’d be very interested in seeing stats on the cost of processing these scripts on the server. Have you run a comparison study?

    Copy & paste the code below to embed this comment.
  21. I prefer using contact forms, especially on my work web site where there are many departments. I can use a pull-down to route the email to the appropriate department based on the inquiry, and the email addresses are maintained in a database so that should a contact person change I can change it in one place instead of searching the whole site to make changes.

    Digital Web recently had a helpful article on building “bulletproof contact forms”:http://www.digital-web.com/articles/bulletproof_contact_form_with_php/ .

    Copy & paste the code below to embed this comment.
  22. An interesting read, but I was disappointed to see that it was very platform specific.  ASP developers are out of luck.

    Copy & paste the code below to embed this comment.
  23. A similar but less complicated solution to this problem can be found here: http://www.scss.com.au/family/andrew/webdesign/emaillinks/#files

    It reads <a> tags with class=“email” and gives them a mailto: href based on the link text, e.g. ‘“My Name” <me (at) here dot com>’.

    Copy & paste the code below to embed this comment.
  24. Great techniques posted here – but sadly there is one hugely gaping hole that is if you use a real email address when registering a site – I made the mistake of registering with my real (as opposed to either of my yahoo & msn spamcatcher) addresses and my spam went from 1 every few weeks to 300 a day in the space of 2 weeks !

    Copy & paste the code below to embed this comment.
  25. I found the article very interesting from a technical, problem-solving perspective, but as others have pointed out it’s not really necessary to go to this much trouble.  My email address is all over the Internet and there’s nothing I can do about it, so I pipe all mail through Gmail and then access it with POP3 or IMAP.  My own Web sites use mail forms instead of mailto links.  I see perhaps one false-negative (spam not identified) per week and have never seen a false-positive.

    Copy & paste the code below to embed this comment.
  26. E-mail obfuscation is a solution, but to an invalid issue.

    You are missing the big picture. The ultimate objective is not to find a clever way to publish e-mails addresses, readable by users, hidden to automated harvesters. Or to filter spams downstream, as some have suggested here. It’s to offer your readers/clients/users an effective way to reach you, fast and easily. This is what really counts. And, that is why you published an e-mail in first place. Since spam renders the e-mail approach ineffective, you need an alternative to client-side mailing.

    You need an a server-side communication system. The solution is obvious : a mailing form, handled server-side, to initiate the contact. That way, there is no e-mails to be harvested and users can still reach you. It’s that simple.

    Copy & paste the code below to embed this comment.
  27. Microformats is one of the biggest thing on the horizon. The idea here is to provide visitors of your website with useful information including your email address.

    So one of the questions for me is how do we manage to provide this information without opening the flood gates to spamming. Any thoughts on that?

    Copy & paste the code below to embed this comment.
  28. Rudi Gens asks: “So one of the questions for me is how do we manage to provide this information without opening the flood gates to spamming. Any thoughts on that?”

    Tackle the problem at the source. Link email harvesters to spammers, and then prosecute email harvesters. See www.projecthoneypot.org/

    Its better than wallpapering over cracks.

    Copy & paste the code below to embed this comment.
  29. I was checking out your example page in FireFox and selected the email addresses, right clicked, and selected View Selection Source, which converted the email addresses for me.  I’m not sure what FireFox does to the code to generate that but it converted it for me in plain text.

    Copy & paste the code below to embed this comment.
  30. I don’t sell my visitors email addresses, but I purposely strip out everything after the + in their email addresses (IE, bob+fakenamegenerator becomes bob).

    It wouldn’t surprise me if people who sell email addresses “clean” them first, or if spammers “cleaned” their mailing lists before sending out spam.

    Copy & paste the code below to embed this comment.
  31. Several people have commented that using forms eliminate the problem of spam from web sites. If only that were true. Spammers are already fully automated in posting to forms; one of my sites with just two forms produces nearly 1,000 bogus posts every day. Which means you’re right back to where you started… filtering for spam.

    Copy & paste the code below to embed this comment.
  32. Have to agree with Mike Davies comments made in his post of “The site visitor should not have to prove their innocence – they are not guilty”.

    Spam is a cost of doing business.  E-mail spoofing is a more critical issue, I believe. Ensuring validity of e-Mail to clients and customers by use of a digital signature is a fundamental requirement, also.

    Attempts to defeat the keyboard banging monkey should never defeat the ease of use nor convenience for the customer to communicate.

    Copy & paste the code below to embed this comment.
  33. As Marty points out, the use of + to append to the local part of an email address will be ineffective as long as the suffix can easily be filtered out.

    The answer to this is to make it common practice to junk all mail which does not include a suffix that the account owner has given out and does not come from whitelisted contacts and does not include some secret.

    Usage: friends mail joe, registrations made to joe+ somesite, friends-of-friends use either eg. joe+ bill or include [a987sd] in subject, until they are whitelisted in their own right.

    Force people not to junk the + if they want to contact you. Then you deal only with spam that has been sent through friends’ addresses and have not required everyone to install PGP or something.

    Copy & paste the code below to embed this comment.
  34. This article just gave me a funky idea. I tesed it with some browsers I had available, and it does seem to work.

    The idea is to use some “/contact.cgi/encoded-e-mail-address” URL, and then have the contact.cgi do a 3xx redirect to the mailto: URL.

    Sure, the bots can now try and follow all your links in the hopes they will find a redirect to actual mail address there. But I think it would be prohibitely ineffective for them, and also would leave a recognizable mark in he logs.

    Copy & paste the code below to embed this comment.
  35. Personally, I find ‘mailto:’ links annoying, so I try to keep first-time contacts limited to a form handled by PHP. However, a mailto link could be returned by PHP just as easy.

    For example, you could have a link like this:
      email Jim Turner

    In PHP, you could keep an array which says who everyone is or connect to mySQL and get the email addresses that way.

      <?php
        $who;
        $email; // an array with Jim Turner in slot 1    
        if($who = “jim_turner”)
          header(“Location:mailto:” . $email1);
      ?>

    While it might be an annoying script to write, you could code this once and forget about it as long as you can remember your co-workers’ names.

    Copy & paste the code below to embed this comment.
  36. one of my sites with just two forms produces nearly 1,000 bogus posts every day.

    It means only one thing : you have designed forms for robots to use, not humans. Your forms don’t require any intelligence and they don’t speak “human”. You are still thinking forms as a bunch of input fields put together, and possibly you don’t validate the supplied data properly.

    Let me guess. You have published a form with inputs labelled so robots can use them:

    Name: [___________]
    E-Mail: [___________]
    Message:[_____________(multi-line textarea)_____________]

    How about a single textarea instead?

    If you need to contact us, please leave us message in the following box. Don’t forget to identify yourself and give us a mean to reach you (a phone number or an e-mail address will just do fine): [____(textarea)____]

    There are a multitude of other creative ways to do deal with spammers. Require intelligence.

    I have two questions for you. How many bogus posts do you see in this thread? How do you explain the difference between this thread and your 1000 bogus posts/day form design?

    Copy & paste the code below to embed this comment.
  37. If you need to contact us, please leave us message in the following box. Don’t forget to identify yourself and give us a mean to reach you (a phone number or an e-mail address will just do fine): (textarea)

    A form that requires people to follow instructions? As if!

    Copy & paste the code below to embed this comment.
  38. “Contact the webmaster” doesn’t make it clear to the user that (s)he is about to hit a mailto link which will make their email client pop up (which is especially annoying when this users only uses webmail). “Email the webmaster” is better, but could still lead to a form.
    I think the best way is to use the actual email address as a label. But that means that the address is still in the code and can therefor be harvested. So I think I’ll stick to the name(at)domain.nl notation which will be translated into a mailto link using JavaScript.

    Copy & paste the code below to embed this comment.
  39. BTW, nice illustration ;-)

    Copy & paste the code below to embed this comment.
  40. I’ve just checked two day’s worth of email and only 11% of the spam is to the address that’s included, without obfuscation, on every page of our company website.  In our case, the biggest spam magnets by far are the people who read HTML emails and automatically download external images.  Web bugs give the spammers an excellent record of which messages get through our filters and which don’t.

    That said, well done for a nice article with some interesting points.  I’m just glad I don’t need it!

    Copy & paste the code below to embed this comment.
  41. I also tried the 3xx redirect to a mailto URL, and while it does indeed bring up an email client, it often leaves the browser with a blank white page. So by clicking a contact button, your site itself disappears.

    You could get around this by opening the contact link in a new window, so your site is still open, but then you end up opening two windows (browser and email), with one of them serving no purpose other than to confuse your users. Not recommended.

    Copy & paste the code below to embed this comment.
  42. How will this code adapt to the situation of an email address such as john.doe@subdomain.domain.tld.countrycode ?

    If both periods and the ‘at’ sign are encoded to plus signs (or any other common character), it’s impossible to know that the email wasn’t supposed to be:
    john.doe.subdomain@domain.tld.countrycode

    Isn’t it?

    Copy & paste the code below to embed this comment.
  43. So what about blogs on subdomains that have no access to scripts? Script-based solutions leaves out millions of people on Blogger, LiveJournal, and Wordpress who get spammed just as bad as people with root access to their own domains. I personally get a Nigerian scam and a you-just-won-the-lottery scam every single day, and this has been going on for months now (although my email addresses have been public on my blog for years, it’s only lately that spammers are so aggressive). Can only the people spending money on hosting get access to the best protection, or is there a way to equalize such protection among the masses? I hate the inequality here.

    Copy & paste the code below to embed this comment.
  44. marah marie: do you realize how cheap it is to get a domain/host these days? lj’ers/bloggers/etc etc etc clearly aren’t hosting solutions. they’re just available as featureless freebies. if you want access to php/etc you will usually have to pay for it like everyone else. hows that for equality?

    Copy & paste the code below to embed this comment.
  45. @Kit Grose: If you had read the comments (specifically, comment 9)—or looked at the code—you’d have discovered that this code is smart enough to take multiple periods into account.

    Copy & paste the code below to embed this comment.
  46. @Walter Wlodarski:

    How about a single textarea instead?

    Which will be promptly filled with spam and submitted anyway. Lack of an “email” field will hardly be a deterrent.

    How many bogus posts do you see in this thread? How do you explain the difference between this thread and your 1000 bogus posts/day form design?

    Most likely due to ALA requiring an actual login, and to being policed/filtered via script, not because of the particular form fields present. What I get on my feedback form is absolutely no different than the average comment spam posted automatically to any blog site (even though my site is not a blog). So my point—still—is that feedback forms end up requiring exactly the same kinds of spam filtering that email requires.

    Given that not all such feedback forms on all my sites produce volumes of spam, I’m assuming that these spammers keep lists of forms and form field names that they use for their junk. Some of my forms are now on their lists, and some are not. They probably sell those lists to each other, too, as the volume only increases with time.

    Copy & paste the code below to embed this comment.
  47. This will probably work great for a while, maybe even years. But one of these days spammers are going to get smart and start using something like htmlunit (http://htmlunit.sourceforge.net/) to emulate all browser functionality (including JavaScript) and get around even the fanciest solutions.

    I think the best long-term solution will always be spam filters. We have our company’s contact email out in the open, unobfuscated, on the site. What’s stopping the spam? Google Apps and the same spam filter that powers gmail. We’ve have the address out there for over 2 years and received exactly 4 spam messages (that got to the inbox) in that time.

    Copy & paste the code below to embed this comment.
  48. It seems that we are doomed to be always behind the spammers, trying to find a solution for their damaging conduct. So far this kind of spam has never been a big issue for me, however, I agree with the previous comments that spam filters might be a more elegant long term solution to the problem than resorting to codes which are anyway potentially vulnerable.

    Copy & paste the code below to embed this comment.
  49. I’m joining the chorus: please do revise the published code to take RFC 2822 into account. Seems like ALA should be encouraging compliance with published standards, not brushing them aside.

    If you need another example of why you might want to use a plus sign in your email address, read this NYT article about a woman who had a “stillbirth at 31 weeks”:http://www.nytimes.com/2005/09/20/health/20case.html and was still getting baby-related mail a year later. Every time she got a portrait studio ad it reminded her how old her daughter should have been (“Smiling: Your 3-month-old!”). Using an address like janedoe+baby@domain.com, she could safely have registered at as many pregnancy sites as she wanted, and then if she needed to, she could shut off the baby email with one filter.

    Copy & paste the code below to embed this comment.
  50. I enjoyed the article and think that’s an interesting solution.  However, it seems like an awful lot of work.

    For anyone interested, I came up with (yet another) JavaScript-based email obfuscater a few months back.  I don’t think mine is better — in the end, all of these systems are hacks that will eventually be defeated — but I think mine is simpler and may be easier to use.

    http://pipwerks.com/journal/2007/06/13/email-address-obsfucation

    My version relies on JavaScript, but also degrades decently.

    Hopefully in the future we won’t even need to have this discussion!  :)

    Copy & paste the code below to embed this comment.
  51. Elements are not tags…

    /Wash
    //Rinse
    ///Repeat

    Copy & paste the code below to embed this comment.
  52. I couldn’t see anywhere else to submit mistakes / bug reports:

    When I try out the “demo page”:http://www.roelvangils.be/geo/demo/ with a non-JavaScript enabled browser, I keep being presented with the Turing test. Each time, I answer the question, I get a new page asking me a different question rather than the contact email address.

    I also note that in the article, the author states that for this technique to work, Apache 2 or greater is required. However the test site runs on Apache 1.3.37.

    Copy & paste the code below to embed this comment.
  53. @Anthony Geoghegan: that’s odd, because we’ve tested this on all major platform/browser combinations and it always worked fine (it’s just a PHP script that evaluates to true or false, so there’s not much that can go wrong). I dare to ask: you did multiply (and not add) the two numbers in the turing test, did you? ;)

    About the required Apache version: yup, you’re right. Apache 2 isn’t even required (I only found this out recently).

    Copy & paste the code below to embed this comment.
  54. Sites like TinyURL are protocol agnostic, so you can create an http://tinyurl.com/ehjwehwjkewh URL for something like mailto:text@example.com Then spam bots will ignore http links and your email address will be protected.

    Copy & paste the code below to embed this comment.
  55. Roel, you are correct to ask. For some reason, I read “sum” even though the text clearly said “product”. That’s what I get for staying on too late after work when I should be at home eating dinner.

    It’s a bit embarrassing that the first time I comment on ALA that I show myself being caught out by a (very) simple intelligence test. :-(

    Thanks for a great article and top technique for defeating spam-bots.

    Copy & paste the code below to embed this comment.
  56. Until recently, I too was under the impression that bots can’t parse JavaScript to make sense of hidden addresses.  But I’ve started working with Java’s JDIC WebBrowser object, and have realized how easy it would be for a Java-based bot to parse “post-processed” pages.  And if it CAN be done, I’m assuming it IS.

    Bottom line: IF USERS CAN SEE THE ADDRESS, SO CAN BOTS!

    Copy & paste the code below to embed this comment.
  57. @C Deardorff: you’re right: bots/spiders with Java-based JavaScript parsing capabilities do exist, but I doubt if they are used for the purpose of e-mail harvesting yet (because of speed issues etc.)  Have you any idea about their ability to initiate events such as onclick/onmouseup and ‘see’ the DOM changes that happen as a reaction to that? Because that’s what happens with this script (the JavaScript processing of the page doesn’t just happen after loading).

    I have a collecton of (newly created) e-mail addresses that are published on various of my own webites (for over 6 months now) that are protected by this technique, and none of them seems to be harvested yet (I don’t receive any spam on thsee addresses). So, for now (!), it all works fine. Fingers crossed?

    Copy & paste the code below to embed this comment.
  58. There seems to be a small thread building up within this comments list about “Why not just use forms?”.

    Using a form is what we do mostly on our websites, but it is not always the solution, since Roel’s solution is aimed at enabling “real” users to copy and paste an email address for use elsewhere. Whereas forms don’t enable the end user to save the email address for later use. Some clients are OK with this, some aren’t.

    The other mention was that forms are also prone to Spam. Yes they are, but the point of the article was to prevent harvesting of email addresses, not to prevent form spam, for which there are some good solutions.

    Personally I do feel though that this is always going to be a loosing battle until the people receiving the spam actually stop reading it. The truth is that spamming makes someone a lot of money. Which means there people who act on spam messages. We will therefore always have spam and we just need to be pragmatic about it. Most spam filters are reasonably good. I certainly have few problems with spam and my email has been plastered all over the web for the last 10 years!

    Copy & paste the code below to embed this comment.
  59. I suppose preventing spam from being lucrative is a good idea, but curing it is much easier nowadays. Greylisting incoming email has proved itself where I work – 90% (dare I say even more than that) of all spam will filter out just by using greylisting. The rest will go through the common spam filters and won’t survive that filter. The few that do come through are filtered out by Thunderbird’s filter that I’ve trained over the months.

    I used to receive up to 600 spam mails each monday (read: after a weekend of not downloading my mail), it’s down to an average of 2 now.

    Copy & paste the code below to embed this comment.
  60. I never expected I am gonna be excited to read this article. You offered a ‘WOW’ solution!
    Definitly gonna use it!

    Copy & paste the code below to embed this comment.
  61. I’m working on a Drupal site for a non-profit organization where many folks who don’t know how to construct a mailto link will be updating the site.  Drupal automatically converts email address to links, which is handy, but it doesn’t obfuscate the addresses at all.  I found some code that’s similar to this solution (http://drupal.org/node/62881), but both that code and the code featured here choke when at symbols are used in unconventional ways.

    In Spanish, many words are gendered (for example, ellos is the masculine form of “they” while ellas is the feminine.)  Some folks want to get away from this default gendering, and to do so, they replace the “a” or “o” in gendered words with “@.” Linguistic quibbles and obscurity aside, it’s something I need to accommodate on the site that I’m working on.  However, both this code and the Drupal-specific code I found mangle words like ell@s, interpreting them as email addresses. I think it’s something to do with the way the regular expression is constructed, but for the life of me I can’t figure out how to make it differentiate between example@domain.com and ell@s.

    Copy & paste the code below to embed this comment.
  62. Jack: You might try replacing with “a” or “o” and running it through a spellcheck. If it passes, it’s a word. If it fails, it’s probably an email address. You might also check for the existence of both “@” and “.” in the same word.

    Copy & paste the code below to embed this comment.
  63. ReachBy.com hosts contact pages with spam protection so you can post and share your link, e.g. http://yourname.reachby.com instead of exposing email address.

    Copy & paste the code below to embed this comment.
  64. Brian,

    I loved the idea of using TinyURL.  Unfortunately, I just discovered that safari kicks out a redirect error.

    Off to try plan b.

    bob

    Copy & paste the code below to embed this comment.
  65. Although i love the content of this article (the technical side) but its like shooting a small bird with a Kalashnikov when u can simply throw a stone at it! A friend of mine I once asked: why don’t you do anything regarding the spam you’re getting and he goes: spam is a good way to know your email is actually working! :)
    The point as mentioned above, you want to kill spam? Stop reading it. As for using the email as a link, I always found that too un-friendly for users, since most probably I do not wish to launch my email browser, I just want to copy and paste later in Yahoo for example… a true useful contact is by means of a form…that said, email images are probably the best.

    Copy & paste the code below to embed this comment.
  66. I like to see creative people coming up with solutions to common problems like spam-n-such.

    Everybody has different needs and I think this is a fine method of helping to prevent spam on a site where you may not be able to control the recipient’s spam filter and/or adding a form isn’t an option. Different Needs.

    In re::mailto links, how about adding a tiny icon after each email that copies the address to your clipboard? I would use something like that as not to launch my e-mail client. I don’t know if this has already been thought of, although I’d have to believe it has. It’s the little things.

    Copy & paste the code below to embed this comment.
  67. I’ve just created a new possibility to use reCAPTCHA’s Mailhide functionality (http://mailhide.recaptcha.net/):

    http://code.google.com/p/mailhide-tag/
    It is a JSP tag which helps developers to hide mail address from spambots.

    Copy & paste the code below to embed this comment.
  68. I’m a bit disappointed that writer of article published on alistapart.com was not aware that + is absolutely allowed in email addresses. Yeah, it was good article, but every web “professional” should know the “plus in email” fact.

    Copy & paste the code below to embed this comment.
  69. @Jaakko Holster: firstly, I’d like you to read “this comment”:http://www.alistapart.com/comments/gracefulemailobfuscation?page=2#14.

    Secondly: sure, according to the official RFC, a plus sign is allowed in e-mail addresses, but:

    a) In reality, e-mail service providers don’t allow users to create addresses that contain a plus sign (!)

    b) I’m perfectly aware that ‘plus addressing’ is an interesting and commonly used technique (by geeks, at least) to tag/filter incoming mail and/or to backtrace where spammers got your address from, but these are not the addresses that you publish on a public website; you use plus addressing when signing up for online services, newsletters etc. Don’t forget that the plus sign is not a part of the actual address.

    c) However, if you insist, you could very easily adapt the regular expressions that the GEO technique uses, to separate the name/domein/tld in an e-mail address with something else (a ‘/’ would be a good idea).

    I hope this helps ;)

    Copy & paste the code below to embed this comment.
  70. The simplest solution I found on one website is

    mail me

    I don’t know in fact if it’s effective but when someone will have javascript turned off – we have a problem.

    Copy & paste the code below to embed this comment.
  71. Are any of the current bots smart enough to catch something like this?

    <span>address</span><span style=“display: none;”></span><span>@</span><a>example.com</a>

    The address is not clickable (which is actually preferable to me), but if you select it and copy it for pasting into your e-mail client, it only pulls the e-mail address.

    Copy & paste the code below to embed this comment.
  72. Are any of the current bots smart enough to catch something like this?

    <span>address</span><span none;”></span><span>@</span><a>example.com</a>

    The address is not clickable (which is actually preferable to me), but if you select it and copy it for pasting into your e-mail client, it only pulls the e-mail address.

    Copy & paste the code below to embed this comment.
  73. I do like the fact that people are always trying new methods (such as this one) yet the main disadvantages of this method as far as I can see are:

    • developer time to implement this on their site
    • nearly attempting to apply a method that is a one-size-fits-all: every site comes with a different user base.
    • falling back to a case where it is not user friendly (i.e. how many processes do users have to go through before they get the email?) in absence of Javascript
    • the + is a server setting which requires investigation by teh developer
    • simply not being able to copy/paste an email

    I’ve contacted ALA a few years ago several times, when I was about to publish an article which was a compilation of “methods to hide emails from the page source”:http://www.csarven.ca/hiding-email-addresses ; talking about the pros and cons for each method and the impact on the resource requirement to beat the spammers.

    My question to you is why did I not get any response from you and why am I reading this: http://alistapart.com/articles/gracefulemailobfuscation/ now?

    Don’t get me wrong, I do appreciate the effort that went into writing this article.

    Copy & paste the code below to embed this comment.
  74. Hi, just don’t forget that the “+” sign is used for Gmail filtering which may cause issues with this particular method.

    “A “+”? is typically not allowed in real e-mail addresses and it doesn’t have to be URL-encoded—which will come in handy later on.”

    e.g. email.address+ala@gmail.com

    I would suggest a “$” instead as this is an invalid email character.

    Great article nevertheless. I love your site and the great reading it provides.

    Copy & paste the code below to embed this comment.
  75. Sorry, i just read through the comments and i noticed you had already made mention of the “+” sign in email addresses.

    Copy & paste the code below to embed this comment.
  76. No server access means no way to create the non-javascript option. For these instances, and the ones where the client insists in showing their email in plain, I no longer worry about it but use an email address I don’t mind changing every 6 or 12 months (or when spam is taking over). As the email address on the website in general is not the main email address in the first place, but just a first contact one this approach seems to work quite well.

    Copy & paste the code below to embed this comment.
  77. Anyone have an idea as to how go get this to work in Drupal?

    Copy & paste the code below to embed this comment.
  78. Okay, got it working in Drupal; however, it seems to always assume JavaScript is turned off (when it’s not – I’m assuming Drupal is blocking something…)

    Copy & paste the code below to embed this comment.
  79. I like what this script is doing, but there’s a big problem with using “window.onload” because it conflicts with any other script(s) that you’re using on a page.

    Copy & paste the code below to embed this comment.
  80. Actually I don’t think my previous post has to do with Drupal.  I just tried installing this on an empty site and the JavaScript won’t kick in?  Can anyone think of something obvious I might be missing?

    Copy & paste the code below to embed this comment.
  81. Never mind – JavaScript works, but it still won’t kick in using Drupal…

    Copy & paste the code below to embed this comment.
  82. i wrote a rails plugin that makes it a snap for rails people to incorporate this idea into their apps. the writeup is here: http://playtype.net/past/2008/3/1/graceful_email_obfuscation_in_ruby/

    Copy & paste the code below to embed this comment.
  83. I’m also having a hard time with Javascript not kicking in. I’m not using a Drupal site either! I’ve tinkered for over an hour and still have no idea why it won’t kick. Even the simplest test page (no additional scripts) does not work.

    I also get a message in Firebug that the geo.js has an error: missing ; before statement
    [Break on this error] var tooltip_js_off = ‘To reveal this e-mail address, you’ll need to answer a si…

    Nice idea. I love the idea…

    Copy & paste the code below to embed this comment.
  84. Why, the article is great and pretty helpful! I used to get a lot of spam for my e-mail address being quite public… But I resolved my problem in much easier way – just signed up with Gafana, I don’t get spam any more. that’s it, guys.

    Copy & paste the code below to embed this comment.
  85. The solution described in the article is great however I think it is too much of an effort to implement. What I don’t want is to waist more time on spam scumbags then necessary.

    Spamspan [1] is a nice simple pretty clean Solution using js+css, degreading without js turned on nice and beeing easy to customize it is the solution of my choice. Especially because there ist a drupal [2] module [3] for it.

    [1] http://spamspan.com http://spamspan.de
    [2] http://drupal.org
    [3] http://drupal.org/project/spamspan

    Copy & paste the code below to embed this comment.
  86. This seems to be an attractive technique, but I have concerns about any added pre-processing and post-processing time. If each file has to run through this filter before it is served, and then processed by another javascript function, does this noticeably affect the page load time?

    Copy & paste the code below to embed this comment.
  87. Roel,
    why don’t you use your technique on your own sites, like Anysurfer.be? Is it not that accessible after all?

    Copy & paste the code below to embed this comment.
  88. My method of displaying email addresses is simple enough that it doesn’t require heavy scripting nor massive legwork on the user’s part.

    “If you would like to get in contact with me, you may email gibson at the domain this site currently resides.”
    (I haven’t really put much time into this short line. I just wanted to get the point across.)

    I feel that displaying an email in a more cunning fashion, such as this, can accomplish a number of goals when trying to filter email.

    1. The amount of spam is greatly reduced, of course. As far as I know, no currently effective bots can grab an email address from this text.
    2. The amount of “superfluous” messages are greatly reduce. I feel that all other messages can go to the comments box if a user is too lazy to manually type in an email address if they would like to speak with me. (Please tell me if I’m biased or just plain wrong.)
    3. Completely cross-browser compatible. No scripting == less browser-incompatibilities.

    This method has proven to be quite useful and has eliminated 100% of spam messages (not to mention the number unneeded ones).

    Copy & paste the code below to embed this comment.
  89. Excellent article. Unfortunately I have to work with an undocumented proprietary content management system written in ASP. I have come up with a simple email obfuscator based on numeric character references, JavaScript and CSS. Take a look at my blog post at www.pixelwisedesign.com/blog/?p=40 if you are in a similar situation and cannot utilize a server side language.

    Copy & paste the code below to embed this comment.
  90. For those of you wanting to use this with Wordpress, amazingly there did not seem to be any implementations of quite this method as of a week ago. I knocked a rudimentary version up on “Wordpress.org plugins”:http://wordpress.org/extend/plugins/graceful-email-obfuscation/ which should do the job. Contact me if anyone is interested in improvements, which I would be happy to bang in if you want to use it on a bigger site.

    I have made some slight changes to the method to hook into WP’s processing, avoiding Apache dependency, and tweaked the encoding method a little.

    For details and links to any future plans, I will update “the post on my site”:http://www.nicholaswilson.me.uk/2010/04/notes-on-good-email-obfuscation/ if I come back to this.

    Copy & paste the code below to embed this comment.
  91. I wrote an email obfuscation routine at http://www.php-ease.com/functions/email_link.html that does what you did, but steps it up a notch.  I wrote a function that base64 encodes the entire mailto link and I put it in the title attribute.  When a user hovers over it with a mouse (yeah, I know, not everyone has javascript – and for this I don’t care), THEN the title is decoded, placed into the href attribute before they click on the link, and when their mouse leaves the anchor area it then becomes obfuscated again.  To step it up a notch, I also rot13 it, but that’s not included in the script I provide.  This solves all of the problems of +‘s, extra periods, even chinese characters.

    Copy & paste the code below to embed this comment.