A LIST Apart: For People Who Make Websites

No. 248

Discuss: Graceful E-Mail Obfuscation

Pages

 <  1 2 3 4 5 >  Last »

21 RE: JUST FILTER IT

@joe:

Agreed. The bandwidth is already being used, and it’s Google’s, thusly, no spam will ever end up on any server belonging to me. Not mention, I won’t need to have this script run every time someone visits my page, so extra savings there as well.

If someone really wants to take the hard way to stopping spam, maybe they should write a letter to their congressman. Not only is it less effective, but takes many times longer to see results!

posted at 08:36 pm on November 6, 2007 by Inejiro Asanuma

22 The + is incredibly useful!

I run my own mail server for me and my family, and I encourage them to use a + in their email addresses for different sites, so they know which sites are sharing their e-mail addresses.

It annoys me to no end when I try and sign up to a site and it says my e-mail address isn’t valid!

I would aggree with Vann B, that a lot of people are going to use the ALA scripts unmodified, and in that respect, would encourage fixing the code to allow the + to be used.

posted at 09:01 pm on November 6, 2007 by Wyatt Draggoo

23 Obfuscatr

I (also) wrote a tool for obfuscation that works better than most of the other online tools I’ve seen. It was written mostly for myself, but it does work, so maybe someone else can benefit from it. Details about why I think it’s awesome are on the FAQ page if anyone cares.

The techniques it uses all suffer from the negative points mentioned in this article, however.

http://www.obfuscatr.com/

posted at 09:04 pm on November 6, 2007 by bugmenot bugmenot

24 Gmail supports +addresses

In reality, however, e-mail service providers typically don’t allow user to create addresses that contain a +. I did point this out in the article.

Errm. You mean like Gmail. I’m always giving out e-mail addresses with a + in them. It allows for easy filtering.

posted at 09:21 pm on November 6, 2007 by bugmenot bugmenot

25 The site visitor should not have to prove their in

JZ notes: “So the idea of thousands of spam messages being sent to your mail server doesn’t bother you as long as you don’t see them in your in-box? Does it concern you that bandwidth is being wasted on these messages?”

Of course bandwidth is a concern, but the solution isn’t an arms race, particularly an arms race where the website owner is degrading the usability or accessibility experience of the site visitor. By treating all visitors as bad until they confirm themselves as humans (or humans who correctly answer a generated question) is treating visitors as culprits.

If you want to avoid email spam, don’t use email. If you insist on using email as a means of communication then it is your responsibility to deal with the implications of using email, and you should not belabour the visitor and treat them like a guilty party until they prove their innocence.

The techniques in this article are both simple and easily reversible. Even the “human proof” question is easily scraped/parsed and answered. Its a generally trivial regex to reduce the question into a form that can quickly be calculated.

I’m watching black-hat SEOers and comment spammers automate the signup process on Blogger and Yahoo, their code works fairly stably, and both systems use Captcha and other anti-scripting techniques. (I did a presentation on this in the last Barcamp London)

The problem here is that spammers have a monetary incentive to break through these flimsy defences. So any solution is merely temporary until the spammer is incentivised enough to spend an hour coding their way through these obstacles.

There is no long term benefit with this solution, but there is a long term usability cost. How do visitors typically react when they are presumed guilty the first time they visit a website? Is that the experience we really want to be recommending?

posted at 09:24 pm on November 6, 2007 by Mike Davies

26 Wow

A very well written article as well as a brilliant idea. Thank you for sharing. I work for a public university so accessibility and security are also major concerns for us. We have had a lot of success using SpamSpan (http://www.spamspan.com/), which relies entirely on JavaScript and doesn’t embed quite as much data into a page as some other solutions. It is also based on the DOM and is easy for Contribute users to remove (which by default protects embedded forms and JavaScript).

I do have a question about the solution you suggest to handle if a user has JavaScript disabled. The form presents a question to the user, which when answered correctly, proves they are not a machine. As a person working in an accessibility-minded social-services agency, do you think that this form poses a problem for persons with cognitive disabilities?

posted at 09:50 pm on November 6, 2007 by Ashley Callahan

27 Regarding the usefulness of +

While I certainly agree that a + sign is allowed in email, and many people do in fact use it as an anti-spam measure (though, to be accurate, it’s more about tracking spam than preventing it), I honestly can’t expect it to serve this purpose for very long.

If it’s standard behavior for email servers to simply discard the + and anything after it, what’s to stop people from simply doing the same thing? If I run a site which collects email addresses, and I intend to sell them, I’ll most certainly just run a quick regex and strip the + and everything after it before selling the address. Harvesters will likely do the same soon, if they don’t already.

I’m not saying you shouldn’t use a + in email addresses you give out, just don’t pretend it’s a cure-all, when what limited usefulness it currently serves is bound to be lost in the not-so-distant future.

posted at 09:56 pm on November 6, 2007 by Marty Alchin

28 Performance

Given the example in the article uses preg_replace() in the callback for the output buffer, a large page on a high traffic site can introduce some performance problems. The degraded version, however, has proven time and time again to help thwart spam. If you fill out a contact form (something like /contact/sales or /contact?who=sales), you still achieve the removal of the email address from the site, your users can still contact you, and you drop the expensive preg, the JS reversal of obfuscated addresses, and the Apache rewrite rules. While this probably goes without saying, be sure to benchmark before deployment.

(Yes, it’s possible to automate the posting of data to the form in order to achieve the same goal, but this requires a custom tooling of a harvester which continues to be more effort than it is worth when there are thousands of other emails floating around on the Internet.)

posted at 10:04 pm on November 6, 2007 by Jakob Heuser

29 W3School Stats?!

First off. Before I rant. Excellent article and ideas. I will be using the ideas and techniques in the future.

And now, rant one:
Congratulations on using some of the most unrepresentative of “average” stats available—W3Schools, a site visited mainly by Web designers and developers, not the general public. Let’s stop using stats as an argument whether or not to adopt something as best practice. There’s no such thing as a universally representative set of stats. Stats are only useful from your own site for analyzing your own users—and even then, it’s only useful for analyzing the users that you currently support (not the potential customers that are getting a sub-par experience through bugs, bad code or obtrusive practices).

Rant two:
I was going to mention the + as being valid. Others did. You also had a fine point: modify the RegEx to fit your own needs.

Again, thanks for some excellent ideas. Please be responsible with those quotes of statistics. ;)

posted at 10:08 pm on November 6, 2007 by John Lascurettes

30 Costs of running these scripts on the server?

I’d be very interested in seeing stats on the cost of processing these scripts on the server. Have you run a comparison study?

posted at 10:15 pm on November 6, 2007 by Art Wagner

Pages

 <  1 2 3 4 5 >  Last »

Got something to say?

Discuss this article. We reserve the right to delete flames, trolls, and wood nymphs.

Create a new account or sign in below if you’d like to leave a comment.

Remember me

Forgot your password?

Subscribe to this article's comments: RSS (what’s this?)