Anti-Spam Techniques In PHP - Part 1

This tutorial provides a few simple techniques for protecting yourself and your web site from spammers.
Provided by Quentin Zervaas

Introduction

This short series of articles provides a few simple techniques for protecting yourself and your web site from spammers.

It does this from two perspectives:

The issue here is that if you publish your email address on a web site, there's a good chance it will be harvested to be used by spammers. Many of these harvesting tools are crude and poorly written so it can be easy to protect yourself from these, although some are more sophisticated.

The first article in this series contains techniques for preventing spam for people who post to your web site.

The issue here is that if your web site allows anybody to submit content, chances are spammers will bombard it with links to their products. They do this to receive a higher backlink count, and therefore receive a higher Google PageRank, giving them better search engine results.

There are number of anti-spam techniques for preventing this, which will be discussed in the second article of this series.

Protecting people

Technique 1: Obfuscating

This technique allows your email address to still be displayed exactly as it is on the web page, while hiding it in the HTML source of your page. Since the email harvesters don't "see" your page, they just read the source, this is hard to write a pattern matcher again.

Lets how it's released with Smarty. Smarty has built-in functionality for this, which you can read about in the Smarty manual, but is basically achieved like this:

Listing 1. Code in template.
{assign var="email" value="antispam@example.com"}
<
a href="mailto:{$email|escape:'hex'}">{$email|escape:'hexentity'}</a>

This will output:
Listing 2. Output in HTML file.
<a href="mailto:%61%6e%74%69%73%70%61%6d%40%65%78%61%6d%70%6c%65%2e%63%6f%6d">
    &
#x61;&#x6e;&#x74;&#x69;&#x73;&#x70;&#x61;&#x6d; <!-- antispam -->
    
&#x40;                                           <!-- @ -->
    
&#x65;&#x78;&#x61;&#x6d;&#x70;&#x6c;&#x65;       <!-- example -->
    
&#x2e;                                           <!-- . -->
    
&#x63;&#x6f;&#x6d;                               <!-- com -->
</a>

The lines have been broken up and commented just for readability.

When you view it in your browser it will just appear as antispam@example.com.

To achieve this without Smarty, we just borrow Smarty's code (from Smarty/plugins/modifier.escape.php)

Listing 3. PHP code.
<?php
function escapeHex($string)
{
    
$return '';
    for(
$x=0$x strlen($string); $x++){
        
$return .= '%'.bin2hex($string[$x]);
    }
    return 
$return;
}

function 
escapeHexEntity($string)
{
    
$return '';
    for(
$x=0$x strlen($string); $x++){
        
$return .= '&#x'.bin2hex($string[$x]).';';
    }
    return 
$return;
}

$email 'antispam@example.com';
echo 
'<a href="mailto:'.escapeHex($email).'">'.escapeHexEntity($email).'</a>';
?>

Realistically though, it would not be terribly difficult to extend an email harvester to decode these hex entities, but hopefully that would be good enough to eliminate some of them.

The advantage of this method is that you can still keep the email linked so users can send emails directly in their email client.

Technique 2: Rewriting

This technique is used commonly by people who post to forums, blogs or discussions to keep their email address human-readable, but without using standard symbols like @ and period. This means if anyone wants to email you they will have to type out your email address manually, but at least if somebody has gone to that effort it's unlikely they are sending you spam.

So basically what we are doing here is to turn an email address that looks like antispam@example.com into antispam [AT] example [DOT] com, or something along those lines.

Once again, Smarty has built in functionality for this, once again using the escape modifier.

Listing 4. Code in template.
{assign var='email' value='antispam@example.com'}
{
$email|escape:'mail'}

This will output:
Listing 5. Output in HTML file.
antispam [ATexample [DOTcom

There's no point in making this a hyperlink as it would not be valid anyway.

To achieve this without Smarty, use something like this:

Listing 6. PHP code.
<?php
function escapeEmail($string)
{
    return 
str_replace(array('@''.'),array(' [AT] '' [DOT] '), $string);
}

$email 'antispam@example.com';
echo 
escapeEmail($string);
?>

A further technique you could employ is to use random replacements for multiple email address, so each time you show an address it could be different, like antispam [AT] example [DOT] com, then antispam AT example DOT com, just simple variations like that which make harvester writers have to work that extra bit harder.

Technique 3: Images

This technique is the most difficult for email harvesters to deal with. It involves never having your email address appear in text (be it encoded or in plaintext), but rather displaying an image which contains your email address. If you put it in the same font and size as the rest of your site (keeping it anti-aliased), most people will never realise. Although as with technique 2, you can't have it hyperlinked directly.

There are two ways to implement this.

The first way is to use something like Photoshop to create the image with your email address in. This takes only a minute or two and you're done. The problem with this is that if your email address ever changes you need to recreate the image. Also, this has to be done for every email address you want to publish, so you can't do any dynamic image creation based on say, email address of people who submit content.

The second way is to use something like the PHP image functions to create email images. Just create a script which draws the text on a small canvas (there a millions of tutorials on the web to do this), and display it.

Listing 7. HTML code.
My email address is
<img src="/images/emailAddress.php" alt="email" class="email-image" />

Here's an example of what the emailAddress.php script will look like:

Listing 8. emailAddress.php
<?php
// the email address to draw
$email 'antispam@example.com';
// select the font. '4' is a builtin kind,
// with each letter about 8px wide
$font 4;
$width strlen($email) * 8// 8px wide per letter
$height 15// this font size needs about this height
// create the GD image
$im ImageCreate($width$height);
// allocate the background colour. The first call
// to this function sets the background
$white ImageColorAllocate($im255255255);
// the text colour.. black
$textColor ImageColorAllocate($im000);
// write the email address to the image
ImageString($im$font00$email$textColor);
// output the content-type header and the image
header('Content-type: image/jpeg');
imageJpeg($im);
?>

You can also use ImageTTF() to draw the text with truetype fonts, and then use the ImageTTFBBox() function to know how big to create the image, but this code is at least a starting point.

If this image doesn't line up quite right, perhaps use this little bit of CSS:

Listing 9. listing-9.css
.email-image vertical-align middle; }

Alternatively, let's say Joe submitted an article to a web site, and the article has an ID of 132. If you wanted to display Joe's email address when viewing his article, you could do:

Listing 10. listing-10.html
Contact Joe at <img src="/images/emailAddress.php?id=132" alt="" class="email-image" />

Then in your emailAddress.php script, it looks up the email address to generate based on the $_GET['id'] (which in this case is 132).

Technique 4: Forms

The final technique in this article is to not display your email address at all.

Simply provide your users with a web form that links to your email address. This is really simple to do with something like PHP's mail() function.

Obviously spammers can then send you spam through your web form, but this extremely unlikely as it is time consuming and not automated (although technically it could be automated - this will be discussed in the article where a technique known as CAPTCHA will be suggested).

Conclusion

This article discussed a few ways to protect email addresses that are displayed on your website from spammers and email address harvesters.

Hopefully these simple anti-spam techniques make it much harder for email addresses to be stolen and misused.

All of these techniques have their own advantages and disadvantages, it all comes down to what you feel is most appropriate for what you are presenting.

I guess the best suggestion is to create a separate email account if you need to publish an address on the web. Nearly all email clients or webmail clients such as Microsoft's Outlook and Hotmail, Thunderbird and Yahoo! mail provide spam protection and spam filters, so combining these with the techniques on this article will hopefully minimize received spam.

The next part of this article will discuss preventing spam on your web site, such as from blog spammers. Click to read Part 2.

Comments


Please post only comments related to the original tutorial. Be polite and helpful, do not spam or offend others.
Create Your Free Account
Please remember that this information is essential to use our services correctly.
After creating the account you will be able to download all of our FREE products.
Fields marked with * are mandatory






Please send me information about updates, new products, specials and discounts from ApPHP!
We recommend that your password should be at least 6 characters long and should be different from your username/email. Please use only letters of the English alphabet to enter your name.

Your e-mail address must be valid. We use e-mail for communication purposes (order notifications, etc). Therefore, it is essential to provide a valid e-mail address to be able to use our services correctly.

All your private data is confidential. We will never sell, exchange or market it in any way. Please refer to Privacy Policy.

By clicking "Create Account", you are indicating that you have read and agree to the ApPHP Terms & Conditions.