Rendering a CAPTCHA image in PHP

It’s been a while since I posted anything web or programming related (I honestly don’t even the remember the last time) so I thought I’d post an update with something asked in an email by a friend. He was putting together a something which I’ve been asked to co-write and we came across the CAPTCHA issue again. We’re thinking of using these in a somewhat different way.

What they don’t tell you about CAPTCHA

They’re, more often than not, completely ineffective. The whole point about trying to prevent bots is only relevant when talking about simple drive-by spammers forum flooding or the like, but for anything more than that, you’re better off finding something else.

What they ARE useful for is to make sure only people who really have something to say end up voicing their opinion. I.E. It’s a think-before-you-speak buffer in many ways. This is especially helpful when you have anonymous posting enabled.

I’ve seen tons of examples of how to generate CAPTCHAs, but many of these (especially for PHP) are depending either on an existing image background or are so unreadable, they are not only bot proof they’re human proof. Worse yet, I’ve seen examples longer than one page of code. And I’m not even talking about session handling.

How someone writes something as simple as a CAPTCHA render in longer than a couple of functions is beyond me. OK, that’s just me being lazy, but to a programmer, laziness is a virtue sometimes.

Here’s another thing they don’t tell you about CAPTCHAs: Anything over 3 characters is useless. If they’ve managed to use OCR to break 3 characters, they’ve got the rest, your efforts will only frustrate legitimate users. Using 4 characters is a bit excessive, 5 and you’re getting on my nerves. 6 Is ridiculous and with any more, chances are, I’d rather not participate in whatever it is you have behind your unreadable gibberish.

Another thing a lot of these CAPTCHAs seem to overlook is the character pool. In some of these things, I’ve seen s that looks like 5, u that looks like v. Don’t even get me started on 0, o, 1, i, and j. Best option in this case is to get rid of these similar looking characters. In fact, you’re better off getting rid of most characters that even remotely have the ability to be confused with another letter. This is why I’m leaving out e as well, since that’s too easily confused with ‘c’ sometimes.

Here’s a sample of a CAPTCHA that hopefully doesn’t suck.

You should at least be able to read the bloody thing.

Here’s the code file that generated it (I didn’t include sessions and stuff, but plenty of examples are available elsewhere) :

<?php

ini_set( "display_errors", true );

// Rudimentary random string generator
function random( $length ) {
	
	$out = '';
	$pool = str_split( '2345689abcdfghkmnpqrstwxyzABCDEFGHKMNPQRSTWXYZ' );
	
	// This doesn't need to be any more complicated
	for($i=0; $i < $length; $i++)
		$out .= $pool[ array_rand( $pool ) ];
	
	return $out;
}

// The business end
function captcha( $txt ) {
	
	// Height of 50 is usually good enough	
 	$sizey = 50;
	
	// Character length
 	$cl = strlen( $txt );
	
	// We'll expand the image with the number of characters
 	$sizex= ( $cl * 19 ) + 10;
	
	// I used monofont, but you can download another font to use
	// Try http://dafont.com (don't pick crazy fonts, a nice monospace will do)
	$font = 'monofont.ttf'; 
	
	// Some initial padding
	$w = floor( $sizex / $cl ) - 13;
	
	$img = imagecreatetruecolor( $sizex, $sizey );
	$bg = imagecolorallocate( $img, 255, 255, 255 );
	imagefilledrectangle( $img, 0, 0, $sizex, $sizey, $bg );
	
	// Random lines
	for( $i=0; $i < ( $sizex * $sizey ) / 250; $i++ ) {
		
		// Select colors in a comfortable range
		$t = imagecolorallocate( $img, rand( 150, 200 ), rand( 150, 200 ), rand( 150, 200 ) );
		imageline($img, 
			mt_rand( 0, $sizex ), 
			mt_rand( 0, $sizey ), 
			mt_rand( 0, $sizex ), 
			mt_rand( 0, $sizey ), $t );
	}
	
	// Insert the text (with random colors and placement)
	for ( $i = $cl; $i >= 0; $i--) {
		
		$l = substr( $txt, $i, 1 );
		
		// Again, colors in a comfortable range. I was thinking pastels
		$tc = imagecolorallocate( $img, rand( 0, 150 ), rand( 10, 150 ), rand( 10, 150 ) );
		imagettftext( $img, 30, 
			rand( -10, 10 ), 
			$w + ( $i * rand( 18, 19 ) ), 
			rand( 30, 40 ), $tc, $font, $l );
	}
	
	// Move the header up the code page if this is going in as part of a bigger project
	header("Content-type: image/png");
	
	imagepng( $img );
	imagedestroy( $img );
}

// Use the render to store in a session first. Remember to clear it with each attempt (success or failure)
captcha( random( 3 ) );
?>
About these ads

4 thoughts on “Rendering a CAPTCHA image in PHP

  1. Don’t even get me started – I could write a blog called Captchas-should-not-suck! How many times have I just given up because I can’t tell what the hell I’m supposed to type?!

  2. I’ve worked extensively on both cracking and developing CAPTCHAs. The key to designing a strong CAPTCHA is to see it as a bot would, not as a human; too many people get lost in how it _looks_ and forget the mathematical patterns that a bot can pick up on. Unfortunately the example you posted could be easily broken.

    Still… an interesting GD/CAPTCHA introduction for newbies, perhaps.

    • Thanks for the comment.

      A bot today would see it almost exactly as a human would. Industrial scale CAPTCHA cracking (that isn’t research only) going on in most cases is leveraged by spammers with cash to spare, so no CAPTCHA that is convenient for humans can withstand these. An example inconvenient for humans isn’t worth having. .

      As I said in the post, if OCR has resolved more than 3 characters, the system is pretty much broken at that point. I specifically limited it to 3 characters because this isn’t meant to keep away those spammers. No CAPTCHA can. And in my experience, it’s cheaper for spammers to use humans for their work so they won’t keep spam out. My example wasn’t meant to prevent these from accessing a site

      The example here was to make it easier on the humans, who are more likely to be your visitors, be able to contribute without being hampered while limiting casual scripts (cheap/free bots) from doing drive-by spam and also to act as a bit of thought buffer before submission (for anonymous users in our case).

      Some users prefer to disable JavaScript while browsing online (for the obvious security benefits) and JS based spam prevention solutions also won’t always work. Casual spamming can be prevented with something like a CAPTCHA without resorting to JS, but only if they’re unobtrusive.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s