Firewall.php

Since yesterday, I’ve been working on my forum script again (oh, you mean the one you’ve been working on since 2009?! Er… yes). The good news is that I’m finally getting somewhere. Bad news, I had to scrap everything I wrote so far since that turned out not to be the direction I wanted to go. The one sticking point was protecting the forum from all sorts of unsavory things the internet has an abundance of.

There all sorts of plugins and apps available to protect your software from spammers and things, but most of them are hardly drop-in caliber. I’ve looked at Akismet (which isn’t as transparent as I had hoped), Fail2ban (which was too involved) and Bad behavior. All in all, BB turned out to be the thing closest to what I was looking for, but it didn’t quite… match.

The premise behind Bad Behavior is that it’s a module/plugin or what-have-you, that sits listening to any requests to your site and piles through a blacklist of bad bots in the form of User Agent fragments and rubbish IP addresses. It optionally downloads blacklists and does host matching, but this aspect seems to be broken due to a PHP bug (surprise!). There’s also the problem of layout. BB seems a bit all-over-the-place as a piece of software. After scanning the code for a while, I realized it wasn’t really what I wanted or how I’d like to layout my forum.

I needed something that can be deeply integrated into the forum so that I’ll have the option of pushing requests to a log of some sort, like BB does, but I also wanted to block users based on user name in other portions of the site. This required that I hack into BB to work and, considering the differing approaches, that wasn’t going to work. There should be two sections to this: A main firewall script and a model. The model is a “firewall entry object” that I can save to a database. Optionally, I also wanted it to have username and other information in the future so I haven’t finished it yet.

So last night, I sat down and sketched out a few things into a class. This is a non functional draft for what might be a firewall script I can reuse elsewhere. You can think of this script as me thinking out loud.

There are many different ways to do this so I’ll be scrubbing this in the future. But for now, here’s the overview

Update: Well that was quick. This went from non-functional draft to semi-functional draft. I’ve also added a sketch of a FireEntry model which can show what would be saved if this was connected to a database. Also moved all the ‘lists’ to separate config files (‘Config/’ folder).

I haven’t had a chance to do a proper update yet since I’ve been extremely busy over the past month. As soon as few days are done, I’ll get back to more important things. I.E. Cabins!

<?php
/**
 * Bot and bad client blocking script (NON FUNCTIONAL DRAFT) 
 * This should NOT be considered foolproof as it uses a blacklist approach.
 * Parts of this code was inspired by the Bad Behavior plugin. No code was shared.
 *
 * @author Eksith Rodrigo <reksith at gmail.com>
 * @license http://opensource.org/licenses/ISC ISC License
 * @version 0.1
 */

class Firewall extends \Singleton {
	
	/**
	 * Message to return if a user is blocked
	 * Right now, it's identical to the router 'not found' message to avoid
	 * returning too much information.
	 */
	const DIE_MESSAGE = 'Couldn\'t find that';
	
	private static $botsIni = 'Config/verifiedbots.ini';
	
	private static $uasIni = 'Config/baduas.ini';
	
	private static $urisIni = 'Config/baduris.ini';
	
	
	/**
	 * @var object Firewall model object
	 */
	private $fire	= null;
	
	public $userhash = '';
	
	
	/**
	 * Forbidden request methods
	 */
	public static $rms = array(
		'trace', 'track', 'delete'
	);
	
	
	public static $searchEngines = array(
		'Google',
		'Bing',
		'Live',
		'MS Search',
		'MSN',
		'Inktomi',
		'Slurp',
		'SearchMonkey',
		'Yahoo',
		'Baidu',
		'Yandex'
	);
	
	/**
	 * Begin working as soon as the module is loaded.
	 * Starts from least expensive checks (IP) to most expensive (Headers)
	 */
	public function __construct() {
		$this->init();
		
		if ( empty( $this->fire->ip ) ) {
			$this->fire->ip		= $_SERVER['REMOTE_ADDR'];
			$this->fire->response	= 'Failed: Martian IP';
			$this->killReq( self::DIE_MESSAGE );
		}
		
		$this->checkRequest();
		$this->checkURI();
		$this->checkHeaders();
		$this->verifiedBotScan();
		
	}
	
	private function init() {
		$this->fire		= new \Models\FireEntry();
		$this->fire->method	= 
			strtolower( $_SERVER['REQUEST_METHOD'] );
		
		$this->fire->uri	= $this->getURI();
		$this->fire->headers	= $this->headers();
		
		$this->fire->ua		= $_SERVER['HTTP_USER_AGENT'];
		$this->fire->protocol	= $_SERVER['SERVER_PROTOCOL'];
		$this->fire->reqtime	= isset( $_SERVER['REQUEST_TIME'] ) ?
						$_SERVER['REQUEST_TIME'] : 
						time();

		$this->fire->ip		= $this->getIP();
	}
	
	private function checkRequest() {
		if ( in_array( $this->fire->method, self::$rms ) ) {
			$this->fire->response = 'Failed: Request check';
			$this->killReq( self::DIE_MESSAGE );
		}
	}
	
	private function checkURI() {
		$uris =  parse_ini_file( self::$urisIni );
		
		foreach( $uris['u'] as $uri ) {
			if ( false === stripos( 
				$this->fire->uri, $uri ) ) {
				continue;
			} else {
				$this->fire->response = 'Failed: URI check';
				$this->killReq( self::DIE_MESSAGE );
				break;
			}
		}
	}
	
	private function checkHeaders() {
		$headers = $this->fire->headers;
		
		/**
		 * Accept missing. Not acceptable.
		 */
		if ( $this->missing( $headers, 'Accept' ) ) {
			$this->fire->response = 'Failed: Accept header missing';
			$this->killReq( self::DIE_MESSAGE );
		}
		
		/**
		 * No UA or it's too short
		 */
		if ( $this->missing( $headers, 'User-Agent', 10 ) ) {
			$this->fire->response = 'Failed: User agent too small';
			$this->killReq( self::DIE_MESSAGE );
		}
		
		/**
		 * Shouldn't see MSIE *and* Windows ME/XP/2000 in the same 
		 * UA string
		 */
		if ( 
			$this->has( $headers, 'User-Agent', '; MSIE' ) && (
			$this->has( $headers, 'User-Agent', 'Windows 2000' ) || 
			$this->has( $headers, 'User-Agent', 'Windows ME' ) || 
			$this->has( 
				$headers, 'User-Agent', 'Windows XP' ) 
			)
		) {
			$this->fire->response = 'Failed: Fake MSIE bot';
			$this->killReq( self::DIE_MESSAGE );
		}
		
		/**
		 * Check against blacklist of User agents.
		 * This is the most expensive operation and should be 
		 * reserved for last.
		 */
		$uas =  parse_ini_file( self::$uasIni );
		if ( $this->has( $headers, 'User-Agent', $uas['u'] ) ) {
			$this->fire->response = 'Failed: Bad User Agent';
			$this->killReq( self::DIE_MESSAGE );
		}
	}
	
	/**
	 * It's opposites day! This function returns *true* if a particular 
	 * header value is completely missing, contains an empty string or
	 * is below the minimum length
	 */
	private function missing( &$h, $k, $min = 0 ) {
		if ( array_key_exists( $k, $h ) ) {
			if ( empty( $h[$k] ) ) {
				return true;
			}
			if ( $min > 0 && mb_strlen( $h[$k] ) < $min ) {
				return true;
			}
			return false;
		}
		
		return true;
	}
	
	/**
	 * Helper to see if a key exists in an array, has a component
	 * to search in the value or matches to an optional regular expression
	 */
	private function has( &$h, $k, $v = null, $regex = false ) {
		$has = array_key_exists( $k, $h );
		
		/**
		 * Only checking for key existence
		 */
		if ( null === $v || !$has ) {
			return $has;
		}
		
		if ( is_array( $v ) ) {
			foreach( $v as $name ) {
				if ( false === stripos( $name, $h[$k] ) ) {
					continue;
				} else {
					return true;
				}
			}
			
			/**
			 * Made it this far. The key wasn't in the array
			 */
			 return false;
		}
		
		/**
		 * The key value should be a regular expression match
		 */
		if ( $regex ) {
			return preg_match('/\b'. $v .'\b/i', $h[$k] );
		}
		
		if ( false === stripos( $h[$k], $v ) ) {
			return false;
		}
		
		return $has;
	}
	
	private function uaInSearchBot() {
		foreach( self::$searchEngines as $bot ) {
			if ( false === strpos( $this->fire->ua, $bot ) ) {
				continue;
			} else {
				return $bot;
			}
		}
		return null;
	}
	
	/**
	 * Check bot UA against IPs that are known for it
	 */
	private function verifiedBotScan() {
		if ( !$this->uaInSearchBot() ) {
			return;
		}
		$out	= null;
		$ua	= $this->fire->ua;
		
		$var =  parse_ini_file( self::$botsIni, true );
		$bots	= array_keys( $var );
		
		foreach( $bots as $b ) {
			$bua = explode( '_', $b );
			foreach( $bua as $a ) {
				
				/**
				 * User agent didn't match any bot aliases
				 */
				if ( false === strpos( $ua, $a ) ) {
					continue;
				} else {
					
					/**
					 * User agent claims to be a known bot
					 */
					$out = $this->rangeScan( 
						$var[$b]['i']
					);
					break; // Bot checking done
				}
			}
			
			/**
			 * We have a result (anything other than null)
			 */
			if ( null !== $out ) { break; }
		}
		
		if ( null === $out ) {
			$this->fire->response = 'Passed';
			return;
		}
		
		/**
		 * Didn't pass bot scan
		 */
		$this->fire->response = 'Failed: Spoofed popular bot';
		$this->killReq( self::DIE_MESSAGE );
	}
	
	/**
	 * Checks a given IP range in CIDR format
	 */
	private function rangeScan( $ips = array() ) {
		$out = false;
		foreach( $ips as $ip ) {
			if ( $out = $this->cidr( $ip, $this->fire->ip ) ) {
				/**
				 * IP in the given list  Exit loop
				 */
				break;
			}
		}
		return $out;
	}
	
	
	/**
	 * This may fail... hard!
	 * 
	 * @returns Gets (or rather extrapolates) IPv4/6 address from 
	 * 		relevant headers
	 */
	private function getIP() {
		
		$vars = array(
			'HTTP_CLIENT_IP', 
			'HTTP_X_FORWARDED_FOR', 
			'HTTP_X_FORWARDED', 
			'HTTP_X_CLUSTER_CLIENT_IP', 
			'HTTP_FORWARDED_FOR', 
			'HTTP_FORWARDED', 
			'REMOTE_ADDR'
		);
		
		foreach( $vars as $v ) {
			
			if ( true === array_key_exists( $v, $_SERVER ) )  {
				
				$ip = explode( ',', $_SERVER[$v] );
				
				foreach( $ip as $test ) {
					$test = trim( $test );
					if ( $this->checkIP( $test ) ) {
						return $test;
					}
				}
			}
		}
		
		/**
		 * If we made it this far, the IP was invalid
		 */
		return '';
	}
	
	private function formatIP4( $ip, $pad = '0' ) {
		$ip	= str_replace( '*', $pad, $ip );
		$bits	= null;
		$p	= strpos( $ip, '/' );
		if ( false !== $p ) { 
			$bits	= substr( $ip, $p, strlen( $ip ) - 1 );
			$ip	= substr( $ip, 0, $p );
		}
		
		$sr	= explode( '.', $ip );
		while( count( $sr ) < 4) {
			$sr[] = $pad;
		}
		$ip	= implode('.', $sr );
		
		return $ip . $bits;
	}
	
	private function matchIP4StartToEnd( $start, &$end ) {
		if ( empty( $end ) ) {
			$end	= array();
			$d	= explode( '.', $start );
			$c	= count( $d );
			
			for( $i = 0; $i < $c; $i++ ) {
				if ( empty( $d[$i] ) ) {
					$end[$i] = '255';
				} else {
					$end[$i] = $d[i];
				}
			}
		} else {
			$end = str_replace( '*', '255', $end );
		}
		
		$end = $this->formatIP4( $end, '255' );
	}
	
	/**
	 * Checks if an IP is between an IPv4 range
	 */
	public function ip4Range( $start, $end, $ip ) {
		
		$start	= $this->formatIP4( $start, '0' );
		
		/**
		 * Bits E.G.'/16' was present. Send to CIDR validation
		 */
		if ( false !== strpos( $start, '/' ) ) {
			return $this->cidr( $start, $ip );
		}
		
		$this->matchIP4StartToEnd( $start, $end );
		
		$start	= ip2long( $start );
		$ip	= ip2long( $ip );
		$end	= ip2long( $end );
		
		if ( $start <= $ip && $end >= $ip ) {
			return true;
		}
		
		return false;
	}
	
	
	/**
	 * TODO: Create IPv6 matching
	 */
	private function ip6Range( $start, $end, $ip ) {
		return false;
	}
	
	
	/**
	 * CIDR format IP matching
	 */
	private function cidr( $r, $ip ) {
		list( $sub, $bits ) = explode( '/', $r );
		
		$ip	= ip2long( $ip );
		$sub	= ip2long( $sub );
		$mask	= ( -1 << ( 32 - $bits ) );
		
		$sub	&= $mask; // Fix inconsistencies
		
		return ( $ip & $mask ) == $sub;
	}
	 
	 /**
	  * Converts an IP4 address to IP6.
	  * Convenient to store as a single format
	  */
	private function ip4Toip6( $ip ) {
		if ( filter_var( $ip, 
			FILTER_VALIDATE_IP, FILTER_FLAG_IPV6 ) ) {
			return cleanIPv6( $ip ); // Already IPv6
		}
		
		$ia = array_pad( explode( '.', $ip ), 4, 0 );
		$b1 = base_convert( ($ia[0] * 256 ) + $ia[1], 10, 16 );
		$b2 = base_convert( ($ia[2] * 256 ) + $ia[3], 10, 16 );
		
		return "0000:0000:ffff:$b1:$b2";
	}
	 
	 /**
	  * Expand IPv6 to proper storage
	  * 
	  * @link http://php.net/manual/en/function.inet-pton.php
	  */
	private function cleanIPv6( $ip ) {
		$h	= unpack( "H*hex", inet_pton( $ip ) );
		$ip	= preg_replace( '/([A-f0-9]{4})/', "$1:", $hex['hex'] );
		
		return substr( $ip , 0, -1 );
	}
	
	
	/**
	 * Checks for martians E.G. 10.0.0.0/8
	 * These should really be blocked at the router/switch
	 */
	private function checkIP( $ip ) {
		return filter_var( $ip, FILTER_VALIDATE_IP, 
			FILTER_FLAG_NO_RES_RANGE | FILTER_FLAG_NO_PRIV_RANGE );
	}
	
	private function killReq( $msg ) {
		$this->logReq();
		//echo $this->fire->response;
	}
	
	private function logReq() {
		$this->fire->save();
	}
	
	private function headers() {
		if ( function_exists( 'getallheaders' ) ) {
			return getallheaders();
		}
		
		$headers = array();
		
		foreach( $_SERVER as $k => $v ) {
			
			if ( 0 === strpos( $k, 'HTTP_' ) ) {
				
				/**
				 * Remove HTTP_ and turn turn '_' to spaces
				 */
				$hd	= substr( $k, 5 );
				$hd	= str_replace( '_', ' ', $hd );
				
				/**
				 * E.G. ACCEPT LANGUAGE to Accept-Language
				 */
				$uw	= ucwords( strtolower( $hd ) );
				$uw	= str_replace( ' ', '-', $uw );
				
				$headers[ $uw ] = $value; 
			}
		}
		
		return $headers;
	}
	
	private function getURI() {
		if ( isset( $_SERVER['REQUEST_URI'] ) ) {
			return $_SERVER['REQUEST_URI'];
		}
		
		$_SERVER['REQUEST_URI'] = substr( $_SERVER['PHP_SELF'], 1 );
		
		if ( isset($_SERVER['QUERY_STRING'] ) ) {
			$_SERVER['REQUEST_URI'] .= '?' . 
				$_SERVER['QUERY_STRING'];
		}
	}
}

The bad user agents ini file

; Partial (I.E. never ending) list of User Agents and partial matches
; Courtesy of the following:
; 
; http://bad-behavior.ioerror.us/
; https://github.com/bluedragonz/bad-bot-blocker/blob/master/.htaccess
; http://forum.joomla.org/viewtopic.php?t=494485
;
;Last count at 278 fragments checked


u[] = '**'
u[] = '\\\\'
u[] = '.NET CLR 1)'
u[] = '.NET CLR1'
u[] = '\r'
u[] = '<sc'
u[] = '; Widows'
u[] = '360Spider'
u[] = '8484 Boston Project'
u[] = 'a href='
u[] = 'Aboundex'
u[] = 'Acunetix'
u[] = 'adwords'
u[] = 'Alexibot'
u[] = 'AIBOT'
u[] = 'asterias'
u[] = 'attach'
u[] = 'autoemailspider'
u[] = 'BackDoorBot'
u[] = 'BackWeb'
u[] = 'Bad Behavior Test'
u[] = 'Bandit'
u[] = 'BatchFTP'
u[] = 'Bigfoot'
u[] = 'Black.Hole'
u[] = 'BlackHole'
u[] = 'BlackWidow'
u[] = 'blogsearchbot-martin'
u[] = 'BlowFish'
u[] = 'Bot mailto:craftbot@yahoo.com'
u[] = 'BotALot'
u[] = 'BrowserEmulator'
u[] = 'Buddy'
u[] = 'BuiltBotTough'
u[] = 'Bullseye'
u[] = 'BunnySlippers'
u[] = 'Cegbfeieh'
u[] = 'CheeseBot'
u[] = 'CherryPicker'
u[] = 'ChinaClaw'
u[] = 'Clearswift'
u[] = 'clipping'
u[] = 'Cogentbot'
u[] = 'Collector'
u[] = 'compatible ; MSIE'
u[] = 'compatible-'
u[] = 'CoralWebPrx'
u[] = 'core-project'
u[] = 'Copier'
u[] = 'CopyRightCheck'
u[] = 'cosmos'
u[] = 'Crescent'
u[] = 'Custo'
u[] = 'Diamond'
u[] = 'Digger'
u[] = 'DIIbot'
u[] = 'DISCo'
u[] = 'DittoSpyder'
u[] = 'discovery'
u[] = 'dragonfly'
u[] = 'Drip'
u[] = 'Download'
u[] = 'eCatch'
u[] = 'Easy'
u[] = 'Email'
u[] = 'Emulator'
u[] = 'Enchanc'
u[] = 'EroCrawler'
u[] = 'Exabot'
u[] = 'Express WebPictures'
u[] = 'Extrac'			; Extractors
u[] = 'EyeNetIE'
u[] = 'Fail'
u[] = 'Fatal'
u[] = 'FlashGet'
u[] = 'FHscan'
u[] = 'Firebird'		; Too old to be viable
u[] = 'flunky'
u[] = 'Foobot'
u[] = 'Forum Poster'
u[] = 'FrontPage'
u[] = 'Gecko/2525'
u[] = 'GetRight'
u[] = 'GetWeb!'
u[] = 'Go!Zilla'
u[] = 'Go-Ahead-Got-It'
u[] = 'gotit'
u[] = 'Grab'
u[] = 'Grafula'
u[] = 'grub'
u[] = 'hanzoweb'
u[] = 'Harvest'
u[] = 'Havij'
u[] = 'hloader'
u[] = 'HMView'
u[] = 'HttpProxy'
u[] = 'HTTrack'
u[] = 'humanlinks'
u[] = 'IlseBot'
u[] = 'Indy Library'
u[] = 'InfoNaviRobot'
u[] = 'InfoTekies'
u[] = 'Intelliseek'
u[] = 'InterGET'
u[] = 'Internet Explorer'	; *Not* IE. UA is likely a bot
u[] = 'Intraformant'
u[] = 'ISC Systems iRc'
u[] = 'Iria'
u[] = 'Java'
u[] = 'Jakarta'
u[] = 'Jenny'
u[] = 'JetCar'
u[] = 'JOC'
u[] = 'JustView'
u[] = 'Jyxobot'
u[] = 'Kenjin'
u[] = 'Keyword'
u[] = 'larbin'
u[] = 'Leacher'
u[] = 'LexiBot'
u[] = 'LeechFTP'
u[] = 'libwww-perl'
u[] = 'lftp'
u[] = 'libWeb/clsHTTP'
u[] = 'likse'
u[] = 'LinkScan'
u[] = 'LNSpiderguy'
u[] = 'LinkWalker'
u[] = 'Lobster'
u[] = 'Locator'
u[] = 'LWP'
u[] = 'Magnet'
u[] = 'Mag-Net'
u[] = 'MarkWatch'
u[] = 'Mata.Hari'		; Well, now I've seen everything
u[] = 'Memo'
u[] = 'Microsoft URL'
u[] = 'Microsoft.URL'
u[] = 'MIDown'
u[] = 'Ming Mong'
u[] = 'Missigua'
u[] = 'Mister'
u[] = 'MJ12bot/v1.0.8'
u[] = 'moget'
u[] = 'Morfeus'
u[] = 'Movable Type'		; Not the blog engine
u[] = 'Mozilla.*NEWT'
u[] = 'Mozilla/0'
u[] = 'Mozilla/1'
u[] = 'Mozilla/2'
u[] = 'Mozilla/3'
u[] = 'Mozilla/4.0('
u[] = 'Mozilla/4.0+(compatible;+'
u[] = 'Mozilla/4.0 (Hydra)'
u[] = 'MSIE 7.0;  Windows NT 5.2'
u[] = 'Murzillo'
u[] = 'MVAClient'
u[] = 'Navroad'
u[] = 'NearSite'
u[] = 'NetAnts'
u[] = 'NetMechanic'
u[] = 'NetSpider'
u[] = 'Net Vampire'
u[] = 'NetZIP'
u[] = 'Nessus'
u[] = 'NG'
u[] = 'NICErsPRO'
u[] = 'Nikto'
u[] = 'Ninja'
u[] = 'Nimble'
u[] = 'NPbot'
u[] = 'Nomad'
u[] = 'NutchCVS'
u[] = 'Nutscrape'
u[] = 'NextGen'
u[] = 'Octopus'
u[] = 'OmniExplorer'
u[] = 'Opera/9.64('
u[] = 'Offline'		 ; 'Offline' anything is a scraper
u[] = 'Openfind'
u[] = 'OutfoxBot'
u[] = 'Papa Foto'
u[] = 'pavuk'
u[] = 'pcBrowser'
u[] = 'Perman Surfer'
u[] = 'PHP'
u[] = 'Pockey'
u[] = 'PMAFind'
u[] = 'POE'
u[] = 'ProPowerBot'
u[] = 'psbot'
u[] = 'psycheclone'
u[] = 'Pump'
u[] = 'PussyCat'
u[] = 'PycURL'
u[] = 'Python-urllib'
u[] = 'QueryN'
u[] = 'RealDownload'
u[] = 'Reaper'
u[] = 'Recorder'
u[] = 'ReGet'
u[] = 'RepoMonkey'
u[] = 'RMA'
u[] = 'revolt'
u[] = 'Siphon'
u[] = 'SiteSnagger'
u[] = 'SlySearch'
u[] = 'SmartDownload'
u[] = 'Snake'
u[] = 'Snapbot'
u[] = 'sogou'
u[] = 'SpaceBison'
u[] = 'Spank'
u[] = 'spanner'
u[] = 'sqlmap'
u[] = 'Sqworm'
u[] = 'Stripper'
u[] = 'Sucker'
u[] = 'SuperBot'
u[] = 'Super Happy Fun'
u[] = 'SuperHTTP'
u[] = 'Surfbot'
u[] = 'suzuran'
u[] = 'Szukacz'
u[] = 'tAkeOut'
u[] = 'TightTwatBot'		; WTF?!
u[] = 'Titan'
u[] = 'Teleport'
u[] = 'Telesoft'
u[] = 'TrackBack'
u[] = 'True_Robot'
u[] = 'Turing Machine'
u[] = 'turingos'
u[] = 'TurnitinBot'
u[] = 'Ubuntu/9.25'
u[] = 'unspecified'
u[] = 'user'
u[] = 'User Agent:'
u[] = 'User-Agent:'
u[] = 'VoidEYE'
u[] = 'w3af'
u[] = 'Warning'
u[] = 'Web Image Collector'
u[] = 'WebaltBot'
u[] = 'WebAuto'
u[] = 'WebFetch'
u[] = 'WebGo'
u[] = 'WebmasterWorldForumBot'
u[] = 'WebSauger'
u[] = 'WebSite-X Suite'
u[] = 'Website eXtractor'
u[] = 'Website Quester'
u[] = 'Webster'
u[] = 'WebWhacker'
u[] = 'WebZIP'
u[] = 'Whacker'
u[] = 'Widow'
u[] = 'Winnie Poh'
u[] = 'Win95'			; These are too old. Likely bots
u[] = 'Win98'
u[] = 'WinME'
u[] = 'Win 9x 4.90'
u[] = 'Windows 3'
u[] = 'Windows 95'
u[] = 'Windows 98'
u[] = 'Windows NT 4'
u[] = 'Windows NT;'
u[] = 'Windows NT 5.0;)'
u[] = 'Windows NT 5.1;)'
u[] = 'Windows XP 5'
u[] = 'WISEbot'
u[] = 'WISENutbot'
u[] = 'Wordpress'		; Vulnerability scanner
u[] = 'WWWOFFLE'
u[] = 'Vacuum'
u[] = 'VCI'
u[] = 'Xaldon'
u[] = 'Xenu'
u[] = 'Zeus'
u[] = 'ZmEu'
u[] = 'Zyborg'

The verified search engines

; Whitelisted popular bots and corresponding IP addresses
; Note: This isn't exhaustive and will likely fail on a few 
; legitimate visits from these. This is mostly to prevent spoofers.
; 
; http://chceme.info/ips/
; http://www.webmasterworld.com/search_engine_spiders/4475767.htm
; http://www.internetofficer.com/web-robot/yahoo/

[Google]
i[] = '64.233.160.0/19'
i[] = '66.102.0.0/20' 
i[] = '66.249.64.0/19'
i[] = '72.14.192.0/18' 
i[] = '74.125.0.0/16' 
i[] = '209.85.128.0/17' 
i[] = '216.239.32.0/19'

[Bing_Live_MS Search_MSN]
i[] = '64.4.0.0/18'
i[] = '65.52.0.0/14'
i[] = '131.253.21.0/24'
i[] = '131.253.22.0/23'
i[] = '131.253.24.0/21'
i[] = '131.253.32.0/20'
i[] = '157.54.0.0/15'
i[] = '157.56.0.0/14'
i[] = '157.60.0.0/16'
i[] = '207.46.0.0/16'
i[] = '207.68.128.0/18'
i[] = '207.68.192.0/20'

[Inktomi_Slurp_SearchMonkey_Yahoo] 
i[] = '8.12.144.0/24'
i[] = '66.196.64.0/18'
i[] = '66.228.160.0/19'
i[] = '67.195.0.0/16'
i[] = '68.142.192.0/18'
i[] = '68.180.128.0/17'
i[] = '72.30.0.0/16'
i[] = '74.6.0.0/16'
i[] = '202.160.176.0/20'
i[] = '209.191.64.0/18'

[Baidu]
i[] = '61.135.190.1/32'		; CN...
i[] = '61.135.190.2/31'
i[] = '61.135.190.4/30'
i[] = '61.135.190.8/29'
i[] = '61.135.190.16/28'
i[] = '61.135.190.32/27'
i[] = '61.135.190.64/26'
i[] = '61.135.190.128/26'
i[] = '61.135.190.192/27'
i[] = '61.135.190.224/28'
i[] = '61.135.190.240/29'
i[] = '61.135.190.248/30'
i[] = '61.135.190.252/31'
i[] = '61.135.190.254/32'
i[] = '119.63.192.0/21'		; JP...
i[] = '119.63.192.128/26'
i[] = '119.63.192.192/27'
i[] = '119.63.192.224/28'
i[] = '119.63.192.240/29'
i[] = '119.63.192.248/30'
i[] = '119.63.192.252/31'
i[] = '119.63.192.254/32'
i[] = '119.63.193.0/24'
i[] = '119.63.196.1/32'
i[] = '119.63.196.2/31'
i[] = '119.63.196.4/30'
i[] = '119.63.196.8/29'
i[] = '119.63.196.16/28'
i[] = '119.63.196.32/27'
i[] = '119.63.196.64/26'
i[] = '119.63.198.0/24'
i[] = '119.63.199.103/32'
i[] = '123.125.64.0/18'		; CN...
i[] = '123.125.66.0/24'
i[] = '123.125.71.0/24'
i[] = '180.76.0.0/16'
i[] = '180.76.5.0/24'
i[] = '180.76.6.0/24'
i[] = '220.181.0.0/18'
i[] = '220.181.7.0/24'
i[] = '220.181.108.0/24'

[Yandex]
i[] = '77.88.0.0/18'
i[] = '77.88.22.0/23'
i[] = '77.88.24.0/21'
i[] = '77.88.24.0/22'
i[] = '77.88.28.0/22'
i[] = '77.88.36.0/23'
i[] = '77.88.42.0/23'
i[] = '77.88.44.0/24'
i[] = '77.88.50.0/23'
i[] = '87.250.224.0/19'
i[] = '87.250.230.0/23'
i[] = '87.250.252.0/22'
i[] = '93.158.128.0/18'
i[] = '93.158.137.0/24'
i[] = '93.158.144.0/21'
i[] = '93.158.144.0/23'
i[] = '93.158.146.0/23'
i[] = '93.158.148.0/22'
i[] = '95.108.128.0/17'
i[] = '95.108.128.0/24'
i[] = '95.108.152.0/22'
i[] = '95.108.216.0/23'
i[] = '95.108.240.0/21'
i[] = '95.108.248.0/23'
i[] = '178.154.128.0/17'
i[] = '178.154.160.0/22'
i[] = '178.154.164.0/23'
i[] = '199.36.240.0/22'
i[] = '213.180.192.0/19'
i[] = '213.180.204.0/24'
i[] = '213.180.206.0/23'
i[] = '213.180.209.0/24'
i[] = '213.180.218.0/23'
i[] = '213.180.220.0/23'

The ‘Bad URIs’

; URL fragments indicating possible SQL injection or 
; directory traversal attempts. Part of the matches from Bad Behavior
; 
; http://www.technicalinfo.net/papers/URLEmbeddedAttacks.html


u[] = '0x31303235343830303536'
u[] = '../'
u[] = '..\\'
u[] = '..%2F'
u[] = '..%u2216'
u[] = '?=PHP'				; Attempt to reveal PHP version
u[] = '%60information_schema%60'
u[] = ';DECLARE%20@'
u[] = '%7e'
u[] = '%3cscript%20'
u[] = '%27%3b%20'
u[] = '%22http%3a%2f%2f'
u[] = '%255c'
u[] = '%%35c'
u[] = '%25%35%63'
u[] = '%c0%af'
u[] = '%c1%9c'
u[] = '%c1%pc'
u[] = '%c0%qf'
u[] = '%c1%8s'
u[] = '%c1%1c'
u[] = '%c1%af'
u[] = '%e0%80%af'
u[] = '%u'
u[] = '+%2F*%21'
u[] = '%27--'
u[] = '%27 --'
u[] = '%27%23'
u[] = '%27 %23'
u[] = 'benchmark%28'
u[] = 'insert+into+'
u[] = 'r3dm0v3'
u[] = 'select+1+from'
u[] = 'union+all+select'
u[] = 'union+select'
u[] = 'waitfor+delay+'
u[] = 'w00tw00t'

And, finally, a ‘FireEntry’ example model. This can show what variables would be saved to the db.

<?php


namespace Models;

class FireEntry extends base {
	
	/**
	 * @var string Assigned label (not UA, but what the firewall determined)
	 */
	public $label	= 'unknown';
	
	
	/**
	 * @var string Request method
	 */
	public $method	= '';
	
	
	/**
	 * @var string Accessed URI
	 */
	public $uri	= '';
	
	
	
	/**
	 * @var string Accessing IP
	 */
	public $ip	= '';
	
	
	
	/**
	 * @var string User Agent string
	 */
	public $ua	= '';
	
	
	
	/**
	 * @var string Complete header string
	 */
	public $headers	= '';
	
	
	
	/**
	 * @var string Requested server protocol
	 */
	public $protocol = '';
	
	
	
	/**
	 * @var string Firewall action (blocked, passed etc...)
	 */
	public $response = '';
	
	
	
	/**
	 * @var string Time the request was received
	 */
	public $reqtime = '';
	
	
	public function __construct( array $data = null ) {
		
		if ( empty( $data ) ) {
			return;
		}
		
		foreach ( $data as $field => $value ) {
			$this->$field = $value;
		}
	}
	
	
	public function save() {
		$time	= parent::_myTime( time() );
		$row	= 0;
		
		$headers='';
		if ( !empty( $this->headers ) ) {
			
		}
		if ( empty( $this->reqtime ) ) {
			$this->reqtime = $time;
		} else {
			$this->reqtime = parent::_myTime( $this->reqtime );
		}
		
		$params = array(
			'label'		=> $this->label,
			'method'	=> $this->method,
			'uri'		=> $this->uri,
			'ip'		=> $this->ip,
			'ua'		=> $this->ua,
			'headers'	=> $headers,
			'protocol'	=> $this->protocol,
			'reqtime'	=> $this->reqtime,
			'updated_at'	=> $time
		);
		
		var_dump( $params );
		//parent::put( 'firewall', $params );
	}
	
	
	
	public static function find( $filter = array() ) {
		// TODO: Filter
		
	}
	
	
	public static function gc( $exp ) {
		$sql	= "DELETE FROM firewall WHERE ( created_at < : exp);";
		$param	= array( 'exp' => $exp );
		
		parent::init();
		parent::$db->prepare( $sql );
		parent::$db->execute( $param );	
	}
	
	
	private static function filterConfig( &$filter = array() ) {
		$filter['limit']	= isset( $filter['limit'] ) ? $filter['limit'] : 10;
		$filter['page']		= isset( $filter['page'] ) ? $filter['page'] : 1;
		$filter['search']	= isset( $filter['search'] ) ? 
						$filter['search'] : '';
		
		$filter['offset']	= parent::_offset( 
						$filter['page'] , 
						$filter['limit']
					);
	}
}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s