Defensive web development

Whether the currency in question is dollars, Bitcoin, moral principles or infamy, a compromised site is just the end result of a business transaction. The purpose of this post is to consider the basic options in making this business unfavorable to an attacker; not eliminate it altogether. There are circumstances in which the business of compromise will still take place even in extremely unfavorable or, unforeseen to you, favorable conditions. Although some of the examples are in PHP as implemented on a typical *nix environment, the ideas here should apply to most other development conditions.

Broad premises

Reasons for compromise beyond “because they could” should be considered irrelevant.

You will not think of every conceivable approach to compromise so plan for contingencies. Always keep current backups, leave customer data segregated and encrypted, and never test on a production machine or connect to a production environment during testing. Always turn off debugging info and error messages where they may be seen by clients. Never store passwords, keys to storage servers, authentication tokens etc… in your script files. If these must be used in some way by your code, try storing them in php.ini or in a folder outside the web root in a per-user .ini that only PHP has read access to, but the http server does not.

What do you do when they come for you

What do you do when they come for you

Enable two factor authentication for any critical services that use the feature (especially your email). If you have login or administrator privileges for your project, never use HTML email. In fact, I’d recommend not using HTML in emails at all and filtering any clickable links into plain URLs that you can copy > paste if you need to visit them.

You won't always see it coming. Even if you do, you may not be able to avoid it.

You won’t always see it coming. Even if you do, you may not be able to avoid it.


Try to avoid “I’ve done everything I could” and “that’s probably OK” lines of thought, but do prioritize critical sections and continue to explore responses to undesirable inputs and conditions. E.G. Try to throw strings or whole files at fields where you were expecting an integer. The type of input, E.G. <select>, <input type=”email”> etc… means nothing to someone who has the “action” URL of your form. Send ridiculously large text, cookies, binaries or otherwise malformed content and see how the server responds. Always validate and sanitize client data.

In the same vein, blacklists are not favorable compared to whitelists when filtering. Only allowing inputs that follow a known set of acceptable criteria is simply a matter of practicality (and in most cases, feasibility since you probably lack omnipotence). An attacker need not succeed on every attempt at compromise, but a defender only gets to fail once. And that single failure could be catastrophic.

Always make sure your read/write/execute privileges are appropriate to minimize chances of accidental exposure. Never allow uploads to folders that have execute permissions and never allow write permissions on executable folders. Put script files outside your web root whenever possible and try to avoid applications and web hosts that limit these options. Consider putting file uploads outside the web root as well and let your scripting handle access to them by stripping out invalid path characters and specifying which directory to search. This creates for some additional overhead, but it prevents the http server from reading uploads directly which may lead to directory traversal if the server isn’t configured properly.

Client requests

Stick to what you can actually digest

Stick to what you can actually digest


Read on GET, act on POST, do nothing special on HEAD, use PUT or PATCH with extreme caution, filter all and let the rest die();

The GET method is for retrieval I.E. reading and you should concentrate on that. Generally, we want to avoid writing to a database on GET unless it’s for statistics or analytics purposes (*).

* Analytics needs a major overhaul. You don’t need to record everything a visitor does on your page and almost everything you do record will be obsolete fairly quickly. So unless you run an ad company, keep analytics to an absolute minimum. Always remember, more “things” are more moving parts and moving parts tend to fail.

POST should be used for creating new content E.G. pages, posts, comments etc… When the database auto-increments IDs or otherwise generates unique identifiers for you, POST is a great way to handle content creation. When using PUT or PATCH, you’re telling the server what the name the resource is. This is not quite the same as a content post title which can double as a URL slug; the database still has an auto-generated ID unique to that post. The resource handler needs to account for name conflict resolution, and the fact that PUT is idempotent. That is, the current request doesn’t rely on the success or failure of the previous one and so can be sent multiple times for the same resource. This may not be desirable in POST where you often don’t want content to be submitted twice.

PATCH is a special case that gets abused often (almost as much as PUT) and it’s simply a set of instructions on how to modify a resource already present on the server. Learn more about these methods before implementing PUT or PATCH.

Never touch $_GET, $_POST or $_FILES directly throughout your application. Always use filters and sanitization to ensure you’re getting the type of content you expected. For $_GET, Regular Expressions will usually suffice since we’re not dealing with HTML. Never handle HTML content with regex. The following is a friendly URL router for a possible blog or similar application.

<?php

namespace Blog; //... Or something

class Router {
	
	/**
	 * @var array Methods, routes and callbacks
	 */
	private static $routes	= array();
	
	/**
	 * Router constructor
	 */
	public function __construct() {	}
	
	/**
	 * Add a request method with an accompanying route and callback
	 * 
	 * @param	string		$method Lowercase request method
	 * @param	string		$route Simple regex route path
	 * @param	callable	$callback Function call
	 */
	public function add( $method, $route, $callback ) {
		// Format the regex pattern
		$route = self::cleanRoute( $route );
		
		// First time we're adding a path to this method?
		if ( !isset( self::$routes[$method] ) ) {
			 self::$routes[$method] = array();
		}
		
		// Add a route to this method and set callback as value
		self::$routes[$method][$route] = $callback;
	}
	
	/**
	 * Sort all sent routes for the current request method, iterate 
	 * through them for a match and trigger the callback function
	 */
	public function route() {
		if ( empty( self::$routes ) ) { // No routes?
			$this->fourOhFour();
		}
		
		// Client request path
		$path	= $_SERVER['REQUEST_URI'];
		
		// Client request method
		$method = strtolower( $_SERVER['REQUEST_METHOD'] );
		
		// No routes for this method?
		if ( empty( self::$routes[$method] ) ) {
			$this->fourOhFour();
		}
		
		// Found flag
		$found	= false;
		
		// For each path in each method, iterate until match
		foreach( self::$routes[$method] as $route => $callback ) {
			
			// Found a match for this method on this path
			if ( preg_match( $route, $path, $params ) ) {
				
				$found = true; // Set found flag
				if ( count( $params ) > 0) {
					// Clean parameters
					array_shift( $params );
				}
				
				// Trigger callback
				return call_user_func_array( 
					$callback, $params 
				);
			}
		}
		
		// We didn't find a path 
		if ( !$found ) {
			$this->fourOhFour();
		}
	}
	
	/**
	 * Paths are sent in bare. Make them suitable for matching.
	 * 
	 * @param	string		$route URL path regex
	 */
	private static function cleanRoute( $route ) {
		$regex	= str_replace( '.', '\.', $route );
		return '@^/' . $route . '/?$@i';
	}
	
	/**
	 * Possible 404 not found handler. 
	 * Something that looks nicer should be used in production.
	 */
	private function fourOhFour() {
		die( "<em>Couldn't find the page you're looking for.</em>" );
	}
}

You can then utilize it as follows

// Main index. 
function index( $page = 1 ) {
	// Do something with the given page number
}

function read( $id, $page = 1 ) {
	// Do something with $id and page number
}

// Now, you can create the router
$router		= new Blog\Router();

// Browsing index or homepage
$router->add( 'get', '', 'index' );
$router->add( 'get', '([1-9][0-9]+)', 'index' );

// Note: The regex requires the page number to start from 1-9

// Specific post
$router->add( 'get', '/post/([1-9][0-9]+)', 'read' );
$router->add( 
	'get', 
	'post/([1-9][0-9]+)/([1-9][0-9]+)', // ID and pages start from 1-9
	'read' 
);

// Now we can route
$router->route();

When handling POST content, we have to be a little more careful. The following is an example of a content post filter which uses typical fields and PHP’s built in content filtering

function getPost() {
	$filter	= array(
		'csrf'	=> FILTER_SANITIZE_FULL_SPECIAL_CHARS,
		'id' 	=> FILTER_SANITIZE_NUMBER_INT,
		'parent'=> FILTER_SANITIZE_NUMBER_INT,
		'title' => FILTER_SANITIZE_FULL_SPECIAL_CHARS,
		'body' 	=> FILTER_SANITIZE_FULL_SPECIAL_CHARS
	);
	
	return filter_input_array( INPUT_POST, $filter );
}

You probably want to do some special formatting for filtering HTML, but this gets rid of the overwhelming majority of undesired inputs a client may send. The filter_input_array function is quite useful for building content with multiple fields at once. When the field has not been sent, the array value will be NULL. You’ll note the ‘csrf’ field. It’s important to ensure that content sent by the user was actually intended, and anti-cross-site request forgery tokens are very helpful in that regard.

Authentication

Looks mighty suspicious!

Looks mighty suspicious!


The only safe way to ensure communication between a user and the server is secure is when the connection uses TLS. Even then, you should avoid storing the username or user ID in the cookie of a logged in user as that is sent on each request to the server. Instead, use an ‘auth’ field in your database table that is a randomly generated hash as the identifier. When the logged in user visits the site, the random hash is sent to the server and the server can use that to lookup the user instead of an ID or username. The ‘auth’ token should be renewed after each successful login.

As an additional benefit, using an auth hash will make it easy to force logout a user simply by deleting the hash stored in the database. If you believe a user’s password has been compromised or if the user requests a password reset, it’s best to delete the auth token, and send a separate link (which expires within the hour and is valid for single-use) to the user’s email to be reset instead of generating a new one yourself.

If you want to add an additional bit of verification to the cookie, you can add a hash of the client’s request signature. This is not going to be unique at all, but it will make spoofing a tiny bit harder for someone who simply steals the cookie without making note of the browser characteristics of the victim user. Keep in mind that if the cookie was sniffed in clear text, this may not help much. Remember that nothing seen in “HTTP_” header variables are reliable.

function signature() {
	$out = '';
	foreach ( $_SERVER as $k => $v ) {
		switch( $k ) {
			case 'HTTP_ACCEPT_CHARSET':
			case 'HTTP_ACCEPT_ENCODING':
			case 'HTTP_ACCEPT_LANGUAGE':
			case 'HTTP_UA_CPU':
			case 'HTTP_USER_AGENT':
			case 'HTTP_VIA':
			case 'HTTP_CONNECTION':
				$out .= $v;
		}
	}
	return hash( 'tiger160,4', $out );
}

Note that I avoided using the client’s IP address which may change often and is sometimes shared with popular proxies. Storing the output of this hash with the cookie along with the auth token will help to avoid identifying the user by name or user ID using the cookie alone.

From the inside

The hardest position to defend against is when the attacker is on the inside. There’s a large swath of information out there about compartmentalization, decentralization and restricting access to information to those who need to know. Instead, I’ll leave you with this excerpt from The Godfather Part II.

Michael Corleone: There’s a lot I can’t tell you, Tom. Yeah, I know that’s upset you in the past. You felt it was a because of a lack of trust or confidence, but it’s… it’s because I admire you. And I love you that, I kept things secret from you. It’s why at this moment that you’re the only one I can completely trust.

Fredo. Ah, he’s got a good heart. But he’s weak and he’s stupid. And this is life and death. Tom, you’re my brother.

Tom Hagen: I always wanted to be thought of as a brother by you, Mikey. A real brother.

Michael: You’re gonna take over. You’re gonna be the Don. If what I think is happened has happened, I’m gonna leave here tonight. I give you complete power, Tom. Over Fredo and his men. Rocco, Neri, everyone. I’m trusting you with the lives of wife and my children. The future of this family.

Tom: If we ever catch these guys do you think we’ll find out who’s at the back of all this?

Michael: We’re not gonna catch ’em. Unless I’m very wrong, they’re dead already. Killed by somebody close to us. Inside. Very, very frightened they’ve botched it.

Tom: But your people, Rocco and Neri, you don’t think they had something to do with this.

Michael: You see, all our people are businessmen. Their loyalty is based on that. One thing I learned from pop, was to try to think as people around you think. Now on that basis, anything is possible.

One thought on “Defensive web development

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s