One of the largest discussions on the PHP internals list begun last Friday with a lengthy proposal by Wietse Venema. At the time of writing this entry, the proposal generated over 100 responses. Everyone on the list had it’s word to say. Now that the two Jesters and the King of Spades (Andi Gutmans, Zeev Suraski and Rasmus Lerdorf) gave their opinion, it might be time for a recapitulation. At this time, it’s almost certain an implementation will be made, not so certain it will be incorporated.
So, what is taint mode all about? It’s about automatically marking data from external sources as unsafe and making sure it’s sanitized before being used. Here is a sample explaining what it does in detail. For now, Beep! will indicate an error. A hundred responses was not enough to decide what to do with these.
$result = mysql_query( "SELECT * FROM foo
WHERE bar = '{$_GET['baz']}'" ); // Beep!
$result = mysql_query( "SELECT * FROM foo
WHERE bar = '" .
mysql_real_escape_string($_GET['baz']) . "'" ); // OK
while( $row = mysql_fetch_row( $result ) )
{
echo $row[0]; // Beep!
echo htmlentities( $row[0] ); // OK
echo (untainted) $row[0]; // OK, but no decision on syntax for this one
}
$stmt = $pdo_dbh->prepare( "INSERT INTO foo (bar, baz)
VALUES(:a, :b)" );
$stmt->execute(
array( ':a' => $_GET['baz'], ':b' => 123 ) ); // OK
Basically, the taint mode only ensures that the appropriate an escaping mechanism is used before sending information to an external source. It does not replace input filtering, as the sample above still contains many flaws.
The opposition to taint mode, lead by the Jack of Hearts (Ilia Alshanetsky) is scared that taint mode will only become an other safe_mode, causing more harm than good. In a worst case scenario, ISPs would enable the taint mode thinking it will make their servers safe from everything, which, as everyone agrees on, is not true. Fooling the taint mode is fairly easy.
exec( mysql_real_escape_string( $_GET['command'] ) ); // OK
To prevent this from happening, taint mode would need to be aware of the context, which would require more than 1 bit, as the current proposal indicates. A bulletproof solution would require too much overhead to be acceptable. Early in the discussions, the overhead of a single check was a concern.
Those supporting the idea (supporting the proposal is not quite the same thing) think it would be a good tool for security conscious developers willing to improve the security of their applications. Those willing to avoid security will be able to do it anyway (PHP always allowed anyone to shoot themselves in the foot). Even if taint mode does not catch every security issue, being able to catch 90% is better than nothing. Plus, it could actually be a good educational tool for those willing to learn. The perception of the public towards the new language feature is important and appropriate communication will be required to avoid problems with ISPs.
Enabling, disabling and error reporting
One thing is certain: it will be disabled by default to keep backwards compatibility. Even the best applications out there would probably fail the taint mode test. Take this safe example:
$foo = 123; // tainted, from external source
if( is_numeric( $foo ) )
echo $foo;
The example is perfectly safe, but taint mode will not understand this. Most applications will not run under taint mode unless major modifications are made. This is why some suggested it should only be used for new developments. Some even suggested that, just like error reporting, it should be disabled in production. It was proposed that the checks could be enabled or disabled at compile time to avoid any overhead in a production environment, but this one did not make too much noise. The other option is of course to make it an option in php.ini. Making it an option in the configuration file would of course make it easier to Windows users not used to compiling PHP.
Three modes were proposed.
- Disabled.
- Audit mode. Application would still run, but problems would be logged to a file.
- Enforcement mode. Kill the script on taint error.
The last one, now known as mode 3, was loudly rejected by both sides for giving a false sense of security. It would encourage hosts to enable it and only accept taint mode compliant applications to run on the server. Passing the taint mode test does not mean the application is secure and that is not the message that wants to be sent. Plus, it would encourage all application writers out there to patch their applications blindly just to get them to run on shared hosts.
An option proposed would be to add an E_TAINT reporting level, which would allow to enable or disable the reporting easily and take advantage of all the logging mechanisms available.
After all, this is the internals list
Of course, there were discussions about implementation details, the impact on source code to be modified, how it would be integrated in the ZVAL. The conclusion to this was that it should not be too complicated, but a base implementation is required before giving more details.
No matter what, the workload required can’t be worst than converting 3000 functions for unicode compliance in PHP 6. As for the inclusion of taint mode: no words. Obviously PHP 6 if it’s accepted, but who knows, it might just be included in 5.3.
Some off topic issues
On the topic of perception by ISPs and good communication of the purpose of taint mode, it was proposed to rename php.ini to php.ini-development and php-ini-recommended to php.ini-production in the distribution for clarity purposes. The former would include E_ALL error reporting and probably taint mode, and the second would disable error output and taint mode. Unlike most issues on the thread, most seemed to agree on the rename, while the content of the file is not likely to change.
A proposition to kill $_GET/_POST/$_ENV/$_COOKIE/$_SESSION/$_REQUEST altogether was made. The point was that the filter extension provides a better solution and those legacy superglobals now caused more harm than good. I think this one was rejected without any discussion. Can you imagine how many applications would break? But the point is still valid: ext/filter is a better option.
Where do I stand?
I think taint mode combined with the filter extension could change the way PHP applications are being written in a drastic way. Ever since PHP 5, PHP has only become more elegant as a language. This new proposal would actually enforce good practises and place a focus on security. I think taint mode is only a tool for developers. Some PHP developers out there don’t care about security. Fine, they don’t have to use it. But some do, some make efforts to make their applications secure. Having taint mode would help these people catch that one place they forgot to escape a value. While it won’t assure that the application is completely secure, it will at least give a certain confidence level that most of it is safe.
