laziness, impatience, hubris
 
 


graciously hosted by neverblock
http://www.neverblock.com/

blog.josephhall.com


Friday, April 04, 2008

PHP Worst Practices

Doran left a particularly interesting comment on my post about Windows security. It's interesting that he should do so right about now, because his comment echoed some of my own thoughts about php.

Those who know me know that I don't like php. That's not the point of this post. It just seems that so many php programmers seem to have grown up with a lot of bad habits, and lately it's been causing me some grief, even though I don't program in php. I'm not saying that Perl doesn't have issues of its own. But Perl has at least one style guide, maybe more. Having been through Damian Conways class and book on Perl Best Practices, I'm thoroughly convinced that the majority of complaints that people have about Perl would disappear if Perl programmers as a whole took his teachings to heart.

Unfortunately, php programmers don't seem to have such a guide. And sadly, many (but not all, thankfully) php programmers seem to be little more than script kiddies. They choose php because it's a buzzword, it's easy to use, and they can see results more quickly. Unfortunately, it seems to me that their environments teach them a lot of bad habits. I'm going to pinpoint two things in particular that I've come across lately with php code.

short_open_tag

A friend of mine called me a couple of weeks ago. He hosts a number of websites on an IIS box (don't worry, he's already working on switching to Linux) and some of them use php. He upgraded to the latest version, tested a couple of apps, and decided he was good to go. Then a customer called him complaining that his site had stopped working. He did what he could, and then called upon me in desperation.

I tested a few things. It seemed that php was in fact working on all but that one site. I checked permissions, file associations, even monkeyed with a little of the site's internal code. While I was in there, I noticed something. All of the php code was encapsulated between <? and ?> markers. I opened his php.ini file, found the short_open_tag open and changed it from Off to On. Suddenly, the site worked.

As with many embedded languages, php places its code between markers so that the web server knows which part of the file is in HTML and which part is server-side code. When php code is used, the opening tag should say <?php, so that the web server knows for sure that it needs to use the php interpreter on that block of code. If a programmer leaves it at <? without the php part, it's known as a short open tag, and the web server needs to guess which language to use. If a file ends with .php, it's probably a pretty safe bet that it will need to use php.

Many sysadmins consider this to be a security hole, and for this reason, the php.ini allows the admin to turn off short open tags. In fact, this is now the default. The php.ini file itself says the following:

; NOTE: Using short tags should be avoided when developing applications or

; libraries that are meant for redistribution, or deployment on PHP

; servers which are not under your control, because short tags may not

; be supported on the target server. For portable, redistributable code,

; be sure not to use short tags.


There are other reasons not to use short open tags. If you google for "short_open_tag security" you'll notice in the midst of pages that talk about security issues, myriad complaints pertaining to mixing XML and php code. As it turns out, XML can also be embedded between <?xml and ?> markers. If you turn on short_open_tag, then php will ignore all non-php code between <? and ?> markers.

Short open tags are sloppy, and the mark of a lazy programmer. If php programmers want to dispell the stereotypes about being bad coders, they need to take this to heart.

.htaccess

This isn't just a problem with php, but it does seem to be common in php programs, so I'm going to address it here. As all good admins know, Apache is a powerful web server platform. It's so complete on its own that a standard base install will often do everything that an admin needs. It's so extensible that programmers can hook into parts of the request life cycle and modify or even replace entire components. It even has the ability to delegate certain pieces of configuration to unprivileged users, allowing them their own flexibility. Unfortunately, this can be a serious problem both in terms of performance and security.

An Apache admin can use the AllowOverride directive to let users create their own per-directory configuration files, usually given the .htaccess name. A good deal of global configuration from the httpd.conf (or apache2.conf on some systems) can be replaced on the directory level by an .htaccess file. Traditionally, this is used to set up authentication for a single directory, and this had led to the thinking that this is the only place where directory-level authentication can be configured. This is not true. Any .htaccess configuration can, and should be set in a directive in the httpd.conf file.

There are several problems with using the .htaccess file instead. First of all, using these files places server configuration in a publicly accessable directory. This should never be done. It also allows users to potentially implement configuration that could compromise server security. And the more of these files exist, the harder it is for the admin to keep control over his server configuration.

Security not a big enough reason for you? How about performance? Apache looks for its configuration files in a particular order. It allows admins to move configuration from the main httpd.conf into separate files which can be organized into directories and then included as if they were all in the same centralized location. Most admins follow this practice, at least to some degree. It's true that for every file that is included, Apache needs to use an additional disk access to load that file. This is fine, because under standard operation these configuration files are loaded once, when Apache starts. They do not need to be read again unless the admin restarts Apache or sends it a SIGHUP signal.

The .htaccess files work differently. Once the global configuration files are loaded, they can be overridden with the AllowOverride directive. If this directive is used, then Apache will look for an .htaccess file every time a file is requested from a directory. If a file is requested from the /web/myserver/docs directory, Apache will attempt to read each of these files in order:

/.htaccess
/web/.htaccess
/web/myserver/.htaccess
/web/myserver/docs/.htaccess

...whether or not those files actually exist. If AllowOverride is used, then every time a page is requested from /web/myserver/docs, Apache will have to perform four file accesses, even if there are not .htaccess files on the system at all. If each of those directories have their own .htaccess file, then Apache will load them in turn, using each to override any previous configuration, until it finishes with the last one and finally loads the page.

Do you really want that kind of server load happening for every single web request? Keep in mind that a single page request can actually cause multiple Apache processes to go through this sequence. A web page with a css file and 10 images means 12 separate hits, each one of which will result in a minimum of 5 file accesses (instead of just one). It may not be so bad for some little blog that a couple of family members read, but if you ever get Slashdotted, you'll be in trouble.

Let's face it, we have enough problems optimizing our servers to handle server load. We don't need to have .htaccess files making it that much worse. We also have enough security issues to worry about without letting users make things worse for us. Unless you have no choice but to delegate configuration to individual unprivileged users, you should set AllowOverride to None and move all .htaccess configuration to the httpd.conf file.

I'm sure there are plenty of other bad habits that php programmers have, as with programmers in any language. These are just a couple that I've come across recently. Unfortunately for them, php programmers have earned themselves a bad reputation with programmers in "real" languages. To their credit, I do know some php programmers that have few, if any of these habits. If other php programmers want to be taken seriously by these and other people, they need to get their act together and start unlearning their bad habits. Maybe a style guide wouldn't be a bad place to start.

1 Comments:

Anonymous Anonymous said...

FWIW, an excellent "best practices" type book for PHP is "Advanced PHP Programming".

4/23/2009 3:48 AM

 

<< Home