Archive

Archive for February, 2007

Speeding up the Zend Framework

February 13th, 2007 Louis-Philippe Huberdeau 4 comments

Like many frameworks, the Zend framework is fairly heavy by itself. Today, I decided to do some profiling on code running to see how all that load was distributed and see if there was any way I could pinpoint the problem to one particular location. I found a major one. Just to get things booted up and route the query to the appropriate method, over 40 files get included via require_once(). IO is definitely a problem here. I didn’t keep the original cachegrind trace (from xdebug), but over half of the processing time was lost in those calls. Even if using APC or some sort of op-code caching, calls still have to be made and the code has to be retrieved.

The base idea was to convert all those includes to a big one that could be cached properly and avoid all those require_once() calls. The best way to do it was to build a dirty PHP script to grab all those files and aggregate them. It removes the PHP tags and calls to require_once() inside the files on the way. Here it is:

< ?php
$files = file( 'files.txt' );

$fp = fopen( 'zend_core.php', 'w' );
fwrite( $fp, "<?php\n" );

foreach( $files as $file )
{
    $content = substr( file_get_contents( trim($file) ), 5 );
    $content = str_replace( '?>', '', $content );
    $content = preg_replace( '/require_once\s*\(?\s*[\'"][^;]+;/', '', $content );
    fwrite( $fp, $content );
}

fwrite( $fp, "?>" );
fclose( $fp );

?>

Prior to running the script, I got the list of files that get loaded with a simple call to get_required_files(). After some path clean-up, these are the files that got loaded in my setup. The list will vary depending on which components of the framework you use. I got the list for a simple call to a very basic controller. I expect some files to be loaded on the fly as required for the rare components.

library/Zend.php
library/Zend/Exception.php
incubator/library/Zend/Translate.php
incubator/library/Zend/Translate/Exception.php
library/Zend/Locale.php
library/Zend/Locale/Data.php
library/Zend/Locale/Exception.php
library/Zend/Locale/Format.php
library/Zend/Locale/Math.php
library/Zend/Locale/Math/PhpMath.php
library/Zend/Locale/Math/Exception.php
incubator/library/Zend/Translate/Adapter/Array.php
incubator/library/Zend/Translate/Adapter.php
library/Zend/Controller/RewriteRouter.php
library/Zend/Controller/Router/Interface.php
library/Zend/Controller/Router/Exception.php
library/Zend/Controller/Exception.php
library/Zend/Controller/Request/Abstract.php
library/Zend/Controller/Request/Http.php
library/Zend/Controller/Request/Exception.php
library/Zend/Uri.php
library/Zend/Uri/Exception.php
library/Zend/Uri/Http.php
incubator/library/Zend/Filter.php
incubator/library/Zend/Filter/Interface.php
library/Zend/Uri/Mailto.php
library/Zend/Controller/Router/Route.php
library/Zend/Controller/Router/Route/Interface.php
library/Zend/Controller/Front.php
library/Zend/Controller/Plugin/Broker.php
library/Zend/Controller/Plugin/Abstract.php
library/Zend/Controller/Response/Abstract.php
library/Zend/Controller/Dispatcher/Interface.php
library/Zend/Controller/Response/Http.php
library/Zend/Controller/Dispatcher.php
library/Zend/Controller/Dispatcher/Exception.php
library/Zend/Controller/Action.php

I stored the list in files.txt and the aggregate script built a 300k file. Including that file instead of the other ones gave around 30% speed improvement (benchmark made with ab -c10 -n1000) on a page not really representing normal usage. Still quite an interesting gain. Of course, using this is completely inconvenient for the framework developers, but it’s nice for those using it. The interesting aspect is that after those changes, the traces provided by xdebug were a lot more meaningful, altought they are not very precise as the slow parts would change from run to run.

So, is the Zend Framework slow and bloated? Well, it certainly is more than a straight PHP script, but it does have benefits. At the moment, it’s only a preview. I expect quite a few changes to be made before the final release to improve the performance of some components. I have been using it quite extensively on a project for little over a week now and the way the controller components handles request dispatching is just too convenient not to use it. Plus, there are plenty of hooks in the controller actions to handle all sorts of special cases, like page caching or authorization. I didn’t use so many components. I had a few bad experiences as well (would you really expect something simple like a logging class not to work?), but overall, I think there is a lot of potential ahead.

Categories: General Tags:

Enjoy the road

I’ve been really busy recently, which explains the complete absence of posts in the last few weeks. Between school and work, I don’t have much time left to write. I think it was last week, I was sitting in my formal methods class trying to figure out how OCL could be readable and watching demonstrations of various tools. At some point, I realized that all these tools were fun to watch, but they don’t bring that much value. Sitting down and scratching a piece of paper while thinking about the constraints in the system does a lot more ambiguity solving than using the tools. To use those tools, a lot of effort has to be placed on syntax issues, avoiding issues caused by the tool itself and incorporating elements that are not really part of the system just to make sure the simulation runs. You can guess I don’t really like resolving those issues.

Still, representing a system’s states as a diagram and writing down the invariants does raise some issues, but like most things, the time you spend working on the conceptual problems is worth a lot more than reaching the end. The same applies to many things in development. The time spent solving a complex problem is a lot more fun than watching the application run in the end (although it can be good when you spent too much time searching for the solution). The same also applies to estimation. You need to spend some time on it to reach those elements you would forget about. Simply counting function points and running them in the model simply won’t do any good.

Typing code is probably the only thing that can be done faster without loosing quality, but only if you live in a typo-free world.

Back when I was in San José, I think the middle part of the estimation is something I completely forgot about. Of course, with only 45 minutes, there are some elements you simply can’t cover. I realized it was not as clear as I thought very fast. The very first question I had to answer was about it and I had no material prepared and it’s not really the kind of topic that can be covered in two minutes and a half. The problem is that estimation relies a lot on analysis skills. I don’t think I even know how I do it, and it probably depends on the application domain. It’s all about sitting down and scratching paper. There are certainly a few questions I ask myself often to find hidden function points, but I never built a checklist for them. Where does the information come from? Can it be modified? In the same way for all users?

Of course, it all comes down to creating use cases and making sure all dependencies are met, but I hate having to write them down in detail, so I just take pages of notes with relations I won’t understand the next day. Writing things down clearly is like formalizing a model in OCL for checking. It takes a lot more time than it’s really worth. Especially when the goal is to perform an estimate. The goal is to know how long it will take before the project could have been finished. You’re better off spending more time thinking about the system than focusing on documentation. If the project is man-size, it’s probably not worth it to leave and maintain a paper trail. The only thing I keep is a list of all items with a function point count next to it. I don’t add much details, just with a few words so I can understand what they were all about. I either store it in a wiki or on paper. Nothing fancy. The only reason I keep it is to have a list of things I need to work on. This way, I don’t have to think about what I will do next. It has nothing to do with estimation.

Categories: General Tags: