Archive

Archive for March, 2010

Improving rendering speed

Speed is a matter of perception. We’d like to believe it’s all due to computational power or the execution speed of queries. There are barriers that should not be crossed, but in most case, getting your application to behave correctly while the user is waiting will improve the perspective. Improving the rendering speed is a good step and tweaking a few settings will improve perception more than trimming off milliseconds from an SQL query.

A now classic example of the effects of perception is the one of progress bars. When moving forward at different rates, even though the total time remains the same, will give the impression of being shorter or longer.

Fiddling with HTTP headers is actually very simple and will help lower the load on your server too. A hit you don’t get is so much faster to serve. Both Yahoo! and Google turned this optimization pain into a game by providing scores to increase. If you are not familiar with them, consider installing YSlow and Page Speed right away.  Now, if you’ve never used them before, chances are running it on your own website will provide terrible scores. Actually, running it on most of the websites out there provides terrible scores.

Both of them will complain about a few items:

  • Too many HTTP requests.
  • Missing expires headers
  • Uncompressed streams
  • Unminified CSS and Javascript
  • Recommend use of CDN

Fewer files

Now, the too many HTTP requests are likely caused by those multiple JavaScript and CSS files you include. The JavaScript part is very simple. All you have to do is concatenate the scripts in the appropriate order, minify them and deliver it all as a single file. There are good tools out there to do it. Depending on how you deploy the application, some may be better than others. I’ve used a PHP implementation to do it just in time and cached the result as a static file, and used a Java implementation as part of a build process. I find the later to be a better option if it is possible.

This is easy enough for production environments, but it really makes development a pain. Debugging a minified script is not quite pleasant. In Tikiwiki, this simple became an other option. In a typical Zend Framework environment, APPLICATION_ENV is a good binding point for the behavior. Basically, you need to know the individual files that need to be served. If in a development environment, serve them individually. In a production or staging environment, serve the compiled file (or build it JIT if building is not an option).

Unless you live with an application that has been shielded from the real world for a decade, it’s very likely that most of the code you use was not written by you. It comes from a framework. You can skip those altogether by not distributing them at all. Google provides a content delivery network (CDN) for those. Why is this faster? You don’t have to serve it, and your users likely won’t have to download it. Because the files are referenced by multiple websites, it’s very likely that they downloaded it and cached it locally in the past. They also serve the standard CSS files for JQuery UI (see bottom right corner), although that’s not quite as well indicated (you should be able to find the pattern).

Both of the minify libraries mentioned above also do the CSS minification. However, this is a bit more tricky as you will need to worry about the relative paths to images and imports of other CSS files.

The final step is to make sure all the CSS is in the header and the JavaScript at the bottom of the page.

Web server tuning

Now that the amount of files is reduced, your scores already improved significantly, an other class of issues will take over. Namely compression, expiry dates and improper ETags. The easiest to set-up is the compression. You will need to make sure mod_gzip or mod_deflate is installed in Apache. It almost always is. Everything is done transparently. All you need to do is make sure the right types are set. It can be done in the .htaccess file. Here is an example for mod deflate.
<IfModule deflate_module>
AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css application/javascript
</IfModule>

Use firebug to see the content type of all files YSlow is still complaining about and add them to the list.

An other easy target is the ETag declaration. In most installs, Apache will generate an ETag for static files. ETags are a good idea. The browser remembers the last ETag it received for a given URI and requests it back asking if it changed. The server compares it and either sends 304 to indicate it was not modified or the new version. The problem is that your server still gets a hit. You’re better off not having them at all.
<FilesMatch "\.(js|png|gif|jpg|css)$">
FileEtag None
</FilesMatch>

Expiry headers are a bit more tricky. When those occur in your scripts, you have to deal with them. Setting an expiry date means accepting that your users might not see the most recent version of the content because they won’t query your server to check. These may not be easy decisions to make.

However, static files are much easier to handle. You will need mod_expires in Apache, which is not quite as common as the compression counterpart. The goal is just to set an arbitrary date in the future. Page Speed likes dates further than a month away. YSlow seems to settle for 2 weeks. The documentation uses 10 years. It should be far enough.
<FilesMatch "\.(js|png|gif|jpg|css|ico)$">
ExpiresActive on
ExpiresDefault "access plus 10 years"
</FilesMatch>

Cookies

Your website most likely uses a cookie to track the session. They are great for your PHP scripts that need them to track who’s visiting, but they also happen to be sent to static files as well because the browser does not know it makes no difference. Cookies alter the request and cause confusion to intermediate caches or whenever the cookies change, like when you change the session id to avoid session hijacks.

The easiest way avoid those cookies from being sent to the static files is to place them on a different server. Luckily, browsers don’t really know how things are organized on the other hand, so just using a different domain or sub-domain pointing to the exact same application will do the trick. If you have more load, you might want to serve them with a different HTTP server altogether, but that requires more infrastructure. It should be easy to push JavaScript and CSS to the other domain. Reaching the images will depend on the structure of your application. You will thank those view helpers if you have any.

If you serve some semi-dynamic files through that domain, make sure PHP does not start the session, otherwise, all this was futile.

You can then configure YSlow’s CDN list to include that other domain and the google CDN, and observe blazing scores. To modify the configuration, you need to edit Firefox preferences. Type about:config in the URL bar, say you will be careful, search for yslow and modify the cdnHostnames property to contain a comma separated list of domains.

One more

By default, PHP sends a ridiculous Cache-Control header. It basically asks the browser to verify for a new version of the script on every request. When you user presses back, you get a new request, and he will likely loose local modifications in the form. Not really nice, and one too many hit on your server. Setting the header to something like max-age=3600, must-revalidate, will resolve that issue and make navigation on your site look so much faster.

These items should cover most of the frequent issues. Both tools will report a few minor issues, some which may be easy to fix, some not so much. Make those verifications part of the release procedure. A new type may get introduced in the application and cause less than optimal behaviors due to the lack of a few headers. It may not be possible to get a perfect score on all pages of a site, but if you can cover the most important one, your users may believe your site is fast, even though you use a heavy framework.

Categories: Programming Tags:

A bit too much

Well, Confoo is now over. That is quite a lot of stress off my shoulders. Overall, I think the conference was a large success and opened up nice opportunities for the future. Over the years, PHP Quebec had evolved to include more and more topics related to PHP and web development. This year was the natural extension to shift the focus away from PHP and towards web, including other programming languages such as Python and sharing common tracks for web standard, testing, project management and security. Most of the conference was still centered around PHP and that was made very clear on Thursday morning during Rasmus Lerdorf’s presentation (which had to be moved to the ballroom with 250-300 attendees, including some speakers who faced an empty audience), but hopefully, the other user groups will be able to grow in the next year.

Having 8 tracks in parallel was a bit too much. It made session selection hard, especially since I always keep some time for hallway sessions. I feel that “track” lost quite a lot of participants this year compared to the previous ones.

For my own sessions, I learned a big lesson this year. I should not bite more than I can chew. It turns out some topics are much, much, harder to approach than others. A session on refactoring legacy software seemed like a great idea, until I actually had to piece together the content for it. I had to attempt multiple ways to approach the topic and ended up with one way that made some sense to me, but very little to the audience it seems. I spent so much time distilling and organizing the content that I had very little time to prepare the actual presentation for it. What came out was mostly a terrible performance on my part. I am truly sorry for that.

Lesson of the year: Never submit topics that involve abstract complexity.

The plan I ended up with was a little like this:

  • Explain why rewriting the software from scratch is not an option. Primarily because management will never accept, but also because we don’t know what the application does in details and the maintenance effort won’t stop during the rewrite.
  • Bringing a codebase back to life requires a break from the past. Developers must sit down and determine long term objectives and directions to take, figure out what aspects of the software must be kept and those that must change, and find a few concrete steps to be taken.
  • The effort is futile if the same practices that caused degradation are kept. Unit testing should be part of the strategy and coding standards must be brought higher.
  • The rest of the presentation was meant to be a bit more practical on how to gradually improve code quality by removing duplication, breaking dependencies to APIs, improving readability and removing complexity by breaking down very large functions in more manageable units.

As I was presenting, my feeling was that I was on one side preaching to converts that had done this before and knew it worked, and the rest of the crowd who did not one to hear it would take a while and thought I was an idiot (emphasized by my poor performance, which I was aware of).

An other factor that came in the mix was that I actually had two presentations. Both of which I had never given before, so both had to be prepared. Luckily, the second one on unit testing was a much easier topic and I find that one went better. It was in a smaller room with fewer people. Everyone was close by, so it was a lot closer to a conversation. I accepted questions at any time. Surprisingly, they came in pretty much the same order I had prepared the content in for the most part. The objective of this session was to bootstrap with unit testing. My intuition told me that the main thing that prevented people from writing unit tests was that they never know where to start. My plan was:

  • Explain quickly how unit testing fits in the development cycle and why test-first is really more effective if you want to write tests. I went over it quickly because I know everyone had that sermon before. I rather placed the emphasis on getting started with easy problems first as writing tests requires some habits. It’s perfectly fine to get started with new developments before going back to older code and test it.
  • Jump in a VM and go through the installation process for PHPUnit, setting up phpunit.xml and the bootstrap script, writing a first test to show it runs and can generate code coverage. I did it using TDD, showing how you first write the test, see it fail, then do what’s required to get it to pass.
  • Keeping it hands on, go through various assertions that help writing more expressive tests, using setUp, tearDown and data providers to shorten tests.
  • Move on to more advanced topics such as testing code that uses a database or other external dependency. I ran out of time on this one, so I could not make any live example of it.

I was quite satisfied with the type of interaction I had with the audience during the presentation and the feedback was quite positive too. It was a small room organized in a way that I was surrounded by the audience close by rather than in a long room barely seeing who I was speaking to. Although there were only 15 attendees, I am confident they got something they can work with.

I could have used a dry run before the presentation. I had done one two weeks prior, but that wasn’t quite fresh in my mind, so it was not quite fluid, but some of it was desired to show where to find the information.

During the other sessions I attended, I made two nice discoveries: Doctrine 2 which came up with a very nice structure that I find very compatible with the PHP way and MongoDB, a document-based database with a very nice way to manipulate data and that has nice performance attributes for most web applications out there.

Categories: General Tags: