In and out of hot water

Some problems are just too obvious. Blatantly unacceptable. Yet, we live with them because we grow accustomed to them. Dozens of people work on a project and solve small issues in the code by patching things up. That’s just the way it’s done. We all do it not thinking there is an other way. The code piles up, duplication creeps and inconsistency spreads. It happens so gradually that no one notices. The only solution to it is gaining perspective. Recently, I got away from a project I had been with for years. The official time away was really only 3 months. I can’t say I have significantly more knowledge about software right now than I had 3 months ago. Yet, as I came back, I came across a tiny issue. It really was just an other one-liner fix. One of those I had done countless times before. However, this time, it just felt unacceptable. What I saw was not the small issue to be fixed, it was the systemic problem affecting the whole. I can fix that once and for all.

Knowing the ins and outs of a project has some advantages. Changes can be made really quickly because you know exactly where to look at. When you see a piece of code, you can know from the style who wrote it and ask the right questions. You know the history of how it came to be and why it’s like that. Sadly, you also know why it won’t change anytime soon. Or can it? Given some amount of effort, anything about software can change. We’re only too busy with our daily work to realize it.

The same effect applies at different levels. People arriving in a new project have the gift of ignorance. They don’t see the reasons. They see what is wrong to them and how it should be for them. Of course, this often comes out in a harsh way for those who spent a lot of effort making it as good as it is, with all the flaws it has. Anyone who has been working on a project for a significant amount of time and demonstrates it knows that reality can be hard. Still, for all the good advice newcomers can have, what they don’t have is the trust of the community and the knowledge required to know all the repercussions a change can have. Still, they do tend to see major issues that no one had intentions to solve.

At a much smaller level, it’s not much different than closing a coding session at the end of a day in the middle of trying to resolve an impossible bug. The next morning, the issue just magically appear. This can be associated to being well rested and sharper, but it might also just be about letting the time go. Deciding to stop searching is a tough call. You’re always 15 minutes away from finding the solution when debugging, until you realize the hypothesis was wrong. It takes a fair amount of self-confidence to know you can take the night off and that you will still be able to fix the issue in time for a deadline. Sure, getting back into it will take some time. It requires to rebuild the mental model of the problem, but rebuilding might just be what’s needed when you are on the wrong track.

This is nothing new. It has been written many times that code review are more effective when done a few days later or by someone else entirely. In this case, Wiegers indicates that when the work is too fresh in your memory, your recite it based on what it should be rather than what it is, hence you don’t really review it. What if when you search for a problem, whether a localized bug or a systemic issue that requires major work, unless you had taken enough distance to clear your assumptions, there is no way you could find it outside of blind luck?

How not to fail miserably with system integration

In the last few weeks, I’ve had the opportunity to get my hands in an SOA system involving multiple components. I had prior experience, but not with that many evolving services at the same time. When I initially read the architecture’s documentation, I had serious doubts any of it was going to work. There were way too many components and each of them had to do a whole lot of work to get the simplest task accomplished. My role was to sit from the outside and call those services, so I did not really have to worry about the internals. Still, unless you’re working with Grade A services provided by reputed companies and used by tens of thousands of programmers, knowing the internals will save a few pains.

Make sure you can work without them

I actually got to this one by accident, but it happened to be a good one. When I joined the team, there were plenty of services available. However, the entry point I needed to get to be able to use any of them was not present at all. I had a vague idea of what the data would look like, so I started building static structures that would be sent off to the views to render the pages. I had a bit of heads up before the integrators joined the team and certainly did not want them to wait around for weeks until the service was delivered.

At some point, the said service actually entered in someone’s iteration, which means it would be delivered in a near future. Fairly quickly, a contract was made for the exact data format that would be provided. Although I was not wrong on the values that would be provided, the format was entirely different. My initial format then became an intermediate format for the view, providing the strict minimum required, and a layer was added in the system to translate the service’s format to our own. The service was not yet available, so the service was just stubbed out. The conversion could be unit tested based on the static data in the service stub. Plugging the service when it arrived was a charm and except for a few environment configurations, it was transparent to integrators.

During the entire development, this fake data ended up being very useful.

  • Whenever the services would change, there was an easy way to compare the expectations to what we actually received, allowing to update the stub data and the intermediate layer.
  • When something would go terribly wrong and services would just fail, it was always possible to revert to the fake data and keep working.
  • It allowed to reach code paths that would not normally be reached, like error handling.

Expect them to fail

One of the holy grails with SOA is that you can replace services on the fly and adjust capacity when needed. This may be partially true, but it also means that while your component works fine, the neighbor may be completely unreachable during a maintenance or while restarting. If you happen to need it, you might as well be down in many cases. While one would hope services won’t crash in production, they happen to crash in development fairly often. To live with this, there is one simple rule: expect every single call to fail, and make a conscious decision about what to do about it. For one, it will make your system fault tolerant. If the only call that failed is fetching some information that is only used being rendered, it’s very likely that you don’t need to die with a 500 internal error. However, if you didn’t expect and handle the failure, that’s what will happen.

Adding this level of error handling does add a significant cost to the development. It’s a lot of code and a lot of reflexions needed. Live with it or reconsider your SOA strategy.

Early on, adding the try/catch blocks wasn’t much of a reflex. After all, you can write code that works without it and PHP sure won’t indicate you that you forgot one. When the first crashes occurred in development, we still had few services integrated. Service interruptions just became worst as we integrated more. What really pushed towards adding more granularity in catching exceptions is not really the non-gracefulness in which they would break the system, it’s the waste of time. With a few team members pushing features in a system, a 15 minutes interruption may not seem like much, but it’s enough to break the flow, which is hard enough to get into in an office environment. Especially when the service that breaks has nothing to do with the task you’re on at the moment.

It does not take much.

  • When fetching information, have a fall-back for missing information.
  • When processing an information, make sure you give a proper notification when something goes wrong.
  • Log all failures and have at least a subtle indication for the end-user that something went wrong and that logs should be verified before bothering someone else.

Build for testability

Services live independently in their own memory space with their own states. They just won’t reset when your new unit test begins, making them a pain to test. However, that is far from being an excuse for not automating at least some tests. Every shortcut made will come back to hunt you, or someone else on the team. It’s very likely that you won’t be able to get the remote system in every possible state to test all combination. Even if you could, the suite would most likely take very long to execute, leading to poor feedback. Mocks and stubs can get you a long way just making sure your code makes the correct calls in sequence (when it matters), passing the right values and stopping correctly when an error occurs. That alone should give some confidence.

To be able to check all calls made, we ended up with an interface defining all possible remote calls providing the exact same parameters and return values as the remote systems. There was a lot of refactoring to get to the solution. Essentially, every single time an attempt was made to regroup some calls because they were called at the same time and shared parameters, or because it was too much data to stub out for just those 2 tiny values, it had to be redone. Some error would happen with the real services because the very few lines of code that were not tested with the real service happened to contain errors, or something would come up and suddenly, those calls were no longer regrouped.

As far as calling the real services go, smoke testing is about the only thing I could really do. Making a basic call and checking if the output seems to be in the appropriate format. In the best of worlds, the service implementers would also provide a stub in which the internal state can be modified, and maintain the stub to reflect the contract made by the real service. It could have solved some issues with the fact that some services are simply impossible to run in a development environment. Sticking to the contract is the only thing that can really be done in an automated fashion for development. I first encountered that type of environment a few years back where running a test actually implied walking to a room, possibly climbing a ladder, switching wires and getting back to the workstation to check.

Have an independent QA team

It might not be miserably, but chances you fail are fairly high when a lot of components need to talk to each other and there is no way you can replicate all of it at once. A good QA team testing in an environment that maps to the production environment will find the most hallucinating issues. In most cases, they are caused by a mismatch between the understanding of the interface between the implementor and the client. When you have a clear log pointing out the exact source of the problem and all your expectations documented in tests and stubs, it does not take a very long discussion to find the source of the issue. Fixing it just becomes adjusting the stubs, and fixing the broken tests.

If you’re lucky enough, there might not be issues left when it goes to production. Seriously, don’t over-do SOA. It’s not as magic as the vendors or “enterprise architects” say it is.

Summer report

For some reason I never quite understood, I always tend to be extremely busy in the summer when I would much rather enjoy the fresh air and take it slow, and be less busy during the winter when heading out is less attractive. This summer was no exception. After the traveling, I started a new mandate with a new client, and that brought my busyness to a whole new level.

In my last post, I mentioned a lot of wiki-related events happening over the summer and that I would attend them all. It turns out it was an exhausting stretch. Too many interesting people to meet, not enough time — even in days that never seem to end in Poland. As always, I was in a constant dilemma between attending sessions, the open space or just creating spontaneous hallway discussions. There was plenty of space for discussion. The city of Gdansk being not so large, at least not the touristic area in which everyone stayed, entering just about any bar or restaurant, at any time of the day, would lead to sitting with an other group of conference attendees. WikiMania did not end before the plane landed in Munich, which apparently was the connection city everyone used, at which point I had to run to catch my slightly tight connection to Barcelona.

I know, there are worst ways to spend par of the summer than having to go work in Barcelona.

I came to a few conclusions during WikiSym/WikiMania:

  • Sociotechnical is the chosen word by academics to discuss what the rest of us call the social web or web 2.0.
  • Adding a graph does not make a presentation look any more researched. It most likely exposes the flaws.
  • Wikipedia is much larger than I knew, and they still have a lot of ambitions.
  • Some people behind the scenes really enjoy office politics, which most likely creates a barrier with the rest of us.
  • One would think open source and academic research have close objectives, but collaboration remains hard.
  • The analysis performed leads to fascinating results.
  • The community is very diverse, and Truth in Numbers is a very good demonstration of it for those who could not be there.

As I came back home, I had a few days to wrap up projects before getting to work for a new client. All of which had to happen while fighting jet lag. I still did not get time to catch-up with the people I met, but I still plan on it.

One of the very nice surprises I had a few days ago is the recent formation of Montréal Ouvert (the site is also partially available in English), which held it’s first meeting last week. The meeting appeared like a success to me. I’m very bad at counting crowds, but it seemed to be somewhere between 40 and 50 people attending. Participants were from various professions and included some city representatives, which is very promising. However, the next steps are still a little fuzzy and how one may get involved is unclear. The organizers seemed to have matters well in hand. There will likely be some sort of hack fest in the coming weeks or months to build prototypes and show the case for open data. I don’t know how related this was to Make Web Not War a few months prior. It may just be one of those idea whose time has come.

I also got to spend a little time in Ottawa to meet with the BigBlueButton team and discuss further integration with Tiki. At this time, the integration is minimal because very few features are fully exposed. Discussions were fruitful and a lot more should be possible with the now in development version 0.8. Discussing the various use cases indicated that we did not approach the integration using the same metaphor, partially because it is not quite explicit in the API. The integration in Tiki is based on the concept of rooms as a permanent entity that you can reserve through alternate mechanisms, which maps quite closely to how meeting rooms work in physical spaces. The intended integration was mostly built around the concept of meetings happening at a specific moment in time. Detailed documentation cannot always explain the larger picture.

Upcoming events

This summer, I will have my largest event line-up around a single theme. None of which will be technical! It will begin on June 25th with RecentChangesCamp (RoCoCo, to give it a French flavor) in Montreal. I first attended that event the last time it was in Montreal and again last year in Portland. It’s the gathering of wiki enthusiasts, developers, and pretty much anyone who cares to attend (it’s free). The entire event is based around the concept of Open Space, which means you cannot really know what to expect. Both times I attended, it had a strong local feel, even though the event moves around.

Next in line is WikiSym, which will be held in GdaÅ„sk (Poland) on July 7-9th. I also attended it twice (Montreal in 2007, Porto 2008). I missed last year’s in Orlando due to a schedule conflict. WikiSym is an ACM conference, making it the most expensive wiki conference in the world (still fair, by other standards). Unlike the other ones which are more community-driven, this one is from the academic world (you know it when they refer to you as a practitioner). Most of the presentations are actually paper presentations. Because of that, attending the actual presentations is not so valuable as the entire content is provided as you get there. It’s much better to spend time chatting with everyone in the now-tradition Open Space. It really is a once per year opportunity to get everyone who spent years studying various topics around wikis from all over the world. Local audience is almost absent, except for the fact that the event tends to go to places where there is a non-null scientific wiki community.

Final stop will be WikiMania, at the exact same location as the previous one until July 11th. I really don’t know what to expect there. I never attended the official WikiMedia conference. However, it has a fantastic website with tons of relevant information for attendees. It probably has something to do with it being an open wiki and being attended by Wikipedia contributors.

I will next head toward Barcelona for a mandatory TikiFest. However, I don’t really consider this to be in the line-up as it’s mostly about meeting with friends.

That is three events on wikis and collaboration. Wikis being the simplest database that could possibly work, what could require 8-9 days on a single topic? It turns out the technology does not really matter. Just like software, writing is not hard. Getting many people to do it together is a much bigger challenge. Organizing the content alone to suit the needs of a community is challenging. Because the structure is so simple, it puts a lot of pressure on humans to link it all together, navigate the content and find the information they are looking for.

More information overload please

A few months back, Microsoft made a presentation on OData at PHP Quebec. While I found the format interesting at first, with the way you can easily navigate and explore the dataset, I must admit I was a bit skeptical. After all, public organizations handing out their data to Microsoft does sound like a terrible idea. While a lot of that data will be hosted on Microsoft technologies, the format remains open, and it appears to be picking up.

I guess what was missing originally for me was a real use case for it. The example at the presentation used a sample database with products and inventory. Completely boring stuff. Today, at Make Web Not War, I had a conversation with Cory Fowler (with Jenna Hoffman sitting close by) who has been promoting OData for the city of Guelph. I got convinced right away that this was the right way to go. Not necessarily, OData as a format, but opening up information for citizens to explore and improve the city. If OData can emerge as a widespread standard to do it, it’s fine by me. The objective is far away from technology. In fact, when I look at it, I barely see it. It’s about providing open access to information for anyone to use. How they will use it is up to them.

The conference had a competition attached to it. Two projects among the finalists were using OData. One of them created a driving game with checkpoints in the city of Vancouver. They simply used the map data to build the streets and position buildings. That is a fairly ludicrous use of publicly available information, but still impressive that a small team could build a reasonable game environment in a short amount of time. The other project used data from Edmonton to rank houses based on the availability of nearby services, basically helping people seeking new properties to evaluate the neighborhood without actually getting away from their computer.

This is only the tip of the iceberg. The data made available at this time is mostly geographical. Cities expose the location of the various services they offer. The uses you can make out of it are quite limited. I’ve seen other applications helping you locate nearby parks or libraries. Sure, knowing there is a police station nearby is good, but there could be so much more. What we need is just more data: crime locations, incident reports, power usage, water consumption. Once relevant information is out there, small organizations and even businesses will be able to use it to find useful information and track it over time. At this time, a lot of the data is collected but only accessible by a few people. Effort duplication occurs when others attempt to collect it. Waste. Decisions are made based on poor evidence.

So there is information out there for Vancouver, Edmonton, even Guelph. Nothing about Montreal. Nothing in the province of Quebec that I could find. I think this is just sad.

Actually, if there is anything out there, it might be very hard to find. Even if there are these great data sources available openly, it remains hard to find them. There is no central index at this time. Even if there were, the question remains of what should go in it. Official sources? Collaborative sources? Not that there is anything like that, but consider people flagging potholes on streets with their mobile phones as they walk around. Of course, accuracy would vary, but it would serve as a great tool for the city employees to figure out which areas should become a priority. There are so many opportunities and so many challenges related to open data access. I don’t think we are fully prepared for the shift yet.

Only the words change

Amazon has brought me back to 1975 and the Mythical Man-Month. It had been on my reading list for quite a while, but at some point around two years ago, it became unavailable. After that, it sat on a shelf for a few months until the stack got down to it. I must say, skip a few technical details and this book could very well have been written last year. After all, in 1975, structured programming (that is, conditions and loops) was a recent concept and not widely adopted. Surprisingly, Brooks knew a whole lot about software development, testing and management. I have the feeling we have learned nothing since it was written. Concepts were only refined, renamed and spread out. As far as I can tell, just a few paragraphs in chapter 13 lays out the founding concepts of TDD.

Build plenty of scaffolding. By scaffolding, I mean all programs and data built for debugging purposes but never intended to be in the final product. It is not unreasonable for the to be half as much code in scaffolding as there is in product.

One form of scaffolding is the dummy component, which consists only of interfaces and perhaps some faked data or some small test cases. For example, a system may include a sort program which isn’t finished yet. Its neighbors can be tested by using a dummy program that merely reads and tests the format of input data, and spews out a set of well-formatted meaningless but ordered data.

Another form is the miniature file. A very common form of system but is misunderstanding of formats for tape and disk files. So it is worthwhile to build some little files that have only a few typical records, but all the descriptions, pointers, etc.

[…]

Yet another form of scaffolding are auxiliary programs. Generators for test data, special analysis printouts, cross-reference table analyzers, are all examples of the special-purpose jigs and fixtures one may want to build.

[…]

Add one component at a time. This precept, too, is obvious, but optimism and laziness tempt us to violate it. To do it requires dummies and other scaffolding, and that takes work. And after all, perhaps all that work won’t be needed? Perhaps there are no bugs?

No! Resist the temptation! That is what systematic system testing is all about. One must assume that there will be lots of bugs, and plan an orderly procedure for snaking them out.

Note that one must have thorough test cases, testing the partial systems after each new piece is added. And the old ones, run successfully on the last partial sum, must be rerun on the new one to test for system regression.

Does it sound familiar? I see test cases, test data, mock objects, fuzzing and quite a lot of things we hear about these days. Certainly, it was different. They had different constraints at the time, like having to schedule to get access to a batch-processing machine. There is some discussion about interactive programming and how it would speed up the code and test cycles.

I find it impressive given that they had so little to work with. I wasn’t even born when they figured that out.

Because the experience is based on system programming for an operating system to be ran on a machine built in parallel, there is a strong emphasis on top-down design, which is the most important new programming formalization of the decade, and requirements. To me, the word requirement is a scary one. I don’t do system programming and for what I do, prototyping and communication does a much better job. However, I found the take interesting.

Designing the Bugs Out

Bug-proofing the definition. The most pernicious and subtle bugs are system bugs arising from mismatched assumptions made by the authors of various components. The approach to conceptual integrity discussed above in Chapters 4, 5 and 6 addresses these problems directly. In short, conceptual integrity of the product not only makes it easier to use, it also makes it easier to build and less subject to bugs.

So does the detailed, painstaking architectural effort implied by that approach, V. A. Vyssotsky, of Bell Telephone Laboratories’ Safeguard Project, says, “The crucial task is to get the product defined. Many, many failures concern exactly those aspects that were never quite specified.” Careful function definition, careful specification, and the disciplined exorcism of frills of function and flights of technique all reduce the number of system bugs that have to be found.

Testing the specification

Long before any code exists, the specification must be handed to an outside testing group to be scrutinized for completeness and clarity. As Vyssotsky says, the developers themselves cannot do this: “They won’t tell you they don’t understand it; they will happily invent their way through the gaps and obscurities.”

Beyond the punch line, this does call for very detailed specifications. It felt to me that those were in-retrospect comments. I don’t think it was ever made that the specification were fully detailed enough for bugs to be driven out. If it had been, you would end up with an issue introduced in the previous chapter: “There are those who would argue that the OS/360 six-foot shelf of manuals represents verbal diarrhea, that the very voluminosity introduces a new kind of incomprehensibility. And there is some truth in that.”

Very detailed specifications, just like exhaustive documentation, will reach a point where it does not bring value because no one can get through all of it in a reasonable time frame. Attempting to find inconsistencies in a few pages of requirements is possible. Not in thousands of pages. The effort required is just surrealist. Sadly, following the advice of strong and detailed requirements, the world of software development sunk in waterfall in the years that followed. For all the great insight in the book we collectively realized decades later, this one got too influential.

Imagine if instead, the industry had used the advice of smaller, more productive teams when possible, higher-level programming languages and extensive scaffolding as the primary advice of the book where software development would be today.

Branching, the cost is still too high

Everyone’s motivation to move to distributed version control systems (DVCS) was that the cost of branching was too high with Subversion. Part of it is true, but even with DVCS, I find the cost of branching to be too high for my taste. I can create feature branches for branches of a decent size, but I think traceability needs even even more granularity.

Let’s begin by listing my typical process to handle feature branches these days.

  1. Branch trunk from the repository to my local copy
  2. Copy configuration files from an other branch
  3. Make minor changes
  4. Run scripts to initialize the environment.
  5. Develop, commit, pull, merge – all of this is great
  6. Push to trunk

My problem is that dealing with those configuration files takes too much time and that is still troublesome. However, there is no real way around it. The application needs to connect to MySQL, Gearman, Sphinx and Memcached. On development setups, they are all on the same machine. Still because I am way too lazy to create new database instances and I often don’t change my prefixes as much as I should, I end up with multiple branches sitting there with only one really usable at any time. Of course, it would all be solved if I were more disciplined, but if it annoys me, it prevents me from doing it right. Just having to do the configuration part encourages me to re-use branches.

The goal of fine-grained branches is to represent the decision-making process as part of the revision control. The way I see it, top level branches represent a goal. It could be implementing a new feature, enhancing a piece of the user interface or anything. However, to reach those top level objectives, it may be required to perform some refactoring or upgrade a library. If those changes are made atomically through a branch and merged as a single commit, there would be ways to look at the hierarchy of commits to understand the flow of intentions. Bazaar can generate graphs from forks and merges. I can imagine tools to help traceability if the decision making is organized in the branch structure.

Why traceability you might ask. For many things that don’t seem to make sense in code, there is a good historical reason (unless it’s due to accidental complexity). Even in my own code written a few months prior, I find places that need refactoring. Most of the time, it’s simply because I was trying to look too far ahead at the time. I was anticipating the final shape of the software, but by the time it got there, new and better ways to achieve the same result had been implemented, leaving legacy behind. When this happens to be in my own code, I can think about the process that led to it, figure out what the original intention was and decide how the design should be adapted to the new reality. When the code is written by someone else, the original intention can only be guessed. I hope creating a hierarchy of branches can provide an outline of the thought process that would explain the decisions made.

My Subversion reflexes pointed me towards bzr switch. It brings a change to the way I got used to work with a DVCS.  My transition was to switch the concept of working copy to branching. Check-outs simply had no use. I was wrong. They can actually fix my issue of configuration burden. If I keep a single check-out of the code that is configured for my local environment, I can then switch it from one branch to an other. Because we are in the distributed world, those other branches can be kept locally, just not in the working copy. The process then changes.

  1. Create a new branch locally
  2. Switch the check-out to the new branch
  3. Develop, commit, pull, merge
  4. Switch check-out to parent branch
  5. Merge local branch

Of course, if changes happen in the configuration files outside of what was locally configured or the schema changes, this has to be dealt with, but I expect this to be much less frequent.

The next step will be to rebuild my development environment in a smarter way. Right now, I have way too many services running locally. I want to move all of those to a virtual machine, which I will fire up when I need them. For this step, I am waiting for the final release of Ubuntu 10.04, and probably a few more weeks. In the past, I had terrible experiences with pre-release OS and learned to stay away, no matter how fun and attractive new features are. It also means re-installing my entire machine, so I don’t look so much towards that. It should be easier now that almost everything is web-based, as long as I don’t loose those precious passwords.

Using virtual machine to keep your primary host clean of any excess is nothing new. I guess I did not do it before I though my disk space was more limited than it is. My laptop has a 64G SSD drive. It was a downscale from my previous laptop’s drive, which was continuously getting full. Too many check-outs, database dumps, log files. They just keep piling up over the years. It turns out the overhead of having an extra operating system isn’t that bad after all.

The good thing about virtual machines is that they are completely disposable. You can build it with the software you need, take a snapshot and move on from there. Simply reverting back to the snapshot will clean up all the mess created. Only one detail to keep in mind: no permanent data can be stored in there. I will keep my local branches on the main host and the check-out in the virtual machine. Having a shell on a virtual machine won’t make much of a difference than a shell locally.

Improving rendering speed

Speed is a matter of perception. We’d like to believe it’s all due to computational power or the execution speed of queries. There are barriers that should not be crossed, but in most case, getting your application to behave correctly while the user is waiting will improve the perspective. Improving the rendering speed is a good step and tweaking a few settings will improve perception more than trimming off milliseconds from an SQL query.

A now classic example of the effects of perception is the one of progress bars. When moving forward at different rates, even though the total time remains the same, will give the impression of being shorter or longer.

Fiddling with HTTP headers is actually very simple and will help lower the load on your server too. A hit you don’t get is so much faster to serve. Both Yahoo! and Google turned this optimization pain into a game by providing scores to increase. If you are not familiar with them, consider installing YSlow and Page Speed right away.  Now, if you’ve never used them before, chances are running it on your own website will provide terrible scores. Actually, running it on most of the websites out there provides terrible scores.

Both of them will complain about a few items:

  • Too many HTTP requests.
  • Missing expires headers
  • Uncompressed streams
  • Unminified CSS and Javascript
  • Recommend use of CDN

Fewer files

Now, the too many HTTP requests are likely caused by those multiple JavaScript and CSS files you include. The JavaScript part is very simple. All you have to do is concatenate the scripts in the appropriate order, minify them and deliver it all as a single file. There are good tools out there to do it. Depending on how you deploy the application, some may be better than others. I’ve used a PHP implementation to do it just in time and cached the result as a static file, and used a Java implementation as part of a build process. I find the later to be a better option if it is possible.

This is easy enough for production environments, but it really makes development a pain. Debugging a minified script is not quite pleasant. In Tikiwiki, this simple became an other option. In a typical Zend Framework environment, APPLICATION_ENV is a good binding point for the behavior. Basically, you need to know the individual files that need to be served. If in a development environment, serve them individually. In a production or staging environment, serve the compiled file (or build it JIT if building is not an option).

Unless you live with an application that has been shielded from the real world for a decade, it’s very likely that most of the code you use was not written by you. It comes from a framework. You can skip those altogether by not distributing them at all. Google provides a content delivery network (CDN) for those. Why is this faster? You don’t have to serve it, and your users likely won’t have to download it. Because the files are referenced by multiple websites, it’s very likely that they downloaded it and cached it locally in the past. They also serve the standard CSS files for JQuery UI (see bottom right corner), although that’s not quite as well indicated (you should be able to find the pattern).

Both of the minify libraries mentioned above also do the CSS minification. However, this is a bit more tricky as you will need to worry about the relative paths to images and imports of other CSS files.

The final step is to make sure all the CSS is in the header and the JavaScript at the bottom of the page.

Web server tuning

Now that the amount of files is reduced, your scores already improved significantly, an other class of issues will take over. Namely compression, expiry dates and improper ETags. The easiest to set-up is the compression. You will need to make sure mod_gzip or mod_deflate is installed in Apache. It almost always is. Everything is done transparently. All you need to do is make sure the right types are set. It can be done in the .htaccess file. Here is an example for mod deflate.
<IfModule deflate_module>
AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css application/javascript
</IfModule>

Use firebug to see the content type of all files YSlow is still complaining about and add them to the list.

An other easy target is the ETag declaration. In most installs, Apache will generate an ETag for static files. ETags are a good idea. The browser remembers the last ETag it received for a given URI and requests it back asking if it changed. The server compares it and either sends 304 to indicate it was not modified or the new version. The problem is that your server still gets a hit. You’re better off not having them at all.
<FilesMatch "\.(js|png|gif|jpg|css)$">
FileEtag None
</FilesMatch>

Expiry headers are a bit more tricky. When those occur in your scripts, you have to deal with them. Setting an expiry date means accepting that your users might not see the most recent version of the content because they won’t query your server to check. These may not be easy decisions to make.

However, static files are much easier to handle. You will need mod_expires in Apache, which is not quite as common as the compression counterpart. The goal is just to set an arbitrary date in the future. Page Speed likes dates further than a month away. YSlow seems to settle for 2 weeks. The documentation uses 10 years. It should be far enough.
<FilesMatch "\.(js|png|gif|jpg|css|ico)$">
ExpiresActive on
ExpiresDefault "access plus 10 years"
</FilesMatch>

Cookies

Your website most likely uses a cookie to track the session. They are great for your PHP scripts that need them to track who’s visiting, but they also happen to be sent to static files as well because the browser does not know it makes no difference. Cookies alter the request and cause confusion to intermediate caches or whenever the cookies change, like when you change the session id to avoid session hijacks.

The easiest way avoid those cookies from being sent to the static files is to place them on a different server. Luckily, browsers don’t really know how things are organized on the other hand, so just using a different domain or sub-domain pointing to the exact same application will do the trick. If you have more load, you might want to serve them with a different HTTP server altogether, but that requires more infrastructure. It should be easy to push JavaScript and CSS to the other domain. Reaching the images will depend on the structure of your application. You will thank those view helpers if you have any.

If you serve some semi-dynamic files through that domain, make sure PHP does not start the session, otherwise, all this was futile.

You can then configure YSlow’s CDN list to include that other domain and the google CDN, and observe blazing scores. To modify the configuration, you need to edit Firefox preferences. Type about:config in the URL bar, say you will be careful, search for yslow and modify the cdnHostnames property to contain a comma separated list of domains.

One more

By default, PHP sends a ridiculous Cache-Control header. It basically asks the browser to verify for a new version of the script on every request. When you user presses back, you get a new request, and he will likely loose local modifications in the form. Not really nice, and one too many hit on your server. Setting the header to something like max-age=3600, must-revalidate, will resolve that issue and make navigation on your site look so much faster.

These items should cover most of the frequent issues. Both tools will report a few minor issues, some which may be easy to fix, some not so much. Make those verifications part of the release procedure. A new type may get introduced in the application and cause less than optimal behaviors due to the lack of a few headers. It may not be possible to get a perfect score on all pages of a site, but if you can cover the most important one, your users may believe your site is fast, even though you use a heavy framework.

A bit too much

Well, Confoo is now over. That is quite a lot of stress off my shoulders. Overall, I think the conference was a large success and opened up nice opportunities for the future. Over the years, PHP Quebec had evolved to include more and more topics related to PHP and web development. This year was the natural extension to shift the focus away from PHP and towards web, including other programming languages such as Python and sharing common tracks for web standard, testing, project management and security. Most of the conference was still centered around PHP and that was made very clear on Thursday morning during Rasmus Lerdorf’s presentation (which had to be moved to the ballroom with 250-300 attendees, including some speakers who faced an empty audience), but hopefully, the other user groups will be able to grow in the next year.

Having 8 tracks in parallel was a bit too much. It made session selection hard, especially since I always keep some time for hallway sessions. I feel that “track” lost quite a lot of participants this year compared to the previous ones.

For my own sessions, I learned a big lesson this year. I should not bite more than I can chew. It turns out some topics are much, much, harder to approach than others. A session on refactoring legacy software seemed like a great idea, until I actually had to piece together the content for it. I had to attempt multiple ways to approach the topic and ended up with one way that made some sense to me, but very little to the audience it seems. I spent so much time distilling and organizing the content that I had very little time to prepare the actual presentation for it. What came out was mostly a terrible performance on my part. I am truly sorry for that.

Lesson of the year: Never submit topics that involve abstract complexity.

The plan I ended up with was a little like this:

  • Explain why rewriting the software from scratch is not an option. Primarily because management will never accept, but also because we don’t know what the application does in details and the maintenance effort won’t stop during the rewrite.
  • Bringing a codebase back to life requires a break from the past. Developers must sit down and determine long term objectives and directions to take, figure out what aspects of the software must be kept and those that must change, and find a few concrete steps to be taken.
  • The effort is futile if the same practices that caused degradation are kept. Unit testing should be part of the strategy and coding standards must be brought higher.
  • The rest of the presentation was meant to be a bit more practical on how to gradually improve code quality by removing duplication, breaking dependencies to APIs, improving readability and removing complexity by breaking down very large functions in more manageable units.

As I was presenting, my feeling was that I was on one side preaching to converts that had done this before and knew it worked, and the rest of the crowd who did not one to hear it would take a while and thought I was an idiot (emphasized by my poor performance, which I was aware of).

An other factor that came in the mix was that I actually had two presentations. Both of which I had never given before, so both had to be prepared. Luckily, the second one on unit testing was a much easier topic and I find that one went better. It was in a smaller room with fewer people. Everyone was close by, so it was a lot closer to a conversation. I accepted questions at any time. Surprisingly, they came in pretty much the same order I had prepared the content in for the most part. The objective of this session was to bootstrap with unit testing. My intuition told me that the main thing that prevented people from writing unit tests was that they never know where to start. My plan was:

  • Explain quickly how unit testing fits in the development cycle and why test-first is really more effective if you want to write tests. I went over it quickly because I know everyone had that sermon before. I rather placed the emphasis on getting started with easy problems first as writing tests requires some habits. It’s perfectly fine to get started with new developments before going back to older code and test it.
  • Jump in a VM and go through the installation process for PHPUnit, setting up phpunit.xml and the bootstrap script, writing a first test to show it runs and can generate code coverage. I did it using TDD, showing how you first write the test, see it fail, then do what’s required to get it to pass.
  • Keeping it hands on, go through various assertions that help writing more expressive tests, using setUp, tearDown and data providers to shorten tests.
  • Move on to more advanced topics such as testing code that uses a database or other external dependency. I ran out of time on this one, so I could not make any live example of it.

I was quite satisfied with the type of interaction I had with the audience during the presentation and the feedback was quite positive too. It was a small room organized in a way that I was surrounded by the audience close by rather than in a long room barely seeing who I was speaking to. Although there were only 15 attendees, I am confident they got something they can work with.

I could have used a dry run before the presentation. I had done one two weeks prior, but that wasn’t quite fresh in my mind, so it was not quite fluid, but some of it was desired to show where to find the information.

During the other sessions I attended, I made two nice discoveries: Doctrine 2 which came up with a very nice structure that I find very compatible with the PHP way and MongoDB, a document-based database with a very nice way to manipulate data and that has nice performance attributes for most web applications out there.

Bad numbers

The most frequently quoted numbers in software engineering are probably those of the Standish Chaos report starting in 1995. To cut the story short, they are the ones that claim that only 16% of software projects succeed. Personally, I never really believed that. If it were that bad, I wouldn’t be making a living from writing software today, 15 years later. The industry would have been shut down long before they even published the report. As I was reading Practical Software Estimation, which had a mandatory citation of the above, I came to ask myself why there was such a difference. I don’t have numbers on this, but I would very surprised to hear any vendor claiming less than 90% success rate on projects. Seriously, with 16% odds of winning, you’re better off betting your company’s future on casino tables than on software projects.

The big question here is: what is a successful project? On what basis do they claim such a high failure rate?

Well, I probably don’t have the same definition of success and failure. I don’t think a project is a failure even if it’s a little behind schedule or a little over budget. From an economic standpoint, as long as it’s profitable, and above the opportunity cost, it was worth doing. Sometimes, projects are canceled for external factors. Even though we’d like to see the project development cycle as a black box in which you throw a fixed amount of money and get a product in the end, that’s not the way it is. The world keeps moving and if something comes out and makes your project obsolete, canceling it and changing direction is the right thing to do. Starting the project was also the right thing to do based on the information available at the time. Sure some money was lost, but if there are now cheaper ways to achieve the objective, you’re still winning. As Kent Beck reminds us, sunk costs are a trap.

When a project gets canceled, the executives might be really happy because they see the course change. The people that spent evenings on the project might not. Perspective matters when evaluating success or failure. However, when numbers after the fact are your only measure for success, that may be forgotten. Looking back, it won’t matter if a hockey team gave a good fight in the third period. On the scoreboard, they lost, and the scoreboard is what everyone will look back to.

Luckily, I wasn’t the only one to question what Standish was smoking when they drew those alarming figures. In the latest issue of IEEE Software, I came across a very interesting title: The Rise and Fall of the Chaos Report Figures. In the article, J. Laurenz Eveleens and Chris Verhoef explain how the analysis made inevitably leads to this kind of number. The data used by Standish is not publicly available, so analyzing it correctly is not possible (how many times do we have to ask for open data?). However, they were able to tap into other sources of project data to perform the analysis and come up with a very clear explanation of their results.

First off, my question was answered. The definition of success for Standish is pretty much arriving under budget based on the initial estimate of the project with all requirements met. Failure is being canceled. Everything else goes into the Challenged bucket, including projects completing slightly over budget and projects with changed scope. Considering that bucket contains half of the projects, questioning the numbers is fairly valid.

I remember a presentation by a QA director (not certain of the exact title) from Tata Consulting at the Montreal Software Process Improvement Network a few years ago in which they were explaining their quality process. They presented a graph where they had some data collected from projects during the post-mortem and asking to explain the causes of some slippage or other types of failures (it was a long time ago, my memory is not that great), but there was a big column labeled Miscellaneous. At the time, I did not notice anything wrong with it. All survey graphs I had ever seen contained a big miscellaneous section. However, the presented highlighted the fact that this was unacceptable as it provided them with absolutely no information. In the next editions of the project survey, they replaced Miscellaneous with Oversight, a word which no manager in their right mind would use to describe the cause for their failures. Turns out the following results were more accurate for the causes. When information is unclear, you can’t just accept that it is unclear. You need to dig deeper and ask why.

The authors then explain how they used Barry Boehm‘s long understood Cone of Uncertainty and and Tom DeMarco‘s Estimation Quality Factor (published in 1982, long before the reports) to identify organizational bias in the estimates and explain how aggregating data without considering it leads to absolutely nothing of value. As an example, they point our a graph containing hundreds of estimations at various moments in the projects within an organization of forecast over actual ratio. The graphic is striking as apparently no dots exist above the 1.0 line (nearly none, there are a few very close by). All the dots indicate that the company only occasionally overestimates a project. However, the cone is very visible on the graph and there is a very significant correlation. They asked the right questions and asked why that was. The company simply, and deliberately, used the lowest possible outcome as their estimate, leading them to a 6% success rate based on the Standish definition.

I would be interested to see numbers on how many companies go for this approach rather than providing realistic (50-50) estimates.

Now, imagine companies buffering their estimates added to the data collection. You get random correlations at best because you’re comparing apples to oranges.

Actually, the article presents one of those cases of abusive (fraudulent?) padding. The company had used the Standish definition internally to judge performance. Those estimates were padded so badly, they barely reflected reality. Even at 80% completion some of the estimates were off by two orders of magnitude. Yes, that means 10000%. How can any sane decision be made out of those numbers? I have no idea. In fact, with such random numbers, you’re probably better off not wasting time on estimation at all. If anything, this is a great example of dysfunction.

The article concludes like this:

We communicated our findings to the Standish Group, and Chairman Johnson replied: “All data and information in the Chaos reports and all Standish reports should be considered Standish opinion and the reader bears all risk in the use of this opinion.”

We fully support this disclaimer, which to our knowledge was never stated in the Chaos reports.

It covers the general tone of the article. Above the entertainment value (yes, it’s the first time I ever associate entertainment with scientific reading) brought by tearing apart the Chaos report, what I found the most interesting was the well vulgarized use of theory to analyze the data. I highly recommend reading if you are a subscriber to the magazine or have access to the IEEE Digital Library. However, scientific publications remain restricted in access.