The saddest thing in the software industry is to see a project run flawlessly, and completely fail at deployment time. Of course, there is the worst case scenario of a team who has never seen a well run deployment and has no idea it can be done without a sleepless night or two. For a deployment or release to work well, it has to be considered from day 0 and be part of the development process.
Version control is now used by everyone serious in the software world. However, people are quick to check off software configuration management (SCM) off the list of software engineering best practices as soon as they see a subversion repository in place. SCM is not called version control for a reason: it goes well beyond it.
Deployment should be a simple task. Just putting up the new production code in place is unlikely to do it on a non-academic project. The typical PHP application will connect to a database, require some directories to be created and assigned the appropriate permissions. New dependencies may have been introduced, whether they are among the shared library or it’s an extension to the language. Covering all these bases is also part of SCM.
The big problem is that the programmer in the middle of his brand new feature will modify the database schema, create folders, change permissions and get everything working on their local copy. Then they loose track of it. In most cases, all developers will share a common database because it’s way too hard to handle those merges. When updating the code base, they will find the permission problem if they even bother using the new feature, fix it, then forget about it.
This kind of behavior is what causes deployments to be hard.
The environment in which the application runs has to be managed. The first step is to make it harder for developers in the development process. Force everyone on their own copy of the database. Fast enough, good database schema practices will be in place. Make sure there is a script containing all the changes to the environment. Ideally, updating the code base should be about running a single script and all dependencies would be resolved without anyone knowing.
If you can easily create, replicate and update development environments, you are on the right way for an easy deployment. When comes the time for a production upgrade, a sanity check can be made by restoring a production backup, restoring the copy on a similar environment (ideally identical) and trying out the update. Then you have enough confidence to do it in production within a few minutes with everyone on the team keeping their hairs and smiles.
The rule is simple, never perform an action you could automate in a script. When it comes to deployment, you want to keep human intervention to a minimum. Humans make errors, machines rarely do. It’s just about fine tuning the scripts to handle all the exception cases. Over time, they become rock solid.
Database schema
I know two ways to handle database schema updates in a sane way and use both on different projects. The first one is simple and guaranteed to work in all cases, the second one is a little more complex and can potentially have problems, but can handle much more complex cases too.
The naive way is to simply version the schema. You need a place to store the current schema version in the database and write version increment scripts when you need a modification to the database schema. The update process is about looking at the version currently in the database, looking at the version used by the code, and run all increment scripts. Because all updates are always executed in the same order, you know the final schema will always be the same.
However, this strategy will cause a problem if your development cycle includes branches. Two branches cannot increment the version number at the same time or it will break. It’s unlikely to cause problems in production as branches will have been merged and conflicts resolved, but it can cause wastes of time in development.
This is why there is a second way to handle database updates, which I read on a blog a long time ago, and with my excellent source management practices, I lost it. The idea is to store the list of installed patches rather than a simple version number. The update process becomes fetching the list of installed patches, comparing it with those contained in the code base, and install the missing ones. Of course, problems could be introduced if patches are not installed in the same order, like columns not being in the same order. I mitigate this risk by including a timestamp in the patch name and ordering them in the install queue, but with branching, the problem remains.
I implemented this technique for TikiWiki at the end of the last summer and it worked great ever since. The code had to have a few peculiarities to handle legacy and respect previous environments, but the code is available under LGPL, so feel free to rip it out.
I always plan for arbitrary code execution before and after each patch other than SQL. This is mostly to handle theoretical cases where data conversion would be required and it would not be possible to do it in plain SQL. My example in mind is always converting poorly implemented hierarchies into nested sets. I consider just having the option to safely be able to refactor the database and not have to bother about data conversion after the patch to be a nice thing.
Some propose having uninstall patches to undo the change. While I can see the value in them, especially in testing when trying to identify where a problem was introduced using binary search on the repository history, I tend to think it’s just one more place you can introduce problems. If rebuilding the entire database from scratch for every test is not too expensive, I don’t think it’s needed. Of course, the rules of the game always change when data quantities increase.
External dependencies
The easiest way to resolve the external dependency problem for libraries is to bundle them. If they get updated with the code base, it’s one less problem to deal with. However, this may not always be possible due to licensing constraints. In this case, they have to be handled externally, which means higher privileges are likely to be required.
You can either write your script in a way that it will require superuser privileges by running sudo and prompting for a password once in a while, or just refuse to install if the environment is incorrect. Perform a set of checks before running any part of the update and list the steps to take before moving along with the update. Both techniques are fine, but this is unlikely to be a technical consideration. It’s only a matter of how draconian your sysadmins are.
Physical environment
The two techniques used for database schema upgrade can certainly be applied to management of the physical environment management. Replace the version in the database with a file containing the version or log all the installed patches in a file. The important part is that what you compare with is on the same medium. If your database server crashes and you need to restore a backup dating before the update, if should be able to update itself even if the environment is already up to date. On the other hand, if the entire server crashes, it should be able to rebuild the environment even if the database is up to date. If you store the environment version information in the database, that just won’t happen.
Generally, using the same mechanisms in development as those used for migrations is the key to successful deployment. Scripts that are rarely used tend to rot. If the development team uses them often, they will work. Consider having policies to get everyone work from a fresh version of the database from production every week to make sure the updates always work. Sure, there are some privacy concerns, but anonymizing data can be automated too. If the data quantities are extremely large and loading a copy of the production database takes hours, better tools can probably be used, but this can probably be done automatically over the week-end while the development servers are idle.