Bad numbers

The most frequently quoted numbers in software engineering are probably those of the Standish Chaos report starting in 1995. To cut the story short, they are the ones that claim that only 16% of software projects succeed. Personally, I never really believed that. If it were that bad, I wouldn’t be making a living from writing software today, 15 years later. The industry would have been shut down long before they even published the report. As I was reading Practical Software Estimation, which had a mandatory citation of the above, I came to ask myself why there was such a difference. I don’t have numbers on this, but I would very surprised to hear any vendor claiming less than 90% success rate on projects. Seriously, with 16% odds of winning, you’re better off betting your company’s future on casino tables than on software projects.

The big question here is: what is a successful project? On what basis do they claim such a high failure rate?

Well, I probably don’t have the same definition of success and failure. I don’t think a project is a failure even if it’s a little behind schedule or a little over budget. From an economic standpoint, as long as it’s profitable, and above the opportunity cost, it was worth doing. Sometimes, projects are canceled for external factors. Even though we’d like to see the project development cycle as a black box in which you throw a fixed amount of money and get a product in the end, that’s not the way it is. The world keeps moving and if something comes out and makes your project obsolete, canceling it and changing direction is the right thing to do. Starting the project was also the right thing to do based on the information available at the time. Sure some money was lost, but if there are now cheaper ways to achieve the objective, you’re still winning. As Kent Beck reminds us, sunk costs are a trap.

When a project gets canceled, the executives might be really happy because they see the course change. The people that spent evenings on the project might not. Perspective matters when evaluating success or failure. However, when numbers after the fact are your only measure for success, that may be forgotten. Looking back, it won’t matter if a hockey team gave a good fight in the third period. On the scoreboard, they lost, and the scoreboard is what everyone will look back to.

Luckily, I wasn’t the only one to question what Standish was smoking when they drew those alarming figures. In the latest issue of IEEE Software, I came across a very interesting title: The Rise and Fall of the Chaos Report Figures. In the article, J. Laurenz Eveleens and Chris Verhoef explain how the analysis made inevitably leads to this kind of number. The data used by Standish is not publicly available, so analyzing it correctly is not possible (how many times do we have to ask for open data?). However, they were able to tap into other sources of project data to perform the analysis and come up with a very clear explanation of their results.

First off, my question was answered. The definition of success for Standish is pretty much arriving under budget based on the initial estimate of the project with all requirements met. Failure is being canceled. Everything else goes into the Challenged bucket, including projects completing slightly over budget and projects with changed scope. Considering that bucket contains half of the projects, questioning the numbers is fairly valid.

I remember a presentation by a QA director (not certain of the exact title) from Tata Consulting at the Montreal Software Process Improvement Network a few years ago in which they were explaining their quality process. They presented a graph where they had some data collected from projects during the post-mortem and asking to explain the causes of some slippage or other types of failures (it was a long time ago, my memory is not that great), but there was a big column labeled Miscellaneous. At the time, I did not notice anything wrong with it. All survey graphs I had ever seen contained a big miscellaneous section. However, the presented highlighted the fact that this was unacceptable as it provided them with absolutely no information. In the next editions of the project survey, they replaced Miscellaneous with Oversight, a word which no manager in their right mind would use to describe the cause for their failures. Turns out the following results were more accurate for the causes. When information is unclear, you can’t just accept that it is unclear. You need to dig deeper and ask why.

The authors then explain how they used Barry Boehm‘s long understood Cone of Uncertainty and and Tom DeMarco‘s Estimation Quality Factor (published in 1982, long before the reports) to identify organizational bias in the estimates and explain how aggregating data without considering it leads to absolutely nothing of value. As an example, they point our a graph containing hundreds of estimations at various moments in the projects within an organization of forecast over actual ratio. The graphic is striking as apparently no dots exist above the 1.0 line (nearly none, there are a few very close by). All the dots indicate that the company only occasionally overestimates a project. However, the cone is very visible on the graph and there is a very significant correlation. They asked the right questions and asked why that was. The company simply, and deliberately, used the lowest possible outcome as their estimate, leading them to a 6% success rate based on the Standish definition.

I would be interested to see numbers on how many companies go for this approach rather than providing realistic (50-50) estimates.

Now, imagine companies buffering their estimates added to the data collection. You get random correlations at best because you’re comparing apples to oranges.

Actually, the article presents one of those cases of abusive (fraudulent?) padding. The company had used the Standish definition internally to judge performance. Those estimates were padded so badly, they barely reflected reality. Even at 80% completion some of the estimates were off by two orders of magnitude. Yes, that means 10000%. How can any sane decision be made out of those numbers? I have no idea. In fact, with such random numbers, you’re probably better off not wasting time on estimation at all. If anything, this is a great example of dysfunction.

The article concludes like this:

We communicated our findings to the Standish Group, and Chairman Johnson replied: “All data and information in the Chaos reports and all Standish reports should be considered Standish opinion and the reader bears all risk in the use of this opinion.”

We fully support this disclaimer, which to our knowledge was never stated in the Chaos reports.

It covers the general tone of the article. Above the entertainment value (yes, it’s the first time I ever associate entertainment with scientific reading) brought by tearing apart the Chaos report, what I found the most interesting was the well vulgarized use of theory to analyze the data. I highly recommend reading if you are a subscriber to the magazine or have access to the IEEE Digital Library. However, scientific publications remain restricted in access.

Always more to do

It’s fascinating to see how little time it takes between the moment code is written and to be mentally flagged as to be improved, how procrastination then kicks in and finally how things get worst because left untouched. Of course, there are higher priorities and hundreds of reasons why it was left behind. The end result is the same. It will have to be refactored at some point, and those hours will be incredibly boring. Recently, I have been working on a fairly large project. After several hundreds of hours, I came to the conclusion I had a decent proof of concept, but it was far from being usable. I made a list of the issues and found several places that needed improvement. Turns out I had known about the root causes for a long time. Simply did not attend them.

So began the refactoring process. Filled with incredibly dull tasks a monkey could do, if only wouldn’t need to spend more time explaining it than it actually takes me to do.

Certainly, those issues would have been much faster to resolve if less was built on em, meaning I had attended them earlier. However, I strongly question that the solution I would have found back then would have lasted any longer. In fact, what I implement now follows patterns that have been deployed in other areas in the mean time. It builds on top of recent ideas. Dull refactoring may just be unavoidable. It will have to be done again in the future.

Constant refactoring and trying to reach for perfection is a trap. I’ve learned about ROI and almost turned going for the highest value objective into an instinct. With limited resources, no time can be wasted on gold-plating an unfinished product. Refactoring, as a means to improve user interaction and speed up development in this case, just happened to become on top of the priority list, and I now have to suffer through brain-dead activities. Luckily, refactoring still beats pixel alignments and fixing browser issues any day.

Trying to avoid falling asleep, I have been keeping Chad Fowler’s new book close by, which turns out to be really good for my condition. Today, I came across this passage.

For most techies, the boring work is boring for two primary reasons. The work we love lets us flex our creative muscles. Software development is a creative act, and many of us are drawn to it for this reason. The work we don’t like is seldom work that we consider to be creative in nature. Think about it for a moment. Think about what you have on your to-do list for the next week at work. The tasks that you’d love to let slip are probably not tasks that leave much to the imagination. They’re just-do-’em tasks that you wish you could just get someone else to do.

It goes on and recommends to divert the mind to a different challenge instead while performing the task with, as an example, keeping 100% code coverage target when writing unit tests. I’ve been doing a lot of that in the project. It influenced the design a lot. Ironically, what makes refactoring so boring is that all the tests now have to be updated. The code update itself is just a few minutes. Updating the dozens of failing tests because the interface changed takes hours however. They are quite a good guarantee nothing broke, but they do increase my daily caffeine intake to unsafe levels.

Adding collaboration and durability to code reviews

The idea first came to me over the summer while I was in Strasbourg discussing the future changes to TikiWiki. It all started because we decided to release more often. A lot more often. As it was not something done frequently before, the tools to do it were hard and tedious to use. Some manual steps are long and annoying. Packaging the releases was one of them. All those scripts were rewritten to be easier to use and to regroup multiple others. That part is good now. One of the other issues was the changelog. TikiWiki is a very active project. Having over 1000 commits per month is not uncommon. Many of them are really small, but it’s still 1000 entries in the changelog. Ideally, we’d like the changelog to be meaningful to people updating, but we just had no way to go through all the changelog before the release.

One solution that came up was to use a wiki page to hold the changelog, get new items to be appended, and then people could go though it and translate the commit messages from developer English to user English, filter the irrelevant ones and come up with something useful to someone. Well, we had no direct way to append the commit messages to the end of a page, so this was not done over the 6 month period and we’re now getting dangerously close to the next release. Looks like the next changelog will be in developer English.

Now, what does this have to do with code review? Not so much. I was reading “Best Kept Secrets of Peer Code Review (which they mail you for free because it’s publicity for their software, but is still worth reading because it still contains valid information) and figured TikiWiki could be a good platform to do code reviews on if only we could link it with Subversion in some way. After all, TikiWiki is already a good platform to collaborate over software development as it was developed for it’s own internal purposes for so long. When you dogfood a CMS for a long time, it tends to become good for software development (also tends to have complex UIs intuitive to developers though). It contains all you need to write quality documentation, track issues, and much more by just enabling features and setting it up.

Moreover, code review is done by the community over the Subversion mailing list. The only issue is that we don’t really know what was reviewed and what was not. I personally think a mail client is far from being the best environment to review changes in. Too often, I’d like to see just a couple lines more above to fully check the change or verify something in an other related file. The alternative at this time is to use subversion commands afterwards and open up files in VI. I wish it required less steps.

Wiki pages are great, because they allow you do do what you need and solve unanticipated issues on the spot. Specialized tools on the other hand are great at doing what they were meant to do, but they are often draconian in their ways, and forcing someone to do something they don’t want to never really works. I’ve made those errors in the past designing applications, and I felt Code Collaborator made the same ones. The book mentioned above contains a chapter on a very large case study where no code could be committed unless reviewed first. Result: few pages were spend explaining how they had to filter data for those cases the code was not really reviewed and people only validated it within seconds. I’d rather know it was not reviewed than get false positives.

Anyway, I started thinking of the many ways TikiWiki could allow to perform code reviews. The most simple layout I could think of was this one:

  • One wiki page per commit, grouping all the changes together. Reviewers can then simple edit the page to add their comments (or use the page comments feature, but I think adding to the page is more flexible) and add links to other relevant information to help others in the review.
  • A custom plugin to create links to other files at a given revision, just to make it easier to type. This could actually be a plugin alias to something else. No need for additional code.
  • A custom plugin to display a diff, allow to display full diff instead and link to full file versions for left and right.
  • An on-commit plugin for Suversion to make the link.

With commits linked to other related commits, to related documentation and discussions, other side features like wiki mind map and semantic links will surely prove to be insightful.

Then I went a little further and figured trackers could be used to log issues and perform stats on in the future. Checklists could be maintained in the wiki as well and displayed as modules on the side, always visible during the review. If issues are also maintained in a tracker, they could be closed as part of the commit process by analyzing the commit message. However, this is mostly an extra as I feel there is enough value in just having the review information publicly available. The great part of using a vast system is that all the features are already there. The solution can be adapted and improved as required without requiring complete new developments.

Now, the only real show stopper in this was that there is no direct way of creating all this from a subversion plugin. TikiWiki does not have a webservice-accessible API to create these things and is unlikely to have one any time soon. The script could load the internal libraries and call them if they were on the same server, but that’s unlikely to be the case. A custom script could be written to receive the call, but then it would not be generic, so hard to include in the core. As we don’t like to maintain things off the core (part of the fundamental philosophy making the project what it is), it’s not a good solution. With this and the changelog idea before, I felt there was still a need for something like this. I’m a software engineer, so I think with the tools I use, but I’m certain there are other cases out there that could use a back door to create various things.

To keep the story short, I started finding too many resemblances to the profiles feature. Basically, profiles YAML files containing descriptions of items to create. They are to be used from the administration panel to allow to install configurations on remote repositories, to configure the application faster based on common usage profiles. Most of the time, they are only used to create trackers, set preferences and the such. However, they contain a few handles to create pages, tracker items and some other elements to create sample data and instruction pages. If they could be ran multiple times, be called by non-administrator users, contain a few more options to handle data a little better (like being able to append to pages), they could pretty much do anything that’s required in here, and a lot more.

An other problem was that profile repositories are also TikiWiki instances. They require quite a few special configurations, like opening up some export features, using categories and such. I wouldn’t want all this stuff just to receive a commit notification, and I wouldn’t want to execute a remote configuration without supervision from the administrator. More changes were required to better handle access to local pages.

Still, those were still minor changes. A few hours later, Data Channels were born. What are these? It’s so simple it’s almost stupid. A name, a profile URI and list of user groups who can execute it from an HTTP call.

Next steps toward using TikiWiki as a code review tool:

  • Discuss this with the community
  • Write a profile to create the base setup and the base data channel profiles
  • Write a subversion plugin to call the channels
  • Write the plugins to display code differences

Interested in something like this for your project or company? Just drop me a line.

He's right, again

In some theoretical sense, developing the complete infrastructure before implementing any visible functionality might be efficient, but in a practical sense, managers, customers, and developers begin to get nervous when too much time goes by before they can actually see the software work. Infrastructure development has the potential to become a research project in creating a perfect theoretical framework[…]

Sounds like a familiar problem? Taken from Software Project Survival Guide. I can’t say it’s my favorite from McConnell, but it’s still right on.

Who reads code samples?

Recently I have been reading Programming Collective Intelligence by Toby Seragan. I love the subject. It’s all about handling large data sets and finding useful information out of it. Finally an algorithm book that covers useful algorithms. I don’t read code-centric books very often because I think they are boring, but this one has a great variety of examples that keep it interesting as the chapters advance. There are also real world examples using web services to fetch realistic data.

My only problem with the book is that there are way too many code samples. It may just be my training, but there are some situations where just writing the formula would have been a lot better. Code is good, but when there is a strong mathematical foundation to it, the formula should be provided. Unlike computer languages, mathematics as a language has been developed for hundreds of years and it provides a concise, unambiguous syntax. I like the author’s effort to write the code as a proof of concept, but I think it belongs in an appendix or on the web rather than between paragraphs.

Which one do you prefer?

def rbf(v1,v2,gamma=20):
    dv=[v1[i]-v2[i] for i in range(len(v1))]
    return math.e**(-gamma*l)


For that kind of code, I vote #2 any time. I’m not a Python programmer. I can read it without any problem, but that vector substraction did seem a little arcane at first and it took me a few seconds to figure out, and I’m about certain that even a seasoned Python programmer would have stopped on that one. It’s not that it takes really long to figure it out, but it really keeps you away from what is really important about the function. What was important was that you want to score points that are far away from each other a lower value than those that are close by. Anyone who has done math could figure it out from the formula because it’s a common pattern. From the code, would you even bother to read it?

This is a very short code sample. In fact, it’s small enough that every single detail of it can fit into your short term memory. Here is an example that probably does not. In fact, I made your life a lot easier here because this code was scattered across 4 different pages in the book.

def euclidean(p,q):
    for i in range(len(p)):
    return (sumSq**0.5)

def getdistances(data,vec1):
    for i in range(len(data)):
    return distancelist

def gaussian(dist,sigma=10.0):
    return (1/(sigma*(2*math.pi)**0.5))*exp

def weightedknn(data,vec1,k=5,weightf=gaussian):

    for i in range(k):

    return avg


The formula is insanely shorter, and the notation could certainly be improved. What’s the trick? It relies on well documented language features like vector operations and trims out all the python-specific code. I actually wrote more than I had to because gaussian itself is well defined in math. Because all operations used are well defined, whichever language you use will probably support them and you can use the best possible tool for your platform. The odds that I use Python when I get to use those algorithms is low, so why should I have to bother with the language specifics?

The author actually included the formula for some function in the appendix. I just think it should be the other way around.

New page on this blog

I finally took some time to put the list of books in my library online. This is something I had been willing to do for quite a while, but I never felt like placing any effort into it. The good thing is that producing that list did not require any effort at all, except for the part where I actually had to search for the book ISBN numbers. I could have picked up the books and find it, but I did not feel like typing them.

I just wrote a small script to pull the information from Amazon using the ISBN code. Using the Zend_Service_Amazon component, it was a matter of minutes. Until now, I didn’t think these service classes really were useful and required to be part of the base framework distribution. That was because I never had to use them. Now I wish there were even more.

Right now, the list is not very useful, but I intend on adding comments on all of the books. Feel free to ask for comments on any book listed, I will give them a higher priority.

The C++ Programming Language

Book Cover

I learned C++ a few years ago and honestly, I havn’t written a single line in over a year. We used Visual C++ 6 in college. For some reason, I felt like learning more about the language details and what it’s really supposed to be. What can be better than a book written by Bjarne Stroustrup, the language’s original author? C++ is a very impressive language. It can be as efficient at very low levels while allowing extreme abstraction. Generic programming and operator overload really is missing in Java. I would say the downside of the language is that, compared to development platforms, it offers very few high level facilities and the standard library, while being very well designed and robust, remains very low level and does not really provide a plus-value in real applications. Of course, there are multiple libraries available around the web to achieve most tasks. Still, searching for a library is additional work.

The thousand-page monster did cover the entire language as I expected it to. The book is divided in 5 main sections: basic facilities, abstraction mechanism, standard library, design using C++ and the appendicies (which are really part of the book). The book tend to get C++ to look very complex as a lot of pages are used to explain the ambiguities, special cases and obscur syntaxes. To avoid any complexity, my recommandation is to keep code clean and don’t abuse those ambiguities, special cases and obscur syntaxes.

Multiple analogies to real world concepts are used to explain technical details and design decisions. Some of them are quite hillarous. Here is one of my favorites. Static and dynamic type checking is compared to electric plugs:

The most obvious thing about these plugs is that they are specifically designed to make it impossible to plug two gadgets together unless the gadgets were designed to be plugged together, and then they can be connected only in the right way. Had you been able to, you would have ended up with a fried shaver or a fried shavee.

Continue reading “The C++ Programming Language”

The Humane Interface

Book Cover

The Humane Interface is an essay by Jef Raskins, creator of the Macintosh, aiming to improve the human-machine interaction. The ~200 pages book covers the conventional graphical user interfaces and hardware devices. The book has a very nice format and for some reason, feels very comfortable when reading. The book contains a color-insert with some of the illustrations the author considered as the most important. Each chapter and section begins with a quote representing the topic. It does feel like an introduction, but I’m pretty sure quite a few people would be mad to read their words so far away from the original purpose. (Sorry, I actually had to write a few positive aspects)

We are oppressed by our electronic servants.
This book is dedicated to our liberation.

[ … Yeah … Right … ]

While the author is very respectable, definetly knows what he is writing about and explains his concepts and theories very well, the book leaves a bad feeling. The book is divided in eight chapters, including introduction and conclusion. The three first chapters all felt like an introduction, explaining basic concepts in a very theorical way. Chapter four gives the basics of UI evaluation and quantification. Chapter 5 and 6 explain theories about the ideal interfaces. Chapter seven seem to be everything that couldn’t fit elsewhere. There are also 2 appendix, which I havn’t read.

The rest of the review will detail the content of the book and explain why I feel the book will forever stay on a shelf, gathering dust.

Continue reading “The Humane Interface”

Design Patterns: Elements of Reusable Object-Oriented Software

Book Cover

Design Patterns: Elements of Reusable Object-Oriented Software is the one reference when it comes to Design Patterns. First published in 1994, the book is now at it’s 30th printing. The nearly 400 fine printed pages masterpiece has a retail price of 85.99$ CAD, but you can have your copy from for 43$ CAD. The four authors of the book are often refered to as the Gang of Four and their standing was high enough even back when the book was released. The last paragraph from the back cover should be descriptive enough:

The authors are internationally recognized experts in the object-oriented software field. Dr. Erich Gamma is a technical director at the Software Technology Center of Object Technology International in Zurich, Switzerland. Dr. Richard Helm is a member of the Object Technology Practice Group in the IBM Consulting Group in Sydney, Australia. Dr. Ralph Johnson is a faculty member at the University of Illinois at Urbana-Champaign’s Computer Science Department. Dr. John Vlissides conducts his research at IBM’s Thomas J. Watson Research Center in Hawthorne, New York.

The rest of the review contains a description of the content of the book, which is divided in two large sections: an example case study and the pattern catalog. Of course, I will also give my opinion since it’s what this review (and blog) is all about.

Continue reading “Design Patterns: Elements of Reusable Object-Oriented Software”

Waltzing with Bears: Managing Risk on Software Project

Cover of the book

I felt on this book while surfing on Amazon. The title sounded original, the topic was interesting and the reviews were positive. I decided to buy it. It was on my doorstep 36 hours later. Waltzing with Bears is a very quick read (little under 200 pages), but it contains an incredible amount of information. Tom DeMarco and Timothy Lister wrote an excellent book.

General risk management is covered by about half of the book. In the early chapters, generic examples are being used. As the chapters advance, the emphasis on software development increases, but a parallel to reality and non-abstract examples are always made. Some of the examples are humoristic or cynical, which makes the read even more interesting. My favorite example was when they were trying to evaluate the damages to prepare for in the case you would hit a child based on the average weights.

Continue reading “Waltzing with Bears: Managing Risk on Software Project”