Archive

Archive for March, 2009

Improvement of the regression tool

Today, I completed and deployed a minor release of TaskEstimation. Attendees at my express session on estimation at PHP Quebec earlier this month got to see the very first addition since the original release: the graphical representation. It seems that for most people without much statistics in their background, and most people who forgot it, R2 is not too meaningful. However, a visual representation of the dots and the regression line makes everything more obvious. The dots simply have to be close to the line.

I made this addition during the conference the day before I had to present. I must say I am really impressed by the possible output of eZ Components Graph. Hacking around in the code to produce a scatter plot was not too much trouble, but it did remind me how hard generating good graphs is. I have done it before and going back to my own code did not seem so attractive. Looking at the internals, I could see the same problems were faced: rendering graphics to multiple output formats without tangling the code is just hard.

I then remembered a voice in the back of my head. It’s Steve McConnell repeating that “simplistic single-point estimates are meaningless because they don’t include any indication of the probability associated with the single point.” I knew I had to do it eventually, but I just had to wrap my head around the maths involved and do it. In addition to the forecast, the regression tool now displays the confidence range based on the desired probability. The range also gets displayed on the graphic.

It turns out Wikipedia was the most accurate source of information I had available at the time. It can be hard to find a basic formula when you don’t have a textbook nearby.

Looking scary? It’s really not that bad, except that a few details are missing and I suspect there is an error in the first one. I’d have to check more sources, but alpha alone does not make sense and using the alpha with the hat does look much more accurate graphically.

Generated graphic

Generated graphic

The major detail that is missing was the way to calculate the talpha/2,n-2 term. I wasn’t the only one searching for it. Someone linked to it on Wikipedia while I was reading. After a more while searching for the formula because I don’t like straight numbers, I settled for including the table itself. Those numbers are quite hard to obtain otherwise and so much accuracy is not required anyway.

TaskEstimation now provides nice confidence ranges to accompany the estimate, making the expected accuracy more obvious and taking into account that extrapolating your data beyond known values has more uncertainty.

Categories: News Tags:

Convergence in the field

During the PHP Quebec Conference, I had one of those weird feelings. It was just like my profession changed overnight, in a good way. It happened while I was talking to Eric David Benari close to the coffee machines at a time we both should have been sitting in an other room listening to a talk. Earlier that day, he had given an introduction talk about project management. He was surprised by how smart the audience was after one of his barometer questions. The question went a little like this:

Jeff is a hero in the company. He often pulls out all nighters, work on week-ends and always saves the day before releases. Rita works 9-to-5. Considering they both do the same amount of work, who should be getting the largest bonus at the end of the year?

I will leave to you to determine what the answer was. Eric David’s surprise was because, for the first time, he saw the audience completely divided, instead of being skewed towards one end.

I myself saw a difference after my presentation on software estimation, which I got scheduled to do only a few hours before the conference to fill up a canceled slot. I first gave that presentation in 2006. At the time, most people thought the idea was weird. While I hope some decided to try it out, I doubt that was the case for the entire audience. This time, I wanted to make it more interactive, and most figured out I wasn’t so prepared, but a few came to talk to me after the presentation to tell me how they handled empirical data and get some advise with some problems. For the first time, I was not alone in the room using valuable estimation practices.

Sure, I have not been alone in this world to do it. I mostly learned from other’s experience. The surprising part is that we are no longer isolated. Have we reached a critical mass? Are there enough software engineering practitioners out there to really make a difference and apply the principles? For a long time, the field has been founded and promoted by a few very bright individuals struggling to get their message through. Today, I see their message as finally spreading.

Every PHP conference I attend has a few sessions on either agile or best practices in some way. These sessions are not the least popular. Even technical crowds are moving their focus away from code. It really wasn’t the case back in 2003. Even some apparently technical talks are in fact process talks in disguise. Sebastian Bergmann‘s classics on PHPUnit are good examples.

How did it happen? Joel Spolsky mentioned that it was a bit useless to write about best practices because the programmers who need it are the less likely to read it. It seems that, while they don’t go out to read books and articles, there is only so much that can be ignored. Blogs made the best practices omnipresent. Podcasts brought them to the lazy ones. They reformulated the tried and true techniques in a way others can understand. Even thought not everyone is reading or listening, I have the feeling communication got to a point where every company out there can now have an evangelist.

Could time alone simply have made a difference? I’m part of a brand new generation of software developer. I was born after all the problems have been recognized. My training was focused on other’s mistakes so I don’t have to make them over again. Every year, a few hundred more software engineers graduate with a better training, knowing the best practices and development processes, not only the coding part of the equation.

Could it be the different crowd? Technical conferences attract younger people in general. People who are usually down in the trenches. Managers are rarely around. Most of what I read these days seems to have a common theme. The theme was present even in books written over 15 years ago, but it’s now getting louder. No matter what you try to do, only one thing will make a difference in the end: commitment to quality work. No methodology will work unless the team agrees with it and embraces it. No estimation technique unless will work unless those who estimate take it seriously. TDD will fail if developers barely attempt to reach code coverage standards. Robert D. Austin explained this phenomenon a long time ago.  Developers know this instinctively because they can see the difference between real and fake. How long will it take until organizations realize it?

The good part to this is that if you have a good team, almost anything will work. Even if you do it all wrong according to the books. There are tons of methodologies out there. Agile alone has a dozen, and there are even more unpublished variants. They all worked at some point, most of them probably failed to repeat with a different team. Formal methods are being laughed at these days, but I have no doubt they did work at the time for the context they were created for. However, simply taking them as a set of rules and enforcing them on other people is bound to fail.

Categories: General Tags:

Pushing it out the door

Parkinson’s law indicates that any task will take the amount of time allocated to it. Too often, this is abused by managers to squeeze developers in a short and unrealistic time line for the project. While they often abuse it, the foundation is still right. Given too much time and no pressure, some developers will create great abstractions and attempt to solve all the problems of the universe. Objectives are always good to have. Allocating a limited, but reasonable, amount of time for a project is a good way to insure no gold plating is made, but still allow for sufficient quality in the context.

A reasonable amount of time may be hard to define. It requires a new skill for developers: estimation. Done on an individual basis, estimation can be used by developers as a personal objective to attain, but can also be part of a greater plan towards self improvement. The practice of up front estimation has huge benefits on the long term, even if they are far off the target. Once the task is completed with a huge variation, it triggers awareness. What when wrong? Constantly answering these questions and making an effort at trying resolving the issues will lead to a better process, higher quality estimations and less stress to accomplish tasks.

A long time ago, after reading Watts Humphrey’s Personal Software Process (PSP), I became convinced of the value of estimation as part of my work. In Dreaming in Code, Scott Rosenberg reflects on Humphrey’s technique:

Humphrey’s success stood on two principles: Plans were mandatory. And plans had to be realistic. They had to be “bottom-up”, derived from the experience and knowledge of the programmers who would commit to meeting them, rather than “top-down”, imposed executive fiat or marketing wish.

A few initial attempts in 2006 gave me confidence that high precision estimates were possible and it wasn’t so hard to attain. However, when my work situation changed, I realized that the different projects I was working on did not have the same quality constraints. This lead to splitting up my excel sheets in multiple ways. The task of estimating became so tedious I eventually dropped all tools. Not because I was not satisfied of the results I obtained, but because of the time it took me to get to it. I reverted to paper estimates and my gut feeling of scale. Still, the simple fact of performing analysis, design and rough measurements gave me significant precision. Not everything was lost.

Estimation Interface

Estimation Interface

However, one thing I did loose was traceability. Paper gets buried under more, or lost altogether. Personal notes are not always clear enough to be understood in the future. I no longer had access to my historical data. I wanted my spreadsheet back, but couldn’t bear with having to organize it. Over a year ago, searching for a reason to try out new things, I started a personal project to build a tool that would satisfy my needs for organization and simplicity. It required a few features crucial to me.

  1. It had to make it easy to filter data to find the relevant parts to the task at hand
  2. It had to be flexible enough to allow me to try out new estimation techniques
  3. It had to be quick and fun to use, otherwise it would just be an other spreadsheet

I achieved a first usable version over the last summer, working on it in my spare time and gave it a test run in the following months. It was not good enough. Too linear. Too static. It did not accomplish what I needed and found myself reverting back to paper over again. What a terrible failure.

Spacial Editor

Spacial Editor

A few months later, I figured I had to make it paper-like and gave it a little more effort. After a dozen hours sitting in the airport over the last two weeks, I think I finally documented my work enough for others to understand. Sadly, even if the application is somewhat intuitive, the prerequisite skills required to perform estimation are not.

Today, I announce the first public beta release of TaskEstimation.com, a tool aimed for developers to estimate their work on a per-task basis and work towards self improvement. Don’t be confused, this is not built for project management. While it probably is flexible enough for it, any project manager using it should have it’s own self improvement in mind. Feedback is welcome on both the application and the documentation. I expect the later one to be lacking details, so feel free to ask questions.

Categories: General, News Tags: