<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>L-P Huberdeau</title>
	<atom:link href="http://blog.lphuberdeau.com/wordpress/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.lphuberdeau.com/wordpress</link>
	<description>Some days, I feel like writing</description>
	<lastBuildDate>Tue, 16 Feb 2010 00:12:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Bad numbers</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/02/15/bad-numbers/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/02/15/bad-numbers/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 00:12:46 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=317</guid>
		<description><![CDATA[The most frequently quoted numbers in software engineering are probably those of the Standish Chaos report starting in 1995. To cut the story short, they are the ones that claim that only 16% of software projects succeed. Personally, I never really believed that. If it were that bad, I wouldn&#8217;t be making a living from [...]]]></description>
			<content:encoded><![CDATA[<p>The most frequently quoted numbers in software engineering are probably those of the Standish Chaos report starting in 1995. To cut the story short, they are the ones that claim that only 16% of software projects succeed. Personally, I never really believed that. If it were that bad, I wouldn&#8217;t be making a living from writing software today, 15 years later. The industry would have been shut down long before they even published the report. As I was reading <a href="http://www.amazon.ca/Practical-Software-Estimation-Insourced-Outsourced/dp/0321439104">Practical Software Estimation</a>, which had a mandatory citation of the above, I came to ask myself why there was such a difference. I don&#8217;t have numbers on this, but I would very surprised to hear any vendor claiming less than 90% success rate on projects. Seriously, with 16% odds of winning, you&#8217;re better off betting your company&#8217;s future on casino tables than on software projects.</p>
<p>The big question here is: what is a successful project? On what basis do they claim such a high failure rate?</p>
<p>Well, I probably don&#8217;t have the same definition of success and failure. I don&#8217;t think a project is a failure even if it&#8217;s a little behind schedule or a little over budget. From an economic standpoint, as long as it&#8217;s profitable, and above the <a href="http://en.wikipedia.org/wiki/Opportunity_cost">opportunity cost</a>, it was worth doing. Sometimes, projects are canceled for external factors. Even though we&#8217;d like to see the project development cycle as a black box in which you throw a fixed amount of money and get a product in the end, that&#8217;s not the way it is. The world keeps moving and if something comes out and makes your project obsolete, canceling it and changing direction is the right thing to do. Starting the project was also the right thing to do based on the information available at the time. Sure some money was lost, but if there are now cheaper ways to achieve the objective, you&#8217;re still winning. As Kent Beck reminds us, <a href="http://www.threeriversinstitute.org/blog/?p=438">sunk costs are a trap</a>.</p>
<p>When a project gets canceled, the executives might be really happy because they see the course change. The people that spent evenings on the project might not. Perspective matters when evaluating success or failure. However, when numbers after the fact are your only measure for success, that may be forgotten. Looking back, it won&#8217;t matter if a hockey team gave a good fight in the third period. On the scoreboard, they lost, and the scoreboard is what everyone will look back to.</p>
<p>Luckily, I wasn&#8217;t the only one to question what Standish was smoking when they drew those alarming figures. In the latest issue of IEEE Software, I came across a very interesting title: <a href="http://www.computer.org/portal/web/csdl/doi/10.1109/MS.2009.154">The Rise and Fall of the Chaos Report Figures</a>. In the article, J. Laurenz Eveleens and Chris Verhoef explain how the analysis made inevitably leads to this kind of number. The data used by Standish is not publicly available, so analyzing it correctly is not possible (how many times do we have to ask for open data?). However, they were able to tap into other sources of project data to perform the analysis and come up with a very clear explanation of their results.</p>
<p>First off, my question was answered. The definition of success for Standish is pretty much arriving under budget based on the initial estimate of the project with all requirements met. Failure is being canceled. Everything else goes into the <em>Challenged</em> bucket, including projects completing slightly over budget and projects with changed scope. Considering that bucket contains half of the projects, questioning the numbers is fairly valid.</p>
<p>I remember a presentation by a QA director (not certain of the exact title) from Tata Consulting at the Montreal Software Process Improvement Network a few years ago in which they were explaining their quality process. They presented a graph where they had some data collected from projects during the post-mortem and asking to explain the causes of some slippage or other types of failures (it was a long time ago, my memory is not that great), but there was a big column labeled <em>Miscellaneous</em>. At the time, I did not notice anything wrong with it. All survey graphs I had ever seen contained a big miscellaneous section. However, the presented highlighted the fact that this was unacceptable as it provided them with absolutely no information. In the next editions of the project survey, they replaced <em>Miscellaneous</em> with <em>Oversight</em>, a word which no manager in their right mind would use to describe the cause for their failures. Turns out the following results were more accurate for the causes. When information is unclear, you can&#8217;t just accept that it is unclear. You need to dig deeper and ask why.</p>
<p>The authors then explain how they used <a href="http://en.wikipedia.org/wiki/Barry_Boehm">Barry Boehm</a>&#8217;s long understood <a href="http://www.construx.com/Page.aspx?hid=1648">Cone of Uncertainty</a> and and <a href="http://en.wikipedia.org/wiki/Tom_DeMarco">Tom DeMarco</a>&#8217;s <a href="http://www.stickyminds.com/sitewide.asp?ObjectId=3392&amp;Function=DETAILBROWSE&amp;ObjectType=ART">Estimation Quality Factor</a> (published in 1982, long before the reports) to identify organizational bias in the estimates and explain how aggregating data without considering it leads to absolutely nothing of value. As an example, they point our a graph containing hundreds of estimations at various moments in the projects within an organization of forecast over actual ratio. The graphic is striking as apparently no dots exist above the 1.0 line (nearly none, there are a few very close by). All the dots indicate that the company only occasionally overestimates a project. However, the cone is very visible on the graph and there is a very significant correlation. They asked the right questions and asked why that was. The company simply, and deliberately, used the lowest possible outcome as their estimate, leading them to a 6% success rate based on the Standish definition.</p>
<p>I would be interested to see numbers on how many companies go for this approach rather than providing realistic (50-50) estimates.</p>
<p>Now, imagine companies buffering their estimates added to the data collection. You get random correlations at best because you&#8217;re comparing apples to oranges.</p>
<p>Actually, the article presents one of those cases of abusive (fraudulent?) padding. The company had used the Standish definition internally to judge performance. Those estimates were padded so badly, they barely reflected reality. Even at 80% completion some of the estimates were off by two orders of magnitude. Yes, that means 10000%. How can any sane decision be made out of those numbers? I have no idea. In fact, with such random numbers, you&#8217;re probably better off not wasting time on estimation at all. If anything, this is a great example of <a href="http://www.systemsguild.com/GuildSite/DandL/AustinForeword.html">dysfunction</a>.</p>
<p>The article concludes like this:</p>
<blockquote><p>We communicated our findings to the Standish Group, and Chairman Johnson replied: &#8220;All data and information in the Chaos reports and all Standish reports should be considered Standish opinion and the reader bears all risk in the use of this opinion.&#8221;</p>
<p>We fully support this disclaimer, which to our knowledge was never stated in the Chaos reports.</p></blockquote>
<p>It covers the general tone of the article. Above the entertainment value (yes, it&#8217;s the first time I ever associate entertainment with scientific reading) brought by tearing apart the Chaos report, what I found the most interesting was the well vulgarized use of theory to analyze the data. I highly recommend reading if you are a subscriber to the magazine or have access to the IEEE Digital Library. However, scientific publications remain restricted in access.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/02/15/bad-numbers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>CUSEC 2010</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/01/23/cusec-2010/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/01/23/cusec-2010/#comments</comments>
		<pubDate>Sun, 24 Jan 2010 02:58:45 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=314</guid>
		<description><![CDATA[This year, I attended CUSEC for the 4th time, two of which I was an organizer. Even though the target audience is students and I graduated what feels a really long time ago now, I still wait avidly for the event every year. The conference isn&#8217;t really ever technically in depth, I see it as [...]]]></description>
			<content:encoded><![CDATA[<p>This year, I attended <a href="http://cusec.net/">CUSEC</a> for the 4th time, two of which I was an organizer. Even though the target audience is students and I graduated what feels a really long time ago now, I still wait avidly for the event every year. The conference isn&#8217;t really ever technically in depth, I see it as an opportunity to see some trends. Every single time, it seems to have a perfect mix. It tends to hook onto the new technologies and hypes. After all, the program <em>is</em> made up by students.</p>
<p>This year, I think everyone would agree that the highlight was <a href="http://pyre.third-bit.com/blog/">Greg Wilson</a> with a very strong invitation to raise the bar and ask for higher standards from software research. Very few quoted studies are actually statistically relevant in any way. I had seen the session before at Dev Days in Toronto (it was around 90% identical). I would see it again. Perhaps, there would be even fewer <em>FIXME</em> notes in the slides. Greg is currently in the process of publishing a book on evidence in software engineering practice to be published as part of the O&#8217;Reilly <em>Beautiful *</em> series. The book does not yet have a name, so I can&#8217;t pre-order on amazon and that&#8217;s truly disappointing.</p>
<p>One of the lower visibility session I found interesting was IBM&#8217;s David Turek on Blue Gene and scientific processing. Many discarded the session because it was given by a VP. Now, I don&#8217;t really care about scientific calculations. I believe it&#8217;s important, but it&#8217;s not where my interests lie. I am almost certain I will never use Blue Gene. However, what I found interesting was to see how they tackle extremely large problems. Basically, the objective is to have supercomputers with 1000 times the computational capacity we have today by the end of the decade. Using current technologies, you would need a nuclear power plant to provide it.</p>
<p>Finally, Thomas Ptacek&#8217;s session on security was mostly entertaining. It was one of those 3 hour session compressed into one hour. I don&#8217;t think I could catch everything, but he went over common developer flaws and how simple omissions can take down the entire security strategies. He concluded with a very useful decision making process: if your encryption strategy involved something else than GPG and SSL, refactor. It&#8217;s one of the problems I always had with cryptography APIs. There are too many options. Many of which are plain wrong and irresponsible to use. On the other hand, he was quite a pessimist during the question period, saying there is no hope to create secure software using the current tools and technologies. All software ever made eventually had flaws found in them.</p>
<p>One of the most troubling moments of the conference for me was to see how much some people can be disconnected. I actually came across a software engineering student (not a freshmen) who did not know what Twitter was. Not only did he not know, he had <em>never</em> heard of it. How is that even possible? I don&#8217;t use Twitter. I use <a href="http://status.net/">an open alternative</a>, and I&#8217;m not that much into microblogging. However, I do believe it somewhat reached mainstream. You can hear the word while watching news on TV. I really need to lower my assumptions about what people know.</p>
<p>Next conference for me will be <a href="http://confoo.ca/en">Confoo.ca</a>, where I will be presenting two sessions and struggling to choose which of 8 sessions to attend every hour for 3 days.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/01/23/cusec-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Decision making</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/01/10/decision-making/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/01/10/decision-making/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 00:48:28 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=309</guid>
		<description><![CDATA[As part of the day to day work of a software developer, decisions have to be taken every single day. Some have a minor impact and can be reverted at nearly no cost. Others have a significant impact on the project and reverting it would be a fundamental change. I have found that, in most [...]]]></description>
			<content:encoded><![CDATA[<p>As part of the day to day work of a software developer, decisions have to be taken every single day. Some have a minor impact and can be reverted at nearly no cost. Others have a significant impact on the project and reverting it would be a fundamental change. I have found that, in most cases, not making a decision at all is a much better solution. A lot of time is wasted evaluating technology. Out of all the options available out there, there is a natural tendency to do everything possible to pick the best of the crop, the one that will offer the most to the project and provides the largest amount of features for future developments. While the reasoning sounds valid, it&#8217;s an attempt to predict the future and will most likely be wrong.</p>
<p>Of course, the project you are working on is great and you truly believe it will be revolutionary. However, you&#8217;re not alone. Every day, thousands of other teams work on their own projects. Chances are they are not competitors, but most likely a complement to yours and will likely make the package you spent so much time selecting completely obsolete before you&#8217;ve used all those advanced features.</p>
<p>Too often, I see a failure to classify the type of decision that has to be made in projects. They are not all equal. Some deserve more time. In the end, it&#8217;s all about managing risks and contingencies. The very first step is to identify the real need and what the boundaries are with your system. No one needs Sphinx. People need to search for content in their system. Sphinx is one option. You could also use Lucene or even external engines if your data is public. What matters when integrating in this case is how the data will be indexed and how searching will be made. When trying out new technology, most will begin with a prototype, which then evolves into production code. At that point, a critical error was made. Your application became dependent on the API.</p>
<p>If you begin by making clear that the objective is to index the content in your system, you can design boundaries that isolate the engine-specific aspects and leave a cohesive &#8212; neutral &#8212; language in your application.</p>
<p>Effectively, having such a division allows you not to choose between Sphinx or Lucene or something else. You can implement one that makes sense for you today and be certain that required changes to move to something else will be localized. With your application logic to extract the data to be indexed and the logic for fetching results and displaying them left independent, the decision-making step becomes irrelevant.</p>
<p>Certainly, there is some overhead. You need to convert the data to a neutral format rather than simply fetching what the API wants and then convert it to the appropriate format. Some look at the additional layer and see a performance loss. In most cases when integrating with other systems, the <a href="http://c2.com/cgi/wiki?OneMoreLevelOfIndirection">additional level of indirection</a> really does not matter. You are about to call a remote system performing a complex operation over a network stack. If that wasn&#8217;t complex, you would have written it yourself.</p>
<p>A common pitfall is to create an abstraction that is too closely bound to the implementation rather than the real needs of the system. The abstraction must speak your system&#8217;s language and completely hide the implementation, otherwise, the layer serves no purpose. It&#8217;s a good idea to look at multiple packages and see how they work conceptually when designing the abstraction. While you&#8217;re not going to implement all of them, looking at other options gives a different perspective and helps in adjusting the level of abstraction.</p>
<p>Once the abstraction is in place. the integration problem is out of the critical path. You can implement the simplest solution, knowing that it won&#8217;t scale to the appropriate level down the road, but the simplest solution now will allow to focus on more important aspects until the limit is reached. When it will be, you will be able to re-asses the situation and select a better option knowing that changes will be localized to an adapter.</p>
<p>Abstracting away is a good design practice and it can be applied to almost any situation. It allows your code to remain clean, breaks dependencies to external systems that would otherwise make it hard to set-up the environment and decrease testability. Because the code is isolated, it leaves room for experimentation with a safety net. If the chosen technology proves to be unstable or a poor performer, you can always switch to something else.</p>
<p>While it works in most cases, it certainly does not work for some fundamental decisions, like the implementation language, unless you plan on <a href="http://www.codinghorror.com/blog/archives/000679.html">writing your own language that would compile in other languages</a>. Some abstractions just don&#8217;t make sense.</p>
<p>When you can&#8217;t defer decision making, stick with what you know. Sure you might want to try one this new framework in the cool new language. The core of your project, if you expect it to live, is no place to experiment. I have been using PHP for nearly a decade now. I&#8217;ve learned all the subtleties of the language. It is a better choice for me. I&#8217;ve used the Zend Framework on a few projects and know my way around it well enough. It&#8217;s a good solution for me. Both together are a much safer path than Python/Django or any alternative, no matter what Slashdot may say.</p>
<p>It might not sound like a good thing to say, but experimenting as part of projects is important. You can&#8217;t test a technology well enough unless it&#8217;s done part of a real project and a project is unlikely to be real unless it&#8217;s part of your job. It&#8217;s just important to isolate experiments to less critical aspects. It&#8217;s the responsible thing to do.</p>
<p>It&#8217;s all about risk management. Make sure all decisions you make are either irrelevant because they can be reverted at a low cost or use technologies you trust based on past experience and you will avoid bad surprises.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/01/10/decision-making/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New favorite toy</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/12/03/new-favorite-toy/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/12/03/new-favorite-toy/#comments</comments>
		<pubDate>Thu, 03 Dec 2009 20:31:59 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=298</guid>
		<description><![CDATA[It certainly ain&#8217;t cutting edge, but I recently started using Sphinx. I heard presentations about it, read some, but never had the occasion to use it. It&#8217;s very impressive as a search engine. The main downside that kept me away from it for so long is that it pretty much requires a dedicated server to [...]]]></description>
			<content:encoded><![CDATA[<p>It certainly ain&#8217;t cutting edge, but I recently started using <a href="http://www.sphinxsearch.com/">Sphinx</a>. I heard presentations about it, read some, but never had the occasion to use it. It&#8217;s very impressive as a search engine. The main downside that kept me away from it for so long is that it pretty much requires a dedicated server to run it. As I primarily work on open source software where I can make no assumption about the environment, sphinx never was an option. For those environments, the PHP implementation of Lucene in the Zend Framework is a better candidate.</p>
<p>In most cases, I tend to stick with what I know. When I need to deliver software, I much prefer avoiding new problems and sticking with what I know is good enough. Granted the option, a few details made me go for Sphinx rather than Lucene (always referring to the PHP project, not the Java one).</p>
<ul>
<li>No built-in stemmer, and could only find for English. If you&#8217;ve never tried, not having a stemmer in a search engine is a good way to only occasionally have search results and it makes everything very frustrating.</li>
<li>Pagination has to be handled manually. Because it runs in PHP and all memory is gone by the end of the execution, the only way it can handle pagination decently is to let you handle it yourself.</li>
</ul>
<p>However, it&#8217;s a matter of trade-off. Sphinx has a few inconvenients.</p>
<ul>
<li>Runs as a separate server application and requires an additional <a href="http://pecl.php.net/package/sphinx">PHP extension</a> to connect to it (although recent versions support the <a href="http://www.sphinxsearch.com/docs/current.html#sphinxql">MySQL protocol</a> and lets you query it from SQL).</li>
<li>No incremental update of the index. The indexer runs from command line and can only build indexes efficiently from scratch. Different configurations can be used to lower the importance of this issue. Some delay on the search index update has to be tolerated.</li>
</ul>
<p>If you can get past those issues, Sphinx really shines.</p>
<ul>
<li>It handles pagination for you. Running it a daemon, it can keep buffers open and keep the data internally and manage it&#8217;s memory properly. In fact, you don&#8217;t need to know and that&#8217;s just perfect.</li>
<li>It can store additional attributes and filter on them, including multi-valued fields.</li>
<li>It&#8217;s distributed, so you can scale the search capacity independently. It requires to modify the configuration file, but entirely transparent to your application.</li>
<li>Result sorting options, including one based on time segments, giving higher ranking for recent results depending on which segment they are part of (hour, day, week, month). Ideal when searching for recent changes.</li>
</ul>
<p>Within just a few hours, it allowed me to solve one of the long lasting issues in all CMS software I&#8217;ve came across: respecting access rights in search results efficiently. Typically, whichever search technique you use will provide you with a raw list of results. You then need to filter those results to hide those that cannot be seen by the user. If you can accept that not all pages have the same amount of results (or none at all), this can work pretty efficiently. Otherwise, it adds either a lot of complexity or a lot of inefficiency.</p>
<p>An other option is to just display the results anyway to preserve the aesthetic and let the user be faced with a 403-type error later on. It may be an acceptable solution in some cases. However, you need to hope the excerpt generated or the title does not contain sensitive information, like <em>Should we fire John Doe?</em>. This can also happen if the page contains portions of restricted information.</p>
<p>First, the pagination issue. I could solve this one by adding an attribute to each indexed document and a filter in all queries. The attribute contains the list of all roles who can view the page. The filter contains the list of all roles the user has. Magically, Sphinx paginates the results with only the pages that are visible to the current user.</p>
<p>Of course, this required a bit of up-front design. The permission system allows me to obtain the list of all roles having an impact on the permissions. Visibility can then be verified for each role without having to scan for every (potentially hundreds or thousands) role in the system.</p>
<p>Sphinx can build the index either directly from the database by providing it with the credentials and a query or through an XML pipe. Because a lot happens from the logic in the code, I chose the second approach, providing me with a lot more flexibility. All you have to do is write a PHP script that (ideally using XMLWriter) gathers the data to be indexed and writes it to the output buffer.</p>
<p>The second part of the problem, about exposing sensitive information in the result, was resolved as a side effect. The system allows to grant or deny access to portions of the content. When building the index, absolutely all content is indexed. However, sphinx does not generate the excerpts automatically when generating the search results. One reason is that you may not need them, but the main reason is more likely to be that it does not preserve the original text. It only indexes it. Doing so avoids having to keep yet an other copy of your data. Your database already contains it.</p>
<p>To generate the excerpt, you need to get the content and throw it back to the server with the search words. The trick here is that you don&#8217;t really need to send back the same content. While I send the full content during the indexing phase, I only send the filtered content when time comes to generate the excerpt.</p>
<p>Sure, there may be false positives. Someone may see the search result and get nothing meaningful to them. John Doe might find out that the page mentions his name, but the content will not be exposed in any case. Quite a good compromise.</p>
<p>So many possibilities. What is your favorite feature?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/12/03/new-favorite-toy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Software education</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/11/07/software-education/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/11/07/software-education/#comments</comments>
		<pubDate>Sat, 07 Nov 2009 22:45:43 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=290</guid>
		<description><![CDATA[There is a strong tendency these days to push for a reform of software education. For one, this is quite strange as I don&#8217;t consider it has ever settled and has too many variants out there to say they are all wrong, but there is still a consensus that the current state is pretty bad. [...]]]></description>
			<content:encoded><![CDATA[<p>There is a strong tendency these days to push for a reform of software education. For one, this is quite strange as I don&#8217;t consider it has ever settled and has too many variants out there to say they are all wrong, but there is still a consensus that the current state is pretty bad. Looking at the failure rates of software projects, this may well be true. Ever since the birth of the profession, a lot of things were learned about software development. Yet, the average curriculum of programming courses take very little of it into account. Most of the ads you may see focus on programming languages and environments, pretending that you will know all there is to know at the end of whichever amount years, months or weeks they decided to advertise.</p>
<p>There is a disconnection. Languages and platforms have very little to do with the actual skills relevant to software development and the failure rates of projects. This is why there is a call for a reform. I have observed multiple opinion groups.</p>
<h3>Apprenticeship</h3>
<p>There has always been a debate to compare software development to other disciplines, trying to find a metaphor to explain what we should be doing. Many of those metaphors lead to apprenticeship and the idea that we should be learning from the work of masters. The basic idea is that we should learn to read before we write. In fact, it does feel absurd that most of us wrote our first lines of code without understanding them. The act of writing it and running the code explained us what it did. Honestly, I think that works pretty well. Code at that level has very little meaning outside execution.</p>
<p>Fortunately, this group actually aims for the higher level benefits. We should be reading significant pieces of code to learn from the design decisions that were made. Learn about the trade-offs. The entire series of <em>Beautiful [Code|Architecture|...]</em> books fit in this perspective. The main issue with this right now is that we don&#8217;t have a significant corpus of code we could get people to read reasonably. So few people read code that we can barely begin to identify the good parts. Good being very subjective to start with.</p>
<p>Imagine a first semester course in college where students never touch a computer. They are provided with code that they must read and understand. Exams? Write essays about the code to explain the author&#8217;s intent and decisions. Find flaws and boundaries. It certainly looks like a literature course, but it&#8217;s the kind of thinking a programmer needs to have when jumping into new code. They must understand the design decisions that were made. Software maintenance is a mess right now because very few people know how to read code. They don&#8217;t know how to spot the limitations and they just hack their way through it, corrupting the initial design until it becomes a faint memory from the past.</p>
<p>Reading code requires a special skill called pattern recognition. Unless you can abstract away from the individual lines of code, any attempt to understand a code listing longer than a hundred lines is bound to fail. Chess players don&#8217;t see their game as a series of individual movements. They have patterns that they recognize and use to anticipate the opponent&#8217;s moves. Although we don&#8217;t have opponents (or shouldn&#8217;t have), knowing the patterns in code allows to anticipate the attributes that come along with them, including extension points, possibilities and limitations.</p>
<p>Design patterns, which goes well beyond the gang of four initial list, are a way to build a collective intelligence related to patterns. It&#8217;s great to know them to communicate with others, but we should also get into the habit of building our own mental models. Maybe someday to write a pattern for it ourselves, or just recognize that it already has a name when we hear from it.</p>
<h3>Do it yourself</h3>
<p>Almost completely on the opposite side, the DIY group encourages programmers to write their code from scratch. Write their own libraries. Their own frameworks. Clone existing applications. All of this for the sake of learning. You can spend years reading about other&#8217;s mistakes and you may be able to remember some of them and avoid them when you encounter the situation. However, a mistake you make yourself is one you will remember forever.</p>
<p>In writing your own code, you learn a lot from the underlying platform. Once you&#8217;ve written a framework or two, you will be much more apt to understand other frameworks as you encounter them. You will understand why some things are done in a certain way, because you will have encountered these issues yourself. You will be able to notice details in the design that no one else took any attention to, like a really clever technique to handle data filtering.</p>
<p>The DIY approach leads to specialization. It focuses on learning things in depth. Most open source projects started this way. The most talented programmers you might encounter probably learned this way. However, this is impractical in more traditional education. It takes a lot of time. It requires passion, and that can&#8217;t be forced onto someone. Within a few hours that would make for a reasonable assignment, very small libraries or focused tasks can be performed. However, what you can learn from scratching the surface is very limited. Going deep enough to really learn something would take months or years.</p>
<p>Arguably, most of the education today does something very similar to this. Teachers give assignments and students perform them. They are built in a way that the students will hopefully learn something specific related to the course along the way. Longer assignment would lead to more valuable learning, but it would also make evaluation much harder and the scope much wider.</p>
<p>An interesting aspect around the web in this area is code katas. Small exercises that can be made by professionals to practice and keep learning. I think that DIY is really important, but it&#8217;s really part of an ongoing personal development rather than something to be done in schools. Let&#8217;s face it. At the speed at which technology evolves in all directions, being outdated is a permanent state. The best we can do is make sure we&#8217;re not left in stone age.</p>
<h3>Process</h3>
<p>The process school of thoughts is morphing as we speak. There used to be a time when there was this idea that if you pinned down requirements early on, made a good design and implemented it properly, nothing could go wrong. That&#8217;s what I was thought in college in the only course related to project management. These days, it may be XP or some other agile method. It&#8217;s still the same school of thoughts. Given a good process, you will obtain good results, so education should focus on teaching good processes and you will obtain good software developers.</p>
<p>Processes are great, but they are mainly a management issue. A process well adapted to the project will allow the team to perform at their best. However, there is a catch. A mediocre team performing at their best is still mediocre. Process management is an optimization issue. It&#8217;s about making sure that the ball is not being dropped in a terrible way that could have been prevented because the team had the skills to avoid it. Dealing with optimization before being able to solve the issue is a ridiculous idea. You might as well tell students to wear a suit at work to hide the fact that they don&#8217;t actually do anything.</p>
<p>The reason there are so many books on methodologies (luckily, the rate at which they are written has decreased) is that <em>they all work</em> because they were written by people working with great teams. They could have followed no process at all and they probably still would have succeeded.</p>
<p>Being able to follow a process is important for a developer. As they join a team in the real world, they will have to follow the rules to begin with. Basic understanding of the reasons behind them will allow them to perform better and stop them from questioning everything. However, processes can in no way be central to the education, unless you are training managers.</p>
<h3>Best practices</h3>
<p>These days, this group is a lot into enrolling students in unit testing from the start. Teach them TDD from the start so that they will never think about working in any other way. Once again, I feel this is a lot of what schools have been doing forever. A best practice is very much dependent on time. In college, we started directly in object oriented programming because that was a hot word at the time. Sadly, just writing classes does not really provide with a good understanding of what object oriented really is all about. There is a balance required.</p>
<p>I would agree that TDD is a better stepping stone than just object oriented or whichever buzzword we had in the past because testing is unlikely to disappear in the future, but very few projects are written test first these days. Given that it has a huge impact on the design of software, I fear that it would make those students completely incapable in a typical work place, having to deal with legacy code.</p>
<p>I think that teaching best practices is certainly one of the roles of education, but it should focus on those that overlap technology and have greater odds of surviving. The buzzword of the day is unlikely to be the best choice. Perhaps the buzzword of the last decade that is still around is a more suitable choice. Those the industry does not speak of because they are part of the definition of software engineering. I don&#8217;t even count the amount of surprised faces I&#8217;ve seen after telling them that people don&#8217;t learn to do unit tests in college. That we don&#8217;t learn to use version control. We don&#8217;t get to package and release applications, or deploy them for that matter.</p>
<p>Using version control, releasing, deploying. Those are skills that are expected in the enterprise. They have been around forever. CVS wasn&#8217;t the first version control software out there, and people speak of it as a long extinguished dinosaur. Yet, outside those who contributed to open source software in their spare time, I know of very few graduates who know how to use subversion or any other tool. A large company I&#8217;ve worked it had a one day course for all new hires which focused almost exclusively on version control. That means that the absence of the skill is frequent enough and important enough for them for have full time resources assigned to training. Why is that not thought in colleges? I don&#8217;t know.</p>
<p>Sure, unit testing, test-first and TDD are great. But perhaps there are more fundamental things to go through before.</p>
<p>But really, is software education all that bad and does it matter anyway? There is some pretty great software out there, and there are excellent programmers who learned without ever going to school. There is certainly a need to shorten the training period from 20 years to 3, but experience will always have value. I&#8217;ve learned programming by myself. I must have been 12 or 13 when I wrote my first line of code and probably would have done it younger if I had access to programming tools or internet before that. I could navigate a computer with a DOS prompt before grade school, so I really don&#8217;t see why I wouldn&#8217;t have started programming earlier given the opportunity. Learning to program is not that hard. I went for a technical degree in college (3 years) and then for software engineering in university (4 years) afterwards and that thought me a whole lot, but I still could program a web forum and knew basic SQL before entering college and what pays my rent is what I learned from contributing to open source.</p>
<p>One can easily learn to program by himself, especially now with an unlimited amount of resources available. If there is one thing school can do, it&#8217;s distill all the information to help focus. The real question that should be asked is what education is trying to achieve. I really don&#8217;t think that education has anything to do with being a good programmer or professional. That&#8217;s a question of passion and attitude. Most book authors I respect probably studied something unrelated because there was no such thing as software education in their time. No matter what you put in the educational system, passionate people will end up on top. Don&#8217;t even think that can be influenced or that they can be manufactured.</p>
<p>If the goal is to provide workers (which it probably is, from the government&#8217;s standpoint), they should provide a training that allows graduates to hit the ground running in a workplace. That means providing them with techniques to understand large code bases and work in them efficiently, and to teach them to use the tools people really use. Then it&#8217;s all about learning to find the information you need, from valid sources, when you really need it. Most code out there does not use complex short path finding algorithms. Sure, knowing those and all the concepts of complexity is useful, but not crucial. When you need it, you have to go back and search for it anyway.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/11/07/software-education/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Go for changes that matter</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/11/04/go-for-changes-that-matter/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/11/04/go-for-changes-that-matter/#comments</comments>
		<pubDate>Wed, 04 Nov 2009 17:27:28 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=281</guid>
		<description><![CDATA[I don&#8217;t have any stats to support this. But I&#8217;m pretty certain that every second, a developer somewhere complains about legacy code. Most of the time, no one person can be blamed for it. Other than a few classics demonstrating complete lack of understanding, most bad code out there was not written by any single [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t have any stats to support this. But I&#8217;m pretty certain that every second, a developer somewhere complains about legacy code. Most of the time, no one person can be blamed for it. Other than a few classics demonstrating complete lack of understanding, most bad code out there was not written by any single person. It just grew organically until just looking at it makes it fall apart. Most of the time, it begins with the good intention of keeping the design simple for a simple task. Augment this with a lot of vision by an inspired third party and the will to keep the original design unchanged by the next implementor and you made a step in the wrong direction. At some point, the code becomes so bad that people just put in their little fix and avoid looking at the larger picture.</p>
<p>No matter how you got there, knowing it won&#8217;t really help you. You&#8217;re stuck with a large pile of code you have no interest in maintaining and whose functionality is a complete mystery. The initial reaction is to vote for a complete rewrite from ground up using more modern technologies. Well, the past has demonstrated this to be a failure over and over again. Starting a new project from scratch is a good way to implement a brand new idea. If the objectives are completely different, it deserves to be a different project. However, a new project to do something the old one did is hardly a good idea. Development will be so long that you will either have to sacrifice your entire user base, who will move away due to neglect of their current solution, or to keep maintaining the current version for a while, which will kill the new initiative due to lack of resources.</p>
<p>The real solution is to devote time to making things better. Use all this time wasted on complaining and actually trying to make things better. While the code smell is often generalized, very few parts are usually rotten. Cleaning up those areas can transform the project without so much effort. The one thing you don&#8217;t want to do is begin with the first file and clean up all the parts you don&#8217;t like about it., or polish some feature because it could be improved and you understand it enough to do it.</p>
<p>Improving code quality is just like improving performance. Unless you target the areas that really matter, there will be no significant impact. If you spend an hour to optimize a query and gain 50% improvement on it, you can be happy with it, but if that query accounted for 1% of the total execution time, your impact really is 0.5%. Sadly, software quality does not have so many direct numbers that can be observed. There are metrics, but the impact will be seen on the longer term, mixed up with dozens with other issues, making it nearly impossible to measure. It also affects these weird factors like team morale.</p>
<p>To me, the main attributes refactoring candidates have are:</p>
<ul>
<li>Obstructive</li>
<li>Untrustworthy</li>
<li>Inconsistent</li>
</ul>
<p>Obstructive issues hamper your ability to grow. They are road blocks. If you drew a directed graph of all the issues and feature requests as nodes and dependencies as edges, those issues would stand in the middle. They cause problems everywhere for no obvious reasons and always prevent you from going up to the next level. In TikiWiki, the permission system was one of those. For a long time, and it still is, the high granularity of the permission system was one of the key features of the CMS. There are currently no less than 200 permissions that can be attributed.  However, the naive implementation caused so many problems that a word was created to identify it in bug reports. It also prevent from having the most demanded feature by large enterprise customers: project workspaces.</p>
<p>Obstructiveness may also apply in terms of development. Every time you have to perform a simple task in a given area, you find yourself juggling with complexity. For historical reasons, just getting close to a piece of code requires a complete ritual dance. So much that you just attempt to work around the issue. It&#8217;s likely that the API does not provide the functionality that is required. A lot of copy-pasting is needed and, as a result, a lot of time is wasted.</p>
<p>Untrustworthy code often looks innocent. A function call that looks simple and that you would expect to work. However, for some reason, every time you use it somewhere, you close your eyes before execution. For some reason, you&#8217;re not convinced it will act as it should. There are also multiple bugs filed related to the feature under corner conditions and they are always fixed by adding a line or a condition. Typically, it will be a very long function with disparate branching. Overgrown by feature requests over time. It&#8217;s not rare to see multiple different ways to do the same thing with different parameters. It was so complicated that someone made a request for something that already existed, and the developer did not even notice. The only way out of it is to map out what it does and what it&#8217;s supposed to do, and begin writing tests for it.</p>
<p>Inconsistency is a different kind of smell. There is nothing wrong, except that you always find yourself looking up how to use components. Different conventions are used. Sometimes you need to send an array, other times an object. For no apparent reason, things are done differently from one place to the next. Most of the time, these are easy to fix. Find out which way is the right one and deploy it all over. Don&#8217;t let the wrong way be used again. Most of the time, they just spread because someone looked up for an example and took the wrong one. Fixing those issues does not have such a large impact by itself, but it will often reduce the clutter in the code. With less code remaining, it will be easier to see the other problems.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/11/04/go-for-changes-that-matter/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Always more to do</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/10/12/always-more-to-do/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/10/12/always-more-to-do/#comments</comments>
		<pubDate>Mon, 12 Oct 2009 21:54:36 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=277</guid>
		<description><![CDATA[It&#8217;s fascinating to see how little time it takes between the moment code is written and to be mentally flagged as to be improved, how procrastination then kicks in and finally how things get worst because left untouched. Of course, there are higher priorities and hundreds of reasons why it was left behind. The end [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s fascinating to see how little time it takes between the moment code is written and to be mentally flagged as to be improved, how procrastination then kicks in and finally how things get worst because left untouched. Of course, there are higher priorities and hundreds of reasons why it was left behind. The end result is the same. It will have to be refactored at some point, and those hours will be incredibly boring. Recently, I have been working on a fairly large project. After several hundreds of hours, I came to the conclusion I had a decent proof of concept, but it was far from being usable. I made a list of the issues and found several places that needed improvement. Turns out I had known about the root causes for a long time. Simply did not attend them.</p>
<p>So began the refactoring process. Filled with incredibly dull tasks a monkey could do, if only wouldn&#8217;t need to spend more time explaining it than it actually takes me to do.</p>
<p>Certainly, those issues would have been much faster to resolve if less was built on em, meaning I had attended them earlier. However, I strongly question that the solution I would have found back then would have lasted any longer. In fact, what I implement now follows patterns that have been deployed in other areas in the mean time. It builds on top of recent ideas. Dull refactoring may just be unavoidable. It will have to be done again in the future.</p>
<p>Constant refactoring and trying to reach for perfection is a trap. I&#8217;ve learned about ROI and almost turned going for the highest value objective into an instinct. With limited resources, no time can be wasted on gold-plating an unfinished product. Refactoring, as a means to improve user interaction and speed up development in this case, just happened to become on top of the priority list, and I now have to suffer through brain-dead activities. Luckily, refactoring still beats pixel alignments and fixing browser issues any day.</p>
<p>Trying to avoid falling asleep, I have been keeping <a title="The Passionate Programmer" href="http://www.amazon.ca/Passionate-Programmer-Creating-Remarkable-Development/dp/1934356344/">Chad Fowler&#8217;s new book</a> close by, which turns out to be really good for my condition. Today, I came across this passage.</p>
<blockquote><p>For most techies, the boring work is boring for two primary reasons. The work we love lets us flex our creative muscles. Software development is a creative act, and many of us are drawn to it for this reason. The work we <em>don&#8217;t</em> like is seldom work that we consider to be creative in nature. Think about it for a moment. Think about what you have on your to-do list for the next week at work. The tasks that you&#8217;d love to let slip are probably not tasks that leave much to the imagination. They&#8217;re just-do-&#8217;em tasks that you wish you could just get someone else to do.</p></blockquote>
<p>It goes on and recommends to divert the mind to a different challenge instead while performing the task with, as an example, keeping 100% code coverage target when writing unit tests. I&#8217;ve been doing a lot of that in the project. It influenced the design a lot. Ironically, what makes refactoring so boring is that all the tests now have to be updated. The code update itself is just a few minutes. Updating the dozens of failing tests because the interface changed takes hours however. They are quite a good guarantee nothing broke, but they do increase my daily caffeine intake to unsafe levels.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/10/12/always-more-to-do/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The OPUS failure</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/08/30/the-opus-failure/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/08/30/the-opus-failure/#comments</comments>
		<pubDate>Sun, 30 Aug 2009 21:41:20 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=272</guid>
		<description><![CDATA[Beginning last year, the Montreal metropolitan area begun the deployment of OPUS cards throughout the public transit system. The process was a long and painful transition during which half the gates to the metro stations were using the new system and half were still on the old one, resulting in longer than average queues. At [...]]]></description>
			<content:encoded><![CDATA[<p>Beginning last year, the Montreal metropolitan area begun the deployment of OPUS cards throughout the public transit system. The process was a long and painful transition during which half the gates to the metro stations were using the new system and half were still on the old one, resulting in longer than average queues. At the end of June, they finally completed the transition and stopped selling the old tickets. I should say almost completed, because you can still come across some of the old gates.</p>
<p>The concept of the OPUS card is fairly simple. Nothing the world has never seen before. An RFID card that can be used in the various public transit systems of the metropolitan area. The promise was interesting. Carrying a single card for all public transit needs. Unified, automated machines to purchase  tickets, removing the need for interaction with other human beings.</p>
<p>It all started fine. Acting like a laggard, I waited until January to get a card. I could then get my monthly pass for the Montreal island, which served on a day to day basis. I then got train tickets to visit my family once in a while. Everything worked fine thus far. A few months later, they modified the train schedules and they were no longer optimal for my needs. I figured out the good old bus would do a better job. I then tried the machine to get the tickets for the bus. It wouldn&#8217;t let me purchase them. It felt counter-intuitive, but I made the line to speak to a human being. The woman did not know what was going on either and searched for a good 10 minutes to figure out what was going on. I could feel a queue of people behind me growing impatient.</p>
<p>She finally gave up and gave me a “solo” card, which is a single-use RFID card containing the tickets. The 1-card dream was over. I was already stuck with two. Worst part is, they both have conflicting signals, so I now had to pull out the card from my wallet to use it. So much for using RFID.</p>
<p>Time came by and this mitigated success held for a few months, that is, until I moved to a more central area. For the first month, I still had my monthly pass, but quickly figured out I was only occasionally using public transits because most of the places I need to go are within a walking distance. Having traveled for the major part of August, getting a monthly pass was certainly not worth it anyway.</p>
<p>So I went out to get tickets instead, which are sold in packs of 10. Went to the machine. Once again, the option was not available. Once again, I reverted to speak to get in queue to talk to someone. The lady told me there was a problem with my card and it had to be “programmed” (one expression that makes me cringe when coming from a non-programmer) and that I had to go to the Berri-UQAM station to get that done. Quite off my route for the day, so I kept that for later.</p>
<p>A few days later, I did stop by the station and went with my card, asking for 10 tickets. I did not mention the previous story with the card having to be “programmed”. That did not feel right. It turns out I got a different response. There was a conflict between the types of tickets. Here is how it goes. Public transit networks are mainly municipal. STM handles Montreal, STL handles Laval, RTL handles Longueuil and a few other smaller ones handles smaller city agglomerations. The AMT is the umbrella organization handling infrastructure developments and operating above ground trains. All networks sell tickets or passes valid on their territory and the AMT sells tickets valid in zones.</p>
<p>It turns out the train tickets I previously bought were not only train tickets, but were zone 3 tickets. What prevented me from buying tickets all along was that those zone 3 tickets would have been valid tickets in the metro and most buses. The only problem is that those tickets cost about twice the price, so I&#8217;m not really interested in using them for anything else than the train.</p>
<p>Well, it&#8217;s a system limitation. Apologizing, the man at the Berri-UQAM station gave me a second OPUS card (from a single card promise, I now have two bulky cards and a smaller one). He also told me that to avoid the problem, I should purchase the zone tickets on solo cards instead of getting them on my regular OPUS cards if I don&#8217;t use them on a regular basis. Here is the problem with this. Solo cards are not distributed by automated machines, and the non-terminal train stations do not have human operators.</p>
<p>My big question is this: Is my use case so complicated that it was not even considered? How is the use case of a young man living in the city and visiting family in the suburbs around once a month a complex use case?</p>
<p>How did this happen? Well, it obviously is a problem of poor engineering work and misunderstanding of technology. They introduced a new technology to “simplify” the interconnection between the transit networks, but did it in a way that prevents any major changes in the way they operate. Make-up on a monkey. Nothing more. They did not simplify the process. They unified it around a piece of plastic and made it harder for everyone to understand what is going on. Even thought carrying a few stacks of tickets was annoying, every time I used one, I knew what was going on. Guessing which one the machine will use for me was out of the equation.</p>
<p>The way this is handled in most cities is that they divide it into zones and you pay based on how many zones you cross during transit. Someone figured out that using the card going in and going out of transit like it&#8217;s done everywhere else in the world was too complicated for the citizens of Montreal, even though they had smiling people standing around the stations for months during the transition to help people. They probably would have made a better investment by actually training the permanent staff to understand how it works.</p>
<p>Deep down, there is a technical issue. Without input on where you are going when getting in the transit, or a way to know where you are getting out at, the system is stuck to blindly pick a ticket. The real solution is go get additional input to understand the route and bill appropriately. It did not happen, or it was completely ignored by administrators who did not understand the technical requirements implied by basic logic. To avoid this terrible randomness issue, the engineering solution was to prevent from purchasing conflicting titles on the same card altogether, and do it silently.</p>
<p>The AMT would have been a good solution: use the zones as a means for billing. Unify the currency, not just the payment method. Get rid of the accidental complexity imposed by the various networks without an overall vision. Especially the parts imposed by administrators in the suburbs who never actually used public transits and truly think it&#8217;s only for the poor rather than a valid mean for urban transportation.</p>
<p>There was never enough political pressure to give AMT the power they needed to unify the metropolitan networks. Worker unions in the various cities kept on fighting to preserve the jobs, or really just getting them re-organized. They did not bother finding a common financial model and kept on working the way they used to. All of it resulting in a system with more flaws than features. The political issue between the various networks ends up affecting us all.</p>
<p>Attempting to touch as little as possible of the statu-quo, all of the directors ended up agreeing on a common shopping list for the solution supplier. I will skip the open bidding process that typically comes with such projects, but here is a list of the requirements I would expect on such a list based on what I have encountered:</p>
<ul>
<li>Each network must be able to sell tickets and monthly passes valid on their territory.</li>
<li>STM also has weekly passes, daily passes and 3-day passes.</li>
<li>AMT sells zone tickets or monthly passes. Each zone may comprise one network or part of a network.</li>
<li>CIT Laurentides and CIT Lanaudière need to have tickets or 	monthly passes that are valid in both networks, which can be for the 	entire networks or only the south portions.</li>
<li>Monthly passes for a single bus may be available.</li>
</ul>
<p>At some point, someone should have realized that these are conflicting and that just asking for the card at the beginning of the trip could not possibly work. Instead of standing up and preventing a disaster, the supplier who could not afford to loose the lucrative contract simply accepted all demands. I don&#8217;t know the exact details about this contract, but I wouldn&#8217;t be surprised to hear that costs were well overrun.</p>
<p>There is a very simple analysis technique called the 5-Why. For any demand. Ask why to expose the underlying demand. Repeat 5 times. I&#8217;m pretty certain most of the complexity from the above list came because citizens complained that they had to carry too much change around and had to pay for every transit. If you have a unified card to pay with, change is no longer an issue.</p>
<p>Getting non-technical people to write down their wishlist is a terrible idea in any system design. It prevents the supplier from being creative and proposing a better, more efficient, solution. Instead, they must build poor systems to meet the illogical needs of people without the depth of understanding required to design a solution.</p>
<p>A bit of traveling around to see what is done in the other metropols of this world might have been useful as well. In my recent trip to London, I encountered the Oyster card. The concept is very similar to our OPUS, except that it was done a bit more efficiently.</p>
<ul>
<li>The card knows about pounds. Not tickets. You just load money on it and whichever transit you take will take what it needs.</li>
<li>It knows about your route and charges differently accordingly. If you take a bus, then the tube, it will know it&#8217;s the 	same travel and not charge twice. I think it even caps to the price of a “daily pass” at some point.</li>
<li>Pricing on the tube is made based on the distance by asking 	the card to get in and get out. Short distances are less expensive.</li>
<li>You can attach your credit card to it online and configure it to fill up when you go below a certain threshold, so you never have 	to worry about loading more money on the card.</li>
</ul>
<p>Of course, this might not all have been possible in Canada. Our privacy concerns are typically higher than those of the citizens of London. London has so many cameras, they wouldn&#8217;t really need you to have a card to know your travel&#8217;s itinerary and bill accordingly.</p>
<p>I don&#8217;t know exactly how it used to be before, but the system seems a lot better to me. At least, they made it simple to use and abstracted away from the need of tickets.</p>
<p>Political issues just keep bleeding through the deployment of OPUS. Some details are not related to technical issues. They are just incomprehensible unless taken with some historical perspective. Coming back from visiting my family, I always had to pay for the metro in Laval even though I had a pass for Montreal and would spend most of the trip there. I could understand that. Even though the metro line connects, it&#8217;s actually a different network and I only paid for Montreal. They want to make sure the people living in the suburbs pay a bit more because their infrastructures end up being more expensive.</p>
<p>Now that I had tickets, I figured they would be good from Laval too. The tickets are the exact same price. That did not work as expected. The tickets valid on the same line are different depending on where you purchase them. I did not try, but from prior experience, I would believe I cannot purchase both at the same time. Once again, this is because the two cities could not agree on who would pay the bill for the extension of the network, which was estimated under 200M and ended up costing 750M.</p>
<p>Don&#8217;t these administrators think about the madness they place the citizens through with all their internal battles? Typically, when you invest a few millions to improve something, it&#8217;s a good time to clean up the multiple hacks imposed by the previously incapable system you are trying to replace. They took the hacks and turned them into specs. Now we&#8217;ll have to live with them for the 20 years to come.</p>
<p>Within weeks of completing the deployment of OPUS, thousands of Bixi bikes were placed around the city on hundreds of station. A complete revolution in urban transit they claimed. The creators did their homework and verified how similar projects failed in other cities and managed to avoid errors from the past. There are enough stations scattered around the city (about every two blocks) to make it useful and they attempt to balance the bikes between the stations (with moderate success). However, the company seems to be enjoying success with deployments in a few smaller cities and projects for London and Boston. There is only one issue. Those bikes can&#8217;t be used with the OPUS card. So much for standardizing.</p>
<p>Let&#8217;s see what the future will bring to us with the re-introduction of the tramway downtown, prolongation of the metro by 20km and the arrival of a train to the airport.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/08/30/the-opus-failure/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>From junior to senior</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/08/15/from-junior-to-senior/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/08/15/from-junior-to-senior/#comments</comments>
		<pubDate>Sat, 15 Aug 2009 10:56:59 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=263</guid>
		<description><![CDATA[It came to me as a realization in the last few days. There is really a huge gap between juniors and seniors as programmers. It&#8217;s commonly said that there is a 1:20 variation in our profession, but how much is really associated to experience? That alone can account for an order of magnitude. It&#8217;s not [...]]]></description>
			<content:encoded><![CDATA[<p>It came to me as a realization in the last few days. There is really a huge gap between juniors and seniors as programmers. It&#8217;s commonly said that there is a 1:20 variation in our profession, but how much is really associated to experience? That alone can account for an order of magnitude. It&#8217;s not a matter of not being smart. Juniors just waste time and don&#8217;t realize it. There are multiple reasons for it, but most of them are related to not asking for help.</p>
<p><strong>The specs are unclear, but they don&#8217;t ask questions.</strong> Specs are always unclear. Just discussing the issue with anyone can help clarifying what has to be done. A junior will either sit idle and wait until he figures it out on his own or pick a random track and waste hours on it. In most cases, there is no right answers. It&#8217;s just a judgment call that can only be made by someone who knows the application throughout , understands the domain of application and the use cases. This takes years to learn, so just ask someone who knows.</p>
<p><strong>They don&#8217;t have the knowledge necessary to go above a hump.</strong> The natural response when they don&#8217;t understand a piece of code is to search google for an entire day, even though someone a shout away could have provided the answer within 45 seconds. Sure the interruptions are bad and learning by yourself is a good skill to have, but wasting that amount of time is never a good deal. It takes a while to become comfortable with all the technologies involved in a non-trivial project.</p>
<p><strong>They spend hours on a problem while they could have solved more important ones in less time. </strong>Prioritizing work is probably the most important aspect of being productive. Especially when you work in an old application that has countless problems, at the end of the day, you still need to get your objectives done. At some point as a programmer, you need to trust &#8220;management&#8221; or &#8220;technical direction&#8221; that the task that were assigned are probably the ones that bring the most value in the project, regardless of what you stumble across along the way.</p>
<p>All of this can be solved fairly easily. Before you begin a task, figure out what the real objectives are, how much time you think it&#8217;s going to take and how much time those who assigned it think it&#8217;s going to take. Unless they are technical, most managers have no clue how long something is going to take, but aligning the expectations is a key to successful projects. If you think it&#8217;s a one week effort and they thought you would only need to spend 2 hours on it, it&#8217;s probably better to set the clocks straight and not begin at all.</p>
<p>Even when money is not involved, either you are working for a governmental organization or an open source project as a volunteer, time remains valuable. All bugs are not equal. Not everything is worth spending time on and what should really be judged is the impact of the effort.</p>
<p>Even though most HR departments turned the concept of a time sheet into a joke by forcing all employees to report 40 hours of work per week, a real, detailed time sheet with tasks and how long they took to perform is a great tool for developers to improve their efficiency. Was all that time really worth spending?</p>
<p>At the end of each day, it&#8217;s a good thing to look back at what you worked on and ask yourself if it was really what you set out to do.</p>
<p>Once you&#8217;re <strong>half way</strong> through the allocated time, ask yourself if you&#8217;re still working on the real objectives. If you&#8217;re not, the solution is obvious: get back to it. If you&#8217;re still on the objective, but feel you are circling around, how about asking for help?</p>
<p>Once you&#8217;re <strong>past</strong> the allocated time, consider cutting your loss. Ask around. Maybe the request was just a nice to have. It&#8217;s really not worth spending too much time on. It may be more economic to assign it to someone else. Just inform about progress and expectations. It allows direction to re-align and support. There is nothing wrong with admitting failure. It just saves time and money in most cases.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/08/15/from-junior-to-senior/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>State your intent</title>
		<link>http://blog.lphuberdeau.com/wordpress/2009/07/19/state-your-intent/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2009/07/19/state-your-intent/#comments</comments>
		<pubDate>Sun, 19 Jul 2009 19:18:47 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=260</guid>
		<description><![CDATA[It&#8217;s quite surprising how simply stating the intent allows for improvements. In the last few days, I have been rewriting the permission system for TikiWiki. One of the goals was to add consistency in the way it works with category permissions in regards to object and global permissions, the other was to improve performance.
Typically, listing [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s quite surprising how simply stating the intent allows for improvements. In the last few days, I have been rewriting the permission system for TikiWiki. One of the goals was to add consistency in the way it works with category permissions in regards to object and global permissions, the other was to improve performance.</p>
<p>Typically, listing elements would contain a condition for each element to verify if the user is allowed to view the object. This seems perfectly reasonable and innocent. However, the implementation of that function turned out to be a few hundreds of lines over time, including function calls all seemingly innocent. However, they were not. The amount of database queries performed to filter the list was unreasonable.</p>
<p>Was it lack of design, careless implementation or the desire to keep things simple? Certainly, a lot of re-use was made. The function to verify permission was so convenient that it could be used anywhere in the code. Everywhere a list had to be filtered, two lines were added in the loop to filter the data. Simple, innocent. Why would you bother placing that in a function? It&#8217;s just two lines of code. Plus! Placing it in a function would harm performance because it would require an additional loop.</p>
<p>Wrong.</p>
<p>The cost of the look is irrelevant. Stating the intent allows for something much more powerful than just simplifying those loops all around, it allows for creating a better way to filter the list when that is what needs to be done, like bulk loading the information required to resolve the permissions, diminishing the amount of queries required by an order of magnitude or two and offering a sane point to cache results and avoid queries altogether.</p>
<p>Even a naive implementation with terrible overhead of a loop would have been useful. Certainly, writing code to load large amounts of information in a single batch is harder. It requires understanding the relationships and the entire flow. However, if the right abstraction exists, stating the appropriate intent, optimizing the filtering routine can be done at a single place rather than requiring all the code to be updated.</p>
<p>This is one of the pitfalls of object oriented programming, and even procedural programming for that matter. It&#8217;s very easy to create elegant abstractions that hides the implementation details. It&#8217;s very easy to use, but when used in the wrong context, it creates slow and bloated applications. The abstractions need to reflect the tasks to be performed, not be regrouped around what holds the tasks together.</p>
<p>Focus is too often placed on the implementation details and local optimizations, while the big differences are made at a much higher level by correctly sequencing the operations to be made. Once the right abstractions are in place, when the interfaces are defined, the actual implementation is irrelevant. If it turns out to be slow due to the database design, it can be changed without affecting the rest of the application. The intent of the code remains unchanged.</p>
<p>It also makes the code more readable. It becomes like reading an executive summary. It tells you the outline and the important pieces to remember, but it does not bury you with details. It provides a high level view that, in most cases, is what people need to understand. Sure, understanding abstractions is a different kind of gymnastic for the brain when you need to debug a piece of code, but most of the time, you can just read by and ignore the details.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2009/07/19/state-your-intent/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
