<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>L-P Huberdeau&#039;s blog</title>
	<atom:link href="http://blog.lphuberdeau.com/wordpress/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.lphuberdeau.com/wordpress</link>
	<description>Software engineering and anthropology, annectodes, and more.</description>
	<lastBuildDate>Fri, 27 Jan 2012 15:52:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Experiment in the small</title>
		<link>http://blog.lphuberdeau.com/wordpress/2012/01/experiment-in-the-small/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2012/01/experiment-in-the-small/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 15:52:44 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=406</guid>
		<description><![CDATA[Technology move fast. Every week there are new frameworks and libraries. In the past years, it seems like data stores have been appearing at an even faster rate. Each of them claims to be a revolution. Those that have been around for a while know that revolutions don&#8217;t happen that often. Those claims set expectations [...]]]></description>
			<content:encoded><![CDATA[<p>Technology move fast. Every week there are new frameworks and libraries. In the past years, it seems like data stores have been appearing at an even faster rate. Each of them claims to be a revolution. Those that have been around for a while know that revolutions don&#8217;t happen that often. Those claims set expectations very high.</p>
<p>Have you ever been in a situation where a new hire in a company is having a walk-through of the projects and the structures, and the mentor can&#8217;t really explain how it became such a mess? Mentions technologies that were once promising and revolutionary, only to be left now as shameful legacy?</p>
<p>Only time can test new technologies. What may look promising based on demos and samples may simply not scale to larger applications or cause maintenance burdens on the long term. I grew to be conservative when it comes to technologies. I still use PHP daily after all. I know it has flaws, but I also know it won&#8217;t fail me. Starting a new green field project is challenging. There are tons of decisions to be made. Tons of new and exciting toys to play with. However, trying to be too innovative hurts most of the time. Bleeding edge is a very well coined term. New technologies mean new problems to solve, which can be fun early on, but when you need to deliver and you start to hit limitations you were not aware of, waste of time starts eating away the benefits.</p>
<p>Immature products do not come with a huge body of knowledge and clear guidelines. You can use a great technology in a wrong way and create horrors. We have all seen some.</p>
<p>Of course, new technologies need to be adopted. In the long run, stable becomes obsolete. While I believe relational databases are not going away any time soon, those new data stores that used to be in the academic world will eventually mature and become mainstream. It started with all of the start-ups in the world using MongoDB or Cassandra or CouchDB, but this is not mainstream. Early adopters at best. I do try out new frameworks and databases on a regular basis, and enjoy it. However, I keep them outside of the critical path until I am confident enough that I understand the technology well enough.</p>
<p>There are plenty of places to experiment around projects. Perhaps a report needs to be built and SQL is not too suitable for it. A prototype for a new feature can be a good place to experiment as well. If it is to go straight into the main application, I take extra precautions. I prepare a contingency plan. I make sure there are good abstractions in place that allow me to replace it if anything goes wrong. If the technology does not allow me to abstract it away, it&#8217;s probably not a design elegant enough for me to use it anyway. I always place maintainability above my desire to try new things, which can be hard.</p>
<p>Experiments are supposed to fail once in a while. If you end up in a situation where everything you try is wonderful and you end up using, there is something wrong with the evaluation process. Even more so if you experiment on the bleeding edge, with technologies out for a couple of weeks. Failures are not a bad thing. Most of the time, new technologies come as a whole package that you are supposed to either take or discard. However, most of the time, they are based on ideas that are simply less common. Ideas that you can take away and use to influence your designs.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2012/01/experiment-in-the-small/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rushing to track</title>
		<link>http://blog.lphuberdeau.com/wordpress/2011/12/rushing-to-track/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2011/12/rushing-to-track/#comments</comments>
		<pubDate>Sat, 17 Dec 2011 16:25:45 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=403</guid>
		<description><![CDATA[Working on larger projects come with the annoyances of project management. Every stakeholder sends in someone to check on the project status and make sure it is on track. It&#8217;s not really possible to blame them to keep an eye on where their money is going, but it can be very counter productive. Software projects [...]]]></description>
			<content:encoded><![CDATA[<p>Working on larger projects come with the annoyances of project management. Every stakeholder sends in someone to check on the project status and make sure it is on track. It&#8217;s not really possible to blame them to keep an eye on where their money is going, but it can be very counter productive. Software projects are not like every other project. It&#8217;s not a production line and it is very hard to measure. Complexity is hidden, and so can progress be. A well run project will tackle the bigger risks first to avoid uncertainty near the end. This means that early on, progress may just be invisible. Outsiders like to see visible progress, but focusing on visual details just builds pretty prototypes. It looks complete, but it&#8217;s not.</p>
<p>Trying to please project managers rather than filling the actual needs is a bad engineering practice, even though it creates successful projects on paper. Just like the vast majority will not remember who finished second, a pile of project reports finished on time and on schedule is a manager&#8217;s pride. Reports don&#8217;t often state that the definition of <em>complete</em> was distorted. Trying to track the invisible causes dysfunction. At some point, you need to trust the people working on the project.</p>
<p>Estimates are hardly ever accurate on day 0. Until some work is done, it&#8217;s just impossible to know how long something is going to take. You can have a wild guess. Having worked on similar problems before helps assessing the risks and adjusting the estimate, but when it&#8217;s brand new work integrating with unknown systems, starting to code is the only way to feel the resistance the system will offer. The risks need to be assessed. Unstable or undocumented APIs, incoherent data, undefined business logic and dozens of other factors can affect how long something will take. If you have committed to a tight deadline and the developers know it, knowing precisely how long tasks are going to take won&#8217;t save you. Letting them work might.</p>
<p>When the primary risks are tackled and something is in place, management becomes possible. Making a list of changes that deviate from the current state, the baseline, to reach a target is quite simple. With the primary risks out of the way, it should be possible to break down the list of tasks into fairly even sizes and manage by tracking velocity. Managing becomes what it should be: looking at the time available and prioritizing. Trying to manage by tracking velocity  during the inception, which ends when the primary risks are tackled is just a waste of time. It leads to panic because velocity is not high enough early on until for a magical reason velocity sharply increases at some point in the project, which is when the actual construction starts. That is, if panic decisions did not screw up with the project.</p>
<p>You can plan all you want, but software has its own agenda. The problem space defines how long inception and elaboration will be, not a schedule on the wall.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2011/12/rushing-to-track/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Chicken and eggs</title>
		<link>http://blog.lphuberdeau.com/wordpress/2011/08/chicken-and-eggs/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2011/08/chicken-and-eggs/#comments</comments>
		<pubDate>Wed, 24 Aug 2011 13:15:47 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=400</guid>
		<description><![CDATA[Developing new features in the open source world is a long process. Not because coding takes time, but because the maturation cycle is much longer. In a normal business development cycle, the specifications are usually quite clear and they will be validated before a release by QA. In most cases I encounter, the initial need [...]]]></description>
			<content:encoded><![CDATA[<p>Developing new features in the open source world is a long process. Not because coding takes time, but because the maturation cycle is much longer. In a normal business development cycle, the specifications are usually quite clear and they will be validated before a release by QA. In most cases I encounter, the initial need is driven by a specific case, but due to the open nature, the implementation must eventually cover broader cases, driven by feature requests or stories from other users.</p>
<p>The main issue that that those additional cases cannot be validated right away. Even if you contact people directly, it&#8217;s unlikely that they will get a development version to test with. Validation will have to wait until the software is released, and they might not even test as soon as the software comes out. With a 6-month release schedule for major changes, that means that the use case validation will take 6 to 12 months.</p>
<p>When the feedback finally arrives, changes are often needed. It&#8217;s not usually very large changes. Small changes to the user interface to include existing capabilities, minor bug fixes or other issues that take less than 2 hours to resolve. Some say that figuring out the problem is half of the job. In this case, finding the issue consumes 99% of the schedule. However, fixing it is not the end of it. For re-validation, a release still needs to happen. It might be in a minor release depending on the moment of the fix, which may be a month away in the best cases. Still, the story is not over as yet more issues may be found.</p>
<p>The reason it takes so long is that development is made for preemptive needs rather than immediate needs. They are nice to have features, but not having them is not a show stopper or they would not be using the software. Alternatively, it may be a show stopper, in which case they are not using the software at all and use something else in the mean time.</p>
<p>This is still in the best of cases, as some people will just try it and declare it broken, stick to their old ways and never signal an issue. In their minds, the feature remains broken forever and they will stay away from it. They might come back much later once the feature has matured. Because they have a work-around, they won&#8217;t ever feel the urge to transition, and the longer it takes, the harder it will be as the work-around probably uses some techniques that are not as clean and slightly corrupt the data structure.</p>
<p>Assuming the feature is useful enough for a critical mass to try it and report issues, it can easily take 2 years for a feature to go from functional to mature and broadly usable. It is a long time. This is for a feature that really worked from the start, had known behaviors, documentation, unit testing and all of what you would expect from production-ready code. It still takes years.</p>
<p>The only way to speed-up the process is to find some other users with critical needs that will have a detailed case to resolve. Most of the time, they will not even know they can hook into some existing functionality. Getting a handful of those users who will be brave enough to install a development version and actively test for their use case can cut down the maturation process in half. Every time an issue is resolved (in a good way, not a dirty hack), it unlocks many more use cases and allows for more improvements. That&#8217;s when the feature becomes first-class.</p>
<p>Faster iteration is the key. If your organization uses a waterfall process or has a distant QA team that does not work closely to the developers, the same issue is likely to hit you. If you can&#8217;t live with the long maturation process the way an open source project can, you need to plan for it and manage it as a risk. Don&#8217;t wait until the week before the release to tie-up loose ends. Make sure the code is more than just a proof of concept early.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2011/08/chicken-and-eggs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Contributing organizations</title>
		<link>http://blog.lphuberdeau.com/wordpress/2011/04/contributing-organizations/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2011/04/contributing-organizations/#comments</comments>
		<pubDate>Thu, 21 Apr 2011 20:25:52 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=305</guid>
		<description><![CDATA[Originally written in 2009, but never published. Conclusion was reworked. The open source model is widely different from the typical business plan. There used to be a time when contributors were volunteers, working for passion and love of the project in their free time. These days, I feel most of the contributors to projects make [...]]]></description>
			<content:encoded><![CDATA[<p><em>Originally written in 2009, but never published. Conclusion was reworked.</em></p>
<p>The open source model is widely different from the typical business plan. There used to be a time when contributors were volunteers, working for passion and love of the project in their free time. These days, I feel most of the contributors to projects make a living out of it. I do it. There is nothing wrong with that. In fact, it&#8217;s a good thing. It&#8217;s a lot more sustainable. Everyone needs to make a living, so odds are you loose a contributor because the company he works for just bought out (or is on the verge to be) and needs to work 70 hour weeks and burns out are much lower. Contribution can become a top priority for a reasonable amount of work hours per week. The Linux kernel was probably one of the first sample where all the top contributors ended up being paid by companies to do it.</p>
<p>As an individual, I find it makes a lot of sense to focus on open source software. There is nothing that I hate doing more than writing the same code twice. Building from an open platform, and contributing back, allows me to avoid writing the same thing twice and prevents me from <em>ever</em> writing some code. I&#8217;m mostly a developer. I don&#8217;t do much of the applying for contracts and filling specific needs. Tried it before. Not a happy place for me, even if basing it on open solutions. I&#8217;d rather leave the pressure to deliver on someone else. Turns out many companies out there see open source as a business opportunity. They can build on an existing product to deliver more value fast. I get hired to take them further while they handle the day to day problems. It&#8217;s a good deal for both sides.</p>
<p>There is however a major difference between a company serving real tangible clients and the open source world. After working for several clients, I can see a major difference between the successful ones and those that barely stay above waters. It turns out it probably differentiates successful from unsuccessful in any field. It&#8217;s called vision. The good companies see ahead. They anticipate problems and make sure they are resolved before the client meets them, at least the fundamental ones. Unsuccessful companies just keep fighting fires and push down the problems and hope for someone to resolve them right away.</p>
<p>A while back, I read one of <a href="http://www.inc.com/magazine/20091201/when-and-how-to-micromanage.html">Joel&#8217;s articles</a> on micromanagement. Apart from the conference organizer&#8217;s inside jokes about terrible WIFI access in conference centers (and the great advice to make sure they give it for free if it does not handle the load &#8212; which is always), the following passage made me smile:</p>
<blockquote><p>At the top of every company, there&#8217;s at least one person who really cares and really wants the product and the customer experience to be great. That&#8217;s you, and me, and Ryan. Below that person, there are layers of people, many of whom are equally dedicated and equally talented.</p>
<p>But at some point as you work your way through an organization, you find pockets of people who don&#8217;t care that much.</p></blockquote>
<p>Having spent most of my time as either an employee (or technically an intern, as I never really had a full time job otherwise), I spent most of my time as a consultant working in fairly small of organizations and have been kept closer to developers than the management-types. I can say without an hesitation that the people down the ladder mostly blame high level management for getting in their way and preventing them from doing their job. Both are probably right. However, I still find Joel&#8217;s wording a bit harsh.</p>
<p>To function correctly with open source and make the relationship efficient, you need to embrace it. Companies trying to make it a one way relationship ended up failing. In open source, the project extends beyond the company. Every line of code you don&#8217;t contribute back is one line you will have to maintain yourself. At some point, it will simply break. Upgrading to obtain the later versions will become harder. At which point, you better hope you have killer traceability to go back to the original issue, because you will have to implement it over again. With a high turnover, it might just kill the company, and the community (and consultants part of it) might not be motivated in helping you out if you did not contribute back when times were good.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2011/04/contributing-organizations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Refactoring sprint</title>
		<link>http://blog.lphuberdeau.com/wordpress/2011/03/refactoring-sprint/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2011/03/refactoring-sprint/#comments</comments>
		<pubDate>Mon, 21 Mar 2011 14:34:34 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=396</guid>
		<description><![CDATA[I spent the last week in Boston to tackle the refactoring of Tiki trackers with other developers. The code was getting old and had evolved in ways no one would recommend. The original author himself had qualified them as a hack. Yet, hundreds of people use them extensively and the interface had been polished over [...]]]></description>
			<content:encoded><![CDATA[<p>I spent the last week in Boston to tackle the refactoring of Tiki trackers with other developers. The code was getting old and had evolved in ways no one would recommend. The original author himself had qualified them as a hack. Yet, hundreds of people use them extensively and the interface had been polished over the years. The main issue is that the cruft to value ratio was reaching a tipping point. The collapse had been predicted for a long time but did not happen. Modifications only took longer to perform, leaving more cruft to remove each time. Worst, no one dared to get close to that code. Few had enough courage to modify it.</p>
<p>Before leaving for Boston to tackle the issue, I had gone through the code and cleaned up parts of it, making conditions more explicit and code slightly more expressive. The objective was not as much to improve the code as to understand the raw design underneath it. I think all software as a design, only too often, it is unintentional and hidden. In those cases, refactoring is initially about making the current design explicit. Once that is done, it can be refactored further and improved to match new requirements. When initially setting the goals for the week, saying we&#8217;re not going to fix any issues, add new features or otherwise solve everyone&#8217;s own favorite issue is a tough sale.</p>
<p><strong>A successful refactoring sprint is about discipline.</strong> The group must stay away from distractions and concentrate on the task. Our task was to extract the field rendering and input logic into cohesive units. The initial input was composed of a few files between 1KLOC and 6KLOC, containing around 40 field types being handled, all mashed together. Some lines were between 500 and 700 characters long. Some parts were duplicated in multiple locations, with the mandatory differences that make them hard to reconcile. Removing that duplication was one of the primary objective. It&#8217;s challenging. I don&#8217;t think anyone thought it was possible when we began, but it had to be done.</p>
<p>Figuring out where to begin is not easy. Initially, you can&#8217;t even get everyone working. Even if the problem had a natural separation with the field types, the code would fight back when too many people worked on it. At first, an initial interface had to be defined as the target to reach. Then it had to be plugged. Essentially, it comes down to <em>if we have a handler to do it, use it, otherwise, revert back to whatever was there before</em>. Those kind of hooks had to be deployed in many places, but we began with one. Working on a few handlers to see how it worked out, learning about the design.</p>
<p>As more handlers got to be created, more hooks were added in other places, leading to revisiting the previous ones. It&#8217;s a highly iterative process. I made the first iteration alone and others were introduced gradually. Everyone&#8217;s first handler took a whole day. It was much more than my most pessimistic estimations. There was a lot to learn. However, the pace then accelerated. As each of us understood the design of the code and the patterns to be found, the pace accelerated. We could see those huge files melting. Each step of the way, it became easier. Anyway, that was the feeling.</p>
<p>Then someone asked how far were we. I pulled out a white board and made a list of the field types that were still to be done. The initial list came as a disappointment. The list was still long. We were only half way and way past the week&#8217;s mid-mark. However, past the initial disappointment, having the list visible ended up being a motivator, because each one that was completed made the list shorter. It encouraged to fully complete the handlers rather than leaving dangling issues.</p>
<p>We ended up completing on the last evening. This was a one week burn. The last few hours were hard for everyone. After spending a week working long hours on challenging code, I don&#8217;t think we could have accomplished more than we did. However, there was great satisfaction. The refactoring process is not completed. One of the issues was tackled, but there are other areas of the code that need to be worked on. However, the bulk of the job was done as a team effort, and now there are stronger grounds to build from. No one could have done it alone.</p>
<p>It should be noted that the week was not only hard work. It was also a social event where non-coding contributors of the community and users were welcome to stop by and chat. There were late night discussions around beers, leading to even less sleep, and the whole week was a great team building experience. While we were shuffling thousands of lines of code around, the documentation team also re-organized the structure of the documentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2011/03/refactoring-sprint/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Where ugly code lies</title>
		<link>http://blog.lphuberdeau.com/wordpress/2011/01/where-ugly-code-lies/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2011/01/where-ugly-code-lies/#comments</comments>
		<pubDate>Sat, 22 Jan 2011 21:14:51 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=392</guid>
		<description><![CDATA[There are multiple definitions to what software architecture is, notwithstanding that in some areas, the term cannot legally be used. Definitions vary from high level code design to organizational issues. James O. Coplien and Gertrud Bjørnvig came up with a good summary in Lean Architecture. Architecture is more about form than structure. Implementation details are [...]]]></description>
			<content:encoded><![CDATA[<p>There are multiple definitions to what software architecture is, notwithstanding that in some areas, the term cannot legally be used. Definitions vary from high level code design to organizational issues. James O. Coplien and Gertrud Bjørnvig came up with a good summary in <em>Lean Architecture</em>.</p>
<ul>
<li><em>Architecture is more about form than structure.</em> Implementation details are not in the scope. Interfaces, classes and design paradigms are not even considered. Only the general form of the final solution is.</li>
<li><em>Architecture is more about compression than abstraction.</em> Abstractions are in code. They are detailed. The architecture part of the work is about larger scale partitioning into well named patterns, which may have multiple implementations.</li>
<li><em>Much architecture is not about solving user problems.</em> While I don&#8217;t fully agree with this one, it&#8217;s true that most users will not see the changes right away.</li>
</ul>
<p>These are high level concepts that have a huge impact on the code. The partitioning that results determines how additions are made to the code and how they will be. There is a direct relationship to the software design and the API available to implementers.</p>
<p>When I look at code in software designed using good techniques, there is typically a clear distinction between some core managing the general process and the extensions following interfaces that are called at the appropriate time. When you look at code inside the core, it really does not seem to do much. There are usually a few strange incantations to call the extension points efficiently and massage the information sent through. The code is not really pretty, but the structure it represents is clean. The glue has to go somewhere.</p>
<p>The leafs, or extension points, are typically a complete jungle. Contortions must be made to fit the interface as mapping is done to an external system. Some pieces were written quickly to serve a pressing matter, fell into a technical debt backlog and eventually out of sight. Code is duplicated around, taken from the older generations to the new ones, evolving over time, except that the ancestors stay around and never get the improvements from the new generations. Quality varies widely as does the implementer&#8217;s abilities and experience, but all of the components are isolated and do not cause trouble&#8230; most of the time.</p>
<p>Seeing how code rots in controlled environments, I&#8217;m always a bit scared when I see a developer searching the web and grabbing random plugins for an open source platform and including them in the code. Disregarding the license issues that are almost never studied, that practice is plain dangerous. There are <a href="http://www.wikisym.org/ws2008/proceedings/research%20papers/18500009.pdf">security implications</a>. Most developers publishing plugins are not malicious, they are simply ignorant of the flaws they introduce.</p>
<p>jQuery is probably the flagship in the category of quality core containing arcade incantations and the jungle of plugins. Surely, having 50,000 plugins may seem like a selling point, but when you consider most of them are lesser duplicates of other ones. Code quality is appallingly low. In most cases, it takes less than 30 seconds to realize they were written by people (self-proclaimed experts) who knew nothing of jQuery, just enough JavaScript to smash together a piece of functionality and branded it as a jQuery plugin for popularity&#8217;s sake while following a tutorial. Never use a plugin without auditing the code first.</p>
<p>Even if good care is taken to control the leafs, ugly code will appear all over. There are no other solutions than to go back and add the missing abstractions. Provide the additional tools needed to handle the frequent problems that were duplicated all around. No amount of planning will predict the detailed needs of those extension points. What allows architecture to work is <em>compression</em>, to be able to skip details so the system can be understood as a whole. The job is not done when the core is in place. Some time must be allocated to watching the emergence of patterns and to respond to them, either by modifying the core or providing use-at-will components. It can be made in multiple steps too.</p>
<p>Recently, I was asked to do a lot of high level refactoring in Tiki. Major components had systemic flaws known for a while and many determined they had to be attacked after the release of the long term support version. High level work has several impacts, but sometimes, just providing low level tools can improve the platform significantly. Cleaner code will make the high level changes easier to perform. It only takes a few hours to run through several thousands lines of code and identify commonalities that could be extracted. Extract them, deploy it around. Iterate. Automated tests to support those changes would be nice, but most of the time, those changes are so low level, it&#8217;s almost impossible to get wrong.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2011/01/where-ugly-code-lies/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Sensible defaults</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/12/sensible-defaults/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/12/sensible-defaults/#comments</comments>
		<pubDate>Sun, 19 Dec 2010 19:32:12 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=390</guid>
		<description><![CDATA[I have not written much C++ in my life. Most of it goes back to college and university, and that short period of time I was at Autodesk. However, I always considered the STL to be very influential. A few years back, I read Bjarne Stroustrup&#8217;s book and something hit me. For those not familiar [...]]]></description>
			<content:encoded><![CDATA[<p>I have not written much C++ in my life. Most of it goes back to college and university, and that short period of time I was at Autodesk. However, I always considered the STL to be very influential. A few years back, I read <a href="http://blog.lphuberdeau.com/wordpress/library/#0201700735">Bjarne Stroustrup&#8217;s book</a> and something hit me. For those not familiar with the STL, it&#8217;s a very template-intensive library that does not use much of the traditional interfaces you typically see in object oriented libraries. Instead, everything is based on duck typing. If the argument you pass provides the right set of operators or methods, the compiler will do the job and go ahead with it. If it does not, the compiler will print out a few dozen lines of garbage with angle brackets all around, which does not relate much to your code. Still, one of the core concepts in the library is that if an operation is efficient, it will have an operator to do it. If it&#8217;s not, it will have a function name that&#8217;s rather long. Hence encouraging the use of efficient operations. How does this work in reality? A vector list will provide direct value access through square brackets, pretending to be an array and a queue won&#8217;t, because that would be O(n) and that&#8217;s not something they want to encourage.</p>
<p>The <a href="http://www.sgi.com/tech/stl/queue.html#3">documentation</a> also contains hints like this one:</p>
<blockquote><p>[3] One might wonder why <tt>pop()</tt> returns <tt>void</tt>, instead of <tt>value_type</tt>.  That is, why must one use <tt>front()</tt> and <tt>pop()</tt> to examine and remove the element at the front of the <tt>queue</tt>, instead of combining the two in a single member function?  In fact, there is a good reason for this design.  If <tt>pop()</tt> returned the front element, it would have to return by value rather than by reference: return by reference would create a dangling pointer.  Return by value, however, is inefficient: it involves at least one redundant copy constructor call.  Since it is impossible for <tt>pop()</tt> to return a value in such a way as to be both efficient and correct, it is more sensible for it to return no value at all and to require clients to use <tt>front()</tt> to inspect the value at the front of the <tt>queue</tt>.</p></blockquote>
<p>Not all libraries in the world are so careful. Take this snippet from the Zend Search Lucene documentation:</p>
<blockquote><p>Java Lucene uses the &#8216;contents&#8217; field as a default field to search.             Zend_Search_Lucene searches through all fields by default, but             the behavior is configurable. See the <a href="http://framework.zend.com/manual/en/zend.search.lucene.query-language.html#zend.search.lucene.query-language.fields">&#8220;Default search field&#8221;</a> chapter for details.</p></blockquote>
<p>When I came across this reading the documentation to build a fairly complex index, I thought it was a reasonable default to search all fields. It&#8217;s very convenient. I could store my content in the field they belong, make sure they are searchable by default and still allow for finer-grained search when required. Fantastic. I went ahead with it, write the code to collect the information and index it properly and the query abstraction at the same time in a good old TDD fashion. Everything worked. Of course, I was testing on very small data sets.</p>
<p>I then went ahead to test with larger data sets. I had an old back-up from a site installed with around 2000 documents in it. It felt like a decent test. I expected the indexing to be around half an hour for that type of data based on what I had read online. The search component had not been selected for speed, it had been selected for portability. Speed was one of those sacrificed attributes as long as it was not too terrible. Of course, the initial indexing took longer than expected, but only by a factor of 2, and I knew some places were not fully optimized yet (it&#8217;s now down to 20-25 minutes).</p>
<p>The really big surprise came as I attempted to make a simple search in the index. It timed out. After 60 seconds. Initial attempts at profiling failed as it was getting late in the afternoon on a Friday night. I closed up shop and had a bit of trouble getting it out of my head that night.</p>
<p>When I got back to it, I took out the time limit, started a profiling session on it and enjoyed my coffee for a little while. The results indicated that the search spent pretty much all of it&#8217;s time optimizing the search query. It was making tens of thousands of of calls to some functions eventually making reads on disk. There was not much more reporting in there to help me. I started adding some var_dumps in the code to see what was going on. Well, it turns out that &#8220;search all fields by default&#8221; was not such a great idea. It actually made it search through all the fields and basically expand the query. Because of how I interpreted the API and documentation, I had built my index to be quite expanded and it contained a few dozens of fields, not all of which existed for all documents. It was a mistake. There was one valid reason why the Java implementation did not behave that way: it was not possible to do it efficiently.</p>
<p>I ended up modifying the index generation to put all content in a <em>contents</em> field, duplicating the content you would actually want to search independently in their own fields and search <em>contents</em> by default. Indexation time wasn&#8217;t altered by much, the code changes were actually very minor and easy due to the array of tests available and search speed went up dramatically. It&#8217;s not as fast as sphinx for sure, but it does offer decent performance and can run on pretty much any kind of cheap hosting, which is a good feature for an open source CMS. It still needs to be investigated, but it&#8217;s also likely to be a smooth upgrade path to using Solr for larger installations. Abstractions around the indexing and searching will also allow to quickly move to other index engines as needed.</p>
<p>Asymptotic analysis is one really boring part of the computer science curriculum, but it&#8217;s really something to consider when building libraries that are going to be used with large numbers of documents. The API needs to reflect the limitations and the documentation must explain them clearly.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/12/sensible-defaults/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>In and out of hot water</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/10/in-and-out-of-hot-water/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/10/in-and-out-of-hot-water/#comments</comments>
		<pubDate>Wed, 27 Oct 2010 15:55:37 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=388</guid>
		<description><![CDATA[Some problems are just too obvious. Blatantly unacceptable. Yet, we live with them because we grow accustomed to them. Dozens of people work on a project and solve small issues in the code by patching things up. That&#8217;s just the way it&#8217;s done. We all do it not thinking there is an other way. The [...]]]></description>
			<content:encoded><![CDATA[<p>Some problems are just too obvious. Blatantly unacceptable. Yet, we live with them because we grow accustomed to them. Dozens of people work on a project and solve small issues in the code by patching things up. That&#8217;s just the way it&#8217;s done. We all do it not thinking there is an other way. The code piles up, duplication creeps and inconsistency spreads. It happens so gradually that no one notices. The only solution to it is gaining perspective. Recently, I got away from a project I had been with for years. The official time away was really only 3 months. I can&#8217;t say I have significantly more knowledge about software right now than I had 3 months ago. Yet, as I came back, I came across a tiny issue. It really was just an other one-liner fix. One of those I had done countless times before. However, this time, it just felt unacceptable. What I saw was not the small issue to be fixed, it was the systemic problem affecting the whole. I can fix that once and for all.</p>
<p>Knowing the ins and outs of a project has some advantages. Changes can be made really quickly because you know exactly where to look at. When you see a piece of code, you can know from the style who wrote it and ask the right questions. You know the history of how it came to be and why it&#8217;s like that. Sadly, you also know why it won&#8217;t change anytime soon. Or can it? Given some amount of effort, anything about software can change. We&#8217;re only too busy with our daily work to realize it.</p>
<p>The same effect applies at different levels. People arriving in a new project have the gift of ignorance. They don&#8217;t see the reasons. They see what is wrong to them and how it should be for them. Of course, this often comes out in a harsh way for those who spent a lot of effort making it as good as it is, with all the flaws it has. Anyone who has been working on a project for a significant amount of time and demonstrates it knows that reality can be hard. Still, for all the good advice newcomers can have, what they don&#8217;t have is the trust of the community and the knowledge required to know all the repercussions a change can have. Still, they do tend to see major issues that no one had intentions to solve.</p>
<p>At a much smaller level, it&#8217;s not much different than closing a coding session at the end of a day in the middle of trying to resolve an impossible bug. The next morning, the issue just magically appear. This can be associated to being well rested and sharper, but it might also just be about letting the time go. Deciding to stop searching is a tough call. You&#8217;re always 15 minutes away from finding the solution when debugging, until you realize the hypothesis was wrong. It takes a fair amount of self-confidence to know you can take the night off and that you will still be able to fix the issue in time for a deadline. Sure, getting back into it will take some time. It requires to rebuild the mental model of the problem, but rebuilding might just be what&#8217;s needed when you are on the wrong track.</p>
<p>This is nothing new. It has been written many times that code review are more effective when done a few days later or by someone else entirely. In this case, Wiegers indicates that when the work is too fresh in your memory, your recite it based on what it should be rather than what it is, hence you don&#8217;t really review it. What if when you search for a problem, whether a localized bug or a systemic issue that requires major work, unless you had taken enough distance to clear your assumptions, there is no way you could find it outside of blind luck?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/10/in-and-out-of-hot-water/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How not to fail miserably with system integration</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/09/how-not-to-fail-miserably-with-system-integration/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/09/how-not-to-fail-miserably-with-system-integration/#comments</comments>
		<pubDate>Tue, 28 Sep 2010 23:35:20 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=385</guid>
		<description><![CDATA[In the last few weeks, I&#8217;ve had the opportunity to get my hands in an SOA system involving multiple components. I had prior experience, but not with that many evolving services at the same time. When I initially read the architecture&#8217;s documentation, I had serious doubts any of it was going to work. There were [...]]]></description>
			<content:encoded><![CDATA[<p>In the last few weeks, I&#8217;ve had the opportunity to get my hands in an SOA system involving multiple components. I had prior experience, but not with that many evolving services at the same time. When I initially read the architecture&#8217;s documentation, I had serious doubts any of it was going to work. There were way too many components and each of them had to do a whole lot of work to get the simplest task accomplished. My role was to sit from the outside and call those services, so I did not really have to worry about the internals. Still, unless you&#8217;re working with Grade A services provided by reputed companies and used by tens of thousands of programmers, knowing the internals will save a few pains.</p>
<h3>Make sure you can work without them</h3>
<p>I actually got to this one by accident, but it happened to be a good one. When I joined the team, there were plenty of services available. However, the entry point I needed to get to be able to use any of them was not present at all. I had a vague idea of what the data would look like, so I started building static structures that would be sent off to the views to render the pages. I had a bit of heads up before the integrators joined the team and certainly did not want them to wait around for weeks until the service was delivered.</p>
<p>At some point, the said service actually entered in someone&#8217;s iteration, which means it would be delivered in a near future. Fairly quickly, a contract was made for the exact data format that would be provided. Although I was not wrong on the values that would be provided, the format was entirely different. My initial format then became an intermediate format for the view, providing the strict minimum required, and a layer was added in the system to translate the service&#8217;s format to our own. The service was not yet available, so the service was just stubbed out. The conversion could be unit tested based on the static data in the service stub. Plugging the service when it arrived was a charm and except for a few environment configurations, it was transparent to integrators.</p>
<p>During the entire development, this fake data ended up being very useful.</p>
<ul>
<li>Whenever the services would change, there was an easy way to compare the expectations to what we actually received, allowing to update the stub data and the intermediate layer.</li>
<li>When something would go terribly wrong and services would just fail, it was always possible to revert to the fake data and keep working.</li>
<li>It allowed to reach code paths that would not normally be reached, like error handling.</li>
</ul>
<h3>Expect them to fail</h3>
<p>One of the holy grails with SOA is that you can replace services on the fly and adjust capacity when needed. This may be partially true, but it also means that while your component works fine, the neighbor may be completely unreachable during a maintenance or while restarting. If you happen to need it, you might as well be down in many cases. While one would hope services won&#8217;t crash in production, they happen to crash in development fairly often. To live with this, there is one simple rule: <em>expect every single call to fail, and make a conscious decision about what to do about it.</em> For one, it will make your system fault tolerant. If the only call that failed is fetching some information that is only used being rendered, it&#8217;s very likely that you don&#8217;t need to die with a 500 internal error. However, if you didn&#8217;t expect and handle the failure, that&#8217;s what will happen.</p>
<p>Adding this level of error handling does add a significant cost to the development. It&#8217;s a lot of code and a lot of reflexions needed. Live with it or reconsider your SOA strategy.</p>
<p>Early on, adding the try/catch blocks wasn&#8217;t much of a reflex. After all, you can write code that works without it and PHP sure won&#8217;t indicate you that you forgot one. When the first crashes occurred in development, we still had few services integrated. Service interruptions just became worst as we integrated more. What really pushed towards adding more granularity in catching exceptions is not really the non-gracefulness in which they would break the system, it&#8217;s the waste of time. With a few team members pushing features in a system, a 15 minutes interruption may not seem like much, but it&#8217;s enough to break the flow, which is hard enough to get into in an office environment. Especially when the service that breaks has nothing to do with the task you&#8217;re on at the moment.</p>
<p>It does not take much.</p>
<ul>
<li>When fetching information, have a fall-back for missing information.</li>
<li>When processing an information, make sure you give a proper notification when something goes wrong.</li>
<li>Log all failures and have at least a subtle indication for the end-user that something went wrong and that logs should be verified before bothering someone else.</li>
</ul>
<h3>Build for testability</h3>
<p>Services live independently in their own memory space with their own states. They just won&#8217;t reset when your new unit test begins, making them a pain to test. However, that is far from being an excuse for not automating at least some tests. Every shortcut made will come back to hunt you, or someone else on the team. It&#8217;s very likely that you won&#8217;t be able to get the remote system in every possible state to test all combination. Even if you could, the suite would most likely take very long to execute, leading to poor feedback. Mocks and stubs can get you a long way just making sure your code makes the correct calls in sequence (when it matters), passing the right values and stopping correctly when an error occurs. That alone should give some confidence.</p>
<p>To be able to check all calls made, we ended up with an interface defining all possible remote calls providing the exact same parameters and return values as the remote systems. There was a lot of refactoring to get to the solution. Essentially, every single time an attempt was made to regroup some calls because they were called at the same time and shared parameters, or because it was too much data to stub out for just those 2 tiny values, it had to be redone. Some error would happen with the real services because the very few lines of code that were not tested with the real service happened to contain errors, or something would come up and suddenly, those calls were no longer regrouped.</p>
<p>As far as calling the real services go, smoke testing is about the only thing I could really do. Making a basic call and checking if the output seems to be in the appropriate format. In the best of worlds, the service implementers would also provide a stub in which the internal state can be modified, and maintain the stub to reflect the contract made by the real service. It could have solved some issues with the fact that some services are simply impossible to run in a development environment. Sticking to the contract is the only thing that can really be done in an automated fashion for development. I first encountered that type of environment a few years back where running a test actually implied walking to a room, possibly climbing a ladder, switching wires and getting back to the workstation to check.</p>
<h3>Have an independent QA team</h3>
<p>It might not be miserably, but chances you fail are fairly high when a lot of components need to talk to each other and there is no way you can replicate all of it at once. A good QA team testing in an environment that maps to the production environment will find the most hallucinating issues. In most cases, they are caused by a mismatch between the understanding of the interface between the implementor and the client. When you have a clear log pointing out the exact source of the problem and all your expectations documented in tests and stubs, it does not take a very long discussion to find the source of the issue. Fixing it just becomes adjusting the stubs, and fixing the broken tests.</p>
<p>If you&#8217;re lucky enough, there might not be issues left when it goes to production. Seriously, don&#8217;t over-do SOA. It&#8217;s not as magic as the vendors or &#8220;enterprise architects&#8221; say it is.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/09/how-not-to-fail-miserably-with-system-integration/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Summer report</title>
		<link>http://blog.lphuberdeau.com/wordpress/2010/08/summer-report/</link>
		<comments>http://blog.lphuberdeau.com/wordpress/2010/08/summer-report/#comments</comments>
		<pubDate>Tue, 31 Aug 2010 22:44:03 +0000</pubDate>
		<dc:creator>Louis-Philippe Huberdeau</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Wiki]]></category>

		<guid isPermaLink="false">http://blog.lphuberdeau.com/wordpress/?p=384</guid>
		<description><![CDATA[For some reason I never quite understood, I always tend to be extremely busy in the summer when I would much rather enjoy the fresh air and take it slow, and be less busy during the winter when heading out is less attractive. This summer was no exception. After the traveling, I started a new [...]]]></description>
			<content:encoded><![CDATA[<p>For some reason I never quite understood, I always tend to be extremely busy in the summer when I would much rather enjoy the fresh air and take it slow, and be less busy during the winter when heading out is less attractive. This summer was no exception. After the traveling, I started a new mandate with a new client, and that brought my busyness to a whole new level.</p>
<p>In <a href="http://blog.lphuberdeau.com/wordpress/2010/06/upcoming-events/">my last post</a>, I mentioned a lot of wiki-related events happening over the summer and that I would attend them all. It turns out it was an exhausting stretch. Too many interesting people to meet, not enough time &#8212; even in days that never seem to end in Poland. As always, I was in a constant dilemma between attending sessions, the open space or just creating spontaneous hallway discussions. There was plenty of space for discussion. The city of Gdansk being not so large, at least not the touristic area in which everyone stayed, entering just about any bar or restaurant, at any time of the day, would lead to sitting with an other group of conference attendees. WikiMania did not end before the plane landed in Munich, which apparently was the connection city everyone used, at which point I had to run to catch my slightly tight connection to Barcelona.</p>
<p>I know, there are worst ways to spend par of the summer than having to go work in Barcelona.</p>
<p>I came to a few conclusions during WikiSym/WikiMania:</p>
<ul>
<li><em>Sociotechnical</em> is the chosen word by academics to discuss what the rest of us call the social web or web 2.0.</li>
<li>Adding a graph does not make a presentation look any more researched. It most likely exposes the flaws.</li>
<li>Wikipedia is much larger than I knew, and they still have a lot of ambitions.</li>
<li>Some people behind the scenes really enjoy office politics, which most likely creates a barrier with the rest of us.</li>
<li>One would think open source and academic research have close objectives, but collaboration remains hard.</li>
<li>The analysis performed leads to fascinating results.</li>
<li>The community is very diverse, and <a href="http://wikidocumentary.org">Truth in Numbers</a> is a very good demonstration of it for those who could not be there.</li>
</ul>
<p>As I came back home, I had a few days to wrap up projects before getting to work for a new client. All of which had to happen while fighting jet lag. I still did not get time to catch-up with the people I met, but I still plan on it.</p>
<p>One of the very nice surprises I had a few days ago is the recent formation of <a href="http://montrealouvert.net/">Montréal Ouvert</a> (the site is also partially available in English), which held it&#8217;s <a href="http://montrealouvert.net/2010/08/11/1e-reunion-de-montreal-ouvert-ouvert-a-tous/?lang=en">first meeting last week</a>. The meeting appeared like a success to me. I&#8217;m very bad at counting crowds, but it seemed to be somewhere between 40 and 50 people attending. Participants were from various professions and included some city representatives, which is very promising. However, the next steps are still a little fuzzy and how one may get involved is unclear. The organizers seemed to have matters well in hand. There will likely be some sort of hack fest in the coming weeks or months to build prototypes and show the case for open data. I don&#8217;t know how related this was to <a href="http://port25.ca/archive/2010/06/29/war-is-over-if-you-want-it.aspx">Make Web Not War</a> a few <a href="http://blog.lphuberdeau.com/wordpress/2010/05/what-could-we-do/">months prior</a>. It may just be one of those idea whose time has come.</p>
<p>I also got to spend a little time in Ottawa to meet with the <a href="http://bigbluebutton.org/">BigBlueButton</a> team and discuss further integration with <a href="http://tiki.org">Tiki</a>. At this time, the integration is minimal because very few features are fully exposed. Discussions were fruitful and a lot more should be possible with the now in development version 0.8. Discussing the various use cases indicated that we did not approach the integration using the same metaphor, partially because it is not quite explicit in the API. The integration in Tiki is based on the concept of rooms as a permanent entity that you can reserve through alternate mechanisms, which maps quite closely to how meeting rooms work in physical spaces. The intended integration was mostly built around the concept of meetings happening at a specific moment in time. Detailed documentation cannot always explain the larger picture.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lphuberdeau.com/wordpress/2010/08/summer-report/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

