I don’t have any stats to support this. But I’m pretty certain that every second, a developer somewhere complains about legacy code. Most of the time, no one person can be blamed for it. Other than a few classics demonstrating complete lack of understanding, most bad code out there was not written by any single person. It just grew organically until just looking at it makes it fall apart. Most of the time, it begins with the good intention of keeping the design simple for a simple task. Augment this with a lot of vision by an inspired third party and the will to keep the original design unchanged by the next implementor and you made a step in the wrong direction. At some point, the code becomes so bad that people just put in their little fix and avoid looking at the larger picture.
No matter how you got there, knowing it won’t really help you. You’re stuck with a large pile of code you have no interest in maintaining and whose functionality is a complete mystery. The initial reaction is to vote for a complete rewrite from ground up using more modern technologies. Well, the past has demonstrated this to be a failure over and over again. Starting a new project from scratch is a good way to implement a brand new idea. If the objectives are completely different, it deserves to be a different project. However, a new project to do something the old one did is hardly a good idea. Development will be so long that you will either have to sacrifice your entire user base, who will move away due to neglect of their current solution, or to keep maintaining the current version for a while, which will kill the new initiative due to lack of resources.
The real solution is to devote time to making things better. Use all this time wasted on complaining and actually trying to make things better. While the code smell is often generalized, very few parts are usually rotten. Cleaning up those areas can transform the project without so much effort. The one thing you don’t want to do is begin with the first file and clean up all the parts you don’t like about it., or polish some feature because it could be improved and you understand it enough to do it.
Improving code quality is just like improving performance. Unless you target the areas that really matter, there will be no significant impact. If you spend an hour to optimize a query and gain 50% improvement on it, you can be happy with it, but if that query accounted for 1% of the total execution time, your impact really is 0.5%. Sadly, software quality does not have so many direct numbers that can be observed. There are metrics, but the impact will be seen on the longer term, mixed up with dozens with other issues, making it nearly impossible to measure. It also affects these weird factors like team morale.
To me, the main attributes refactoring candidates have are:
Obstructive issues hamper your ability to grow. They are road blocks. If you drew a directed graph of all the issues and feature requests as nodes and dependencies as edges, those issues would stand in the middle. They cause problems everywhere for no obvious reasons and always prevent you from going up to the next level. In TikiWiki, the permission system was one of those. For a long time, and it still is, the high granularity of the permission system was one of the key features of the CMS. There are currently no less than 200 permissions that can be attributed. However, the naive implementation caused so many problems that a word was created to identify it in bug reports. It also prevent from having the most demanded feature by large enterprise customers: project workspaces.
Obstructiveness may also apply in terms of development. Every time you have to perform a simple task in a given area, you find yourself juggling with complexity. For historical reasons, just getting close to a piece of code requires a complete ritual dance. So much that you just attempt to work around the issue. It’s likely that the API does not provide the functionality that is required. A lot of copy-pasting is needed and, as a result, a lot of time is wasted.
Untrustworthy code often looks innocent. A function call that looks simple and that you would expect to work. However, for some reason, every time you use it somewhere, you close your eyes before execution. For some reason, you’re not convinced it will act as it should. There are also multiple bugs filed related to the feature under corner conditions and they are always fixed by adding a line or a condition. Typically, it will be a very long function with disparate branching. Overgrown by feature requests over time. It’s not rare to see multiple different ways to do the same thing with different parameters. It was so complicated that someone made a request for something that already existed, and the developer did not even notice. The only way out of it is to map out what it does and what it’s supposed to do, and begin writing tests for it.
Inconsistency is a different kind of smell. There is nothing wrong, except that you always find yourself looking up how to use components. Different conventions are used. Sometimes you need to send an array, other times an object. For no apparent reason, things are done differently from one place to the next. Most of the time, these are easy to fix. Find out which way is the right one and deploy it all over. Don’t let the wrong way be used again. Most of the time, they just spread because someone looked up for an example and took the wrong one. Fixing those issues does not have such a large impact by itself, but it will often reduce the clutter in the code. With less code remaining, it will be easier to see the other problems.