A Simple Process
Code refactoring is the process of restructuring code without altering the behaviour (changing its factoring/composition).
At first you might think that if you write code well in the first place then refactoring is redundant, and you might be right but unfortunately that isn’t the case for most real world situations. You might not have written the original code and are picking up from another developer, and/or it might be legacy code that was written at a time when some of the more modern techniques and standards weren’t in place.
Sometimes developers under time pressure quickly hack in fixes that work well enough for launch day but further down the line these bits of clunky unmanageable code can become a burden to work around.
Refactoring might be performed for a myriad of reasons, but from my experience it usually is just as simple as having learnt new and more efficient techniques than were used at the time the original source code was written.
Why Do It?
There’s many reasons for refactoring code and the benefits vary from system to system. Refactoring can improve the readability of code by making it less complex and breaking it down into smaller more manageable pieces. In large teams the refactoring of old code bases can be important to improve maintainability and help everyone to quickly understand what the code does.
A code smell is a symptom in source code that may indicate the presence of a much deeper and more serious problem. In general a code smell is distinct from a bug as they do not prevent the program from functioning correctly, instead they may indicate a particular weakness or oddity in the working of the system that may leave it vulnerable to failure, bugs or development trouble in the future.
Detecting a code smell can be a sign that the system is in need of refactoring. Refactoring small chunks of code can eventually diagnose the root cause of a code smell and may in the process solve the issue.
Common code smells can include things such as large methods and classes, duplicate code, surplus data being returned by a function and classes that use methods of another class excessively.
The kind of refactoring needed can be determined by the type of smell. For example, large classes and methods can be fixed using techniques known as class and method extraction where they are split into smaller, more manageable pieces.
Code refactoring has a close relationship with technical debt. For teams that follow an agile methodology, the absence of refactoring (or good refactoring) can lead to the accumulation of what is known as technical debt - which can be thought of as work that needs to be done later to complete the work or implement changes.
When a fair amount of technical debt has accumulated it is a good idea to repay it by refactoring the code base.
Causes of Technical Debt
Technical debt can occur for a variety of reasons, some of which cannot be avoided and it must simply be managed and paid off with refactoring when there is time for it. Some of the most common causes include:
Agile development - In an agile development environment new features and functionality are added at a very fast pace and are usually a little rough around the edges especially in the earlier stages of a project.
Lack of knowledge - Sometimes less experienced developers (or experienced developers trying something new) simply don’t know how to write the most elegant solution right off the bat. This is one of the most commonly occurring cases of technical debt being introduced which can be solved with refactoring later on when the developer has more experience and may have learnt a new technique or two.
Parallel development - When several developers are working on different branches for a longer period of time technical debt will start to accrue because of the mounting complexity of a future merge operation. The longer developers make changes to their branches in isolation the more debt will build up and complicate the inevitable merge.
Software Rot aka Bit Rot
Software rot is the the deterioration of software performance or responsiveness over time which leads to the software becoming ‘legacy’ and in need of an upgrade (or refactoring!). The problem usually arises not with the code itself but with changes to the environment within which it exists.
There are two main categories of software rot - dormant and active rot.
The term dormant rot is used to describe a scenario in which software that is not currently being used or developed gradually becomes unstable the surrounding application/system changes and is updated. Other external changes such as user requirements can also contribute to rot and render the software obsolete.
Active rot is when software that is being used and updated begins to lose its integrity over time if mitigating techniques (such as refactoring) are not used. Due to the nature of developing constantly changing software, features are usually prioritised over elegance and good documentation especially when under time pressure. The absence of good documentation however can lead to confusion as to how the software works and make it harder for developers to refactor and clean up code further down the line.
Another common cause of software rot is when developers make modifications to open source software libraries which later receive important updates (security patches etc.) that cannot be easily applied to the developers now heavily modified version of the code. This is one of the reasons why it is considered bad practice to modify vendor libraries.
Sometimes software rot is not curable. When a system is upgraded certain parts of the software may simply be completely incompatible with the new architecture and need to be re-written from scratch.
Before going ahead and simply changing the structure of the source code it is a good idea and standard practice to develop a set of unit tests to make sure that the functionality of the code is not affected.
Refactoring should be carried out in small stages so that the functionality can be repeatedly tested and ensured.
There are several different types of techniques attributed to code refactoring. Some techniques follow the principles of abstraction and decomposition while others are as simple as changing the names of variables and methods to make them more self explanatory.
Below is a small list of techniques commonly used to refactor code, I won’t go into them in detail today but we might cover them individually in future blog posts.
- Class extraction
- Method extraction
Name and Location Techniques
- Making variable and method names more self explanatory
- Moving things around into a more logical order
- Removal of redundant and duplicate code
- Editing comments to make them more concise and clear
Refactoring.com - Lots of examples of refactoring code.