"This stress! We don't have time and still need to deliver so many features by the end of the month. It's just too stressful to develop unit tests right now." – Anonymous? Or way too popular?
If you’ve heard any words to this effect, you are not alone. This is a typical killer argument used to prevent the development of unit tests. The background is simple to explain: if we only consider the time spent on creating "productive" code, we may be faster without these annoying tests. However, higher speed in a small portion does not necessarily mean we will reach our goal faster. On the contrary, gliding (instead of rushing) in software development is the motto, not short-term rushing. Nevertheless, this killer phrase often works and is just one example among many others. I hear it frequently in many companies and projects. The goal is to debunk all arguments in a subjective way and shut down any further discussion.
Where is the potential for savings with unit testing?
Unit tests are created by the developer during implementation. Errors found and resolved at this stage are always much cheaper than errors found later. In addition to the results from studies, there is also a simple explanation for this. Let's consider how the time to correct an error is distributed.
Figure 1: The process for finding and resolving errors
I call the first part after the error occurrence as the "error analysis". Usually, the following several steps are necessary to find the error and understand the cause of the problem:
- Understanding the error description
- Setting up the test environment and development environment
- Reproducing the error
- Understanding and reading the code
- Modifying the code to better understand the problem, e.g. adding more logging
The second part is about the actual "error resolving". Interestingly, regardless of when I correct an error, once I understand the problem, the duration for error resolving is almost constant.
The longer the code creation date, the longer the error analysis. Why?
The factors which influence the duration of error analysis are heavily dependent on how much time has passed since the developer created this code (= difference between implementation and error analysis). The worst case occurs when I correct an error in a piece of code that I did not create myself. It takes me the most amount of time to understand this code upfront.
Why so? Sample this: If I were to read an essay I wrote during my school days today, I would hardly remember that it was written by me in the first place, and secondly, I would have to think about what I wanted to convey at that time.
Developing code is very similar to the process of writing. The formalism and abstraction possibilities of programming languages make the developed work less understandable. As a developer, I create my own mental model for converting requirements into code. Understanding the code, therefore, means first reading the code and then understanding the developer's mental model. I do this even with my own code. I feel that I need more time to understand my own code after one or two weeks. I am convinced that I try to develop readable code. Forgetting increases rapidly, and the further back the creation date is in the past, the more time I need to invest in understanding and retrieving the information from my memory.
Figure 2: Forgetting curve by Hermann Ebbinghaus
This fact is called the forgetting curve, discovered by the German psychologist Hermann Ebbinghaus. The forgetting curve depends on many factors such as the difficulty of the learned material and some others. The values cannot be directly applied to programming but the underlying model can be applied. Therefore, the time for error analysis can be represented as a function of forgetting.
Figure 3: Relationship between Forgetting Curve and Analysis time
By the way, forgetting applies to all other tasks necessary for error analysis as well. Have you ever reproduced and prepared test data for an old error? Or even reset the development environment because the error can only be reproduced in an older version of the product? All these activities take significantly longer if the software creation date is further back.
Are unit tests a cost-effective and profitable solution?
Unit tests are created by the developer during implementation. There is no cheaper time to find and fix errors. Why? Because the knowledge of the developed solution is fresh in the developer's memory. The developer tests their code at a time when they are actively engaged with the requirements and the curve of forgetting has just begun. This reduces the effort for error detection to zero! Of course, creating test cases takes time, but that is minimal compared to the necessary effort for error detection. Furthermore, creating test cases can be learned, and the effort decreases with gained experience.
Figure 4: Error-handling process with unit testing
If we consider only the aspect of error detection in the Test-Driven Development (TDD) approach, then this approach is most efficient and economical. In the first step, the developer designs and implements a test case that must be faulty at that point. The second step involves fixing the error, which is also part of the implementation. Then, refactoring is done to improve the solution. This cycle is repeated until all the requirements have been implemented through tests. The effort for error detection is reduced to zero, and implementation and correction merge with each other.
Additionally, a good Unit Test suite helps you to pinpoint the error to a specific feature in the code. This supports the developer in the error analysis phase by avoiding wastage of time and effort in looking at the wrong places, especially when adding new features and their implications to the existing code, and during regression testing. This fact is indirectly related to the forgetting curve, too. These unit tests save us from having to keep all the details in our memory and this help us find the error faster.
From the perspective of effort alone, early creation of unit tests is cost-effective. It really surprises me why this agile practice is not applied in more projects. I am convinced that at an overall level, developing software with TDD and unit tests is faster and safer than without them. I avoid wasting time on lengthy error detection through early testing.
In addition to the effort aspect of error detection, there are other reasons to use TDD. Foremost for me is the focus on the actual requirements. To implement test cases, requirements must be clearly formulated. This will most likely help to avoid implementing unnecessary requirements. Furthermore, unit tests support the developer in finding better solutions and designs.