Should We Write a Unit Test or an End-to-End Test?
The debate over whether to write a unit test or an end-to-end test for an aspect of a software system is something I've encountered a number of times. It most often arises as a philosophical discussion along the lines of: If we can only write one test for this feature, should we write a unit test or an end-to-end test? In essence, time and resources are limited, so what type of test would be most effective?
I think managers — development leads, product managers, project managers — generally favour end-to-end tests, as they are the most supportive of their responsibilities. They answer the question: Does the system ultimately deliver what it needs to deliver? I've also encountered situations, however, where people are hostile to unit testing and view it as a waste of time. They advocate using end-to-end tests exclusively and view unit tests as restrictive to evolving the system, requiring too much time and effort to refactor, or redundant, given that the overall behaviors of the system are verified by end-to-end tests.
In this article, I will provide my opinion on this question. I should qualify that my experience has been in building software infrastructure for industrial applications — streaming data system for near-real-time data. For someone who has worked in another domain, where deploying and testing the entire software system is easier, or where the operational environment is more forgiving of error, I can understand how their experience might be different. I've worked on hosted services as well as infrastructure that is installed on-premises and operated by the customer. These systems are composed of a number of different elements, must perform consistently and reliably, and meet crosscutting requirements in terms of security, scalability and performance. These systems need to evolve to include new functionality or bug fixes, without introducing regressions in existing functionality or behaviour. Testing these systems end-to-end is always a challenge, as there are a number of dependencies that must be in place in order to test even a small part of the overall system. Reproducing the diversity of issues encountered in operational settings is also challenging.
An end-to-end test is a test that exercises an aspect of the system within the context of the system as a whole. An end-to-end test is generally either a functional test that verifies a specific aspect of the system behaves as expected, or an acceptance test that not only verifies that the aspect functions correctly, but also validates that the broader system continues to meet requirements in terms of performance, scalability, security, maintainability, and so on. End-to-end tests usually require deploying the system and its dependencies, which can take a significant amount of time and resources. The tests can take a long time to run and are subject to variability, given the number of moving parts.
As an engineer, a product manager, a development lead, or even an engineering director, I would always want every critical aspect of the system to be tested as part of an end-to-end test. I just wouldn't live without this. This test not only verifies that the feature works, but it also verifies that it works under typical operational conditions. The end-to-end test is essential. I want to know that the system I'm delivering meets the requirements and will continue to meet the requirements as the system evolves in the future.
Another important consideration with regard to end-to-end tests is that the most complex and subtle bugs in software systems cannot be observed in isolation and are only encountered when exercised as part of an integrated system. I've written previously about a set of challenging bugs that I encountered that could not be observed with unit tests, or even a functional test exercising one aspect in isolation. End-to-end tests allow one the opportunity to study the application under conditions where these bugs arise and are invaluable for building robust and reliable software systems.
A unit test is a test that independently exercises a small unit of source code, like a method or a class. A unit test typically has no dependencies, can be executed in milliseconds, and is perfectly reliable. Given that I wouldn't live without an end-to-end test, for me the question becomes: Do I write a unit test?
End-to-end tests serve a critical purpose, but when a system is tested predominantly or exclusively with end-to-end tests, problems arise. The Google Testing Blog article Just Say No to More End-to-End Tests does a great job of characterizing the problems of relying on too many end-to-end tests. Given the inevitable complexity of end-to-end tests, it can take hours or days to get feedback. When an end-to-end test fails, it is often very difficult to identify the component that resulted in the failure. Even then, the failure can often be a result of the variability introduced by the test infrastructure itself. In contrast, unit tests are fast, reliable, and they isolate failures to a specific unit of code. The Google article suggests that tests are most effectively arranged as part of a pyramid where the largest number of tests are unit tests, followed by a moderate number of integration tests, then a smaller number of end-to-end tests.
I worked for a number of years on a software system that was tested almost exclusively with end-to-end tests. An inverted pyramid at best, but probably closer to just the tip of the pyramid. It would take a day or more to run these tests. There were frequent test failures as a result of the test infrastructure itself. When a test failed, it was often very hard to determine why. The team spent a great deal of time characterizing test failures, an expensive activity that became accepted as part of the daily routine. This usually involved pouring over log files, often adding additional instrumentation and waiting for the next day's test run. Rinse and repeat.
I think unit tests serve a different purpose and are complimentary to end-to-end tests. Unit tests are more about developer productivity and creativity, rather than verifying that the system is functioning correctly. For a developer, the feedback cycle derived from end-to-end tests is just too long. If the activation energy required for experimentation and feedback becomes too high, it becomes a barrier to exploration, learning, and making progress. Unit tests provide almost instant feedback. They are supportive of experimentation, as there is no need to deploy the system to run the tests. Unit test can easily be executed on the developer's workstation and integrate seamlessly with most IDEs.
Unit tests, unlike end-to-end tests, can easily be executed when code is committed to source control. If a unit test fails, the commit can be rejected. This means that the code under source control can be maintained in a state where it is always functional. This has great benefits, particularly when dependencies are shared across multiple teams or projects.
Another complimentary aspect of unit tests is that they support thinking about how the code is factored. This is not something end-to-end tests generally encourage. Test Driven Development (TDD) has been a popular practice for helping developers write better code. While TDD doesn't necessarily prescribe unit tests, they are usually central to it. I am not a purist in terms of TDD. I rarely write formal unit tests upfront. As I'm working, I generally write tests to support experimentation, preferring to formalize my tests once I have settled on a path forward. After I started unit testing, however, it certainly helped me develop better interfaces and simpler implementations than I would have otherwise. Perhaps an expert programmer with lots of experience can effectively structure each unit for reuse, composition, and evolution, and rely strictly on end-to-end tests, but for the rest of us mortals, I think unit testing tends to encourage better design and more effective organization.
In the introduction, I mentioned two specific concerns regarding writing unit tests in addition to end-to-end tests. The first was the overlap between unit tests and end-to-end tests. There will not necessarily be a one-to-one relationship between a unit test and an end-to-end test, but there will often be duplication. I think it is effective to test the same functionality under different conditions. I actually embrace this duplication. I'd rather something be tested twice, rather than not at all, and I think considering the same test from different perspectives, at different times, by different people, ends up improving the overall system. The value in this investment is not always apparent. Often it is not the test artifacts themselves but the act of testing that ends up improving the overall system and the skills of the people that work on it.
The second concern was that unit tests make the system hard to evolve. I find simple unit tests that focus on testing one thing per test method are rarely difficult to refactor, if refactoring is even necessary. I've never found unit tests to be a burden in terms of evolving a system. In fact, I think a system tested largely with end-to-end tests ultimately becomes harder to evolve, as people become afraid of making changes when they cannot easily characterize the consequences. I can understand that when unit tests involve a lot of mocking that one feels it is near impossible to evolve the system. But by employing a lot of mocking, one inevitably ends up tying the tests directly to the implementation. I rarely, if ever, use mocks. When a test requires mocking, I usually reconsider the design and modify my approach so that mocking becomes unnecessary, or I reevaluate whether the feature would be more effectively tested with just an end-to-end test.
Whenever possible, I write both a unit test and an end-to-end test. I view unit tests as complimentary to end-to-end tests. End-to-end tests verify the behaviour of the system as a whole, while unit tests support developer productivity and creativity. I embrace the diversity of testing the same aspect from multiple perspectives. I like how unit tests inform software design and organization and keep the code base healthy when they must pass in order to commit code.