Is the Scientific Method Valuable in Engineering?

Jun 23, 2024

In the novel Zen and the Art of Motorcycle Maintenance, Robert M. Pirsig states: “The purpose of scientific method is to select a single truth from among many hypothetical truths.” He summarizes the scientific method as follows:

(1) statement of the problem, (2) hypotheses as to the cause of the problem, (3) experiments designed to test each hypothesis, (4) predicted results of the experiments, (5) observed results of the experiments, and (6) conclusions from the results of the experiments.

The scientific method is held in high regard as a reliable process for acquiring and refining knowledge. It is arguably valuable in science, so people accept the scientific method as a universally good thing, and even aspire to it, without questioning its value. In engineering, fixating on a hypothesis, and working to confirm or falsify it, can lead us astray.

Question-Driven Engineering

In Bryan Cantrill’s excellent talk Things I Learned The Hard Way, he talks about effective debugging.^[1] Provocatively, he explains how people say they debug using the scientific method, but asserts this comes from an inferiority complex of software engineering not being a real engineering discipline—we try and sound scientific to desperately up-level what we are doing. He questions the value of the scientific method for debugging:

Do we use the scientific method? Does science use the scientific method? I would argue, no. We form a hypothesis, then conduct an experiment to attempt to contradict the hypothesis—bull shit, I don’t debug that way!
—Bryan Cantrill

He goes on to describe how a hypothesis is a guess and debugging with hypothesis-centric engineers drives him crazy: “I know this! It’s this! Oh, it wasn’t that, it’s this!”^[2] Instead of guesses, he encourages asking questions of the system:

What question do you want to ask of the system? Even if there is no way to ask it. Then get creative about how you ask it.
—Bryan Cantrill

He concludes by explaining how guess-driven debugging focuses on the things we already know, whereas question-driven debugging invites creativity.^[3] Perhaps the most valuable part: even when existing systems cannot answer the questions we want to ask, we can adapt them to leave artifacts to answer these questions, like adding a new metric, log message, or tool. Essentially, using engineering to answer those questions and improve the system over time.

Similar to debugging, the same thing can happen when we are trying to optimize the performance of a software program. It is not fruitful to guess where the CPU is consumed: “It must be serialization and deserialization! No, it’s the spin lock! Wait, maybe it’s inefficient memory management!” Instead, we can simply ask: “How is the CPU time being used?” From there, we can use a CPU profiler, like a Flamegraph, to measure how the CPU is being used. The path to improving application performance may be obvious, or we may need to control certain variables, modify the software, or use our intuition gained from experience to narrow down the possibilities. But there is no need to guess.^[4] We can experiment and measure, following a sound engineering methodology. Even once we have found an area for improvement, making it more efficient won’t necessarily be obvious. For example, serialization and deserialization of data can be expensive, but is often unavoidably necessary. Actually making a performance improvement may require even more questions and creativity.

Another situation in which engineers ask questions rather than form hypotheses is the discipline of continuous improvement. Continuous improvement is a technique popularized in the process industries and manufacturing to understand how to operate a process more efficiently by manipulating control variables using designed experiments while keeping the product within specification. For instance, operating a chemical reactor at a slightly lower inlet temperature might produce a product of higher quality, or a product of equal quality more cheaply. Again, no one is forming a hypothesis, they are simply manipulating process variables around control setpoints to better understand the process. They are asking a question of the system through controlled experiments—“What happens if I decrease the inlet temperature by one degree Celsius?”—and observing what happens.^[5]

Modern Software Engineering

I recently read the book Modern Software Engineering by Dave Farley. He explores whether software engineering is a real engineering discipline. He makes the argument that it is when we approach software development and operations using engineering methods that produce superior outcomes. Farley argues that as software engineers, managing complexity is one of the most important parts of our job. We do this by applying established techniques, including automated testing, and continuous integration and continuous delivery (CI/CD), that encourage modularity, cohesion, separation of concerns, and rapid feedback. I think it is an excellent book and an important contribution to our industry.

Farley defines engineering as follows:

Engineering is the practical application of an empirical, scientific approach to finding efficient, economic solutions to practical problems.
— Dave Farley

Farley quotes Richard Feynman on the scientific method and notes the importance of forming a hypothesis and performing experiments. On the surface, it might seem Farley has a different viewpoint than Cantrill, but after a more careful reading, Farley is arguing for a scientific approach—controlling variables, testing assumptions, and performing experiments—to iteratively solve problems and improve processes, not a rigorous application of the scientific method.^[6]

In addition, Farley puts a lot of emphasis on the importance of repeatable, automated testing to encourage modularity, cohesion, and a separation of concerns. Using testing as a feedback mechanism is further enhanced by using techniques for CI/CD, shortening feedback cycles and improving repeatability. But these tests are not being used to disprove a hypothesis or discover a fundamental truth of the system, they are being used to test the invariants of the system and provide the guardrails for continued experimentation, discovery, refinement, and iteration.^[7] Consistent with Cantrill’s statement on improving systems by making them receptive to the questions we want to ask, Farley notes that in order to write tests, we must control variables and establish measurement points where the results are visible and measurable. This encourages modular design, and incorporating these tests into CI/CD establishes a repeatable process which allows for comparisons and improvements over time.

Hillel Wayne also explored the question of whether software engineering is a real engineering discipline by interviewing people who have worked as software developers and in traditional engineering fields, like civil, electrical, mechanical, and chemical engineering. A summary of this work is presented in the talk Is Software Engineering Real Engineering? Wayne concludes software engineering is an engineering discipline, sharing similarities with traditional engineering disciplines, like being iterative, involving uncertainty, and using a mix of formal and informal methods. Software engineering is different from traditional engineering disciplines in that we can iterate much faster due to the flexibility of software and the minimal capital investment required for most experiments.^[8] Another advantage is deterministic code will always return exactly the same result, not just a result within certain tolerances. Farley also notes these two unique advantages of software engineering. Wayne reports the one thing people consistently said traditional engineering could learn from software engineering is the practice of version control. Farley would probably go even further and say version control in concert with CI/CD.^[9]

Most comparisons of software engineering to traditional engineering focus on production engineering, like the building of bridges, buildings, or widgets. More common to each engineering discipline are processes, not products. These can be processes refined in order to build products like bridges, buildings, and widgets, or processes in themselves, like in the production and management of chemicals, heat exchange, or electricity. Furthermore, engineering is often applied to incrementally improve an existing process rather than invent something entirely new. This highlights the importance of establishing a repeatable process, because without it, there is nothing to engineer. In software engineering, we can establish processes around version control, change management, testing, refactoring, modularity, scaling, and so on, and we can even combine aspects of these into a more comprehensive process we call CI/CD. Once we have a repeatable process, we can apply engineering to make it safer, cheaper, more scalable, more secure, and so on.

It is easy to write software to solve a problem. It is much harder to write software that can be adapted over time to accommodate new requirements, scale to meet growing needs, satisfy industry compliance and regulation, be maintained and modified by someone else, or retire the old thing and not just introduce the new thing.^[10] It is valuable to draw the distinction between a software developer and a software engineer. Anyone can write software, therefore, anyone can be a software developer. In contrast, a software engineer applies established engineering processes that result in superior outcomes over time. ^[11] Software engineers apply a scientific approach, through controlling variables and performing experiments, and by applying mathematics and statistics, but they don’t use the scientific method.^[12]

The Way of Zen

Although The Way of Zen, by Allan Watts, is not a book about science or engineering, it has a lot to say about both, including subjects like rational decision making and intuition. It also describes the limitations of the scientific method:

The rigorously scientific method of predicting the future can be applied only in special cases—where prompt action is not urgent, where the factors involved are largely mechanical, or in circumstances so restricted as to be trivial.
—Alan Watts, The Way of Zen

In science, contradicting a hypothesis can be hard because experimentation is challenging and because it can be difficult to control variables. In addition, a lot of effort often goes into explaining the errors in measurements.^[13] Engineering is different in that we are not working to prove or disprove a hypothesis, but rather iteratively and empirically improve a process by applying proven methods.^[14] Software engineering is unique in that a lot of our experiments are deterministic, perfectly reproducible, and cheap, fast, and easy to reproduce.

We feel that we design rationally because we base our decisions on collecting relevant data about the matter at hand. Yet we might ask whether we really know what information is relevant, since our plans are constantly upset by utterly unforeseen incidents. We might ask how we know when we have collected enough information upon which to decide.
—Alan Watts, The Way of Zen

An important difference between engineering and the scientific method is that in engineering we need to have a sense of quality—we need to be pragmatic, we need to understand what is good enough. Science is about seeking the truth and the search for the truth can go on a long time and without valuable outcomes. Engineering is about solving a problem or making something better, sometimes from first principles and sometimes empirically, but without necessarily understanding a fundamental truth. Engineering is about delivering a certain level of quality—making something faster, cheaper, cleaner, or more reliable—but doing so within constraints like cost, time, resources, physical constraints, or product specifications. In science, we form a hypothesis and design experiments to falsify that hypothesis.^[15] In engineering, we establish a process, and apply methods rooted in science, mathematics, and statistics, to ask questions of the system, run experiments, and improve the process.

I began with a quote from Zen and the Art of Motorcycle Maintenance describing the scientific method.^[16] Zen and the Art of Motorcycle Maintenance is ultimately a book about quality, not the application of the scientific method, and I will return to it in conclusion:

The difference between a good mechanic and a bad one, like the difference between a good mathematician and a bad one, is precisely this ability to select the good facts from the bad ones on the basis of quality...This is an ability about which formal traditional scientific method has nothing to say.
—Robert M. Pirsig, Zen and the Art of Motorcycle

Bryan mentions as an aside that if he lives long enough he will write a book on debugging methodology. I hope Bryan writes this book because I think it would be an invaluable and lasting contribution to our industry. ↩︎
Guess-driven debugging reminds me of the Try It Now! skit by The Kids in the Hall. The car won’t start and they come up with a number of ridiculous hypotheses, including washing the windshield and painting the car a different colour. Amusingly, the Wikipedia page for the Scientific Method notes: “Failure to develop an interesting hypothesis may lead a scientist to re-define the subject under consideration.” I wrote more about this The Kids in the Hall skit in my essay Engineering as Sketch Comedy. ↩︎
Zen and the Art of Motorcycle Maintenance describes how it is often obvious what is wrong: “You don’t need any scientific experiments to find out what’s wrong. It’s obvious what's wrong. What you need is an hypothesis...and the scientific method doesn’t provide any of these hypotheses. It operates only after they’re around.” Echoing Cantrill, it also describes how we focus on what we know: “You need some ideas, some hypotheses. Traditional scientific method, unfortunately, has never quite gotten around to say exactly where to pick up more of these hypotheses. Traditional scientific method has always been, at the very best, 20-20 hindsight. It’s good for seeing where you’ve been. It's good for testing the truth of what you think you know, but it can’t tell you where you ought to go, unless where you ought to go is a continuation of where you were going in the past. Creativity, originality, inventiveness, intuition, imagination—'unstuckness,' in other words—are completely outside its domain.” ↩︎
Years ago, I recall needing to profile L1 and L2 CPU cache misses using the Intel VTune profiler in order to understand a particularly difficult performance problem in C++ code. ↩︎
Techniques for continuous improvement haven’t been popular in software, but I expect this to change as more and more of the software we currently write moves down into the infrastructure. For more on continuous improvement see my article Observations on Observability. ↩︎
My calculus professor in my first year of university, Dr. Boyd, used to say, “Now I will prove this the rigorous way. I usually like to avoid rigor, because with rigor comes mortis.” ↩︎
It could be argued that the fundamental truth is evaluating if the software is always in a releasable state. But tests never perfectly replicate production environments and we need to continue to use empirical engineering techniques to evaluate software once deployed to production systems. ↩︎
I studied process control engineering, a combination of applied mathematics and chemical engineering. I think what attracted me to software engineering was the ability to work in systems but experiment and iterate rapidly. ↩︎
Jason Cox has done excellent work to bring software engineering practices to the software and control systems that run the rides at Disney theme parks, even while concrete is being poured and steel is being assembled. This industry didn’t traditionally use these practices, but the discipline of CI/CD has significantly reduced the time it takes to develop or update a ride, and the ability to iterate quickly has allowed people to be more creative. The key was establishing a reliable and repeatable process. Jason briefly mentions some of these outcomes in the talk Disney DevOps: Creating Digital Magic. ↩︎
I’m reminded of Joel Spolsky’s article The Iceberg Secret, Revealed and my essay The Iceberg Secret Is Just the Tip of the Iceberg. ↩︎
If a team ignores or is resistant to automated testing, refactoring, evolving systems, or retiring old systems, it is a sure sign they are developing software, but not practicing software engineering. Farley warns: “Where there is a temptation to have long-running tests, or manual tests, these are often symptoms of an inappropriate lack of controlling variables. We often don’t take this idea sufficiently seriously.” Farley is clear that software engineers do not ask for time to write tests or refactor code, it is just part of doing the work. Not surprisingly, the research in the book Accelerate: Building and Scaling High Performing Technology Organizations showed that there is no trade-off between quality and speed. ↩︎
A measure of the quality of a software system is how adaptable it is to change. In the software classic The Mythical Man Month, Fred Brooks provides advice on planning the system for change and planning the organization for change and agrees with Farley on the importance of modularity: “The degree of this modularity determines the adaptability and changeability of the program.” ↩︎
I’m grateful to Percy Link for giving me the book The Way of Zen. As an aside, Percy’s PhD thesis is a study in the challenges of controlling variables and accounting for measurement error when applying the scientific method in the real world. ↩︎
Engineering at the university I went to was formally called the Faculty of Applied Science. ↩︎
Being facetious, maybe forming a hypothesis and experiments for disproving it is only part of the scientific method because of the need to apply for research grants. “I want to change a few variables and just see what happens!” is, perhaps, not convincing enough for a grant proposal. ↩︎
Unlike The Way of Zen, which is a book on orthodox Zen Buddhism, Pirsig warns in the introduction to Zen and the Art of Motorcycle Maintenance: “Although much has been changed for rhetorical purposes, it must be regarded in its essence as fact, However, it should in no way be associated with the great body of factual information relating to orthodox Zen Buddhist practice. It’s not very factual on motorcycles either.” ↩︎