Deleting the Wrong Part or Process: Concrete Thinking in Software Organizations
Elon Musk described his design process during an interview while touring SpaceX: 1) make your requirements less dumb, 2) delete the part or the process, 3) simplify or optimize, 4) accelerate cycle time, 5) automate.[1] He emphasized if you are not regularly adding back ten percent or more, you probably haven’t deleted enough.[2]
For hardware, this process is extremely effective. You continually apply these steps, deleting requirements, and parts and processes, adding some back when you have deleted too much, up to the point where you have refined the product and it is ready to ship to customers.
Similarly, for software, unnecessary requirements, components, and processes only add complexity, costs, and risks, and, therefore, must be ruthlessly deleted. However, if you are too focused on the immediate and the concrete, you may fail to grasp software abstractions and delete the wrong parts or processes, ultimately robbing software of its future reliability and extensibility.
With software, flexibility is essential. Unlike hardware where you refine the product up to the point where you ship it, software is used to adapt your product after it has shipped. Examples include iterating on the product based on customer feedback or changing market conditions, working around hardware defects or limitations, or adding new features or integrations. If not, why are you using software? Why not just include a fixed set of capabilities in hardware?[3]
One of the best measures of software quality is its receptiveness to change—how easily and reliably can it be modified?[4] Dave Farley, in the book Modern Software Engineering, notes that an incremental approach to software development is important. It is achieved by managing complexity, which includes automated testing that can be used to evaluate the quality of the software on an ongoing basis:
You can think of this as a fairly defensive approach, and it is, but the aim is to keep our freedom of choice open. That is one of the significant benefits of working to manage complexity. As we learn more, we can change our code on an ongoing basis to reflect that learning. I think a better adjective than “defensive” is “incremental”.
—Dave Farley, Modern Software Engineering
A story will help illustrate.
We Only Need To...
Imagine you have a software service that needs to store environment-specific configuration parameters. Someone suggests storing the configuration in a database, like SQLite, but this is rejected because of the “complexity” it would introduce when, “We only need to store a file.”
A few months into production, you get an urgent escalation. The service will not start because it cannot read the file. The file is corrupted. The last time the file was saved it was only half-written before the process abruptly terminated. It turns out you need to write the file atomically. No problem, you think. Renaming a file is an atomic operation, so you modify the program to write a temporary file, and if this succeeds, you rename it. While you are at it, you add some additional code for concurrency control. A bit more code, a few more tests, and a little more documentation, but problem solved.
A few months later you are debugging another difficult issue. You realize the current configuration is insufficient and you wish you had the previous configuration, in addition, to understand what has changed over time. You implement a write-ahead log that stores each configuration file. You also write a basic query language that can return the latest configuration file, or the last n configuration files, or a configuration file that was in effect at a certain time. Finally, you add a command-line tool. More code, more tests, and more documentation, but you are enjoying this intellectual playground.
Later, you realize the history of changes is still insufficient. You also need to know who made the changes. You need to store an identity, add a foreign-key relationship (or at least the appearance of foreign-key relationship), add a “where” filter to the custom query language, and perform a schema migration from the old schema to the new one that includes the identity. Even more code, even more tests, and even more documentation.
Congratulations, you are incrementally writing a database, a database that looks a lot like SQLite. What happened to just storing a file?
Software development should be incremental, but this is the wrong kind of incremental. Good software design should reduce the cost of change, not increase it. All of these changes are a significant undertaking involving custom code and tests, and fundamental changes to the original design. Only you deeply understand this code, rather than adopting a widely used, tested, documented, and deployed industry standard, like SQLite.[5]
If you started with SQLite, you could have used it to just store a file, and it would do so with transactional guarantees such that you would rarely, if ever, worry about corrupted files.[6] In addition, SQLite already supports tables, foreign keys, foreign key constraints, a SQL query language, functions and operators for JSON that might be useful, and much more. Instead of choosing a technology to make your software both more reliable and easier to modify, you deleted the wrong part or process in the name of reducing requirements and complexity. “We don’t need a database! We only need to store a file!” You can imagine a similar scenario playing out when choosing other software components, software interfaces, programming languages, platforms, or strategies for testing, deployment, and monitoring.
Personality Types, Concrete Thinking, and Abstractions
Part of the problem in the example above is a failure to appreciate the fundamental abstractions. When data are stored in computing systems, it is pretty much inevitable that writes require transactional guarantees, even if they are implicit, that changes will need to be tracked over time, and that someone, at some point, will have to understand who did what, and when. The initial requirements may be as simple as storing a file, but they will inevitably grow more complex.
Drawing abstractions is difficult and we are not always going to get them right, but making an abstraction, even if it is imperfect, is often better than no abstraction. In Kubernetes, it is not that YAML or JSON are perfect formats for defining the configuration of resources like deployments, secrets, or persistent volumes. These two formats both have many trade-offs. But they are good enough formats, and because they are supported by the platform and everyone uses them, they come with standard tooling and well-defined expectations and system behaviors that every service can build on top of.[7] If you have a platform where every product or service needs to decide how to store, update, query, and audit configuration files, with all of the challenges of the example above, you have failed to appreciate basic abstractions, and you have failed to create a platform others can effectively build on, test, update, integrate, operate, or extend.
Drawing effective abstractions is the essence of software development and systems design. Abstractions are essential to the non-functional qualities of software, like testability, coupling, and cohesion,[8] and abstractions determine how effectively software can be evolved over time.[9] Some abstractions, like security, privacy, redundancy, replication, durability, portability, or scalability, are very difficult to add after the fact because they change many fundamental assumptions about the software system. It doesn’t mean you need to implement each one right away, or ever, but you have to anticipate them and design for their possibility. If not, adding them later may involve a complete redesign, throwing everything out and starting from scratch. It is foolish to make assumptions like “We can add redundancy later!” in an effort to delete requirements or scope. I’ve seen organizations struggle for years to implement features like this on top of software that was developed without these abstractions in mind.[10]
Intuitive thinkers are speculative, theoretical, philosophical, and conceptual.[11] They ponder possibilities, not just the current status or circumstances. Less than 30 percent of the general population has a preference for intuitive thinking, and intuitive systems thinkers are an even smaller population.[12] The majority of the population is sensing, attending concretely to tangible, observable facts. Hardware is very concrete, especially after it is manufactured and ships to a customer. Software is far more abstract, especially when designing software for extensibility and change. Intuitive, systems thinkers tend to annoy others, being perceived as too focused on the future, or on the ideal, or the theoretical, especially when the dominant thinking style is more concrete, and intuitive thinking is not well understood.[13] Concrete thinkers prefer to focus on more immediate objectives, what can be measured, and on a concrete plan. But if you don’t trust the intuitive systems thinkers in your organization to help draw abstractions and design software systems, you are likely deleting the wrong things when it comes to simplifying software design.[14]
Deleting the Wrong Part or Process
Fred Brooks, in his famous book on software development and project management based on his experiences developing the operating system for IBM’s System/360 mainframe computer, observed:
Of course the technological base on which one builds is always
advancing. As soon as one freezes a design, it becomes obsolete in
terms of its concepts.
— Fred Brooks, The Mythical Man-Month
This is true of hardware and software. For hardware, eventually one must stop iterating and freeze the design, at least for that revision of the product, in order to manufacture it and ship it to customers within the specifications. With hardware, deleting requirements, parts, and processes means eliminating what needs to be procured, manufactured, certified, tested, or replaced when it fails. This improves quality and reliability, saves time, and reduces costs.
Software is different. The whole purpose of software is to be able to change it after you ship it, evolving it without changing the hardware.[15] When you simplify software by deleting requirements, or by deleting parts or processes, you need to ensure you are not also deleting the abstractions that will later provide flexibility and reliability. This is a difficult balance to strike, because just like hardware, adding the wrong components only increases development and operational complexity, and the cost, time, and risks of change. Appreciating the difference can be difficult for concrete thinkers or for people without years of experience in software development and operations.[16]
For software, trust in technologies, patterns, practices, and people who have demonstrated success in adapting to change, being extra careful not to delete the wrong part or process. For software, simple is always better, but less isn’t always more. It is never as simple or as concrete as, “We only need to store a file.”[17]
Elon describes this process at 13:25 in the video Starbase Tour with Elon Musk. ↩︎
While this engineering process is effective for deleting requirements and processes, very few organizations seem to be able to apply it to administrative bureaucracy, as I wrote in Administrivia: Reconsidering the Engineering and Management Tracks ↩︎
For example, relying on application-specific integrated circuits (ASICs). The AWS Nitro System uses ASICs for dedicated, off-loaded processing for storage, networking, Remote Direct Memory Access (RDMA), and more, as described by James Hamilton in AWS Nitro System. ↩︎
The DORA metrics are measures of the velocity and the stability of software changes. ↩︎
LLMs are experts in SQLite, but not your custom database. ↩︎
The SQLite documentation recommends it as an excellent choice for application files and as a replacement for ad hoc disk files: Appropriate Uses For SQLite. Using SQLite can be faster than using the file system: 35% Faster Than The Filesystem. ↩︎
I encourage you to study Promise Theory developed by Mark Burgess. If we can understand the system in terms of the underlying promises it can keep, it enables many others to build on top of this foundation, including building other software, writing documentation, drafting and executing test plans, and answering Request for Proposal (RFPs) in a sales process. If the promises are unclear or constantly changing, all of this becomes very difficult. Mark describes the challenge of systems development in his book Thinking in Promises: Designing Systems for Cooperation: “Figuring out how systemic promises emerge from the real promises of individual agents is the task of the modern engineer. We should not make light of the challenge. System designers think more like town planners than bricklayers, and need to know how to think around corners, not merely in straight lines.” ↩︎
In Modern Software Engineering, Dave Farley defines coupling as: “B must change behavior only because A changed”. He contrasts this with cohesion: “When a change to A allows B to change so that both add new value”. In a large organization, it is critical to minimize coupling and maximize cohesion. ↩︎
In Modern Software Engineering, Dave Farley recommends a layer of abstraction when interfacing with software from a different scope: “As a default stance, or a guideline, I recommend that you always add Ports & Adapters where the code that you talk to is in a different scope of evaluation, such as a different repo or a different deployment pipeline. Taking a more defensive stance in these situations will make your code more testable, yet again, and more robust in the face of change.” ↩︎
Another example I’ve seen is avoiding abstractions for naming telemetry, concatenating this information into tag names instead. You end up with tag names like
Site1_Building4_Line4_BackupPump_Inlet_m3s_new_v2
. This approach is certainly concrete and helps you move fast, at least initially, but eventually breaks down when people name things inconsistently, no one agrees on the asset definitions and relationships, parts are replaced or processes are modified, and organizations change through mergers and acquisitions. You can’t even rename things because you are never sure what will break. One of the important innovations OSIsoft brought to the PI Server was to store telemetry with an identifier that was simply a foreign key to an asset database that stored rich and extensible information describing relationships, like hierarchy, location, proximity, device type, and direction, that could be used for grouping, search, visualization, or structured analytics. For more on developing abstractions for telemetry, see my essay From a Time-Series Database to a Key Operational Technology for the Enterprise: Part II and my keynote talk From Fast-Data to a Key Operational Technology for the Enterprise. ↩︎I am using the terms intuitive/intuition and sensing/sensation from analytical psychology and the research of Carl Jung. Most people are familiar with these terms from personality tests for the Myers–Briggs Type Indicator which is derived from Jung’s work but different from Jung’s model of typology. For more on Jung’s typology, our dominant and less developed ways of functioning, and how these can change over time, listen to Daryl Sharp on episode #5 of the Speaking of Jung podcast: Jung’s Model of Typology. To understand more about typology, I recommend reading Please Understand Me II by David Keirsey and Personality Types: Jung’s Model of Typology by Daryl Sharp. ↩︎
The fact that intuitive thinkers are less than 30 percent of the general population is widely quoted. I believe the study in MBTI® Manual for the Global Step I™ and Step II™ Assessments 4th Ed is the source of this data, but it is also observed by organizations that administer various personality tests. The large difference in preference between sensing and intuitive types is much wider than the other two functions, thinking versus feeling, which is closer to 50 percent. One way of functioning, intuition versus sensing, or thinking versus feeling, isn’t better than the other—they are complimentary. It is just that usually we have a preference for one because it has served us well, while the complimentary one remains underdeveloped. ↩︎
One of the things that annoys people about intuitive systems thinkers is that they tend to repeatedly diverge their thinking when everyone else is trying to converge, asking questions like, “Are we solving the right problem?” or “Do we have the right solution?” This is described as the “Double-Diamond Model of Design” in the book The Design of Everyday Things, but Don Norman. ↩︎
We have all been involved in a project where we added too much complexity anticipating needs that never materialized. There is often a small difference between adding components or features that only add complexity versus adding abstractions that make software reliable and flexible. Software teams erode trust when they promise the right abstractions but then build software much more concretely. The software ends up overly complex, hard to test, narrow in scope, and difficult to change. Projects end up late or failed. The next project becomes more difficult, not easier. In my opinion, this is often the result sensing types masquerading as intuitive thinkers when it is not their natural thinking style. It is easy to write software all day long, so leadership needs to be exceptionally careful that software teams are capable of building effective abstractions. I recommend leaders focus less on measures of software quality that are divorced from how quality software is created, like the number of bug escapes, and more on measuring practices that achieve quality software, like the time it takes to run tests, the quality of test coverage, or the DORA metrics. While these measures are more abstractly related to quality than the concrete things you are concerned with—the number of bug escapes, the customer impact—as Will Larson says in the essay How to create software quality, “Feedback across iterations tends to measure quality, which informs future quality, but does not necessarily create it.” This is consistent with Dave Farley’s focus on improving the software development process through practices like automated testing and Continuous Delivery (CD), as described in Modern Software Engineering. This is also why Amazon focuses on input metrics rather than output metrics—controllable, leading indicators rather than lagging indicators—as described in Working Backwards: Insights, Stories, and Secrets from Inside Amazon, by Colin Bryar and Bill Carr. ↩︎
Manufacturing probably falls somewhere between hardware and software. Manufacturing is concrete add less abstract than software, but it is also a process and it can be refined over time, as the Japanese made famous with methods for continuous improvement. ↩︎
While Elon puts a lot of emphasis on concrete first-principles thinking, I believe he often relies on his intuition when simplifying problems. I am an intuitive thinker. This can be challenging when I just have a feeling something will work, or won’t work, but I can’t yet articulate why from first principles. I appreciate when colleagues respect my intuition for a problem, but also create space to return to it in a few days once my intuition has had time to develop into more concrete arguments or examples. Naturally, my intuition is not always right. Writing is one of the best ways to go from intuitive thinking to more concrete arguments. It refines and improves the quality of thinking, and allows intuitive thinkers to engage concretely with sensing types. This is one of the reasons Amazon has seen such value in a culture of writing, as described by Jeff Bezos on Lex Friedman’s podcast. ↩︎
Just so none of the examples I’ve provided here are misunderstood, they came from observing colleagues of mine over a decade ago, intuitive thinkers regularly misunderstood by the concrete thinkers they worked with. If the examples resonate, that’s because I think they are universal, and it is largely why I wrote this essay. ↩︎