Rethinking Streaming Workloads with Akka Streams: Part III

This series is focused on recomposing workloads that are not traditionally expressed as streaming workloads. Employing a streaming approach can make a problem simpler, more elegant, and more natural. The first article explored streaming tools for flow control, bounded resource-constraints, and error handling, and demonstrated how these tools can be…

On Eliminating Error in Distributed Software Systems

This article expands on my talk What Lies Between: The Challenges of Operationalising Microservices from QCon London 2019. In designing, developing, operating, and evolving distributed software systems, failure must be considered intrinsic to the system. In any sufficiently large or complex system, there will always be something that has failed.…

Engineering as Sketch Comedy

I do not watch a lot of comedy, but there have been a few moments that have struck my fancy and remained with me. I want to recall three sketches and relate them to engineering. Anyone who knows me will have heard me relate one or more of these sketches…

Rethinking Streaming Workloads with Akka Streams: Part II

In the first installment of this series, I demonstrated how materializing an Akka Stream is relatively inexpensive and I explored how to express a number of workloads—workloads that you may not initially think of as streaming workloads—using the Akka Streams API to provide concurrency control, throttling, circuit breaking,…

Observations on Observability

This article expands on one section of my talk What Lies Between: The Challenges of Operationalising Microservices from QCon London 2019. Over the past few years, observability has become a prominent topic in distributed computing. Observability means different things to different people and the use of the term is still…

Rethinking Streaming Workloads with Akka Streams: Part I

The Akka Streams API is one of my favourite tools for building reactive, distributed applications. If you are not familiar with it, I published an article on the motivations for using the Akka Streams API, as well as an article on how its powerful semantics address common patterns when streaming…

Licensing Software for Mutual Success

Most enterprises make extensive use of open-source software. Many are attracted by the price: free. Free from the perspective of some people, anyway. Most enterprises also purchase a lot of commercial software and services. I want to explore how licensing terms often discourage using the software most effectively—encouraging suboptimal…

Kubernetes Liveness and Readiness Probes: Looking for More Feet

Kubernetes liveness and readiness probes are mechanisms to improve service reliability and availability. For example, if a container is unresponsive, restarting the container can make the application more available, despite the defect. I have written two articles on how these mechanisms, designed to improve system reliability and availability, can make…

Kubernetes Liveness and Readiness Probes Revisited: How to Avoid Shooting Yourself in the Other Foot

Previously, I wrote an essay describing how Kubernetes liveness and readiness probes can unintentionally reduce service availability, or result in prolonged outages. In the conclusion of that essay, I highlighted Lorin's Conjecture: Once a system reaches a certain level of reliability, most major incidents will involve: A manual intervention that…

Essential Software Tools for Developing Operational Technologies

This article expands on a keynote that I gave at Reactive Summit 2018. I also discussed these topics on the Real-World Architecture Panel at QCon San Francisco 2018. An operational technology combines hardware and software to monitor and/or control the physical state of a system. Examples include systems used…

Kubernetes Liveness and Readiness Probes: How to Avoid Shooting Yourself in the Foot

Kubernetes liveness and readiness probes can be used to make a service more robust and more resilient, by reducing operational issues and improving the quality of service. However, if these probes are not implemented carefully, they can severely degrade the overall operation of a service, to a point where you…