Will the Real Continuous Delivery Please Stand Up?
Summary:
Are you actually doing continuous delivery—or just checking the boxes? In this insightful talk from Prodacity 2025, Edward Hieatt, Chief Customer Officer at Mechanical Orchard and an original pioneer of Agile, breaks down what real continuous delivery looks like, why most teams fail at it, and the key metric that actually measures success.
Hieatt challenges common Agile vanity metrics, explains why most organizations still operate like it’s the 1990s, and introduces a practical framework for reducing volatility, increasing predictability, and making continuous delivery a reality.
🔹 Key Topics Covered:
- The real goal of continuous delivery (hint: it’s not just deploying faster)
- Why most organizations are still stuck in outdated software delivery models
- The problem with measuring Agile success using vanity metrics
- The volatility metric: The best way to measure predictable software delivery
- How disciplined feedback loops drive real agility
- Why software teams need less chaos, more discipline, and better measurement
🕒 Key Highlights & Timestamps:
[00:03] - Introduction: The origins of Agile & the evolution of continuous delivery
[02:45] - Why most teams focus on the wrong Agile metrics
[06:04] - The traditional software delivery model—and why it fails
[09:36] - How modern product teams must adapt to ever-changing requirements
[12:35] - The shift from big upfront design to continuous improvement
[14:58] - The volatility metric: Measuring true agility & predictable output
[18:38] - The disciplines that drive continuous delivery success
[22:20] - Why software culture should follow business goals—not the other way around
[24:19] - Final thoughts: CD isn’t a toolset—it’s a mindset
🔗 Stay Connected:
Learn more about Prodacity: https://www.rise8.us/prodacity
Follow us on X: https://x.com/Rise8_Inc
Connect with us on LinkedIn: https://www.linkedin.com/company/rise8/
👍 Like, Subscribe, and Share:
If you’re serious about delivering high-quality software predictably, reducing volatility, and building a truly Agile organization, give this video a thumbs up, subscribe to our channel, and share it with your network. Agility isn’t about speed—it’s about consistency.
#Prodacity2025 #EdwardHieatt #ContinuousDelivery #Agile #SoftwareDevelopment #PredictableDelivery #DevOps #LeanSoftware
Transcript:
Edward Hieatt (00:03):
Well, thank you Bryon for that kind introduction. It's great to be here. So many familiar faces. Looking forward to this talk. Bryon mentioned that he wanted me to come. I was one of the OGs of Agile, which I like that. I like being an original gangster despite my accent. I think I can pull it off. It's a lot better than being called old school because that doesn't sound good old, but it's true that I've been working on software development and how to help large organizations adopt modern practices for some quarter of a century. Now, if you can believe it, and that includes the last decade or so focused in particular on the federal government, DOD and government in general approaching software in a different way. Some of you in the audience have been part of that, so thanks for that and it's great to see you all still leading the charge in this very important area of our country's security and defense.
(01:01):
Now, Bryon wants me to talk about continuous delivery, which we'll attempt to define in a minute, but let's just say for now it means doing software right these days, kind of the aggregation of all the stuff we're supposed to be doing and all the good practices and all the lean theory and the XP and the Agile and the UCD and all that. How do you do that? Well, all the time or something. That's what continuous delivery means, and so I expect in fear of me just giving you a list of practices that you should be doing and wagging my finger and telling me you're doing them not well enough or not and do more testing and more pairing or something, which I'm not going to do that. I mean I may do that at the end, but I'm not going to do that to begin with because in my experience talking to large organizations trying to adopt modern software development practices, they tend to be kind of too caught up in the details of the practices, the individual practices.
(01:56):
Are you doing this? Are you doing that? Will they pay a program? Will they write tests? Are they doing CI and kind of corresponding vanity metrics, if I could say that are perhaps useful in isolation to monitor some particular part of the puzzle, but frankly tend to serve to lose the wood for the trees in the big picture. The big picture being what's the point of adopting continuous delivery? How do you know you're doing it well or not well, right? Where did it come from, et cetera. So I kind of want to step back today and perhaps some of you will find this stuff obvious, but I think it's worth going through from first principles again as to what the point is of continuous delivery and how do you know you're doing it? Well, I've got a metric for you to consider even that is not widely adopted, but I'm pretty passionate about.
(02:45):
That's kind of a high banner metric for how you're doing. So I'll do it in four parts. First of all, what was wrong with the old way of doing software development? This CD thing sounds kind of hard to implement. Why was the old way bad? What was the old way? By the way, what are we doing? Are we doing it right or not? This can be also couched as what does a good software development team mean these days? It's kind of a big question. I want to expose the harsh reality of how hard it is to do CD well, but encourage you to embrace it, embrace the red pill, how to improve how you're doing with continuous delivery. So big conversation, happy to talk in more detail afterwards, but I want to kind of get back to the fundamentals and give you a good motivation in your organization for the why and how well you are doing. So what's the point? What's the of cd? Where did it come from? Seems like it's pretty hard to do. Are we actually finding value in it? Et cetera, et cetera.
(03:50):
Human me for a minute when I'm going to go back to the old way, which was the way when I was coming along as an engineer in the nineties, and frankly if I could say when I go to most organizations is still ultimately the default way because it makes a lot of sense to the industrial revolution mindset, to the managerial mindset, to the divide and conquer mindset, to the analogies to civil engineering, electrical engineering, et cetera. You all know this, but how software used to be built was one big sequence. The famous waterfall, you what are you going to build? Product people write down requirements. How are you going to build it? Architects design important things that tell you how you're going to architect the system. What's it going to look like? Designers make it all pretty in a PowerPoint or in a deck and you end up with perhaps a thing this thick hundreds of pages that says, we're going to build this a product requirements document. You then build it hopefully cheap and fast per the spec, right? You then make sure it works right, and then you hit the deadline and then you give it to the users and you won, right? That's what the old way looked like. In other words, success means being done on time when it comes to delivering the software. It's really that engineering and QA phase.
(05:15):
There you go. That software development, that's an analogy to civil engineering. You're going to build a bridge. You better get that bridge figured out just right where it's going to start, where it's going to end, what materials you're going to use, how much weight should it carry, how many cars at once, how it's going to be a beautiful, beautiful bridge. Sounds like somebody I know, a big beautiful bridge, and then how you're going to get talent to go build that bridge and then make sure it can carry the weight you expected or whatever. You open it with a ribbon cutting and you're done. Maybe you've got to support it afterwards, but you're basically done. So what's wrong with the old way? We'll come back to that after this beautiful gruff. How do we graph our delivery in a very simple high level way? Well, I want to first give you a loose definition of what I mean by delivery here.
(06:04):
I'm not going to get into the whole what's a point, what's velocity, what's business value, what's chores? But basically let's define delivery for the purposes of this talk to be the production, meaning it's ready for production of some functionality in software that's useful for some business purpose or end user purpose, some mission purpose. In other words, someone wants this, who's going to use it or their proxy does? We'll put time down here and we'll put this notion of output business output up the Y axis so that Y axis means actually valuable stuff getting into the hands of end user. It doesn't mean I worked really hard fixing a bug or writing architectural documents or refactoring something or setting up my CI environment or getting my cloud ready or setting up my cloud database. It means somebody got some stuff in their hands they could use.
(06:54):
That's what that means. So if you work for three days on architecture, you get zero down there and the old way gives you zero for quite a while, right? You get your requirements done. Very smart people, long process, lots of thinking, lots of, or we'd now call hypotheses, but at the time was decisions architecture done. Very smart people writing lots of blueprints, design, done lots of UI flow work, et cetera. Very important. Take your time, expensive maybe years, and then you build it and you QA it right? Now what does that look like in the old way?
(07:32):
I think it usually looks like this, and when I talk about what I mean by this, I usually get people with experience in this kind of nodding their heads. It can be that you get off to the races, people are building really fast, they can't wait to build the spec. Then they realize, wait a minute, what do you mean by that again? And they go back to the product person who's kind of in a different division now and they've kind of lost track of what they're doing and they didn't really understand the question. And then you realize you've made a mistake and there's a bug and then you realize et cetera, et cetera. You're going to get these peaks and valleys. Maybe the boss comes and says, we're going to do it, and you go up again and they buy you pizza and you stay one night and then near the end there's kind of a burst of activity. Maybe you hit the deadline and then you fall over exhausted. And why the ups and downs? I won't read them out, but these kind of reasons. Lots of very, very good reasons. Didn't get the requirement that guy's code didn't work. I had a bug I couldn't figure out. Architecture didn't actually make sense in this case. Sorry, architect. He's not even available as she anymore. Integrations didn't exist, et cetera, right? All this stuff means you get peaks and valleys, but what was really wrong with that?
(08:43):
Well, I don't think anything is wrong with that actually. If you're satisfied with what you decided to build to begin with was the right thing to build and you decide that building it on time means success, that can work. You may end up exhausted with burned out teams who want to leave, but if you only care about the end result, I've seen it work, right? You can make that work. That sounds controversial these days, but I've seen it work. The thing is then something happened around the two thousands, and I still think it happened because big internet companies began to make amazing software and then they made it on phones and we all got very spoiled in our consumer home lives and then decided why is our work software so sucky, right? So someone changed the rules around that time and it's kind of bled into enterprise and government over the last couple of decades.
(09:36):
Slowly but surely or something you tell me the are new rules are that the proxy for the user says, I don't quite know exactly what to build, but I know what I want to have happen for the end user to be more successful or for the business or for the mission. I know what I want to have happen, my goals. I think I know the kind of thing that we have to build to make that thing happen. So let's try out some of those early ideas and see if it works, right? That's what I want to do now, and if it doesn't work, we'll change direction or we'll go faster. We'll go that way. I'll go that way. We will pivot and as a result, I want you as a delivery team to be able to hack that. You can handle that, right? Changing requirements, changing direction.
(10:19):
You can't be telling me it's slower. You can't slow down. They don't slow down in internet land, in Google and Facebook and so on. And also, we're never going to be done by the way. We'll always keep going and making it better and better. Now imagine that story being told to a team used to doing it the old way. I can tell you what happened. They threw up all over it and said, you must be crazy. You can't change the requirements. That's that's a sacrosanct set of documentation and we can't go fast enough if you want to do that and we can't change it all the time. What are you talking about? But suddenly that became the expectation, right? In other words, we are trying to tell this delivery team, we're going to change direction on you and the architecture can't be the same always. And you might change the design or you might change the direction of the product, but I want you to keep going fast and also be predictable too.
(11:11):
If you think of it that way. That is a big thing to ask of a delivery team and that is what I would define as continuous delivery. How do you predictably make progress on business output despite all those changes happening? So back to this graph, it's what it used to look like, right? There's a deadline at the end. You have this big long cycle to get there, but now the business or the proxy user may say, you know what? I want to deliver what you've built so far because I feel like it right now, or they need me out in the field and then you know what? That didn't work so well change direction and deliver this now and now build this and deliver it. Now, in other words, arbitrarily at the whim of the proxy user, the business or the PM or whatever, they're can just tell you when to deliver this thing.
(11:52):
That's the expectation. And if you're familiar with the lean theory, of course you have this notion of de-risking at every phase, the sawtooth diagram, et cetera, right? This is the way you do lean development. The problem is that along the bottom there, that old sequence of events does not work because that sequence expects one deliverable at the very end and you can kind of mess around in the middle and screw up and go up and down in that wavy diagram and no one cares. But now it does matter because you're going to deliver it in an arbitrary time. These are like, deliver now tomorrow, deliver today. Try this out kind of deadlines. So that won't work. You've got to get rid of a thinking entirely about the sequence of development and do a different sequence all the time, like every day, every few hours.
(12:35):
And that sequence looks more like this. The requirements will change or be given, the design will happen at the same time, and that's usually a very small amount of functionality. And then it turns out that the way to go fast cannot be that the engineers build it and then QA tests it. That's too slow. Just the handoff itself is too slow. It might take a day that's too slow for that handoff. So something else has to happen. And it turns out, and we'll get back to it, that if you define the requirements in some kind of a test suite and then build against them, you actually can pull off of vastly compressed iteration. And then very weirdly architecture happens at the end. You extract the generalities as you go and with discipline refactor the code to support the requirements that have been given very different loop happening many times per day.
(13:22):
That's the implementation of CD that turns out works the best to solve against this new expectation. And if teams don't know that they're trying to do the old way, that's when things go horribly wrong. Now, how are we doing with cd? How do we know how well we're doing? How do we avoid getting caught up in the vanity metrics, the DORA metrics, the DevOps metrics, the SRE metrics, the agile metrics. There's lots of good metrics, but at the highest level, are we doing well? And this is where I don't see a lot of organizations being honest or really seeing the wood for the trees. What would really well look like if CD's goal is predictable delivery of business output in the face of radically changing requirements and learnings, but it would look like that. Then constant, very high output despite all the change. Now that's ridiculous. That never happens. No one ever gets that. It's not possible to keep being really fast as a team indefinitely despite all the changing requirements and setups and tooling and architecture changes and so on. You're not going to get that. What does pretty good actually look like? Something like this.
(14:35):
I think a pretty good team is delivering in a way that's up and down, but on average is fairly high and never has the really big troughs or the really big peaks. There isn't a product person in the world who would tell you I'd much rather have sometimes really high velocity, but also unpredictably low velocity sometimes versus a pretty high, more stable velocity. Everyone wants a pretty high stable velocity, not a sometimes very high, but unpredictably low sometimes as well. So how do we measure it? Well, this is a metric that I have preached about for a while. Volatility the idea that in fact, as a summary metric, it can't tell you the details of what's breaking down as a summary metric. If you want to measure how consistently productive a team is and how little level velocity shifts over time, in other words, how the team can react to changing requirements, how they can react to refactoring their code to make up architectural shifts for those requirements, how they can fix bugs, how they can deploy, how they can, et cetera, et cetera.
(15:40):
How does that all become just in the noise and stable, right? You want to have as low a volatile velocity as possible. So what is this volatility? If you forgive the middle school or maybe it's high school math, now define it to be the deviation around the velocity. So we all know velocity probably means something like this. The mean, the trailing mean of the last few iterations of your output. So that's about the mean of that line. That's the velocity in most people's parlance deviation around that. Basically how much that differs send a deviation, gives you some measurement. You can even divide that by the mean and end up with a unitless percentage. So independent of how many points you've measured things at or how a team calibrates its estimates, you can end up with a percentage volatility, meaning independent of your absolute mean. How much is it changing as a percentage? And that I propose is a very nice summary metric to indicate all told, everything factored in over the long term. How consistently is the team producing business functionality for the PM or the proxy or the end user?
(16:58):
So that I recommend you look into for your own organization, and this gets us to the red pill. This is a moment of truth for many people when they actually first look at their volatility because many teams have worked very hard on their practices, they're doing well, things look good, and you measure this metric. Ultimately, you'd love to get to a volatility near zero, right? Not much change, which is obviously very difficult. And once you've realized how volatile you are, it's very hard to unsee it and you sort of start to realize the chaos that's going on no matter how much you think you're in control in real life. At one point we took a sample of a hundred thousand projects in a project management tool that many of you probably have heard of anonymously, of course, and these are teams that were using an agile project management tool when that was still kind of a new thing.
(17:49):
So they were really trying to do agile development. If they used this tool of these a hundred thousand, about a hundred were my fault, my problem, but out of a hundred thousand, the average volatility, the average standard deviation about the mean as a ratio was 92%. In other words, every iteration, every week or whatever, it was looking like literally a sine wave, like up and down equal to the mean. Every week people who were trying to do agile really well had that result. Thankfully, our projects had a 50% one, so I could tell our customers we're better, but still 50% doesn't sound very good and it's not very good except for it's really hard to do CD well if you think of it in these terms.
(18:38):
So finally, how is it that we can try to control this entropy, this chaos, get the volatility down? What chance do we have as humans dealing with other humans and computers and systems and entropy and integrations and the rest of it, and of course people changing their mind on what they want to build and users saying, I don't like this. I do like that a lot of fickle people as well as computers and developers in the mix. What chance do we have? There is an antidote, and if you know me, I love this word, discipline. There is an antidote, and I'm not going to go through these all in very great detail given the time, but maybe a minute or two on each, and this sounds like some kind of a covid slide. How do you flatten the curve of velocity? But the antidote, the one chance we have is to adopt very strict disciplines, all of which are designed to in the microcosm notice deviation from expectation and which build in a self-healing facility.
(19:41):
If you do that and ratchet forward progress, you have a chance. If you don't, entropy will win. Your volatility will be up in the nineties or whatever. So that's why these practices come in. In my mind, this is at the end of my presentation. These practices are an attempt. At first it was a Hail Mary, these sound crazy. Let's just try this. Nothing else seems to work. An attempt to control the chaos, and they're all about feedback loops, as I'm sure. So what are we building? A big old spec that's very static. It can't change very easily. Let's have a visible backlog that can be changed by the product team, the development team at any given time that's small enough to have chunks that are estimatable by a developer and doable in a few days. It's a moving target, it's changing. It's clearly prioritized. Ideally, it's a strict linear priority.
(20:35):
There's no multiple similar priorities, et cetera. You can see it wherever you're sitting. You can talk about it, you can challenge it, you can reorder it, you can remove things. You can add things. It's living, but it's visible. Number two, back to the QA idea. Have the developers write tests that document their expectations such that if they don't match, you immediately know the code is not what you expected it was. So in that case, there's no moment where you're saying, I think it's working fine because I'm smart. That's a horrible thing to say. Don't say that. Write the test first. Prove that it's working or not. Continuous integration code that you write, code that that person writes, is it working together? Find out right away refactoring re-architecting, that architecture didn't actually work for the use case. We actually have, no one knew that it wasn't the architect's fault, but now we realize that we should change the code, refactor the code, make it deduplicate, et cetera, right?
(21:33):
Feedback on the duplication in the code or in the poor performance of the code. For example, the famous pair program, which I never put first because it tends to take all the energy out of the room. Pairing is designed for what? Not for consultants to make money, not for people to say, I don't like that. Not for geeks to be uncomfortable, but to force ideas out into the sunlight and be exposed. Clearly, we're trying to build what? What's the test we're going to write? I don't understand you. I don't understand that. Back and forth, back and forth. Have the argument in real time near the work in the microcosm. Then write the code. Why are you doing that? Why are you doing that? The forced verbalization of pairing surfaces, disagreement surfaces, unreadable code problems, surfaces, poor tests near the work and quickly. And of course retros, the whole team is feeding back on itself.
(22:20):
What's working? What's not working? There's a longer list than this. These are the most famous ones from the original OG Agile. All of these things are jargony. They sound like a consultant. They sound like going to the gym on Monday morning. They sound like people wagging their finger in disciplines, and I found that to be a major problem if you lead with these practices. However, if you lead with, we want to produce predictable business output that's not volatile and serves the whims of the business, and we're going to measure how well we're doing at that and see if these kind of things help make it less volatile. The whole mentality of the team, the whole rebellious resistant instinct of the developer tends to melt and we'll see what works. Right? And such longevity is the consultancy that Bryon mentioned that I was part of had came not from the preaching about these practices, but rather the commitment to seeing what practices actually have a good result on the volatility, basically in a simple metric and proving that that worked or not.
(23:26):
These things came out of desperation. If you read the original XP books, it's like, nothing's working. We suck in a big organization. What if we turned up the obvious stuff that everyone agrees to the 11 out of 10, right? You've probably read about this. If QA is a good idea, let's QA all the time. Let's do it before we're at the code. Crazy. If code reviews are a good idea, let's code review all the time, pair programming, crazy, et cetera. No one thought it would be successful. It's crazy talk, but it turned out it was. So innovation on practices, you got to be proving that they do or don't work through data, and I recommend that volatility metric. And ultimately then what we would say is that CD is a goal, is a north star, is a way of thinking about shipping valuable software on demand, consistently, predictably, et cetera, right?
(24:19):
That's what it is. It doesn't say much about how fast you're doing it in an absolute sense, but rather it's continuous, it's predictable, it's ongoing. And to that end, you got to think of it as a competency, a culture, et cetera. That's where those words come from, but do it this way around business goal metric, then practices and culture. Don't do culture first. Don't do practices first. You'll get it resisted, right? Think of it this way around, and then suddenly people are more open and I think I'm out of time right now, so thank you all.