Agile and DevOps are on the rise. Hardly anyone these days defends up-front design approaches or project-based over product-based change. It seems to many that ‘agile’ is seen as a free pass for unlimited change. This is an illusion. Working ‘agile’ can in fact decrease your agility.
Suppose you have a hypothetical (I have to stress this example is hypothetical, but it is based on stories I have heard from people in my network) large company that is working hard on getting the following done:
- Moving all IT change initiatives to Agile/DevOps, e.g. by adopting a framework like SAFe. The company has been working with Agile approaches for a while and is maturing to DevOps. The idea is to improve above all agility and speed.
And already being done in an Agile/DevOps way are the following new initiatives
- As many medium and large organisations are now involved in: Creating new core infrastructure to make it automated (“infrastructure as code”) and cloud-ready. The goal is to be faster, less error prone and cheaper (fewer infrastructure operators). Many organisations are on this route as it is often a prerequisite for being able to profit from the cloud.
- Moving its ESB and the services employed on it from BizTalk to Mule. The BiZTalk contract was almost at an end and Mule won the RfP.
- As a result of the DevOps initiative: Setting up new continuous integration and deployment pipelines based on new deployment-supporting software with automatic testing and all that
- Combining all kinds of logging logging and messaging from operations and applications, managing specific access to this, creating predictive analytics on it, building regulatory compliance on top of it, etc…
While these are clearly transformations that are supportive of business goals, the actual business that makes the money is still changing as well, with also a great number of large and small projects in the primary line of business. A new trading application here, a new reporting environment there, data lakes and analytics all over the place, you get the picture. Any organisation is naturally a seething cauldron of change, even if there are no big transformation programs in place. Oh, and don’t forget that all those existing setups often need to adapt to these new initiatives as well. If your software engineering currently uses many different versions of frameworks and platforms, the automated infrastructure/platform will not support them all, just as the cloud PaaS offerings don’t.
Now, this kind of massive change portfolio as pictured above is already rife with risk. As Steve McConnell wrote: “Seymour Cray, the designer of the Cray supercomputers, says that he does not attempt to exceed engineering limits in more than two areas at a time because the risk of failure is too high. Many software projects could learn a lesson from Cray. If your project strains the limits of computer science by requiring the creation of new algorithms or new computing practices, you’re not doing software development, you’re doing software research.”.
Where Cray was talking about technological innovation from the computing machinery perspective, from the enterprise perspective, that parallel change is exactly what is happening in the above scenario: Next to an entirely new way of working, four major new techniques are put in place: the IAM platform, the infrastructure automation platform, the ESB and services platform, and the continuous integration software. So our hypothetical company is not doing enterprise development, it is effectively doing a massive research effort with lots and lots of nasty unknowns in the techniques they are starting to use. Which they combine with business cases and more or less strict deadlines of course.
It seems that these days many think that — by using Agile/DevOps methods — this ‘too many change vectors’ isn’t a problem anymore. After all, ‘agile’ is based on not having to be based on certainties, it enables you to react to change. It is thus no problem that ‘everything moves’ (πάντα ῥεῖ (panta rhei) — the famous statement by the Greek philosopher Heraklites), as everything can also in an agile way react to unexpected changes. Who needs certainty and stability? If we are truly agile, we can refactor as we go.
But in practice, agility is not that simple. Suppose our ESB/service team requires infrastructure for their platform. They go to the infrastructure DevOps team, which is not only in the middle of a ‘program increment’ of 3 months, it also reacts to the requirements of the ESB/service team with a somewhat vague reply that goes along the line: “Well, this is the first time that your kind of requirements pop up, so we have to think about what to do. We might do it this way or that, we should probably first have a ‘spike’ to find out how to support your needs.”. The ESB/service team is now stuck. What are they going to do: have a beer and pizza session without doing any real work? No. Instead, part of the work of the infrastructure team will be to start thinking already on how to solve the problem, to make a sort of design choice already so the ESB team can proceed, even if it will take some time before implementation ends up on their backlog. So, they huddle together with the ESB/service team to understand what is needed and come up with a possible way to support it. All discussions take place in an atmosphere of uncertainty as both need to move forward concurrently and with dependencies. Many meetings take place with the outcome that more things need to be researched.
And while they are doing this, they run into the question that with all this automation: how are we creating infrastructure for the ESB/service platform in such a way that what we build links to the new enterprise-wide logging initiative? They should align with the logging team on this. So repeat: the infrastructure team goes to the logging team and gets the following answer: “Well, this is the first time that your kind of requirements pop up, so we have to think about what to do. We might do it this way or that, we should probably first have a ‘spike’ to find out how to support your needs.”. Again, management wants results before a certain deadline, so we start another huddle. Oh, and by the way, our logging team needs infrastructure too and vice versa.
Some will say: “Sure, but your description is unfair. We have frameworks like SAFe (Scaled Agile Framework) to manage agility at larger scales.” We’ll get to that.
YAGNI and MVP
What is making matters far worse is the mantra of YAGNI (“You Ain’t Gonna Need It” — from Extreme Programming; see the side story YAGNI and YGNIL: Discussion between an ‘agile’ and an ‘upfront design’ person) and the idea of the MVP (“Minimum Viable Product”) that rules much of Agile work. The idea is that you produce actual valuable results (features) as quickly as you can so you actually profit from all the development work done as early as possible. We all hate those years and years of development with nothing to show for it in production. In the most idealistic implementation of agile, every two weeks you have new production releases of your products, where ‘in production’ means that you start to reap the beneifits. And we prioritise based on “weighted shortest job first“, so we maximise value.
YAGNI and YGNIL: Discussion between an ‘agile’ and an ‘upfront design’ person
Upfront: “You Agile people are silly. Why do you think you don’t have to think upfront? Suppose you want to build a bridge. You think your MVP can be a bridge for cyclists. But what if the bridge eventually must be able to hold heavy lorries? The foundation you have laid for the cyclist bridge cannot hold the lorry bridge. But if you have created a foundation for a lorry bridge you could have first built cyclist bridge on top of it, and then expand it to a lorry bridge. You can’t build an Eiffel tower by first laying the foundation for an outhouse.”
Agile: “Well, you are quite right if what we build is a physical bridge. But the problem with the analogy is that in bridge building, you cannot change a foundation when a bridge is already there. But we build software and software is much more malleable. In fact, it is infinitely malleable. So, in software, I can build a weak foundation, then build a tiny cyclist bridge, then — when I need a lorry bridge — first rebuild the foundation while the cyclist bridge stays in place (in agile terms: we ‘refactor’ the foundation) and when that is in place we build the lorry bridge next to the cyclist bridge.”
Upfront: “Would it not be more effective to build the good foundation in the first place? Now you have built and rebuilt a foundation. ”
Agile: “Nah. In practice, the world constantly changes and suddenly they do not need a lorry bridge at all anymore as someone else has created a tunnel. The future cannot be predicted. We wait until someone wants a lorry bridge and then the refactor of the foundation becomes part of the work for the lorry bridge. We call this ‘YAGNI’, which stands for “You Ain’t Gonna Need It”. It is kind of a lean mantra that says you don’t build things unless you are certain it is needed. ”
Upfront: “I see. But what if you cannot rebuild the foundation without having to redo the cyclist bridge? Don’t you run the risk of having to build the cyclist bridge twice?”
Agile: “Yes, you have a point there. If you build the foundation for the cyclist bridge you must take into account you might have to refactor it later to support a lorry bridge as well. So, the first foundation you build might not be able to support a lorry bridge, but it must be built in such a way that you are ready for a future refactor to a foundation for a lorry bridge as well. And of course, if it is dead certain that you will need the lorry bridge in any case, then it is less wasteful to build the right foundation in the first place. But with a caveat: technology changes and building the foundation now might not get you the best solution for when the lorry bridge is finally required.”
Upfront: “I see another problem. If the customer comes along for a lorry bridge and you have to refactor the foundation first, the time between request and result grows and your schedule will also become less predictable. They won’t like nor understand that.”
Agile: “Right again. So in the decision to follow YAGNI you need to estimate YGNIL (You’re Gonna Need It Later) and you need to estimate how fast you need to be able to deliver features like a lorry bridge later. Build the right foundation now and you are more speedy later.”
But in the situation we have described above, everyone has multiple interdependent uncertainties and progress slows down to a crawl; or we may even see something that looks like a deadlock, not because we lack agility but because we have too much of it. Everyone waiting for everyone to think of solutions or to implement the necessary refactors kills progress.The problem with YAGNI and MVP is that you kill the IT version of ‘just in time delivery’ and because IT is so malleable, people create suboptimal solutions to be able to progress anyway. YAGNI and MVP do not provide a free lunch. What you need is a bit of YGNIL (“You’re Gonna Need It Later”) to prevent suboptimal solutions and wastes of time. Even Agile needs Architecture and preparations, it doesn’t emerge magically even if the authors of the Agile Manifesto thought it would.
A method like SAFe tries to manage this with higher level constructs like “Agile Release Trains” and “Value Streams”. At these levels, agile frameworks look at aligning all different substreams so that the time and content dependencies are managed within each independent ‘huddle’. This doesn’t only look a lot like the ‘upfront design’ based management of before (because you need to know what is needed how — in other words: some design — before you can prioritise), it also is based on a fundamental assumption that is not always met.
The idea behind these higher level constructs is that they constitute a tree-like structure of dependencies. Within your ‘Agile Release Train’ or Value Stream you can align everyone in both time and content (architecture, design decisions). And it is this idea of a tree like structure of dependencies that is an illusion. Organisations with their Business-IT landscapes are not tree-like at all, they are a web. Many dependencies exist all over the place: between the business products and between those and the underlying platforms and infrastructure, between the service bus and all applications, between infrastructure and logging, etc.. An enterprise is a huge chess board with many players concurrently making moves all over the place, all interacting.
Managing those dependencies is very difficult. It becomes even more difficult as the stakeholders have been told they will get a constant stream of real business value (e.g. fire people, increase sales) because of the agile approach. Agile, they have been effectively told, is the silver bullet that slays the werewolf of dependencies. Under pressure to deliver visible value, what happens when a program is confronted with necessary refactors? Will they easily accept the need for refactor? Of course not. What happens is that the pressure is great to produce the business value without ‘doing it right’ and the result is a constant change in direction. If the pressure is really high a program will jump all over the place, producing even more half-finished work that hasn’t produced real business value.
And then we haven’t considered what the uncertainties from the outside world does to this: do you think a regulator will wait for you to refactor to get things right? Nope. So, even setting up a very strong orthodox architecture approach doesn’t help.
What does this mean? It means we will see a lot of large transformations using agile methods that will take a lot more time than originally expected. Just recently one of my contacts told me that he had been working for a company that had worked years and years on producing the situation where they had been able to move applications to the cloud and manage their infrastructure/platforms automatically (you know: using Puppet etc.) including moving them from Amazon to Azure and back. I was impressed. But when asked how many applications had moved to the cloud, the answer was: “three” and a few more were being investigated.
Overexpectation and underdelivery is not new, of course. But ‘agile fundamentalists’ believe that ‘agile’ does away with the problem of managing dependencies, because agile means you can react to change easily. But you can’t and thus agile doesn’t. Hence: Dear CIO: unless you explicitly manage (read: limit) the amount of change you try to push through your organisation (let alone what you actually do to your people when you push too much too hard), taking in account Seymour Cray’s guideline, chances are slim that you will truly succeed as you are doing research on your (hapless) organisation, not developing it. Your progress will slow the change you do concurrently. Agile or no agile. Paraphrasing Steve McConnell: As Thomas Hobbes observed in the 17th century, life under mob rule is solitary, poor, nasty, brutish, and short. Life in a situation of ‘too much agile’ is solitary, poor, nasty, brutish, and hardly ever short enough.
Image By Donato Bramante – Public Domain, https://commons.wikimedia.org/w/index.php?curid=6509784