What should we do to handle the constantly changing domain and technical requirements? Patrick Kua talks about some interesting aspects on the architectural level.
It is not uncommon that an existing piece of non-trivial software gets re-written from scratch. Personally, I find this preference for a greenfield approach interesting, because it is not something I share. In fact, I believe that it is a fundamentally flawed approach in many cases, so why are people still doing this today?
Probably because they just like it more and also it is perceived, at least when things get started, as more glorious. But when you dig deeper the picture changes dramatically. Disclaimer: Of course there are situations when re-starting is the only option. But it typically involves aspects like required special hardware not being available any more.
When I started writing this post with making some notes, it all started with technical arguments. They are of course relevant, but the business side is much more critical. Just re-phrase the initial statement about re-writing to something like
Instead of gradually improving the existing software and learn along the way, we spend an enormous amount of money on something that delivers no additional value compared to what we have now. For this period, which we currently estimate to be 2 years (it is very likely to be longer), apart from very minor additions, the business will not get anything new, even if the market requires it; so long-term we risk the existence of the organization. And yes, we may actually loose some key members of staff along the way. Also, it is not certain that the new software ever works as expected. But should it really do, the market has changed so much, that the software is not very useful for doing business anyway and we can start all over again.
Is there anyone who still thinks that re-writing is generally a good idea?
Let us now change perspective and look at it from a software vendor’s point of view. Because the scenario above was written with an in-house application in mind. What is different, when we look at a company that develops and sells enterprise software? For this text the main properties of enterprise software are that it is used in large organizations to support critical business processes. How keen will customers be to bet their own future on something new, i.e. not tested? But even if they waited long enough for things to stabilize, there would be the migration effort. And if that effort comes towards them anyway, they may just as well look at the market for alternatives. So you would actively encourage your customer base to turn to the competition. Brilliant idea, right?
What becomes clear looking at things like that, is what the core value proposition of enterprise software is: investment protection. That is why we will, for the foreseeable future, continue to have mainframes with decades-old software running on them. Yes, these machines are expensive. But the alternatives are more expensive and in addition pose enormous risk.
In the context of commercial software vendors one argument for a re-write is that of additional revenue. It is often seen as easier to get a given amount of money for a new product than an improved version of something already out there. But that is the one-off view. What you typically want as a software vendor is a happy customer that pays maintenance every year and, whenever they need something new, first turns to you rather than the competition for options. Also, such a happy customer is by far the best marketing you can get. It may not look as sexy as getting new customers all the time, but it certainly drives the financial bottom line.
Switching over to the technical side, there are a few arguments that are typically made in favor of a restart. My favorite is the better architecture of the new solution, which will be the basis for future flexibility, growth, etc. I believe that most people are sincere here and think they can come up with something better. But the truth is that unless someone has done something similar successfully in the past, there is a big chance that the effort is hugely underestimated. Yes, technology has advanced and we have better languages and frameworks. But on the other hand the requirements have also grown dramatically. Think about high availability, scalability, performance and all the others. Even more important, though, is the business side. With something brand new people will have greatly increased expectations. So giving them something like-for-like will probably not be seen as success.
The not-invented-here syndrome is also relevant in this context and particularly with more junior teams. I have seen a case when an established framework used in more than 9,000 business-critical (i.e. direct impact on revenue) installations was dismissed in favor of something the team wanted to develop themselves. And I can tell you that the latter was a really bad implementation. Whether it was a misguided sense of pride or a fundamental lack of knowledge I cannot say. But while certainly being the most extreme incarnation I have seen so far, it was certainly not the only one.
So far my thoughts on the subject. Please let me know what you think about this topic.
Martin talks about various algorithms for lock-free data processing. Worth watching, as always 🙂
Here is another interesting presentation from Martin given at the Code Mesh Conference 2015 in London.
Quite recently I heard a statement similar to
“The application works, so there is no need to consider changing the architecture.”
I was a bit surprised and must admit that in this situation had no proper response for someone who obviously had a view so different from everything I believe in. But when you think about it, there is obviously a number of reasons why this statement was a bit premature. Let’s have a look at this in more detail.
There are several assumptions and implicit connotations, which in our case did not hold true. The very first is that the application actually works, and at the time that was not entirely clear. We had just gone through a rather bumpy go-live and there had not yet been a single work item processed by the system from start to finish, let alone all the edge cases covered. (We had done a sanity test with a limited set of data, but that had been executed by folks long on the project and not real end users.) So with all the issues that had surfaced during the project, nobody really knew how well the application would work in the real world.
The second assumption is that the chosen architecture is a good fit for the requirements. From a communication theory point of view this actually means “a good fit for what I understand the requirements to be”. So you could turn the statement in question around and say “You have not learned anything new about the requirements since you started the implementation?”. Because that is what it really means: I never look back and challenge my own thoughts or decisions. Rather dumb, isn’t it?
Interestingly, the statement was made in the context of a discussion about additional requirements. So there is a new situation and of course I should re-evaluate my options. It might indeed be tempting to just continue “the old way” until you really hit a wall. But if that happens you have consciously increased sunk costs. And even if you can “avoid the wall”, there is still a chance that a fresh look at things could have fostered a better result. So apart from the saved effort (and that is only the analysis, not a code change yet) you can only loose.
The next reason are difficulties with the original approach and of that there had been plenty in our case. Of course people are happy that things finally sort-of work. But the more difficulties there have been along the way, the bigger the risk that the current implementation is either fragile or still has some hidden issues.
And last but not least there are new tools that have become available in the meantime. Whether they have an architectural impact obviously depends on the specific circumstances. And it is a fine line, because there is always temptation to go for the new, cool thing. But does it provide enough added value to accept the risks that come with such a switch? Moving from a relational database to one that is graph-based, is one example that lends itself quite well to this discussion. When your use-case is about “objects” and their relationships with one another (social networks are the standard example here), the change away from a relational database is probably a serious option. If you deal with financial transactions, things look a bit different.
So in a nutshell here are the situations when you should explicitly re-evaluate your application’s architecture:
- Improved understanding of the original requirements (e.g. after the first release has gone live)
- New requirements
- Difficulties faced with the initial approach
- New alternatives available
So even if you are not such a big fan of re-factoring in the context of architecture, I could hopefully show you some reasons why it is usually the way to go.
A good talk on EDA (event-driven architecture).
Here is a rather interesting video from Martin Kleppmann where he talks about dealing with concurrent changes to data. While the title may sound theoretical to some, it is a topic that probably every developer has come across. And here is also the link to the paper with the algorithm presented. If you are interested in an implementation, check this Github project.
In the closing statement of my post Architects Should Code I said that for me code and architecture are just two ways to look at the same thing. It seems that I am not alone in that perception 🙂 and can very much recommend the video linked in below. I found its start a bit boring, but am very happy, in hindsight, to have not switched away.
There is a widespread notion, that developers at some point in their career evolve into something “better”, called architect. This term has the connotation of mastery on the subject, which I think is ok. What is not ok for me, is that in many cases there is the expectation that an architect’s time is too valuable for something as mundane as coding. Instead architects should develop the overall architecture of the solution.
Ok, so what is the architecture? Most people believe that some documentation (often a mixture of prose and high-level pictures) is the architecture. I would argue that this not the architecture but a more or less accurate abstraction from it. When enough work has gone into things, it may may even become the to-be architecture (but certainly not the as-is one). However, outside of regulated industries I have never seen a document that was thorough enough. A friend of mine used to work on embedded systems for car breaks where lives are at stake; and he told me some interesting stories about the testing and documentation efforts these guys take.
In my view the architecture is, by definition, in the code itself. Everything else is, I repeat myself, something that has some relationship with it. So how can an architect not be coding? You could argue that instead of doing the work him- or herself, mentoring and guiding less experienced developers is a good use of the architect’s time. For me that works fine up to a certain level of complexity. But if we talk about an STP (straight-through processing) solution, is that really something you want to give to a mid-level developer? (It would probably be an ideal piece of work for pair-programming, though.)
I do certainly not want to demean people who call themselves architects. It is a not-so-common capability to look at an IT landscape and see the big picture. Instead many people will be lost in the details. So we definitely need this this perspective! But it is a different kind of architecture, the so-called Enterprise Architecture (EA). I know folks who started as (really good) developers and are now very successful at Enterprise Architecture.
So, in closing, my core point is that the architecture of a solution and its code are basically two sides of the same coin. Everybody involved on the technical level should understand both aspects. And if the level of detail varies, depending on the official role, that is probably ok.
Here is yet another interesting video. The title is chosen badly, though, as the content is not really about the future but the history of programming. But on the other hand you need to understand the past, if you want to avoid repeating its failures.