Here is a rather interesting video from Martin Kleppmann where he talks about dealing with concurrent changes to data. While the title may sound theoretical to some, it is a topic that probably every developer has come across. And here is also the link to the paper with the algorithm presented. If you are interested in an implementation, check this Github project.
In the closing statement of my post Architects Should Code I said that for me code and architecture are just two ways to look at the same thing. It seems that I am not alone in that perception 🙂 and can very much recommend the video linked in below. I found its start a bit boring, but am very happy, in hindsight, to have not switched away.
There is a widespread notion, that developers at some point in their career evolve into something “better”, called architect. This term has the connotation of mastery on the subject, which I think is ok. What is not ok for me, is that in many cases there is the expectation that an architect’s time is too valuable for something as mundane as coding. Instead architects should develop the overall architecture of the solution.
Ok, so what is the architecture? Most people believe that some documentation (often a mixture of prose and high-level pictures) is the architecture. I would argue that this not the architecture but a more or less accurate abstraction from it. When enough work has gone into things, it may may even become the to-be architecture (but certainly not the as-is one). However, outside of regulated industries I have never seen a document that was thorough enough. A friend of mine used to work on embedded systems for car breaks where lives are at stake; and he told me some interesting stories about the testing and documentation efforts these guys take.
In my view the architecture is, by definition, in the code itself. Everything else is, I repeat myself, something that has some relationship with it. So how can an architect not be coding? You could argue that instead of doing the work him- or herself, mentoring and guiding less experienced developers is a good use of the architect’s time. For me that works fine up to a certain level of complexity. But if we talk about an STP (straight-through processing) solution, is that really something you want to give to a mid-level developer? (It would probably be an ideal piece of work for pair-programming, though.)
I do certainly not want to demean people who call themselves architects. It is a not-so-common capability to look at an IT landscape and see the big picture. Instead many people will be lost in the details. So we definitely need this this perspective! But it is a different kind of architecture, the so-called Enterprise Architecture (EA). I know folks who started as (really good) developers and are now very successful at Enterprise Architecture.
So, in closing, my core point is that the architecture of a solution and its code are basically two sides of the same coin. Everybody involved on the technical level should understand both aspects. And if the level of detail varies, depending on the official role, that is probably ok.
Here is yet another interesting video. The title is chosen badly, though, as the content is not really about the future but the history of programming. But on the other hand you need to understand the past, if you want to avoid repeating its failures.
I recently started a new hobby project (it is still in stealth mode, so no details yet) and went through the exercise to really carefully think about what technology to use for it. On a very high level the requirements are fairly standard: Web UI, persistence layer, API focus, cross-platform, cloud-ready, continuous delivery, test automation, logging, user and role management, and all the other things.
Initially I was wondering about the programming language, but quickly settled for Java. I have reasonable experience with other languages, but Java is definitely where most of my knowledge lies these days. So much for the easy part, because the next question proved to be “slightly” more difficult to answer.
Looking at my requirements it was obvious that developing everything from the ground up would be nonsense. The world does not need yet another persistence framework and I would not see any tangible result for years to come, thus loosing interest to soon. So I started looking around and first went to Spring. There is a plethora of tutorials out there and they show impressive results really quickly. Java EE was not really on my screen then, probably because I still hear some former colleagues complain about J2EE 1.4 in the back of my mind. More importantly, though, my concern was more with agility (Spring) over standards (JEE). My perception with too many Java standards is that they never outgrow infancy, simply because they lack adoption in the real world. On the other hand Spring was created to solve real-world problems in the first place.
But then, when answering a colleague’s question about something totally different, I made the following statement:
I tend to avoid convenience layers, unless I am 100% certain that they can cope with all future requirements.
All to often I have seen that some first quick results were paid for later, when the framework proved not to be flexible enough (I call this the 4GL trap). So this cautioned myself and I more or less went back to the drawing board: What are the driving questions for technology selection?
- Requirements: At the beginning of any non-trivial software project the requirements are never understood in detail. So unless your project falls into a specific category, for which there is proven standard set of technology, you must keep your options open.
- Future proof: This is a bit like crystal ball gazing, but you can limit the risks. The chances are bigger that a tier-3 Apache project dies than an established (!) Java standard to disappear. And of course this means that any somewhat new and fancy piece must undergo extreme scrutiny before selecting it; and you better have a migration strategy, just in case.
- Body of knowledge: Sooner or later you will need help, because the documentation (you had checked what is available, right?) does not cover it. Having a wealth of information available, typically by means of your favorite search engine, will make all the difference. Of course proper commercial support from a vendor is also critical for non-hobby projects.
- Environment: Related to the last aspect is how the “landscape” surrounding your project looks like. This entails technology but even more importantly the organization which has evolved around that technology. The synergies from staying with what is established will often outweigh the benefits that something new might have when looked at in isolation.
On a strategic level these are the critical questions in my opinion. Yes, there are quite a few others, but they are more concerned with specifics.
Every once in a while someone is rolling their eyes when I, again, insist on a well-chosen name for a piece of software or an architectural component. And the same also goes for the text of log messages, by they way; but let’s stick with the software example for now.
Well, my experience with many customers has been the following, which is why I think names are important: As soon as the name “has left your mouth” the customer will immediately and sub-consciously create an association in his mind what is behind it. This only takes a second or two, so it is finished before I even start to explain what a piece of software does.
Assuming that my name was chosen poorly, and hence his idea about the software’s purpose is wrong, he will then desperately try to match my explanation with his mental picture. Obviously this will not be successful and after some time (hopefully just a few minutes), he will interrupt me and say that he doesn’t understand and shouldn’t the software actually be doing this and that.
It makes the conversation longer than necessary and, more importantly, creates some friction; the latter is hopefully not too big, but esp. at the beginning of a project when there is no good personal relationship yet, it’s something you want to avoid. Also, think about all the people who just read the name in a document or presentation and don’t have a chance to talk with you. They will run around and spread the (wrong) word. I have been on several projects where bad names created some really big problems for the aforementioned reasons.
I can honestly say that I when I wrote my post about ALM and middleware, I hadn’t heard about the Open Services for Lifecycle Collaboration initiative. But it is exactly the kind of thing I had in mind. These guys are working on the definition of a minimum (but expandable) set of features and functions that allow easy integration between the various tools, which can usually be found in an organization. To my knowledge no products exist yet, but I really like the idea and approach.
For a while now I have been contemplating various aspects on logging and related areas. Some of them have found their way into this post. I look purely at application logging, leaving out the underlying layers. In particular those comprise the operating system and other software that from the application’s perspective can be considered infrastructure (e.g. databases, middleware).
There is a bunch of different groups that are affected by the logging aspects of an application. They all have specific requirements and some of those aspects will be looked at now.
- Developer: Yes, also the developer has some demands towards logging. For me two things are particularly relevant: During the creation of code I don’t want to be distracted from writing the actual logic. And later I don’t want to wade through lots of boiler-plate code.
- Application administrator: This person knows the internals of the application quite well and helps users when they run into problems with the application; often he or she also serves as a liaison with the system’s administrator. They must be able to quickly find out if a quirk comes from a real problem within the application, is a result of some external problem, or perhaps stems from “wrong” usage.
- Operations: These guys have to ensure that the entire IT landscape is running smoothly. All too often stuff is thrown upon them that has been designed with very little thought on daily usage. They must be given the possibility to quickly see whether everything is ok or things need to be escalated with application support. In particular this requires integration into system management tools. Those are usually working with JMX, SNMP and log file monitoring.
There is certainly a lot more to say here, but this should give a first overview and make clear that a variety of requirements needs to be fulfilled.
Logging vs. Monitoring vs. Management
One way how to look at the above line is that it describes a hierarchy of aspects that built upon each other: A set of log entries allows me to monitor my application; and monitoring is the basis for management. So logging is indeed a very important step for a smooth, efficient and compliant operation of my organization. The more you move towards the higher-level facets, the more important it is to abstract from the single, raw “event” and see the bigger picture. What also becomes important is correlation of events. Perhaps my application becomes less stable whenever database response times exceed a certain threshold. And Cloud Computing will certainly add something here, too.
Most developers think of logging as an unloved necessity. But why is that? In my post about Asynchronous Communication I made the point that poor tooling does not make a pattern bad, it is just poor tooling. Likewise, I think many people simply don’t have the proper tools for logging. Last year, while developing a small system management component, I conducted a small experiment on myself. Instead of hard-coding log messages I went the extra mile and wrote my own message catalog. The result is that my (implicit) workflow has changed. Whenever an additional log statement is needed, I now only have to do the following:
- Check the message catalog for a message I can re-use; if yes I’m done already (and perhaps have to wire in some parameters).
- If a new message is needed I need to decide on a new key. This should be done with some confidence that I don’t need to change it later (although this would already be much easier than plain text).
- Also a message and log level need to be chosen for the key. Those can both be done in a pretty quick-and-dirty approach, since changing them later in the catalog file is easy.
At first this may sound more complex than just putting in a plain text statement. But when putting everything directly into the code, all these things must be chosen with careful consideration because changing them later is much more effort. This distracts me a great deal, since coming up with a good log message and the appropriate log level is often far from trivial. While my initial rational for the message catalog had mainly been automated log file watching, this ease of development proved to be the real “killer” for me.
Recently, a few colleagues and I had a very interesting discussion about what should go into a Version Control System (VCS) and what should not. In particular we were arguing as to whether things like documents or project plans should go in. Here are a few things that I came up with in that context.
I guess the usage of VCS (and other repositories) somehow comes down to a few general desires (aka use-cases):
- Single source of truth
- History/time machine
- Automation of builds etc.
In today’s world with its many different repositories you can either go for a mix (best-of-breed) or the lowest common denominator which is usually the VCS. So what’s stopping people from doing it properly (=best of breed)?
- Lack of conceptual understanding:
- Most people involved in those kinds of discussion usually come from a (Java) development background. So there is a “natural” tendency to think VCS. What this leaves out is that other repositories, which are often DB-based, offer additional capabilities. In particular there are all sorts of cross checks and other constraints which are being enforced. Also, given their underlying architecture, they are usually easier to integrate with in therms of process-driven approaches.
- Non-technical folks are mostly used to do versioning-by-filename and require education to see the need for more.
- Lack of repository integration: Interdependent artefacts spread over multiple repositories require interaction, esp. synchronisation. Unless some kind of standard has emerged, it is a tedious task to do custom development for these kinds of interfaces. Interestingly, this goes back to my post about ALM needing middleware.
- Different repositories have clients working fundamentally differently, both in terms of UI and underlying workflow (the latter is less obvious but far-reaching in consequence). Trying to understand all this is really hard. BTW: This already starts with different VCS! As an example just compare SVN, TFS and Git (complexity increasing in that order, too) and have “fun”.
- Lack of process: Multiple repositories asking for interaction between themselves also means that there is, at least implicitly, a process behind all this. Admittedly, there is also a process behind a VCS-only approach, but it’s less obvious and its evolvement often ad-hoc in nature. With multiple repositories a more coordinated approach is required to the process development, also because often this means crossing organisational boundaries.
Overall, this means that there is considerable work to be done in this area. I will continue to post my ideas here and look forward to your comments!
The idea to write this post was triggered when I read an article called “Choosing Agile-true application lifecycle management (ALM)” and in particular by it saying that many ALM tools come as a package that covers multiple processes in the lifecycle of an application. Although strictly speaking this is not a “we-cover-everything” approach, it still strongly reminds me of the take that ERP software has made initially. Its promise, put simply, was that an entire organization with all its activities (to avoid the term process here) could be represented without the need to develop custom software. This was a huge step forward and some companies made and still make a lot of money with it.
Of course, the reality is a bit more complex and so organizations that embrace ERP software have to choose between two options: either change the software to fit the organization, or change the organization to fit the software. This is not meant to be ERP bashing, it simply underlines the point that the one-size-fits-all approach has limits. And a direct consequence of this is that although the implementation effort can be reduced considerably, there is still a lot of work to be done. This is usually referred to as customizing. The more effort needs to go there, the more the ERP software is changing into something individual. So the distinction between a COTS (commercial off-the-shelf) software, the ERP, and something developed individually gets blurred. This can reduce the advantages of ERP, and especially the cost advantage, to an extent.
And another aspect is crucial here, too. An ERP system, pretty much by definition, is a commodity in the sense that the activity it supports is nothing that gives the organization a competitive advantage. In today’s times some of the key influencing factors for the latter are time-to-market and, related to that, agility and flexibility. ERP systems usually have multiple, tightly integrated components and a complex data model to support all this. So every change needs careful analysis so that it doesn’t break something else. No agility, no flexibility, no short time-to-market. And in addition all organizations I have come across so far, need things that their ERP does not provide. So there is always a strong requirement to integrate the ERP world with the rest, be it other systems (incl .mainframe) or trading partners. Middleware vendors have addressed this need for many years.
And now I am finally coming back to my initial point. In my view ALM tools do usually cover one or several aspects of the whole thing but never everything. And if they do, nobody these days starts on a green field. So also here we need to embrace reality and accept that something like ALM middleware is required.