Offshoring and Knowledge Depletion

I have recently watched a very interesting presentation by Gunter Dueck (former CTO of IBM in Germany), where he talks about the future of work. When looking at how many consulting organizations structure their business, you can see a trend that seemingly high-level parts, usually the interaction with the customer and where business domain knowledge is required, stay in your own country. And those aspects that can, allegedly, be performed by less qualified people are moved to countries with lower salaries.

This may seem attractive from a cost-management point of view for a while. And it certainly works right now – to the extent possible. Because the organizations still have people locally who had been “allowed” to start with relatively simple coding tasks and grow into the more complex areas of the business. But what if those folks retire or leave? How do you get people to the level of qualification that allows them to understand the customer’s business on the one hand, and at the same time be able to have an intelligent conversation with the offshore folks who do the coding? Because for the latter you must have done coding yourself extensively.

Trying to Help

Quite recently I came across a number of texts (e.g. “Is Giving the Secret to Getting Ahead?“), which suggest that trying to be helpful to others is not only altruistic, but also helps the giving person. This is certainly true for me, so let me tell you my story.

My employer is running a number of internal mailing lists that cover all our products and are used by folks to discuss all sorts of (mainly technical) stuff. I have been subscribed to many of those lists for close to ten years now and they helped me to learn an awful lot about our products and their real-world application.

Most of this information cannot be found in official documentation or training, simply because it is relevant only in a very specific context. What I realized, though, is that it is exactly this context-specificity, which helped me understand the products better. Because if you discuss, along with your own use-case, the border conditions, you develop a “feeling” how the software works internally. And this, in turn, allows you to analyze totally new use-cases.

So on average I spend about 30-60 minutes per day on those mailing lists (you typically get 150-200 mails per day in total). Some of them I just scan through rather quickly. But others I follow much more closely. And of course after some time I had identified various folks that stood out in terms of amount and quality of their contributions. Many of them I have never met personally in all those years because they live on another continent. But still we have a relationship.

After some time, when I had learned enough from following the lists as well as my actual work with some of the tools, I started answering questions myself. Soon enough people knew my areas of expertise and also added my personal email address next to the mailing lists when sending their questions, to make sure I would take a look.

So apart from my ego, how did helping others on the mailing lists help me? The obvious thing is that I am known to be an expert on various topics (e.g. ALM, performance, HA) for our products. But more importantly, when I have a question, people are willing to spend time and return the favor. Because, on the other hand, we all know the lone wolf character who only asks, but never helps others. And guess how many replies they usually get …

The same approach also works on public mailing lists, by the way. In the mid-1990s I had basically the same experience in the Novell NetWare group of the German FidoNet (basically a proprietary version of UseNet). I was a university student then and also had my own small company. Following the group was critical to building up my NetWare expertise, which I then used to charge a nice (at the time) hourly rate.

In conclusion I cannot overestimate the role that helping others played on my professional success. Plus, it feels really good :-).

Configuration Management – Part 6: The Secrets

Every non-trivial application needs to deal with configuration data that require special protection. In most cases they are password or something similar. Putting those items into configuration files in clear is a pretty bad idea. Especially so, because these configuration files are almost always stored in a VCS (version control system), that many people have access to.

Other systems replace clear-text passwords found in files automatically with an encrypted version. (You may have seen cryptic values starting with something like {AES} in the past.) But apart from the conceptual issue that parts of the files are changed outside the developer’s control, this also not exactly an easy thing to implement. How do you tell the system, which values to encrypt? What about those time periods that passwords exist in clear text on disk, especially on production systems?

My approach was to leverage the built-in password manager facility of webMethods Integration Server instead. This is an encrypted data store that can be secured on multiple levels, up to HSMs (hardware security module). You can look at it as an associative array (in Java usually referred to as map) where a handle is used to retrieve the actual secret value. With a special syntax (e.g. secretValue=[[encrypted:handleToSecretValue]]) you declare the encrypted value. Once you have done that, this “pointer”will of course return no value, because you still need to actually define it in the password manager. This can be done via web UI, a service, or by importing a flat file. The flat file import, by the way, works really well with general purpose configuration management systems like Chef, Puppet, Ansible etc.

A nice side-effect of storing the actual value outside the regular configuration file is that within your configuration files you do not need to bother with the various environments (add that aspect to the complexity when looking at in-file encryption from the second paragraph). Because the part that is environment-specific is the actual value; the handle can, and in fact should, be the same across all environments. And since you define the specific value directly within in each system, you are already done.

Too Qualified to Code

There is an “interesting” perception of software development in that quite a few people think it is an activity suitable for graduates but not highly qualified professionals. I am in violent disagreement with this. But since it is a widespread belief, just dismissing it as dumb does not help very much. So how does it come that so many people, who have a somewhat limited knowledge about the subject, make such a judgement and act according to it?

From a psychological point of view there seems to be a mechanism in place that either makes those people believe that they actually understand the subject enough to make an informed decision. Or they do not see it as important enough to invest more time in the whole process and just go for the easy way. Or something in the middle? I will look at some thoughts I have had on the topic over the last couple of years and hope they make sense to you.

If you look at how most companies handle careers, you will soon realize that many talk about career paths for managers and professionals. And they are usually eager to say that both are valued equally in their organization. Well, I have yet to see a place where this statements, typically coming from HR, survives the reality check. If the entire organization is run such that managers from day one experience that the “techies” (sometimes lovingly referred to as code monkeys) are just there to serve management without questioning its wisdom, guess what happens.

Most professionals will simply do as they are told and not start a discussion why something is possibly not as easy or brilliant as it looks from the business side. Managers, and that is absolutely the techies’ fault, will then start to believe that they know enough about technology to decide on details. And the next time, when they do so and the techie just nods and goes away to deliver, they expect a good result. If the result is not so good though, because many details were not taken into account, it is indeed the techie’s fault to not have brought them up. And, voila, there you have the vicious circle.

I must say that I have been fortunate enough so far and not run into that situation. But that not only requires experience and the “standing” in the organization, it also depends on your financial independence. I live in Germany and we have laws against my employer just firing me, because I am not obedient enough but quite a nuisance instead. So in that respect I am just lucky. But your standing with management is something you can and should control. If you are too quiet and never give them a chance to learn how thoughtful and interested in the company’s progress you are, how should they know?

What has worked nicely for me is not trying to be someone else but find opportunities where I was on my home turf and still talk about something they were interested in. And in a nutshell the message I got across was: I understand and want to support what you need to achieve; and for the technical details you can rely on my experience and let me decide on the nitty-gritty stuff, which you are not interested in anyway.

And with this you are not talking about coding any more but supporting the business. Yes, actually performing this support means coding. But once people see that they are much better off with you doing this, rather than someone less experienced, they will happily accept it.

So how can you proof that your experience is relevant to the business? By keeping your promises and deliver on time, quality and budget. Because you have gone through the learning curve and know all the potential pitfalls that can happen for a given task. Having them taken care of upfront and not let surprise anyone, you keep things under control. Just a few weeks ago one of my projects went live and my counterpart from the business side was thrilled that there was only one bug discovered, and fixed immediately, so far.

Fixing things quickly and effectively is, by the way, one of the most powerful tools in your arsenal. Nobody expects a bug-free software, especially if it is custom development. But what people want, and rightly so, is that you can fix things fast. So invest time and have a system in place that allows you to ship fixes in no time and without interrupting normal operations. If you need to tell the business that a two-hour downtime is required for you to bring a fix into production, you are really not worth your salary.

That’s it for now. I know that I barley scratched the surface of a very diverse topic. But everything needs to start somewhere …

Configuration Management – Part 5: The External World

A standard requirement in configuration management is using values that already exist somewhere else. The most obvious places are environment variables (defined by the operating system) and Java system properties. Other are existing (!) databases, e.g. from an ERP system, or files.

The reason for referencing values directly from their original sources instead of duplicating them in a copy-paste fashion is the Don’t-Repeat-Yourself (DRY) principle. Although the latter is typically discussed in the context of code, it applies to configuration at least as well. We all know those “great” applications, which require manual updates of the same value in different places. And if you miss only one, all hell breaks loose.

For re-use of values within a file, the standard approach is variable interpolation. A well-known syntax for that comes from the build tool Apache Ant, where ${variableName} can be placed anywhere into a value assignment. Apache Commons Configuration, and therefore also WxConfig, support this syntax. In WxConfig you can even reference values from other files that belong to the same Integration Server package.

But what to do for other sources? Well, for the aforementioned environment variables and Java system properties, the respective interpolators from Apache Commons Configuration can also be used in WxConfig. But the latter also defines several interpolators of its own.

  • Cross package: Assuming you have an Integration Server package that holds some general values (often referred to as “global”), those can be referenced. (There are multiple ways to deal with truly global values and the specifics really determine which one is used best.)
  • Current date/time: Gets the current date and/or time, which is admittedly a bit of an edge-case. One could argue that this is not really a configuration value, and that is correct. However, there are scenarios when the ability to quite easily have such a value comes in very handy. Think about files that get created during processing of data. Instead of manually concatenating the path, the base filename, the date/time stamp and the file suffix in the code, you could just have something like this in your configuration file: workerFile=${tmpDir}/appFile_${date:yyyyMMdd-HHmmssSSS}.dat Doesn’t that make things a lot cleaner to read?
  • File content: Similar to date/time this was added primarily for convenience and clarity reasons. The typical use-case are templates, where a file (possibly maintained by another application) is just read directly into a configuration value.
  • Code invocation: When all else fails, you can have an arbitrary piece of logic be executed and the result be placed into the configuration value.

It is worth noting that all interpolators resolve at invocation. So you always get current results. In the case of the file content interpolator, this has the downside of file I/O; but if that really becomes a bottleneck, you are still not worse off than when having the read somewhere else in the code. And the file system caching is also still there …

With this set of options, pretty much all requirements for a redundancy-free configuration management can be met.

Mac OS X Mavericks: Mouse Right-Click Stopped Working

Not sure how many people still use Mac OS X Mavericks, but anyway. I had just had the issue that the right-click had stopped working, on the mouse as well as the trackpad. There are many posts on the subject, so it seems a not too uncommon problem. The one that helped me suggested to simply turn the mouse off and on. Strangely enough, this fixed it for both, mouse and trackpad. Thanks!

Martin Kleppmann – Conflict Resolution for Eventual Consistency

Here is a rather interesting video from Martin Kleppmann where he talks about dealing with concurrent changes to data. While the title may sound theoretical to some, it is a topic that probably every developer has come across. And here is also the link to the paper with the algorithm presented. If you are interested in an implementation, check this Github project.

Configuration Management – Part 4: The Differences

The capability to deal with multiple environments is another strength of the WxConfig solution. It basically uses a late-binding approach and, upon loading, checks the environment type of the current system (Docker-compatible, by the way). That way, only values valid for the current environment type will be loaded.

Before diving into the details, I want to share a few more conceptional thoughts here. It is worth noting that WxConfig primarily operates on files for storing configuration values, while many projects I have seen use a database for that. A database offers functionality that I would consider “convenience” in the context of configuration management, but it also has some drawbacks (for this use-case).

Primarily, access to files’ content is always easier than to a database. Yes, there is probably no reasonably conceivable scenario that could not be solved with a database. However, it is easier to e.g. view a file in a terminal session, than starting the database command line client and run a select statement. And especially so from a support perspective, when you don’t need additional complexity.

But even more important are things like scripted operations against such files and version control. Why scripting, you may ask. Well, because I see it as a “last line of defense” to achieve a certain behavior. There will always be use-cases out there, that I have not anticipated. And for those it is possible to augment files and achieve what is needed outside of my solution. Also possible with a database – somehow. But do you really prefer that over an Ant script run from your Continuous Integration (CI) job?

And for version control the advantage of files is probably even more obvious. Yes, you can work with something like Liquibase and effectively have a file representation of your database’s structure and contents. But is that so desirable when you could just have plain files?

But, finally, back to different environments and associated aspects. The approach I have chosen is to dynamically load only those files at run-time that are suitable for the current system. So for all config files there is a check whether they are relevant or not. Those that are, will be loaded and the contents of all files added to memory for use by custom code. That is the whole secret.

To rephrase: Once the initialization is finished all the relevant configuration values are available to the application. Their source (i.e. which files were loaded) is completely abstracted away. It is this separation that makes WxConfig so powerful. And in the meantime that basic approach was also applied to other conditions. Because at its core the environment type is just that, a condition.

So we have regular (=unconditional) file loading and conditional loading. Other conditions currently supported are the type of operating system, the version of the run-time container, a timeframe, and also the hostname. And they can be combined with a logical AND. So you can easily develop a solution on Windows that will later run on Linux. All the specifics can be handled by configuration.

When I came up with the idea more than nine years ago, it was a bit like the obvious approach to take. Since then there have been numerous moments when this decision was re-affirmed by the flexibility I got from it.

Size of Development Teams

What is a good size for a development team? Of course there is no truly universal answer to this question. Software projects are just too diverse. But still my gut feeling as well as several articles I read on the subject, tell me that in many cases two to five people is probably a good initial estimate.

Obviously this excludes things above a certain size, like operating or ERP systems in their entirety. But change the focus to a module of the aforementioned and all of a sudden the team with five people is not so ridiculously small any more. So, like in many other cases, it all comes down to scope.

My general line of thinking is this: When something is too big for five people, it comes with an inherent complexity that will increase the likelihood of failure considerably. Of course, you can throw more resources at the problem. But you will get all the by-products of that. Think about communication overhead between developers, additional management layers, and the increased risk of low-performers among all the people that are poring in.

Interestingly, you can also approach things from the opposite direction: I was once told a story about a CEO who reduced the size of a development team to get things done better. At the time I thought this was a ridiculous approach. But I have long changed my mind since then and wholeheartedly agree. It is a bit like the inverse of what is described in “The Mythical Man-Month“, where Frederick P. Brooks, Jr. describes the naive idea that adding developers will shorten the time needed to accomplish a certain goal.

So in my view the goal should absolutely be to to have development teams no larger than five people. If the overall project is too big for this, you need to slice it into parts that can be delivered by small-enough teams. This will also help to enforce clean interfaces/contracts between the various parts.

All of the above assumes that we are talking about people with considerable experience (5+ years) in the relevant area. Given that the average experience of software developers is about five years, it will often be challenging to staff such a team. (And five years average is for their entire career, not the relevant subject area.) But dealing with this, is the topic for a different post.