Category Archives: ALM

webMethods Integration Server: Continuous Deployment

For more than nine years I have been working on a package for webMethods Integration Server. With the experience gained there, I want to discuss a number of aspects about Continuous Deployment.

Versioning

I recommend the use of semantic versioning, which at its core is about the following (for a lot of additional details, just follow the link):

  • The version number consists of three parts: Major, minor, and patch (example v1.4.2).
  • An increase in the major version indicates a non-backwards-compatible change.
  • An increase in the minor version indicates a backwards-compatible change.
  • A increase in the patch version indicates bug fixes only, no functional changes at all.

It is a well-known approach and makes it very easy for everyone to derive the relevant aspects from just looking at the version number. If the release in question contains bug fixes for something you have in use, it is probably a good idea to have a close look and check if a bug relevant to you was fixed. If it is a minor update and thus contains improvements while being backwards compatible, you may want to start thinking about a good time to make the switch. And if it is major update that (potentially) breaks things, a deeper look is needed.

Each Integration Server package has two attributes for holding information about its “version”. One is indeed the version itself and the other is the build. The latter is by default populated with in auto-increase number, which I find not very helpful. Yes, it gives me a unique identifier, but one that does not hold any context. What I put into this field instead is a combination of a date-time-stamp and the change set identifier from the Version Control System (VCS). This allows me to see at a glance when this package was built and what it contains.

Build

Conceptually the work on a single package, as opposed to a set of multiple ones that comprise one application, is a bit different in that you simply deal with only one artifact that gets released. In many projects I have seen, people take a slightly different approach and see the entirety of the project as their to-be-released artifact. This approach is supported by how the related tools (Asset Build Environment and Deployer) work: You simply throw in the source code for several packages, create an archive with metadata (esp. dependencies), and deploy it. Of course you could do this on a per-package basis. But it is easier to have just one big project for all of them.

Like almost always in live, nothing comes for free. What it practically means is that for every change in only one of the potentially many packages of the application, all of them need a re-deployment. Suppose you have an update in a maintenance module that is somewhat unrelated to normal daily operation. If you deploy everything in one big archive, this will effectively cause an outage for your application. So you just introduced a massive hindrance for Continuous Deployment. Of course this can be mitigated with blue-green deployments and you are well advised to have that in place anyway. But in reality few customers are there. What I recommend instead is an approach where you “cut” your packages in such a way that they each of them performs a clearly defined job. And then you have discrete CI job for each of them, of course with the dependencies taken into account.

Artifact Storage

Once your build has been created, it must reside somewhere. In a plain webMethods environment this is normally the file system, where the build was performed by Asset Build Environment (ABE). From there it would be picked up by Deployer and moved to the defined target environment(s). While this has the advantage of being a quite simple setup, it also has the downside that you loose the history of your builds. What you should do instead, is follow the same approach that has been hugely successful for Maven: use a binary repository like Artifactory, Nexus, or one of the others (a good comparison can be found here). I create a ZIP archive of the ABE result and have Jenkins upload it to Artifactory using the respective plugin.

To have the full history and at the same time a fixed download location, I perform this upload twice. The first contains a date-time-stamp and the change set identifier from the Version Control System (VCS), exactly like for the package’s build information. This is used for audit purposes only and gives me the full history of everything that has ever been built. But it is never used for actually performing the deployment. For that purpose I upload the ZIP archive a second time, but in this case without any changing parts in the URL. So it effectively makes it behave a bit like a permalink and I have a nice source for download. And since the packages themselves also contain the date-time stamp and change set identifier, I still know where they came from.

Deployment

Depending on your overall IT landscape there are two possible approaches for handling deployments. The recommended way is to use a general-purpose configuration management tool like Chef, Puppet, Ansible, Salt, etc. This should then also be the master of your webMethods deployments. Just point your script to your “permalink” in the binary reposiory and take it from there. I use Chef and its remote file mechanism. The latter nicely detects if the archive has changed on Artifactory and only then executes the download and deployment.

You can also develop your own scripts to do the download etc., and it may appear to be easier at a first glance. But there is a reason that configuration management tools like Chef et al. have had such success over the last couple years, compared to home-grown scripts. In my opinion it simply does not make sense to spend the time to re-invent the wheel here. So you should invest some time (there are many good videos on YouTube about this topic) and figure out which system is best for your needs. And if you still think that you will be faster with your own script, chances are that you overlooked some requirements. The usual suspects are logging, error handling, security, user management, documentation, etc.

With either approach this makes deployment a completely local operation and that has a number of benefits. In particular you can easily perform any preparatory work like e.g. adjusting content of files, create needed directories, etc.

Summary

All in all, this approach has worked extremely well for me. While it was first developed for an “isolated” utility package, it has proven to be even more useful for entire applications, comprised of multiple packages; in other words, it scales well.

Another big advantages is separation of concerns. It is always clear which activity is done by what component. The CI server performs the checkout from VCS and orchestrates the build and upload to the binary repository. The binary repository holds the deployable artifact and also maintains an audit trail of everything that has ever been built. The general-purpose configuration management tool orchestrates the download from the binary repository and the actual deployment.

With this split of the overall process into discrete steps, it is easier to extend and especially to debug. You can “inject” additional logic (think user exits) and especially implement things like blue-green deployments for a zero-downtime architecture. The latter will require some upfront thinking about shared state, but this is a conceptual problem and not specific to Integration Server.

One more word about scalability. If you have a big-enough farm of Integration Servers running (and some customers have hundreds of them), the local execution of deployments also is much faster than doing it from a central place.

I hope you find this information useful and would love to get your thoughts on it.

Configuration Management – Part 6: The Secrets

Every non-trivial application needs to deal with configuration data that require special protection. In most cases they are password or something similar. Putting those items into configuration files in clear is a pretty bad idea. Especially so, because these configuration files are almost always stored in a VCS (version control system), that many people have access to.

Other systems replace clear-text passwords found in files automatically with an encrypted version. (You may have seen cryptic values starting with something like {AES} in the past.) But apart from the conceptual issue that parts of the files are changed outside the developer’s control, this also not exactly an easy thing to implement. How do you tell the system, which values to encrypt? What about those time periods that passwords exist in clear text on disk, especially on production systems?

My approach was to leverage the built-in password manager facility of webMethods Integration Server instead. This is an encrypted data store that can be secured on multiple levels, up to HSMs (hardware security module). You can look at it as an associative array (in Java usually referred to as map) where a handle is used to retrieve the actual secret value. With a special syntax (e.g. secretValue=[[encrypted:handleToSecretValue]]) you declare the encrypted value. Once you have done that, this “pointer”will of course return no value, because you still need to actually define it in the password manager. This can be done via web UI, a service, or by importing a flat file. The flat file import, by the way, works really well with general purpose configuration management systems like Chef, Puppet, Ansible etc.

A nice side-effect of storing the actual value outside the regular configuration file is that within your configuration files you do not need to bother with the various environments (add that aspect to the complexity when looking at in-file encryption from the second paragraph). Because the part that is environment-specific is the actual value; the handle can, and in fact should, be the same across all environments. And since you define the specific value directly within in each system, you are already done.

Configuration Management – Part 5: The External World

A standard requirement in configuration management is using values that already exist somewhere else. The most obvious places are environment variables (defined by the operating system) and Java system properties. Other are existing (!) databases, e.g. from an ERP system, or files.

The reason for referencing values directly from their original sources instead of duplicating them in a copy-paste fashion is the Don’t-Repeat-Yourself (DRY) principle. Although the latter is typically discussed in the context of code, it applies to configuration at least as well. We all know those “great” applications, which require manual updates of the same value in different places. And if you miss only one, all hell breaks loose.

For re-use of values within a file, the standard approach is variable interpolation. A well-known syntax for that comes from the build tool Apache Ant, where ${variableName} can be placed anywhere into a value assignment. Apache Commons Configuration, and therefore also WxConfig, support this syntax. In WxConfig you can even reference values from other files that belong to the same Integration Server package.

But what to do for other sources? Well, for the aforementioned environment variables and Java system properties, the respective interpolators from Apache Commons Configuration can also be used in WxConfig. But the latter also defines several interpolators of its own.

  • Cross package: Assuming you have an Integration Server package that holds some general values (often referred to as “global”), those can be referenced. (There are multiple ways to deal with truly global values and the specifics really determine which one is used best.)
  • Current date/time: Gets the current date and/or time, which is admittedly a bit of an edge-case. One could argue that this is not really a configuration value, and that is correct. However, there are scenarios when the ability to quite easily have such a value comes in very handy. Think about files that get created during processing of data. Instead of manually concatenating the path, the base filename, the date/time stamp and the file suffix in the code, you could just have something like this in your configuration file: workerFile=${tmpDir}/appFile_${date:yyyyMMdd-HHmmssSSS}.dat Doesn’t that make things a lot cleaner to read?
  • File content: Similar to date/time this was added primarily for convenience and clarity reasons. The typical use-case are templates, where a file (possibly maintained by another application) is just read directly into a configuration value.
  • Code invocation: When all else fails, you can have an arbitrary piece of logic be executed and the result be placed into the configuration value.

It is worth noting that all interpolators resolve at invocation. So you always get current results. In the case of the file content interpolator, this has the downside of file I/O; but if that really becomes a bottleneck, you are still not worse off than when having the read somewhere else in the code. And the file system caching is also still there …

With this set of options, pretty much all requirements for a redundancy-free configuration management can be met.

Configuration Management – Part 4: The Differences

The capability to deal with multiple environments is another strength of the WxConfig solution. It basically uses a late-binding approach and, upon loading, checks the environment type of the current system (Docker-compatible, by the way). That way, only values valid for the current environment type will be loaded.

Before diving into the details, I want to share a few more conceptional thoughts here. It is worth noting that WxConfig primarily operates on files for storing configuration values, while many projects I have seen use a database for that. A database offers functionality that I would consider “convenience” in the context of configuration management, but it also has some drawbacks (for this use-case).

Primarily, access to files’ content is always easier than to a database. Yes, there is probably no reasonably conceivable scenario that could not be solved with a database. However, it is easier to e.g. view a file in a terminal session, than starting the database command line client and run a select statement. And especially so from a support perspective, when you don’t need additional complexity.

But even more important are things like scripted operations against such files and version control. Why scripting, you may ask. Well, because I see it as a “last line of defense” to achieve a certain behavior. There will always be use-cases out there, that I have not anticipated. And for those it is possible to augment files and achieve what is needed outside of my solution. Also possible with a database – somehow. But do you really prefer that over an Ant script run from your Continuous Integration (CI) job?

And for version control the advantage of files is probably even more obvious. Yes, you can work with something like Liquibase and effectively have a file representation of your database’s structure and contents. But is that so desirable when you could just have plain files?

But, finally, back to different environments and associated aspects. The approach I have chosen is to dynamically load only those files at run-time that are suitable for the current system. So for all config files there is a check whether they are relevant or not. Those that are, will be loaded and the contents of all files added to memory for use by custom code. That is the whole secret.

To rephrase: Once the initialization is finished all the relevant configuration values are available to the application. Their source (i.e. which files were loaded) is completely abstracted away. It is this separation that makes WxConfig so powerful. And in the meantime that basic approach was also applied to other conditions. Because at its core the environment type is just that, a condition.

So we have regular (=unconditional) file loading and conditional loading. Other conditions currently supported are the type of operating system, the version of the run-time container, a timeframe, and also the hostname. And they can be combined with a logical AND. So you can easily develop a solution on Windows that will later run on Linux. All the specifics can be handled by configuration.

When I came up with the idea more than nine years ago, it was a bit like the obvious approach to take. Since then there have been numerous moments when this decision was re-affirmed by the flexibility I got from it.

Configuration Management – Part 3: The Journey Begins

Today I want to tell you about a configuration management solution I have developed. It is built for Software AG‘s ESB (the webMethods Integration Server). The latter did, for a very long time, not have built-in capabilities for dealing with configuration values. So what many projects did, was to hack together a quick-and-dirty “solution” that was basically a wrapper around Java properties.

At the time (early 2008) I was working in a “SWAT” team in presales, where we dealt with large prospects that typically looked for a strategic solution to their integration requirements. Of course configuration management, although not as prevalent as today, came up more and more often. So instead of trying to dodge the bullet, I started working on a solution. And since webMethods Integration Server is built on an open architecture, you can easily add things.

So let’s dive into some of the details of my configuration management solution, called
WxConfig. Its overall design goal was to have packages (roughly the platform’s equivalent to a web application on servlet containers) that could be moved around without any additional manual setup or configuration. The latter is a requirement stemming from the fact that we are in an ESB/integration context. By definition most packages therefore have some “relationship” to the outside world, and are not an application running more or less in isolation. Typical example would be a connection to a database, or an ERP system.

Traditionally there were two approaches for dealing with these external environment dependencies. On development (DEV) environments, there was some kind of document, telling the developer what setup/configuration they would have to apply to their DEV environment. For all “higher” stages like test (TEST) or production (PROD) this was hopefully handled by some kind automation; although back in 2008 that was rarely case and more often than not also there a manual checklist was used.

From a presales perspective (that is where it all started) the DEV part was the relevant one. My idea was to come up with something that could transform the document with the setup instructions into some kind of codification. So I basically devised an XML format (or rather a convention) that did just that. It started out quite small, with only schedulers being supported. (In that specific case we needed a periodic execution of logic for a demo.)

Since then a lot of other things were added. The approach was so successful that  many customers have moved their entire environment setup (well, at least those parts that are supported) to the WxConfig solution. And there are, interestingly enough, quite a few customers who use WxConfig solely for environment configuration but not for configuring their actual applications.

You may now ask: Hang on, what about the other environments, like TEST and PROD? Very good question! This is one of the core strengths of the WxConfig solution and therefore warrants a separate post. Stay tuned …

Configuration Management – Part 2: Splendid Isolation

After the first post on configuration management, today I want to write about what I consider one of the biggest flaws in all systems I am aware about so far. They all live pretty much in “splendid isolation” and ignore whatever surrounds them.

My professional background is integration software and I attribute it to this fact, that for me it is a rather obvious real-world requirement to not ignore what is around you, but connect with it. In the past this was typically applied to applications on the business level. So having the ERP system talk with the CRM and SCM applications is well-established. But of course only your own imagination stops you from doing the same for other software. For an example, please check my post on integration of ALM software.

So what specifically makes integration an interesting aspect of configuration management? There are many different “layers” of configuration management out there. It starts with the infrastructure where you typically find a CMDB (Configuration Management Database), that contains information about computers and other components (esp. network). Then there are tools like Chef, Puppet, Ansible etc. that manage the operating system. (I know that is quite a simplification, because you can obviously handle other aspects that way as well; and I have done that extensively.)

A whole “new” level comes into play with containers like Docker. They come, again, with their own tools to manage configuration and operations. Then there is infrastructure-level software like databases, middleware, application servers, etc.

And last but not least, you finally hit the actual application. This is, by the way, the only, if any, level that the business will be interested in. They do not care about the vast complexity underneath, the interdependencies, etc. But of course they expect us to deliver value fast, without interruptions or failures, and for as little money as possible.

If there is no well-managed link between all those layers (and of course the number of layers needs to be multiplied by the number of environment types), how could you be able to handle things, let alone handle them well?

This is why I firmly believe, you need integration, governance, workflow capabilities etc. around configuration management.

Configuration Management – Part 1: The Horrors

Why is Configuration Management (CM) so important? The simple answer is: Because the correct configuration is one of the three critical parts for running a non-trivial piece of software successfully. (The other two are the actual code and the data.)

I still remember my first project when I was fresh out of university. While the team was great, everything else was a bloody nightmare. In particular the software, for which we developed, did not support configuration management in any way. Staging was basically done by exporting from the development system (DEV), manually finding and changing values in the export (at least it was plain ASCII), and importing it on the test system (TEST), and so on.

As if this wasn’t bad enough, there actually was no TEST environment. The customer had decided to save the money, so we had a combined DEV+TEST system. So more than once we ended up with the production system (PROD) accessing the DEV+TEST database. You may ask why we did not simply write a script to change the various settings automatically. Well, we were not allowed to, because it was not budgeted for.

And it gets even worse, because I actually lied to you when describing the staging process before. The problem was that while the export from one environment worked fine, the import into another did not after some time. It was basically a limitation of the maximum size the import file could could have. Our solution (after consultation with the vendor!) was to shut down both environments (DEV+TEST and PROD) and copy the binary storage files that held the development from one to the other. Then all the relevant values had to be adjusted manually on PROD, and then a test to be performed.

Of course that test failed a couple of times until all the changes had been done and recognized by the system (caching can be a dangerous thing). And because we were interfacing with at least five other systems (all PROD!) such a test run was not so easy. To make things more complicated, every set of test data (to be manually prepared on all surrounding systems) could only be used once. So we made a lot of “friends” with the folks operating the other systems.

We pulled a few all-nighters and somehow got things up and running (there were many other nice things with the software, of course). But it was frustrating and very inefficient. Since then configuration has been something that I deeply care about. And you probably understand why now.

ESP8266 Management Platform

There is an awful lot of people doing hobby projects with the ESP8266. I did not want to come up with the 47th incarnation of something, so the question came up, what an interesting project could look like. In the end I decided to develop a solution for managing a multitude of ESP8266-based devices. From a high-level perspective this will include the following components:

  • CMDB: holds information about all devices
  • Bootstrapper: prepares the raw module for all further work
  • Security Manager
  • Lifecycle Manager: Reference processes “from cradle to grave”

In my professional life I have been doing a lot of work on SDLC (Software Development Lifecycle) and configuration management, so the above project choice seemed a natural fit. It also happens that I am deeply interested in both topics and strongly think that poor application of them is often responsible for IT project failures.