ESXi 6.5: Intel NICs Not Found

While playing around with an ESXi 6.5 test system, I accidentally killed all network connectivity by setting the NICs to pass-through. This post gives a bit of background and the solution that worked for me.

The system is home-built with a Fujitsu D3410-B2 motherboard and an Intel dual-port Gigabit NIC (HP OEM). The motherboard has a Realtek RTL8111G chip for its NIC, which does allegedly work with community drivers, but not out-of-the-box. One of the things I want to run on this box is a pfSense router. So, when I discovered, that the Realtek NIC was available for pass-through, I enabled this. I also enabled one(!) of the two ports of my Intel dual-port NIC. At least, that is what I had intended to do.

Because what really happened was that all three NICs were set to pass-through, which of course meant that ESXi itself had no NIC available to itself any more. This issue showed after the next reboot, when the console told me that no supported NICs had been found in the system. Perhaps not wrong in strict terms, but certainly a bit misleading, when you are not very experienced with ESXi.

Searching the net did not provide a real answer. But after a couple of minutes I realized that perhaps my change about pass-through might be the culprit. The relevant file where these settings are stored is /etc/vmware/esx.conf. I searched for lines looking like this

/device/000:01.0/owner = "passthru"

and replaced them with

/device/000:01.0/owner = "vmkernel"

After that I just had to reboot and things were fine again.

 

Re-Writing Software from Scratch

It is not uncommon that an existing piece of non-trivial software gets re-written from scratch. Personally, I find this preference for a greenfield approach interesting, because it is not something I share. In fact, I believe that it is a fundamentally flawed approach in many  cases, so why are people still doing this today?

Probably because they just like it more and also it is perceived, at least when things get started, as more glorious. But when you dig deeper the picture changes dramatically.  Disclaimer: Of course there are situations when re-starting is the only option. But it typically involve aspects like required special hardware not being available any more.

When I started writing this post with making some notes, it all started with technical arguments. They are of course relevant, but the business side is much more critical. Just re-phrase the initial statement about re-writing to something like

Instead of gradually improving the existing software and learn along the way, we spend an enormous amount of money on something that delivers no additional value compared to what we have now. For this period, which we currently estimate to be 2 years (it is very likely to be longer), apart from very minor additions, the business will not get anything new, even if the market requires it; so long-term we risk the existence of the organization. And yes, we may actually loose some key members of staff along the way. Also, it is not certain that the new software ever works as expected. But should it really do, the market has changed so much, that the software is not very useful for doing business anyway and we can start all over again.

Is there anyone who still thinks that re-writing is generally a good idea?

Let us now change perspective and look at it from a software vendor’s point of view. Because the scenario above was written with an in-house application in mind. What is different, when we look at a company that develops and sells enterprise software? For this text the main properties of enterprise software are that it is used in large organizations to support critical business processes. How keen will customers be to bet their own future on something new, i.e. not tested? But even if they waited long enough for things to stabilize, there would be the migration effort. And if that effort comes towards them anyway, they may just as well look at the market for alternatives. So you would actively encourage your customer base to turn to the competition. Brilliant idea, right?

What becomes clear looking at things like that, is what the core value proposition of enterprise software is: investment protection. That is why we will, for the foreseeable future, continue to have mainframes with decades-old software running on them. Yes, these machines are expensive. But the alternatives are more expensive and in addition pose enormous risk.

In the context of commercial software vendors one argument for a re-write is that of additional revenue. It is often seen as easier to get a given amount of money for a new product than an improved version of something already out there. But that is the one-off view. What you typically want as a software vendor is a happy customer that pays maintenance every year and, whenever they need something new, first turns to you rather than the competition for options. Also, such a happy customer is by far the best marketing you can get. It may not look as sexy as getting new customers all the time, but it certainly drives the financial bottom line.

Switching over to the technical side, there are a few arguments that are typically made in favor of a restart. My favorite is the better architecture of the new solution, which will be the basis for future flexibility, growth, etc. I believe that most people are sincere here and think they can come up with something better. But the truth is that unless someone has done something similar successfully in the past, there is a big chance that the effort is hugely underestimated. Yes, technology has advanced and we have better languages and frameworks. But on the other hand the requirements have also grown dramatically. Think about high availability, scalability, performance and all the others. Even more important, though, is the business side. With something brand new people will have greatly increased expectations. So giving them something like-for-like will probably not be seen as success.

The not-invented-here syndrome is also relevant in this context and particularly with more junior teams. I have seen a case when an established framework used in more than 9,000 business-critical (i.e. direct impact on revenue) installations was dismissed in favor of something the team wanted to develop themselves. And I can tell you that the latter was a really bad implementation.Whether it was a misguided sense of pride or a fundamental lack of knowledge I cannot say. But while certainly being the most extreme incarnation I have seen so far, it was certainly not the only one.

So far my thoughts on the subject. Please let me know what you think about this topic.

Theory of Constraints and “The Goal”

Two  years after reading Eliyahu M. Goldratt‘s famous book “The Goal” for the first time, I had a second go at it. It proved, again, to be an entertaining and at the same time enlightening read. I will not come up with yet another summary, you will find plenty of those already. Instead I want to point out a few interesting links to other areas.

Let me start with an interesting connection to my article about personal objectives. One of the statements I had made there, was that a global optimum cannot be reached by local optima everywhere; some local deficiencies would need to be accepted to reach the overall goal. (Apart from the common-sense aspect, this is also formally proven in systems theory.) Well, exactly the same point is mentioned in “The Goal” when Dr. Goldratt writes that at the end of the day, only the amount of shipped goods, i.e. goods sold, counts. Any intermediate over-achievements (“We beat the robot”), is really completely irrelevant.

Another interesting link is one with the book “Turn the Ship Around” by David Marquet (see this for a quick summary). Mr Marquet makes the point that in his experience a deciding factor for an organization’s (in his case the crew on a nuclear attack submarine) performance is, whether its members work to avoid individual mistakes or to achieve a common goal. Similar, in “The Goal” there is a paragraph about the success of various changes in a manufacturing context. It is basically about the different intent (and Mr Marquet calls his style intent-based leadership, by the way). Instead of trying to reduce costs, the manufacturing team looked at things from a revenue generation point of view.

In the addendum “Standing on the Shoulders of Giants” a short introduction is given into the Toyota Production System (TPS). One interesting comment in this section is, that other car manufacturers which have also implemented a comparable system, have not achieved the same level of improvement. This is attributed to the guiding principle: Toyota focused on throughput (and by that ultimately on revenue), whereas all the others looked at cost reduction. The analogy to that is a study (source unknown as of this writing) about customer satisfaction vs. profit optimization. Allegedly, those companies that focus on customers not only win on this front. But in the mid- to long-term they also constantly outperform those that focus on financials.

Revisiting Software Architecture

Quite recently I heard a statement similar to

“The application works, so there is no need to consider changing the architecture.”

I was a bit surprised and must admit that in this situation had no proper response for someone who obviously had a view so different from everything I believe in. But when you think about it, there is obviously a number of reasons why this statement was a bit premature. Let’s have a look at this in more detail.

There are several assumptions and implicit connotations, which in our case did not hold true. The very first is that the application actually works, and at the time that was not entirely clear. We had just gone through a rather bumpy go-live and there had not yet been a single work item processed by the system from start to finish, let alone all the edge cases covered. (We had done a sanity test with a limited set of data, but that had been executed by folks long on the project and not real end users.) So with all the issues that had surfaced during the project, nobody really knew how well the application would work in the real world.

The second assumption is that the chosen architecture is a good fit for the requirements. From a communication theory point of view this actually means “a good fit for what I understand the requirements to be”. So you could turn the statement in question around and say “You have not learned anything new about the requirements since you started the implementation?”. Because that is what it really means: I never look back and challenge my own thoughts or decisions. Rather dumb, isn’t it?

Interestingly, the statement was made in the context of a discussion about additional requirements. So there is a new situation and of course I should re-evaluate my options. It might indeed be tempting to just continue “the old way” until you really hit a wall. But if that happens you have consciously increased sunk costs. And even if you can “avoid the wall”, there is still a chance that a fresh look at things could have fostered a better result. So apart from the saved effort (and that is only the analysis, not a code change yet) you can only loose.

The next reason are difficulties with the original approach and of that there had been plenty in our case. Of course people are happy that things finally sort-of work. But the more difficulties there have been along the way, the bigger the risk that the current implementation is either fragile or still has some hidden issues.

And last but not least there are new tools that have become available in the meantime. Whether they have an architectural impact obviously depends on the specific circumstances. And it is a fine line, because there is always temptation to go for the new, cool thing. But does it provide enough added value to accept the risks that come with such a switch? Moving from a relational database to one that is graph-based, is one example that lends itself quite well to this discussion. When your use-case is about “objects” and their relationships with one another (social networks are the standard example here), the change away from a relational database is probably a serious option. If you deal with financial transactions, things look a bit different.

So in a nutshell here are the situations when you should explicitly re-evaluate your application’s architecture:

  • Improved understanding of the original requirements (e.g. after the first release has gone live)
  • New requirements
  • Difficulties faced with the initial approach
  • New alternatives available

So even if you are not such a big fan of re-factoring in the context of architecture, I could hopefully show you some reasons why it is usually the way to go.

My WiFi Setup with Ubiquiti Networks UAP-AC-PRO

After almost nine months it is time for a verdict on my “new” WiFi access, the Ubiquiti Networks UAP-AC-PRO. I can honestly say that it works extremely well here and all the WiFi issues I have had for years, have simply gone. The device is comparatively expensive (I paid about 140 Euros at Amazon) and unless my existing solution had not caused so many issues, I probably would have not spent the money. But for me it was definitely worth it.

I was initially made aware of the Ubiquiti Networks UAP-AC-PRO by an article on a German website that covers Apple-related topics. The guys there were quite enthusiastic about it and especially its graceful and uninterrupted handover of connections from one access point to the other. The latter had been a particularly nasty issue for me, with a Fritz!Box Fon WLAN 7390 covering the ground floor and an  FRITZ!WLAN Repeater 450E, configured as a pure access point, covering the first floor. There simply was no handover, so I had to effectively configure two completely separate WiFi networks. In addition the FRITZ!WLAN Repeater 450E needed a regular power-cycle because for no apparent reason it would stop working every couple of days. Its predecessor, a FRITZ! 300E WLAN Repeater, was much better in that respect, but it had died after a bit more than two years.

So all in all the situation was not too great on the WiFi front. This changed dramatically when I replaced both the Fritz!Box and the FRITZ!WLAN Repeater with just a single UAP-AC-PRO. Or in other words: Juts one UAP-AC-PRO gave me better WiFi than the both Fritz components combined. Impressive! In consequence, the seamless handover of connections from one access point to the other was not relevant any more at all. So instead of buying a second UAO-AC-PRO, I just have one and all is well. The flip-side is that I cannot play around with this feature ;-).

To sum things up, I am extremely satisfied with the UAP-AP-PRO. For administration I run the Unifi program on a Raspberry Pi 3 (model 2 worked just as well for me) and I will write another post on some of the setup aspects of that later. If you search on the Internet or look at Youtube, you will also find a lot of additional information.

Configuration Management – Part 8: The Maintenance

How do I maintain my configuration data? It is one thing to have them stored somewhere and being able to maintain stuff, if you are a developer or a technical person in general. In which case you will be fine with a plain text editor, sometimes even something like vi (I am an Emacs guy 😉 ). But what if you want business users be able to do this by themselves? In fact, quite often these folks will also tell you that they do not want to be dependent on you for the job.

When trying to identify the requirements for a maintenance tools suitable for business users, I came up with the following points (and in that order of priority):

  • Familiarity: People do not want to learn an entirely new tool for just maintaining a few configuration values. So exposing them to the complete functionality of a configuration management system will not work. And even if the latter has something like role-based configuration or user profiles, it will typically just hide those parts, that “normal” users are not supposed to see. But it will still, tacitly, require a lot of understanding about the overall system.
  • Safety: In many cases business people will tell you that while they want to do changes by themselves, they still expect you (or the system you develop) to check what they entered. In a number of cases the argument went so far that people, from my perspective, were trying to avoid responsibility for their actions altogether. Whether this was just an attempt to play the blame-game or rather an expression of uncertainty, I cannot say. But for sure you need to add some automated checks.
  • Auditing: In many cases you must be able to maintain records of changes for legal reasons (e.g. Sarbanes–Oxley Act). But even if that is not the case, you as the technically responsible person absolutely want to have records, what configuration were in the system when.
  • Extensibility: Especially today, when most of us try to work in an agile manner, this should not need mentioning. But unfortunately the reality I mostly witness is quite different. Anyway, what we as developers need to do here, is scrutinize what the business folks have brought up as their requirement. It is not their job to tell us about it, but the other way around.

So what is the solution? I see basically two options here: custom-developed web UI and Excel. The Excel approach may surprise you, but it really has a number of benefits, also from a technical perspective. Remember that it is file-based, so you can simply leverage your normal VCS for auditing. How the business users “upload” the updates, may vary. But you can start with something simple like a shell script/batch file that acts as a VCS wrapper, and many people will already be happy. More sophisticated stuff like a web upload and built-in verification is nicer of course.

A web UI has the benefit that you can provide feedback on rule violation easier, compared to the Excel approach. But what is often overlooked is the requirement for mass-updates. It is a big pain to click through 50 records whereas in Excel that would simply be a copy-paste operation. And the latter is less susceptible to errors, by the way.

Sometimes Business Rules Management System (BRMS) and the web UI they usually come with can be an alternative. But if you are not already using it on the project, the overhead of bringing in an entire additional system, will typically outweigh the benefits. If you still want to explore that route, pay particular attention to how the web UI changes can be synced back into the artifacts in VCS the developers work with.

How Useful Are Personal Objectives?

At least here in Germany there has been a recent shift away from personal objectives in compensation. Some very big companies like Daimler, Bosch and Infineon have announced that they will abandon personal objectives in this context and only look at the overall result of the organization. (I am pretty certain that sales is exempt from that, though.) They will, of course, keep personal objectives as a tool for individual and career development, but there will be no more financial impact.

Personally, I think this is a move into the right direction for various reasons and would like to reflect on a few of them (in descending order of importance).

  • Local optimization: This is the part of individual, financially relevant objectives that has by far the biggest impact. You have all seen it at work when different parts of the organization seem to fight each other. With the odd exception, they do not do that out of actual dislike but because it is needed for them to achieve their goals. One might think now, that in this case the goals just need to be adjusted. Well, not really. It is a major research topic in the field of systems theory and the boiled-down answer is that for a global optimum it is required that at least some sub-systems run below their local optimum. (For details see e.g. Systems theory and complexity: Part 3 by  Kurt A. Richardson.) Besides, who in business would have the time to develop such an elaborate system?
  • Employee demotivation: I have yet to meet someone who thinks that personal goals, which impact their salary, are a good and motivating thing. (There is also the fundamental point that motivation is by definition intrinsic, but this is another discussion.) The best you can hope for is a neutral judgement. However, the normal approach is that objectives trickle down through the hierarchy and the manager more or less gives the direct the same goals he or she got. I see two main problems here. First, if I get pretty much the same goal as my boss, how much influence do I still have to really control the result and by that my payment? Second, all of a sudden I have to deal with abstraction levels that are in a different “hemisphere” than my actual work. How does that fit?
  • Time spent: If you as a manager are at least somewhat serious about giving your directs proper objectives and balancing your own financial outcome with their needs, you spend an enormous amount of time on this. You have to to be careful how to sell them their goal (which you first had to devise), discuss the details and make adjustments. And the real fun comes with “judgment day”. Be prepared for discussions that likely scar your relationship with the direct. And we also have all the administrative details around this, although I am least concerned about them.

So in total you probably know by now my position on the subject. I consider myself a highly motivated employee and love most of my work. So how can anyone assume that I do not throw all my energy behind what I am doing? And even if I did not do this, would an objective linked to money change anything? No, I would just play the system.

Let’s see if the companies mentioned at the beginning of this post are just the tip of the iceberg.