In an episode of Gilmore Girls, a character called Jackson gets a hold of an industrial deep frier. He wants to use it to deep fry Thanksgiving turkey, but after that he and his drunk buddies start looking for other things to deep fry (“Deep fried cake!” “Deep fried biscotti!” “Deep fried shoe!”). I feel like that about microservices these days.
Look, here’s the thing. There is a whole section on this blog dedicated to microservice articles. I am as guilty as anybody for blowing that particular horn. As an architecture, microservices represent a great solution for a particular problem. They are just not a solution for EVERY problem. I feel we are entering deep fried biscotti territory if we think otherwise.
Every microservice presentation or article drives the message “monoliths bad, microservices good”. As always, that is overly simplistic. Monoliths are not bad because of some inherent fatal flow. Monoliths were the norm in the days of on-premise software delivery, where a single admin was in charge of installing a large product behind the firewall for internal company use.
Once Cloud gained strength, developers realized that putting a large monolith in a container and operating it in the Cloud is very hard. They were hard to scale, hard to evolve and incrementally re-deploy, configure for efficient Cloud-scale operation, yada yada yada – you have heard this a zillion times.
The key point here is the alignment of several factors that made microservices explode:
- Product is fully managed in the Cloud – there are no customer admins in the picture
- Product is multi-tenanted, serving a large number of active users – this means careful tuning and scaling is needed to avoid the product melting into the ground
- Individual teams operate the software – “you build it, you run it”. They know what their microservice is doing, why it is malfunctioning when it does so, and how to address any particular problem.
The great increase in complexity of a microservice system compared to the comparable monolith was not a big problem because each team saw only their own slice. Operational complexity was mitigated by intimate knowledge of the teams operating their own microservices – like having a designer of your car’s engine keeping your car in top shape for you.
Depending on when you jumped on the whole microservice bandwagon, it may or may not be possible to explain to you that microservices do not require Kubernetes as the image orchestration platform. From the days of 12-factor apps and Heroku, to Cloud Foundry, there were various ways to deploy a microservice system before Kubernetes. Kubernetes just made it easier, and became the default way of running a microservice system of any size. Yes, I know Cloud Foundry now runs on Kubernetes. Sheesh.
But again, as long as we are talking about a fully managed product in the Cloud, your users cannot care less about what you are using to run it. Couple of years ago we ported a product with hundreds of microservices from Cloud Foundry to Kubernetes without our customers ever noticing or caring. I know you are super excited about Kubernetes, but your customers are not, trust me.
Kubernetes + microservices on-premises
By now we have established that microservices do not solve customer problems directly, they solve DevOps problems, and customers benefit from that. Teams can develop and operate microservices in relative autonomy, at their own schedule. Each microservice can be scaled precisely, leaving the rest of the system unaffected. Each microservice can use a different stack if you so desire (not that it is a particularly good idea – there is such a thing as ‘too much polyglotism’). But again, these are all solutions to developer and Cloud service operator problems. Customers don’t know how many microservices you are running, or if you are running any for that matter, or which stack they are using, as long as your Cloud service is fit for purpose, sufficiently fast and has the experience and features they need.
All this changes once you try to sell this Cloud product to customers to install by themselves behind their firewalls. They can tell you are giving them a microservice system to install because they need to install a Kubernetes distribution first, followed by dozens upon dozens of microservice images. They are acutely aware of all of your implementation details because they can see them running in the Kubernetes dashboard. Customers who in the monolith days got to install one product now get Ikea style flat-packed glass door bookcase with two bags of hardware to assemble. Unlike Ikea variety, your assembled k8s+microservices bookcase will occasionally make weird noises, doors will sometimes fail to open, screws will fall out on their own and it may also crash to the floor with a big thud, and you will need to rebuild it.
Your on-premises administrators are now a replacement for the sum of all the teams that doubled as Cloud ops, but with none of the deep understanding of how they work, why they misbehave in a particular way and how to make them happy again.
There is no escaping it – a microservice-based product delivered for on-premises installation is really a monolith as far as customer admins are concerned. It’s like before they used to get computers in black boxes, and now they are getting gaming PCs in acrylic window cases where they can see the components inside. You look in the Kubernetes dashboard, and can see all the pods, replicas, ingress, storage, all the moving parts. Trouble is, as a customer admin you have little idea what they do and why some are misbehaving.
Why microservices on-premises are problematic
While writing this I am assuming that you are trying to ship a product originally built for the Cloud as a software. These days there is assumption that Kubernetes is so ubiquitous, admins will have no problem managing such a system. That assumption is wrong. Let me count the ways:
- If your company has its own public Cloud or cheap virtual or real infrastructure, you have very little idea how costly and resource-hungry your microservice system is. This realization comes to a head when you spec out how many nodes a k8s system needs to have to run your product, and how much CPU/RAM/disk space these nodes need to have. Companies that manage their own internal infrastructure may not have easy and cheap access to as many VMs or bare metal servers as you demand.
- If you are a particularly religious believer in microservices, you have way too many of them, which is now a problem when somebody else and not you needs to keep them running.
- “You built it, you run it” works great in the Cloud. On-premises, you built it, customer admins run it. They have no idea where to start and how to do it properly, or what to do if they start failing. Chances are, you didn’t write everything that is in your teams’ collective heads down as Ops documentation.
Why we keep doing it anyway
There are a few reasons why. If you are developing Cloud-first, you don’t have a choice – microservices are built for the complex Cloud systems. Meanwhile, many customers today are still weary of the public Cloud and want to be able to access their own data behind the firewall. You cannot afford to write Cloud and on-prem versions of your product – writing one code base makes a lot of sense from the development cost point of view. Plus, unless you are ‘born in the Cloud’ startup, your Sales team is much more used to selling software than services (Cloud services are bought and billed as-you-go, which kind of removes the need for traditional Sales).
Having a uniform way of delivering units of functionality, or deliver new versions/fixes/upgrades is another reason Kubernetes is attractive for on-premises as well. While ‘service windows’ are still a thing behind the firewall, there is a lot of pressure for Cloud-like ‘zero downtime’ updates. Kubernetes’ rolling deploys are a relatively easy and standard way to achieve that requirement if you use microservices and each of the microservices can start in a few seconds.
Finally, large software products are built by many teams, and benefit from some kind of component model. In one of my past projects we used to deliver an on-premises Web server as a monolith, but it was assembled out of many OSGi components in the build phase. Kubernetes allows teams to put together a similar system using containers as components. The problem is that while this is great for development teams, it has little benefit for customer admins as they are managing such a system.
How to do it (relatively) well
OK, so delivering microservices on Kubernetes for on-premises software is not going away. Can we do it better then?
We can start this way:
- Review your microservices – in the Cloud, the primary reason to have microservices is so that teams can evolve and deploy them independently and continuously. Unfortunately, microservices have overhead – they need to be monitored, scaled with multiple replicas, they consume resources even when idling (minimal cpu/ram/disk reservations), and each new microservice is a new moving part that can fail. Make each microservice earn a reason to exist, and if it fails that test, consider combining multiple microservices into a larger one.
- Optimize microservice performance – knowing your product will run on-premises in less then ideal circumstances, minimize network traffic, database chatter, and expensive calls as much as possible. Cut down on waste, remove unused code and files, switch to the new versions of dependencies with fixes and performance improvements. Chances are, your microservices could have used that tuning in the Cloud as well, but running in a more generous environment masked the pressing need for it.
- Create run books – transfer your team’s collective know-how required to successfully operate your microservices to detailed documentation. Chances are you already have them, if only for Cloud first responders on duty that react to pagers when you are not around. Write two versions – another for operating your services in on-premises deployments.
- Automate everything – automating installation, upgrades and most routine tasks allows you to capture your team’s knowledge in repeatable scripts. It will also make support calls easier if scripts customer admins run can generate diagnostic information you need to figure out what is wrong in their particular installation.
- Consider writing custom admin interface – no, Kubernetes dashboard is not going to cut it. Kubernetes dashboards are by necessity generic – they are designed for any kind of a system running there. You need a layer customized for YOUR product, next level of semantic abstraction that allows customer admins to understand your product in a way that makes unique sense for it. That admin interface will of course turn around and call generic Kubernetes commands and APIs. The key is to talk to admins in a way that is appropriate for your product, not through low level Kubernetes implementation details.
Avoid the ‘M’ word on-premises
One thing not to do is assume that microservice architecture is a selling point for on-premises software. It is not. Microservices are designed to address a problem of managing massive Cloud systems with a number of teams working in parallel, using Continuous Deployment tools. You CAN deliver microservices on-premises as well, but consider it implementation detail. The closer on-prem delivery is to classic monolithic products, the more successful you are going to be.
In other words, consider microservices as the most modern component model for delivering complex on-premises software. Customers didn’t care about J2EE application servers, they didn’t care about OSGi components, and there is no reason to believe they care about microservices when it comes to operating large on-premises products. What counts is good documentation, well behaved and not too power-hungry systems, semantically correct diagnostics, automated operations, and well behaved install and update mechanisms.
Even with all that in place, there is a distinct possibility customers will rethink their approach after managing your microservice products on-premises for a while. If they also exist as fully managed services in the Cloud, a sizeable number of them may just give up on operating your software and switch to the Cloud. If that happens, you will have achieved a somewhat roundabout, but effective demonstration of the value of the Cloud. Good job!
© Dejan Glozic, 2021