Micro-Services – Fad, FUD, Fo, Fum

The Triumph of Death, Pieter Bruegel, 1562
The Triumph of Death, Pieter Bruegel, 1562

But band leader Johnny Rotten didn’t sell out in the way that you’re thinking. He didn’t start as a rebellious kid and tone it down so he could cash in. No, Rotten did the exact opposite — his punk rebelliousness was selling out.

[…] And he definitely wasn’t interested in defining an entire movement.

From 5 artistic geniuses who only became great after selling out

We all intrinsically know that internet cycle is shortening, but what is happening these days with micro-services is bordering on ridiculous. But let’s start from another cycle that is now circling the mortal coil on an ever shorter leash – the Agile movement. Let’s hear it from @pragdave (one of the founding fathers) himself:

Ignoring the typo in the tweet (Dave Thomas obviously meant ‘word’, not ‘work’), this is a familiar story that starts with enthusiasm and fervor of a few ‘founding fathers’ (you can see them artistically blurred in comfortable pants). But before you know it (and in this case ‘before you know it’ takes a span of 13 years), the idea is distorted, commercialized, turned into professional certifications, turned into a rigid religion, and essentially morphed into an abomination barely recognizable to the said founding fathers. The Agile Manifesto defined good practices ‘in the left column’ to combat bad practices ‘in the right column’, yet Dave founds the reality steadily creating ‘agile’ practices that are hard to distinguish from the very things the original movement was raising against.

He who is bitten by a snake fears a lizard.

East African proverb

If you read so far, you are still waiting to see the connection to micro-services. There seems to be a micro-service bubble in formation, inflated by a steady stream of articles in the last couple of months. I have written recently about it, but it bears repeating.

The term itself entered my consciousness during the NodeDay 2014, thanks to great presentations of Ian Livingstone and Richard Roger (both the slides and videos are now available). It followed by an epic 7-episode miniseries by Martin Fowler, only to be challenged to a post-off with an alternative article).

Not to be outdone, Adrian Rossouw published the first article in a series (what is it with micro-services that makes people think they write copy for HBO?), and he brought the parabola of monkeys into the mix. He also expressed concerns that micro-services may not be the best solution in some contexts (client side – duh, I don’t even know how to make browsers run multiple processes even if I wanted to; but also REST APIs, which I completely disagree – in fact, they map perfectly into a system API, where each micro-service can handle a group of API endpoints, nicely routed to by a proxy such as NGINX).

Finally, Russ Miles collapsed 13 years of Agile journey into a Janus-like post that declares a tongue-in-cheek Micro-Services Manifesto. In addition to the real-fake manifesto with signatures (something Nicholas Cage could search for in his next movie), he proposed an official mascot for the movement – Chlamidia bacteria (not a virus, Russ, and you don’t want to contract it):

Chlamidia bacteria
Chlamidia bacteria – a microbial mascot for micro-services

The moral of this timeline? It definitely took us way less than 13 years to get from excited to jaded – about three months, I think. We should all back off a bit and see why the micro-services came to being before declaring another ‘Triumph of Death’.

On Premise Systems – Driver Analogy

Before the cloud, most of the enterprise software was meant to be installed on premise i.e. on customers’ computers, managed by their own administrators. A good analogy here is creating a car for consumers. While a modern car has hundreds of systems and moving parts, this complexity needs to be abstracted out for the average driver. Drivers just want to enter the car, turn on the key and drive away (North American drivers don’t even want to be bothered to shift gears, to the horror and loathing of their European brethren).

By this analogy, monolithic on premise software is good – it is easy to install, start and stop (I said ‘easy’, not ‘fast’). When a new version arrives, it is stopped, updated and started again in a previously announced ‘maintenance window’. The car analogy holds – nobody replaces catalytic converter while hurling down the highway at 120km/h (this is Canada, eh – we hurl down the highway in the metric system). This operation requires service ‘downtime’.

Cloud Systems – Passenger Analogy

With cloud systems, complexity is not a problem because somebody else is driving. Therefore, it does not matter if you cannot do it as long as your driver knows how to operate the velocitator and deceleratrix. What you do expect is that the driver is always available – you expect no downtime, even for service and repairs. That means that the driver needs to plan for maintenance while you are in a meeting or sleeping or otherwise not needing the car. He may need to involve his twin brother to drive you in an identical car while the other one is in service. You don’t care – as far as you are concerned, the system is ‘always on’.

What does this means for our cloud system? It means that monolithic application is actually hurting our ‘always on’ goal. It is a bear to cluster, scale, and keep on all the time because it take so long to build, deploy, start and stop. In addition, you need to stop everything to tweak a single word in a single page.

Enter micro-services, or whatever practitioners called them before we gave them that name. Cutting the monolith into smaller chunks is just good common sense. It adds complexity and more moving parts, but with redundant systems and controlled network latency, the overhead is worth it. Moreover, in a passenger analogy the added complexity is ‘invisible’ because it is the driver’s problem, not yours. It remains the implementation detail of a hosted system.

With micro-services, we can surgically update a feature without affecting the rest of the system. We can beef up (cluster) a feature because it is popular without the need to needlessly cluster all other features that do not receive as much traffic. We can rewrite a service without committing to rewrite the entire monolith (which rarely happens). And thanks to DevOps and all the automation, we can deploy hundred times a day, and then monitor all the services in real time, through various monitoring and analytics software.

Before we called it ‘micro-services’, Amazon called it services built by two pizza teams. We also had SOA, although they were more about reusable services put into catalogs then breaking down a hosted the system into manageable chunks without grand aspirations of reusability.

The thing is – I don’t care how we call it. I don’t even care if we come up with a set of rules to live by (in fact, some may hurt – what if I cannot fit a micro-service into 100 lines of code; did I fail, will I be kicked out of the fraternity). There is a need for decomposing a large hosted system into smaller bits that are easier to manage, cluster and independently evolve. We probably should not call it after a nasty STD or monkeys as suggested, but in the end, the need will survive the movement and the inevitable corruption and sell out.

If you are writing a cloud-based app these days, and through trial and error discovered that you are better off breaking it down into several independent services with a strong contract, communicating through some kind of a pub/sub messaging system, you just backed into micro-services without even knowing.

In the end, while ‘Agile’ was declared dead by a founding father, ‘Agility’ lives on and is as needed as ever. Here is hoping that if we manage to kill the the micro-service movement somehow (and in a much shorter time), the need for composable hosted systems where parts independently evolve will live on.

That’s the kind of afterlife I can believe in. That and the one by Arcade Fire.

 © Dejan Glozic, 2014

In The Beginning, Node Created os.cpus().length Workers


As reported in the previous two posts, we are boldly going where many have already gone before – into the brave new world of Node.js development. Node has this wonderful aura that makes you feel unique even though the fact that you can find answers to your Node.js questions in the first two pages of Google search results should tell you something. This reminds me of Stuff White People Like, where the question ‘Why do white people love Apple products so much” is answered thusly:

Apple products tell the world you are creative and unique. They are an exclusive product line only used by every white college student, designer, writer, English teacher, and hipster on the planet.

And so it is with Node.js, where I fear that if I hesitate too much, somebody else will write an app in Node.js and totally deploy it to production, beating me to it.

Jesting aside, Node.js definitely has the ‘new shoes’ feel compared to stuff that was around much longer. Now that we graduated from ‘Hello, World’ and want to do some serious work in it, there is a list of best practices we need to quickly absorb. The topic of this post is fitting the square peg of Node.js single-threaded nature into the multi-core hole.

For many skeptics, the fact that Node.js is single-threaded is a non-starter. Any crusty Java server can seamlessly spread to all the cores on the machine, making Node look primitive in comparison. In reality, this is not so clear-cut – it all depends on how I/O intensive your code is. If it is mostly I/O bound, those extra cores will not help you that much, while non-blocking nature of Node.js will be a great advantage. However, if your code needs to do a little bit of sequential work (and even a mostly I/O bound code has a bit of blocking work here and there), you will definitely benefit from doubling up. There is also that pesky problem of uncaught exceptions –  they can terminate your server process. How do you address these problems? As always, it depends.

A simple use case (and one that multi-threading server proponents like to use as a Node.js anti-pattern) is a Node app installed on a multi-core machine (i.e. all machines these days). In a naive implementation, the app will use only one core for all the incoming requests, under-utilizing the hardware. In theory, a Java or a RoR server app will scale better by spreading over all the CPUs.

Of course, this being year 5 of Node.js existence, using all the cores is entirely possible using the core ‘cluster’ module (pun not intended). Starting from the example from two posts ago, all we need to do is bring in the ‘cluster’ module and fork the workers:

var cluster = require('cluster');

if (cluster.isMaster) {
   var numCPUs = require('os').cpus().length;
   //Fork the workers, one per CPU
   for (var i=0; i< numCPUs; i++) {
   cluster.on('exit', function(deadWorker, code, signal) {
      // The worker is dead. He's a stiff, bereft of life,
      // he rests in peace.
      // He's pushing up the daisies. He expired and went
      // to meet his maker.
      // He's bleedin' demised. This worker is no more.
      // He's an ex-worker.

      // Oh, look, a shiny new one.
      // Norwegian Blue - beautiful plumage.
      var worker = cluster.fork();

      var newPID = worker.process.pid;
      var oldPID = deadWorker.process.pid;

      console.log('worker '+oldPID+' died.');
      console.log('worker '+newPID+' born.');
else {
   //The normal part of our app.

The code above does two things – it forks the workers (one per CPU core), and it replaces a dead worker with a spiffy new one. This illustrates an ethos of disposability as described in 12factors. An app that can quickly be started and stopped, can also be replaced if it crashes without a hitch. Of course, you can analyze logs and try to figure out why a worker crashed, but you can do it on our own time, while the app continues to handle requests.

It can help to modify the server creation loop by printing out the process ID (‘process’ is a global variable implicitly defined – no need to require a module for it):

http.createServer(app).listen(app.get('port'), function() {
   console.log('Express server '+process.pid+
                ' listening on port ' + app.get('port'));

The sweet thing is that even though we are using multiple processes, they are all bound to the same port (3000 in this case). This is done by the virtue of the master process being the only one actually bound to that port, and a bit of white Node magic.

We can now modify our controller to pass in the PID to simple page and render it using Dust:

exports.simple = function(req, res) {
  res.render('simple', { title: 'Simple',
                         active: 'simple',
                         pid: process.pid });

This line in simple.dust file will render the process ID on the page:

This page is served by server pid={pid}.

When I try this code on my quad-core ThinkPad laptop running Windows 7, I get 8 workers:

Express server 7668 listening on port 3000
Express server 8428 listening on port 3000
Express server 8580 listening on port 3000
Express server 9764 listening on port 3000
Express server 7284 listening on port 3000
Express server 5412 listening on port 3000
Express server 6304 listening on port 3000
Express server 8316 listening on port 3000

If you reload the browser fast enough when rendering the page, you can see different process IDs reported on the page.

This sounds easy enough, as most things in Node.js do. But as usual, real life is a tad messier. After testing the clustering on various machines and platforms, the Node.js team noticed that some machines tend to favor only a couple of workers from the entire pool. It is a sad fact of life that for college assignments, couple of nerds end up doing all the work while the slackers party. But few of us want to tolerate such behavior when it comes to responding to our Web traffic.

As a result, starting from the upcoming Node version 0.12, workers will be assigned in a ’round-robin’ fashion. This policy will be the default on most machines (although you can defeat it by adding this line before creating workers):

    // Set this before calling other cluster functions.
    cluster.schedulingPolicy = cluster.SCHED_NONE;

You can read more about it in this StrongLoop blog post.

An interesting twist to clustering is when you deploy this app to the Cloud, using IaaS such as SoftLayer, Amazon EC2 or anything based on VMware. Since you can provision VMs with a desired number of virtual cores, you have two dimensions to scale your Node application:

  1. You can ramp up the number of virtual cores allocated for your app. Your code as described above will stretch to create more workers and take advantage of this increase, but all the child processes will still be using shared RAM and virtual file system. If a rogue worker fills up the file system writing logs like mad, it will spoil the party for all. This approach is good if you have some CPU bottlenecks in your app.
  2. You can add more VMs, fully isolating your app instances. This approach will give you more RAM and disk space. For JVM-based apps, this would definitely matter because JVMs are RAM-intensive. However, Node apps are much more frugal when it comes to resources, so you may not need as many full VMs for Node.

Between the two approaches, ramping up cores is definitely the cheaper option, and should be attempted first – it may be all you need. Of course, if you deploy your app to a PaaS like CloudFoundry or Heroku, all bets are off. It is possible that the code I have listed above is not even needed if you intend to host your app on a PaaS, because the platform will provide this behaviour out of the box. However, in some configurations this code will still be useful.

Example: Heroku gives you a single CPU dyno (virtualized unit of server power) with 512MB RAM for free. If you stay on one instance but pick a 2-core dyno with 1GB RAM (I know, still peanuts), that will cost you $34.50 at the time of writing (don’t quote me on the numbers, check them directly at the Heroku pricing page). Using two single core dynos will cost you the same. Between the two, JVM would probably benefit from the 2x dyno (with more RAM), while a single threaded Node app would benefit from two single core instances. However, our code gives you the freedom to use one 2X dyno and still use both cores. I don’t know if availability is the responsibility of the PaaS or yourself – drop me a line if you know the details.

It goes without saying that workers are separate processes, sharing nothing (SN). In reality, the workers will probably share storage via the attached resource, and storage itself can be clustered (or sharded) for horizontal scaling. It is debatable if sharing storage (even as attached resources) disqualifies this architecture from being called ‘SN’, but ignoring storage for now, your worker should be written to not cache anything in memory that cannot be easily recreated from a data source outside the worker itself. This includes auth or session data – you should rely on authentication schemes where the client sends you some kind of a token you can exchange for the user data with an external authentication authority. This makes your worker not unlike Dory from Pixar’s ‘Finding Nemo’, suffering from short term memory loss and introducing itself for each request. The flip side is that a new worker spawned after a worker death can be ready for duty, missing nothing from the previous interactions with the client.

In a sense, using clustering from the start builds character – you can never leave clustering as an afterthought, as something you will add later when your site becomes wildly popular and you need to scale. You may discover that you are caching too much in memory and need to devise schemes to share that information between nodes. It is better to get used to SN mindset before you start writing clever code that will bite you later.

Of course, this being Node, there is always more than one way to skin any particular cat. There is a history of clustering with Node, and also keeping Node alive (an uncaught exception can terminate your process, which is a bummer if only one process is serving all your traffic). In the olden days (i.e. couple of years ago), people had good experience with forever. It is simple and comes with a friendly license (MIT). Note though that forever only keeps your app alive, it is not clustering it. More recently, PM2 emerged as a more sophisticated solution, adding clustering and monitoring to the mix. Unfortunately, PM2 comes with an AGPL license, which makes it much harder to ship it with your commercial product (which means little if you are just having fun, but actually matters if you are a company of any size with actual paying customers installing your product on premise). Of course, if your whole business is hosted and you are not shipping anything to customers, you should be fine.

What I like about ‘cluster’ module is that it is part of the Node.js core library. We will likely add our own monitoring or look for ‘add-on’ monitoring that plays nicely with this module, rather than use a complete replacement like PM2. Regardless of what we do about monitoring, the clustering boilerplate will be a normal part of all our Node.js apps from now on.

© Dejan Glozic, 2014

The Gryphon Dilemma


In my introductory post The Turtleneck and the Hoodie I kind of lied a bit that I stopped doing everything I did in my youth. In fact, I am playing music, recording and producing more than I did in a while. I realized I can do things in the comfort of my home that I could have only dreamed in my youth. My gateway drug was Apple GarageBand, but I recently graduated to the real deal – Logic Pro X. As I was happily mixing a song in its beautifully redesigned user interface, I needed some elaborate delay so I reached for the Delay Designer plug-in. What popped up required some time to get used to:


This plug-in (and a few more in Logic Pro) clearly marches to a different drummer. My hunch is that it is a carry-over from the previous version, probably sub-contracted and the authors of the plug-in didn’t get to updating it to the latest L&F. Nevertheless, they shipped it this way because it very powerful and it does a great primary task, albeit in its own quirky way.

This experience reminded my of a dilemma we are faced today in any distributed system composed of a number of moving parts (let’s call them ‘apps’). A number of apps running in a cloud platform can serve Web pages, and you may want to hook them up together in a distributed ‘site of sites’. Clearly the loose nature of this system is great from the point of view of flexibility. You can individually evolve each app as long as the contract that glues them together is upheld. One app can be stable and move slowly, while you can rev the other one like mad. This whole architecture works great for non-visual services. The problem is when you try to pull any kind of coherent end user experience out of this unruly bunch.

A ‘web site’ is an illusion, inasmuch ‘movies’ are ‘moving’ – they are really a collection of still images switched at 24fps or faster. Web browsers are always showing one page at a time. If a user clicks on an external link, the browser will unceremoniously dump all of page’s belongings on the curb and load a new one. If it wasn’t for the user session and content caching, browsers would be like Alzheimer patients, having a first impression of the same page over and over. What binds pages together are common areas that these pages share with their keen, making them a part of the whole.

In order to ensure this illusion, web pages have always shared common areas that navigationally bind them to other pages. For the illusion to be complete, these common areas need to be identical from page to page (modulo selection highlights). Browsers have become so good at detecting shared content on pages they are switching in and out, that you can only spot a flash or flicker if the page is slow or there is another kind of anomaly. Normally the common areas are included using the View part of the MVC framework – including page fragments is 101 of the view templates. Most of the time it appears as if only the unique part of the page is actually changing.

Now, imagine what happens when you attempt to build a distributed system of apps where some of the apps are providing pages and others are supplying common areas. When all the apps are version 1.0, all is well – everything fits together and it is impossible to tell your pages are really put together like words on ransom notes. After a while, the nature of independently moving parts take over. We have two situations to contend with:

  1. An app that supplies common areas is upgraded to v2.0 while the other ones stay at v1.0
  2. An app that provides some of the pages is upgraded to v2.0 while common areas stay at 1.0
Evolution of composite pages with parts evolving at a different pace.

These are just two sides of the same coin – in both cases, you have a potential for end-results that turn into what I call ‘a Gryphon UX’ – a user experience where it is obvious different parts of the page have diverged.

Of course, this is not a new situation. Operating system UIs go through these changes all the time with more or less controversy (hello, Windows 8 and iOS7). When that happens, all the clients using their services get the free face lift, willy-nilly. However, since native apps (either desktop or mobile) normally use native widgets, there are times when even an unassisted upgrade turns out without a glitch (your app just looks more current), and in real world cases, you only need to do some minor tweaking to make it fit the new L&F.

On the Web however, the Web site design runs much deeper, affecting everything on each page. A full-scale site redesign is a sweeping undertaking that is seldom attempted without full coordination of components. Evolving only parts of a page is plainly obvious and results in it not only being put together like a ransom note but actually looking like one.

There is a way out of this conundrum (sort of). In a situation where a common area can change on you any time, app developers can sacrifice inter-page consistency for intra-page consistency. There is no discussion that common set of links is what makes a site, but these links can be shared as data, not finished page fragments. If apps agree on the navigational data interchange format, they can render the common areas themselves and ensure gryphons do not visit them. This is like reducing your embassy status to a consulate – clearly a downturn in relationships.

Let’s apply this to a scenario above. With version evolution, individual pages will maintain their consistency, but as users navigate between them, common areas will change – the illusion will be broken and it will be very obvious (and jarring) that each page is on its own. In effect, what was before a federation of pages is more like a confederation, a looser union bound by common navigation but not the common user experience (a ‘page ring’ of sorts).

Evolution of composite page where each page is fully controlled by its provider.

It appears that this is not as much a way out of the problem as ‘pick your poison’ situation. I already warned you that the Gryphon dilemma is more of a philosophical problem than a technical one. I would say that in all likelihood, apps that coordinate and work closely together (possibly written by the same team) will opt to share common areas fully. Apps that are more remote will prefer to maintain their own consistency at the expense of inter-page experience.

I also think it all depends on how active the app authors are. In a world of continuous development, perhaps short periods of Gryphon UX can be tolerated knowing that a new stable state is only a few deploys away. Apps that have not been visited for a while may prefer to not be turned into mythological creatures without their own consent.

And to think that Sheldon Cooper wanted to actually clone a gryphon – he could have just written a distributed Web site with his three friends and let the nature take its course.

© Dejan Glozic, 2013

Making Gourmet Pizza in the Cloud

A while ago I had lunch with my wife in a Toronto midtown restaurant called “Grazie”. As I was perusing the menu, a small rectangle at the bottom attracted my attention. It read:

We strongly discourage substitutions. The various ingredients have been selected to complement each other. Substitutions will undermine the desired effect of the dish. Substitutions will slow down our service.

When I read it first, I filed it under ‘opinionated chef’ (as if there is any other kind) and also remembered another chef named Poppie from a Seinfeld episode, who opined on this topic thusly:

POPPIE: No, no. You can’t put cucumbers on a pizza.

KRAMER: Well, why not? I like cucumbers.

POPPIE: That’s not a pizza. It’ll taste terrible.

KRAMER: But that’s the idea, you make your own pie.

POPPIE: Yes, but we cannot give the people the right to choose any topping they want! Now on this issue there can be no debate!

Fast forward to 2013. This whole topic came back to my volatile memory as I was ruminating about what the cloud platform model means for web app developers. In the years before the cloud, we would carefully consider our storage solution, our server and client technology stack and other elements with an eye on how the application is going to be deployed. You wanted to minimize support and maintenance cost, administration headaches and also wanted to achieve economy of scale. A large web application is composed of components, and the economy of scale can be achieved if one architectural style is followed throughout, one database type, one web framework that supports components etc.

In this architecture, it is just not feasible to allow freedom for components to have a say in this matter. Maintaining three different databases or forcing three Web frameworks to cooperate is just not practical, even though these choices may be perfect for the components in question. The ‘good enough’ fit was a good compromise for consistency and ease of support and maintenance.

In contrast, a modern web application is now becoming a loose collection of smaller apps running in a cloud platform and communicating via agreed upon protocols. One of the key selling points of the cloud platform is that it is easy to write and publish an app and have it running within minutes. I am sure early successes are important for learning, and many apps will be tactical and time-boxed – up for a couple of months, then scrapped. But a much more important aspect of the new architecture to me is the opportunity to carefully curate the components of your federated application, the way an opinionated chef selects ingredients for a gourmet pizza.

Let me explain. A modern cloud platform such as EC2, Heroku or CloudFoundry provides for you a veritable buffet of databases (MySQL, Postgres, MongoDB, SQL Server), caching solutions (Memcache, Redis), message queues (RabbitMQ), language environments (Java, Scala, Python, Ruby, JavaScript), frameworks (Django, Spring, Node/Express, Play, Rails). You can then package a carefully curated collection of client-side JavaScript frameworks to include in our app. You have the freedom to choose, and not only in one way:

  1. You have the freedom to select the right database, server and client side that is the best fit for the problem that a single app is trying to solve. You can write computationally intensive app in Java or Scala, write front end app with a lot of I/O and back-and-forth chatter in Node.js, and another in Django or RubyOnRails because you simply like it and it makes you more productive. You don’t have to use whatever has been chosen by the entire team even though it does not fit your exact needs.
  2. You then have the freedom to compose your federated application out of individual apps cooperating as services, each one optimized for its particular task. As I said earlier, this would have been prohibitively expensive and impractical in the past, but is now attainable for smaller and smaller providers. The cloud platform has subsumed the role of the economy of scale vehicle that was previously required from large applications deployed on premise.

The vision of the polyglot future has been already declared for programming languages by Dave Fecak et al and also for storage solutions by Martin Fowler. A common thread in both is ‘best fit for the task’. In the former, additional angle was employment prospects of polyglot developers, while in the latter was a caveat that polyglot persistence comes with the expense of additional complexity.

I think that my point of view is subtly different: in the cloud platforms, complexity of managing many different systems is mitigated by the fact that the platform is responsible for most of it. On the other hand, while CFOs are salivating over the prospects of slashed IT expenses in hosted solutions, I am more intrigued by another unexpected side-effect: using the right tool for the right job and using multiple tools within the same project is now attainable for smaller players. All of a sudden, an application running multiple servers, each using a different database type is trivially possible for even the smallest startups.

We will never be asked again to put cucumber on our pizza just because the team as a whole has decided so, even though we hate cucumber. Conversely, we CAN put cucumber just on our pizza even though the rest of the team doesn’t like it if works wonderfully in our unique combination. In the new world of cloud platforms, we can all make gourmet pizza-style applications, each carefully put together, each uniquely delicious, and each surprisingly affordable. The opinionated chef from Grazie now seems almost prophetic: in the cloud, we DO want to select various ingredients to complement each other, because substitutions will undermine the desired effect and slow down our service(s).

You can almost pass the restaurant’s menu as the white paper.

© Dejan Glozic, 2013