Celestial Bodies and Micro-Front End Reuse

Astronomy_for_the_use_of_schools_and_academies_(1882)_(14577573838)
Source: Wikimedia Commons

If you are writing a modern production system of any complexity, chances are you are using microservice architecture for it. If that system serves the Web browser, sooner or later you will discover that you must break up the UI into parts served by multiple micro-services (or micro front-ends).

The promise of microservices is that they can be built by separate teams, allowing teams relative independence and ability to move at their own speed, without the need to for intergalactic coordination. However, sooner or later executives (God bless them) will notice that some of the user interfaces contain elements that look similar across the system, and the dreaded word “Reuse” will raise its head.

It can sound downright heretical to say anything against reuse in software industry. That code reuse saves time and improves quality feels so innately right that it takes gumption to raise against it. What would be more logical than to create a reusable component for things that look similar and make all micro front-ends reuse them?

And yet.

Movement on a planetary scale

To try to illustrate the problem we can hit with indiscriminate reuse, let’s look at the movement of planets in a solar system. As planets rotate around the star, they follow their own orbits that place them at various distances to each other. If we simplify the geometry and make all orbits perfect circles and in the same plane, the planets will still have different rotational speed, dictated by various factors (yes, ‘Gravity’ violated many of them). This rotation will place them at various positions to each other, with two extremes:

  1. Conjunction – when two planets are lined up on the same side relative to the star and closest to each other
  2. Opposition – when two planets are lined up on opposite sides relative to the star

As planets are in constant motion, they will at any moment be between these two points.

Back to the topic

OK, so let’s bring it back to the microservice system. The reason UI component reuse is being discussed in your meetings is because a design pass was made, and the first iteration of the whole system’s user experience has been put together, and some repetition is observed. This page in micro front-end A looks very similar to the page in micro front-end B. That picker is the same in three pages. Let’s solve it once and for all for everybody. Reuse, people! Sensible leadership all around.

It is easy to fall into this trap because reuse works great for, say, libraries. Of course you will not write that set of mathematical transformations over and over everywhere. In our Node.js microservice system, the list of modules each app depends on is getting longer and longer.

I just typed npm list > deps in one of our largest microservices (I am tempted to call it Gustav for its girth). The flattened list of all dependencies is 3689 lines long. We should call it “miniservice” or just “service” by now. I am not writing 3000 modules by hand any time soon. Reuse is fine and dandy here.

Similarly, all our micro front-ends are using the same platform API layer. The APIs are ensuring the state of the microservice system is consistent. Again, I am not saving the same state in two APIs and keeping it in sync somehow – that would be crazy.

All our micro front-ends are using the same style guide and the same set of React components written to follow that guide with a set of basic visual building blocks (no, we are not using Bootstrap; yes, we know 15% of all web sites use it – good for them). Without it, our UI will look like a ransom note:

ransomizer.com.fileepcBb7

Say NO to casual UI reuse

This brings us to reusing actual parts of pages between micro front-ends. If you follow the reasoning above, it is the logical next step. This is where our astrophysics comes into play.

The pages that look similar today are like planets in conjunction. They are at their closest position in the microservice planetary system. The danger is that we are forgetting the constant movement. It may be hard to believe right now, but these pages will move apart. Why? Complex systems rarely evolve in perfect coordination. You cut your hair and a month later the cut starts losing shape, because the hair refuses to grow exactly the same everywhere. Diverging priorities, architectural changes, business pressures, reorgs all conspire against your microservices keeping their relative distance to each other.

The whole point of a microservice system is to insulate against this corporate chaos by allowing self-sufficient teams to cut through the haze and continue to make progress. Wonton UI reuse will tie the microservices together and restrict their freedom of movement. This is an elaborate way to write a distributed monolith, the worst kind. In our planetary analogy, you are applying super-natural gravitational forces between planets that completely mess up their trajectories and may result in the fiery collapse of the entire system. Or ‘Gravity 2’.

Doing it right

The heretical thought presented in this article is that some duplication and bespoke UI code is good if the outcome is preserved agility and independence of the microservice teams. Of course, thousands of lines of duplicated code is its own kind of a waste, agility or not. This is where good practices are essential, such as using back end for front end to push most of business logic to the server, leaving mostly presentation logic on the client. The business logic can then reuse libraries and modules across microservices (within reason, you need avoid the danger of increased coupling so don’t go overboard).

If you do it right, you will already use common style, common APIs, shared libraries and common widgets. The rest is just composition, where you can create a bespoke UI that perfectly fits your page and nobody will break you by pushing a change without enough testing at 2am.

Any exceptions?

Yes, there are exceptions. Some UI components are just too tricky to implement even once, let alone multiple times. You will know those when you see them – the microservice teams will come together and agree that one of them will build it for all of the teams because life is too short to fix that horrible bug 5 times.

The threshold to clear will be:

  1. Is this truly a single component, or multiple components merged together, selected with a maze of flow control and configuration parameters? If latter, don’t.
  2. Is the component simple enough and with a single purpose, more likely to survive inevitable zigs and zags of business requirements and design changes?
  3. Is the team writing the shared component up to the task and ready to write exhaustive functional and UI tests to guard against regressions and breakages?
  4. Is the problem complex enough to overcome the overhead of reuse?

Once you are true to these conditions, reusable components will be few and far between. If a once useful reused component becomes a sore point in a year, fork and take the control back.

You will reap the consequences

In complex microservice systems, some of the truisms of software development need to be put to the test and re-examined. The true test is not if applying them works at the current moment (at the planetary conjunction), but if the system will survive the opposing forces of changing business requirements, architecture evolution, aging code and organizational changes.

As engineers, we must remember that business defines requirements, design defines the UX, but we are responsible for turning the requirements and the design into clickable reality. It is our responsibility to choose that reality not only based on what our system looks like today, but what it may look like in 6, 12 or 18 months. Resist the urge to reach for quick shortcuts through casual reuse, and plan for what situation that reuse will likely put you after a few architectural and organization changes.

Like planets, your microservices will never stand still.

© Dejan Glozic, 2017

 

 

 

 

Advertisements

Components Are (Still) Hard on the Web

matryoshka_dolls_3671820040_2

Here’s Johnny! I know, I know. It’s been a while I posted something. In my defence, I was busy gathering real world experience – the kind that allows me to forget the imposter syndrome for a while. We have been busy shipping a complex microservice system, allowing me to test a number of my blog posts in real life. Most are holding up pretty well, thank you very much. However, one thing that continues to bother me is that writing reusable components is still maddeningly hard on the Web in 2016.

Framing the problem

First, let’s define the problem. In a Web app of a sufficient complexity, there will be a number of components that you would want to reuse. You already know we are a large Node.js shop – reusing modules via NPM is second nature to us. It would be so nice to be able to npm install a Web component and just put it in your app. In fact, we do exactly that with React components. Alas, it is much harder once you leave the React world.

First of all, let’s define what ‘reusing a Web component’ would mean:

  1. You can put a component written by somebody else somewhere on your page.
  2. The component will work well and will not do something nasty.
  3. The component will not look out of place in your particular design.
  4. You will know what is happening inside the component, and will be able to react to its lifecycle.

First pick a buffet

Component (or widget) reuse was for the longest time a staple of desktop UI development. You bought into the component model by using the particular widget toolkit. That is not a big deal on Windows or MacOs – you have no choice if you want to make native applications. The same applies to native mobile development. However, on the Web there is no single component model. Beyond components that are intrinsic part of HTML, in order to create custom components you need to buy into one of the popular Web frameworks first. You need to pick the proverbial buffet first before you can sample from it.

In 2016, the key battle is currently between abstracting out and embracing the Web Platform. You can abstract the platform (HTML, CSS, DOM) using JavaScript. This is the approach used by React (and my team by extension). Alternatively, you can embrace the platform and use HTML as your base (what Web Components, Polymer and perhaps Angular 2 propose). You cannot mix and match – you need to pick your approach first.

OK, I lied. You CAN mix and match, but it becomes awkward and heavy. React abstracts out HTML but if you use a custom component instead of the built-in HTML, React will work fine. All the React traits (diff-ing the two incremental iterations of virtual DOM, then applying the difference to the actual DOM) works for custom components as well. Therefore, it is fine to slip in a Web Component into a React app.

The opposite does not really work well – consuming a React component in Angular or Polymer is awkward and not worth it really. Not that the original direction is worth it necessarily – you need to load Web Components JavaScript AND React JavaScript.

Don’t eat the poison mushroom

One of the ways people loaded components in their pages in the past was by using good old iframes. Think what you will about them, but you could really lock the components down that way. If you load a component into your own DOM, you need to really trust it. Single origin policy and CORS are supposed to help you prevent a component leaking data from your page to the mother ship. Nevertheless, particularly when it comes to more complex components, it pays to know what they are doing, go through the source code etc. This is where open source really helps – don’t load a black box component into your DOM.

The shoes don’t match the belt

One of the most complex problems to deal with when consuming a Web component of any type is the design. When you are working in a native SDK, the look and feel of the component is defined by the underlying toolkit. All iOS components have the ‘right’ look and feel out of the box when you consume them. However, Web apps have their own themes that create combinatorial explosion. A reusable component needs to do one of the following things:

  1. Be configurable and themeable so that you can either set a few parameters to better blend it into your style guide, or provide entire template to really dial it in
  2. Be generic and inoffensive enough to be equidistant from any parent theme
  3. Be instantly recognizable (think youtube player) in a way that makes it OK that it has its own look and feel.

A very complex reusable component with a number of elements can be very hard to dial in visually by consumers. In corporations, this may reduce the number of themes you want to support. A large component may take it upon itself to support 2-3 supported and widely used design style guides. Then all you need to do is provide a single parameter (style guide name) to make the component use the right styles across the board.

What is going on inside?

Adding a component into your page is not only a matter of placement and visual style. Virtually all reusable components are interactive. A component can be self-contained (for example, all activity in a youtube player is confined to its bounding box), or expected to interact with the parent. If the component must interact with the parent, you need to consider the abstraction chain. Consider the simple countdown timer as a reusable component. Here is how the abstraction chain works:

timer-abstraction

The timer itself uses two low-level components – ‘Start’ and ‘Stop’ buttons. Inside the timer, the code will add click listeners for both buttons. The listeners will add semantic meaning to the buttons by doing things according to their role – starting and stopping the timer.

Finally, when this component is consumed by your page, only one listener is available – ‘onTimerCountdown()’. Users will interact with the timer, and when the timer counts down to 0, the listener you registered will be notified. You should be able to expect events at the right semantic level from all reusable components, from the simplest calendars to large complex components.

If a component can be made part of a larger document, two things you will care the most is serialization and dirty state. When users interact with the component and make a modification, you want to be told that the component is changed. This should trigger the dirty state of the parent. When the user clicks ‘Save’, you should be able to serialize the component and store this state in the larger document. Inversely, on bootstrap you should be able to pass the serialized state to the component to initialize itself.

Note that the actual technology used does not matter here – even the components embedded using iframes can use window.postMessage to send events up to the parent (and accept messages from the parent). While components living in your DOM will resize automatically, iframe-ed components will need to also send resizing events via window.postMessage to allow the parent to set the new size of the iframe.

The long tail

More complex reusable components don’t only have client-side presence. They have a need to call back to the server and fetch the data they need. You can configure such a component in two ways:

  1. You can fetch the data it requires for the component. In that case, the component is fully dependent on the container and it is container’s responsibility to perform all the XHR calls to fetch the data and pass it to the component. This approach may be best for pages that want full control of the network calls. As an added bonus, you can fit such a component into a data flow such as Flux, where some of the data may be coming from Web Socket driven server-side push, not just XHR request.
  2. You can proxy the requests that the component is performing. This approach is also acceptable because it allows the proxy to control which third-party servers are going to be whitelisted.
  3. You can configure CORS so that the component can make direct calls on its own. This needs to be done carefully to avoid the component siphoning data from the page to servers you don’t approve.

On all of these cases you may still want to be told about the events inside the web component using the component events as discussed above.

Frameworks are just the beginning

So there you go – all the problems you need to wrestle with when trying to reuse components in a larger project. Chances are the component is written in the ‘wrong’ framework, but trying to make the component load in your page is only the beginning. Fitting the component into the page visually, figuring out what is happening in it events-wise, and feeding it data from the server is the real battle. Unless you are trying to load a calendar widget, this is where you will spend most of your time.

© Dejan Glozic, 2016

Isomorphic Apps Part 1: Node, Dust, and Socket.io

Two-headed turquoise serpent. Mixtec-Aztec, 1400-1521. Held at British Museum. Credits: Wikimedia Commons.
Two-headed turquoise serpent. Mixtec-Aztec, 1400-1521. Held at British Museum. Credits: Wikimedia Commons.

On the heels of the last week’s post on the futility of server vs client side debate, here comes an example. I wanted to do this a long time and now the progress of the Bluemix project I am working on has made it important to figure out isomorphic apps in earnest.

For those who have not read the often cited blog post by the Airbnb team, isomorphic apps blur the line between the client and the server by sharing code and templates. For this to work, it helps that both sides of the divide are similar in nature (read: can run JavaScript). Not that it is impossible to do in other stacks, but using Node.js on the server is the most direct path to isomorphism. JavaScript libraries are now written with the implicit assumption that people will want to run them in a Node app as well as in the browser.

Do I need isomorphic?

Why would we want to render on the server to begin with?

  1. We want to send HTML to the browser on first request, showing some content to the user immediately. This will help the perceived performance since browsers are amazing at quickly rendering raw HTML, and we don’t have to wait for all the client side JavaScript to load before we can see the anything.
  2. This HTML will also give something to the search engine crawlers to chew on and give you a decent SEO without a lot of effort.
  3. Your app will fit nicely into the Web as it was designed (a collection of linked pages).

Why would you want to do something on the client then?

  1. You want to provide nice interactive experience to the user – static documents (even those dynamically rendered on the server) are not a lot of fun beyond actual content.
  2. You want your page to respond to changes on the server (other users making changes that affect the content of your page) using Web Sockets.
  3. You want to provide features that involve a number of panels that need to flow like a native app, and don’t want to reload full pages for that.

How to skin this particular cat

Today we are not hurting for choices when it comes to libraries and frameworks for Web development. As a result, I decided to write a multi-part article covering some of those options, and allow you to choose what works in your particular situation.

We will start with the simplest way to go isomorphic – by simply exploiting the fact that many JavaScript templating libraries run on both sides of the network divide. In our current projects in IBM, dustjs-linkedin is our trusted choice – a solid library used by many companies and a pleasure to work with. It can be used for rendering views of Node/Express applications, but if you compile the template down to JavaScript, you can load it and render partials on the client as well.

The app

For this exsercise, we will write a rudimentary Todo app, which is really just a collection of records we want to keep. There is already a proverbial Todo MVC app designed to test all the client side MVC frameworks known to man, but we want our app to store data on the server, and render the initial Todo list using Node.js, Express and Dust.js. Once the list arrives at the client, we want to be able to react to changes on the server, and to add new Todos by entering them on the client. In both cases, we want to render the new entries on the client using the same templates we used to render the initial list.

Since we will want to use the REST API (folded into the same app for simplicity) as the single source of truth, we will use Socket.io library to build a MVC-CV app (full MVC on the server, only the controller and the view on the client). The lack of the client model means that when we make changes to the server model through the REST API, we will rely on Socket.io to communicate with the client side controller and update the client view. With a full client side MV*, client side model would be updated immediately, followed by the asynchronous reconciliation with the server. This approach provides for immediacy and makes the application feel snappy, at the expense of the possibility that a seemingly successful operation eventually fails on the server. Mobile app developers prefer this tradeoff.

In order to make Todos a bit more fun, we will toss in Facebook-based authentication so that we can have user profile and store Todos for each user separately. We will use Passport module for this. For now, we will use jQuery and Bootstrap to round up the app. In the future instalments, we will get progressively fancier with the choices.

Less taking, mode coding

We will start by creating a Dust.js partial to render a single Todo card:

<div class="todo">
   <div class="todo-image">
      <img src="{imageUrl}">
   </div>
   <div class="todo-content">
      <div class="todo-first-row">
         <span class="todo-user">{userName}</span>
         <span class="todo-when" data-when="{when}">{whenText}</span>
      </div>
      <div class="todo-second-row">
         <span class="todo-text">{text}</span>
      </div>
   </div>
</div>

As long as the variables it needs are passed in as a dictionary, Dust Core library can render this template in NodeJS or the browser. In order to be able to load it, we need to compile it down to JavaScript and place in the ‘public/js’ directory:

#!/bin/bash          
dustc -name=todo views/todo.dust public/js/todo.js

We can now create a page where todos are rendered on the server as a list, with a text area to enter a new todo, and a ‘Delete’ button to delete them all:

<h1>Using Dust.js for View</h1>
      
<h2>Todos</h2>
<div class="new">
  <textarea id="new-todo-text" placeholder="New todo">
  </textarea>
</div>
<div class="delete">
  <button type="button" id="delete-all" class="btn btn-primary">Delete All</button> 
</div>
<div class="todos">
  {#todos}
    {>todo todo=./}
  {/todos}
</div>

You will notice in the snippet above that we are now inlining the partial we have defined before (todo). The collection ‘todos’ is passed to the view by the server side controller, which obtained it from the server side model.

The key for interactivity of this code lies in the JavaScript for this page:

<script src="http://cdn.jsdelivr.net/dustjs/2.4.0/dust-core.js"></script>  
<script src="/js/todo.js"></script>   

<script>
  var socket = io.connect('/');
  socket.on('todos', function (message) {
    if (message.type=='add') {
      dust.render("todo", message.state, function(err, out) {
        $(".todos").prepend(out);
      });
    }
    else if (message.type=='delete') {
      $('.todos').empty();
    }
  });
      
  $('#delete-all').on('click', function(e) {
    $.ajax({url: "/todos", type: "DELETE"});
  });
 
  $('#new-todo-text').keyup(function (e) {
    var code = (e.keyCode ? e.keyCode : e.which);
    if (code == 13) {
      e.preventDefault();
      $.post("/todos", { text: $('#new-todo-text').val() });
      $('#new-todo-text').val('');
    }
  });
</script> 

New todos are created by capturing the Enter key in the text area and posting the todo using POST /todos endpoint. Similarly, deleting all todos is done by executing DELETE /todo Ajax call.

Notice how we don’t do anything else here. We let the REST endpoint execute the operation on the server and send an event using Web Sockets. When we receive the message on the client, we update the view. This is the CV part of MVC-CV architecture that we just executed. The message sent via Web Sockets contains the state of the todo object that is passed to the Dust renderer. The output of the todo card rendering process is simply prepended to the todo list in the DOM.

REST endpoint and model

On the server, our REST endpoint is responsible for handling requests from the client. Since we are using Passport for authentication, the requests arrive at the endpoint with the user object attached, allowing us to execute the endpoints on behalf of the user (in fact, we will return a 401 if there is no user info).

var model = require('../models/todos');

module.exports.get = function(req, res) {
  model.list(req.user, function (err, todos) {
    res.write(JSON.stringify(todos));
    res.sendStatus(200);
    res.end();
  });
};

module.exports.post = function(req, res) {
  var body = req.body;
   
  model.add(req.user, body.text, function (err, todo) {
    res.sendStatus(201);
    res.end();
    _pushEvent("add", req.user, todo);
  });
};

module.exports.delete = function(req, res) {
  model.deleteAll(req.user, function(err) {
    res.sendStatus(204);
    res.end();
    _pushEvent("delete", req.user, {});
  });
};

function _pushEvent(type, user, object) {
  var restrictedUser = {
    id: user.id,
    name: user.displayName
  };
  var message = {
    type: type,
    state: object,
    user: restrictedUser
  };
  exports.io.sockets.emit("todos", message);
}

We are more-less delegating the operations to the model object, and firing events for verbs that change the data (POST and DELETE). The model is very simple – it uses lru-cache to store data (configured to handle 50,000 users and TTL of 1 hour for entries before they are evicted). This is good enough for a test – in the real world you would hook up a database here.

var LRU = require("lru-cache")
, options = { max: 50000
            , length: function (n) { return 1 }
            , maxAge: 1000 * 60 * 60 }
, cache = LRU(options)
;

module.exports.add = function (user, text, callback) {
  var todo = {
    text: text,
    imageUrl: "https://graph.facebook.com/"+user.id+"/picture?type=square",
    userName: user.displayName,
    when: Date.now()
  };
  var model = cache.get(user.id);
  if (!model) {
    model = { todos: [todo] };
  }
  else {
    model = JSON.parse(model);
    model.todos.splice(0, 0, todo);
  }
  cache.set(user.id, JSON.stringify(model));
  callback(null, todo);
};

module.exports.list = function(user, callback) {
  var model = cache.get(user.id);
  if (model)
    model = JSON.parse(model);
  var todos = model?model.todos:[];
  callback(null, todos);
};

module.exports.deleteAll = function(user, callback) {
  cache.del(user.id);
  callback(null);
};

The entire example is available as a public project on IBM DevOps Services. You can clone the Git repository and play on your machine, or just click on Code and inspect it in the Web IDE directly.

The app is currently running on Bluemix – log in using your Facebook account and give it a spin.

Commentary and next steps

This was the simplest way to achieve isomorphism. It has its downsides, among them the lack of immediacy caused by the missing client side model, but it is blessed by the complete freedom from client side frameworks (jQuery and Bootstrap nonwithstanding). In the part 2 of the post, I will insert Backbone on the client. Since it has support for models, collections and views, it is a particularly good choice for gradually evolving our application (AngularJS would require a complete rewrite, where Backbone can reuse our Dust.js template for the View). Also, as frameworks go, it is tiny (~9K minified gzipped).

Finally, in the part 3, we will swap Dust.js for React.js in the Backbone View implementation, just to see what all the fuss is about. Now you realize why I need to do this in three parts – so many frameworks, so little time.

© Dejan Glozic, 2015

HA All The Things

HA all the things

I hate HA (High Availability). Today everything has to be highly available. All of the sudden SA (Standard Availability) isn’t cutting it any more. Case in point: I used to listen to music on my way to work. Not any more – my morning meeting schedule intrudes into my ride, forcing me to participate in meetings while driving, Bluetooth and all. My 8 speaker, surround sound Acura ELS system hates me – built for high resolution multichannel reproduction, it is reduced to ‘Hi, who just joined?’ in glorious mono telephony. But I digress.

You know that I wrote many articles on micro-services because it is our ongoing concern as we are slowly evolving our topology away from monolithic systems and towards micro-services. I have already written about my thoughts on now to scale and provide HA for Node.js services. We have also solved our problem of handling messaging in a cluster using AMQP worker queues.

However, we are not done with HA. Message broker itself needs to be HA, and we only have one node. We are currently using RabbitMQ, and so far it has been rock solid, but we know that in a real-world system it is not a matter of ‘if’ but ‘when’ it will suffer a problem, bringing all the messaging capabilities of the system with it. Or we will mess around with the firewall rules and block access to it by accident. Hey, contractors rupture gas pipes and power cables by accident all the time. Don’t judge.

Luckily RabbitMQ can be clustered. RabbitMQ documentation is fairly extensive on clustering and HA. In short, you need to:

  1. Stand up multiple RabbitMQ instances (nodes)
  2. Make sure all the instances use the same Erlang cookie which allows them to talk to each other (yes, RabbitMQ is written in Erlang; you learn on the first day when you need to install Erlang environment before you install Rabbit)
  3. Cluster nodes by running rabbitmqctl join_cluster –ram rabbit@<firstnode> on the second server
  4. Start the nodes and connect to any of them

RabbitMQ has an interesting feature in that nodes in the cluster can join in RAM mode or in disc mode. RAM nodes will replicate state only in memory, while in disc mode they will also write it to disc. While in theory it is enough to have only one of the nodes in the cluster use disc mode, performance gain of using RAM mode is not worth the risk (performance gain of RAM mode is restricted to joining queues and exchanges, not posting messages anyway).

Not so fast

OK, we cluster the nodes and we are done, right? Not really. Here is the problem: if we configure the clients to connect to the first node and that node goes down, messaging is still lost. Why? Because RabbitMQ guys chose to not implement the load balancing part of clustering. The problem is that clients communicate with the broker using TCP protocol, and Swiss army knives of proxying/caching/balancing/floor waxing such as Apache or Nginx only reverse-proxy HTTP/S.

After I wrote that, I Googled just in case and found Nginx TCP proxy module on GitHub. Perhaps you can get away with just Nginx if you use it already. If you use Apache, I could not find TCP proxy module for it. It it exists, let me know.

What I DID find is that a more frequently used solution for this kind of a problem is HAProxy. This super solid and widely used proxy can be configured for Layer 4 (transport proxy), and works flawlessly with TCP. It is fairly easy to configure too: for TCP, you will need to configure the ‘defaults’, ‘frontend’ and ‘backend’ sections, or join both and just configure the ‘listen’ section (works great for TCP proxies).

I don’t want to go into the details of configuring HAProxy for TCP – there are good blog posts on that topic. Suffice to say that you can configure a virtual broker address that all the clients can connect to as usual, and it will proxy to all the MQ nodes in the cluster. It is customary to add the ‘check’ instruction to the configuration to ensure HAProxy will check that nodes are alive before sending traffic to them. If one of the brokers goes down, all the message traffic will be routed to the surviving nodes.

Do I really need HAProxy?

If you truly want to HA all the things, you need to now worry that you made the HAProxy a single point of failure. I told you, it never ends. The usual suggestions are to set up two instances, one standard and another backup for fail-over.

Can we get away with something simpler? It depends on how you define ‘simpler’. Vast majority of systems RabbitMQ runs on are some variant of Linux, and it appears there is something called LVS (Linux Virtual Server). LVS seems to be perfect for our needs, being a low-level Layer 4 switch – it just passes TCP packets to the servers it is load-balancing. Except in section 2.15 of the documentation I found this:

This is not a utility where you run ../configure && make && make check && make install, put a few values in a *.conf file and you’re done. LVS rearranges the way IP works so that a router and server (here called director and realserver), reply to a client’s IP packets as if they were one machine. You will spend many days, weeks, months figuring out how it works. LVS is a lifestyle, not a utility.

OK, so maybe not as perfect a fit as I thought. I don’t think I am ready for a LVS lifestyle.

How about no proxy at all?

Wouldn’t it be nice if we didn’t need the proxy at all? It turns out, we can pull that off, but it really depends on the protocol and client you are using.

It turns out not all clients for all languages are the same. If you are using AMQP, you are in luck. The standard Java client provided by RabbitMQ can accept a server address array, going through the list of servers when connecting or reconnecting until one responds. This means that in the event of node failure, the client will reconnect to another node.

We are using AMQP for our worker queue with Node.js, not Java, but the Node.js module we are using supports a similar feature. It can accept an array for the ‘host’ property (same port, user and password though). It will work with normal clustered installations, but the bummer is that you cannot install two instances on localhost to try the failure recovery out – you will need to use remote servers.

On the MQTT side, Eclipse Paho Java client supports multiple server URLs as well. Unfortunately, our standard Node.js MQTT module currently only supports one server. I was assured code contributions will not be turned away.

This solution is fairly attractive because it does not add any more moving parts to install and configure. The downside is that the clients becomes fully aware of all the broker nodes – we cannot just transparently add another node as we could in the case of the TCP load balancer. All the client must add it to the list of nodes to connect to for this addition to work. In effect, our code becomes aware of our infrastructure choices more than it should.

All this may be unnecessary for you if you use AWS since Google claims AWS Elastic Load Balancing can serve as a TCP proxy. Not a solution for us IBMers of course, but it may work for you.

Give me PaaS or give me death

This is getting pretty tiring – I wish we did all this in a PaaS like our own Bluemix so that it is all taken care of. IaaS gives you the freedom that can at times be very useful and allow you to do powerful customizations, but at other times makes you wish to get out of the infrastructure business altogether.

I told you I hate HA. Now if you excuse me, I need to join another call.

© Dejan Glozic, 2014