Components Are (Still) Hard on the Web

matryoshka_dolls_3671820040_2

Here’s Johnny! I know, I know. It’s been a while I posted something. In my defence, I was busy gathering real world experience – the kind that allows me to forget the imposter syndrome for a while. We have been busy shipping a complex microservice system, allowing me to test a number of my blog posts in real life. Most are holding up pretty well, thank you very much. However, one thing that continues to bother me is that writing reusable components is still maddeningly hard on the Web in 2016.

Framing the problem

First, let’s define the problem. In a Web app of a sufficient complexity, there will be a number of components that you would want to reuse. You already know we are a large Node.js shop – reusing modules via NPM is second nature to us. It would be so nice to be able to npm install a Web component and just put it in your app. In fact, we do exactly that with React components. Alas, it is much harder once you leave the React world.

First of all, let’s define what ‘reusing a Web component’ would mean:

  1. You can put a component written by somebody else somewhere on your page.
  2. The component will work well and will not do something nasty.
  3. The component will not look out of place in your particular design.
  4. You will know what is happening inside the component, and will be able to react to its lifecycle.

First pick a buffet

Component (or widget) reuse was for the longest time a staple of desktop UI development. You bought into the component model by using the particular widget toolkit. That is not a big deal on Windows or MacOs – you have no choice if you want to make native applications. The same applies to native mobile development. However, on the Web there is no single component model. Beyond components that are intrinsic part of HTML, in order to create custom components you need to buy into one of the popular Web frameworks first. You need to pick the proverbial buffet first before you can sample from it.

In 2016, the key battle is currently between abstracting out and embracing the Web Platform. You can abstract the platform (HTML, CSS, DOM) using JavaScript. This is the approach used by React (and my team by extension). Alternatively, you can embrace the platform and use HTML as your base (what Web Components, Polymer and perhaps Angular 2 propose). You cannot mix and match – you need to pick your approach first.

OK, I lied. You CAN mix and match, but it becomes awkward and heavy. React abstracts out HTML but if you use a custom component instead of the built-in HTML, React will work fine. All the React traits (diff-ing the two incremental iterations of virtual DOM, then applying the difference to the actual DOM) works for custom components as well. Therefore, it is fine to slip in a Web Component into a React app.

The opposite does not really work well – consuming a React component in Angular or Polymer is awkward and not worth it really. Not that the original direction is worth it necessarily – you need to load Web Components JavaScript AND React JavaScript.

Don’t eat the poison mushroom

One of the ways people loaded components in their pages in the past was by using good old iframes. Think what you will about them, but you could really lock the components down that way. If you load a component into your own DOM, you need to really trust it. Single origin policy and CORS are supposed to help you prevent a component leaking data from your page to the mother ship. Nevertheless, particularly when it comes to more complex components, it pays to know what they are doing, go through the source code etc. This is where open source really helps – don’t load a black box component into your DOM.

The shoes don’t match the belt

One of the most complex problems to deal with when consuming a Web component of any type is the design. When you are working in a native SDK, the look and feel of the component is defined by the underlying toolkit. All iOS components have the ‘right’ look and feel out of the box when you consume them. However, Web apps have their own themes that create combinatorial explosion. A reusable component needs to do one of the following things:

  1. Be configurable and themeable so that you can either set a few parameters to better blend it into your style guide, or provide entire template to really dial it in
  2. Be generic and inoffensive enough to be equidistant from any parent theme
  3. Be instantly recognizable (think youtube player) in a way that makes it OK that it has its own look and feel.

A very complex reusable component with a number of elements can be very hard to dial in visually by consumers. In corporations, this may reduce the number of themes you want to support. A large component may take it upon itself to support 2-3 supported and widely used design style guides. Then all you need to do is provide a single parameter (style guide name) to make the component use the right styles across the board.

What is going on inside?

Adding a component into your page is not only a matter of placement and visual style. Virtually all reusable components are interactive. A component can be self-contained (for example, all activity in a youtube player is confined to its bounding box), or expected to interact with the parent. If the component must interact with the parent, you need to consider the abstraction chain. Consider the simple countdown timer as a reusable component. Here is how the abstraction chain works:

timer-abstraction

The timer itself uses two low-level components – ‘Start’ and ‘Stop’ buttons. Inside the timer, the code will add click listeners for both buttons. The listeners will add semantic meaning to the buttons by doing things according to their role – starting and stopping the timer.

Finally, when this component is consumed by your page, only one listener is available – ‘onTimerCountdown()’. Users will interact with the timer, and when the timer counts down to 0, the listener you registered will be notified. You should be able to expect events at the right semantic level from all reusable components, from the simplest calendars to large complex components.

If a component can be made part of a larger document, two things you will care the most is serialization and dirty state. When users interact with the component and make a modification, you want to be told that the component is changed. This should trigger the dirty state of the parent. When the user clicks ‘Save’, you should be able to serialize the component and store this state in the larger document. Inversely, on bootstrap you should be able to pass the serialized state to the component to initialize itself.

Note that the actual technology used does not matter here – even the components embedded using iframes can use window.postMessage to send events up to the parent (and accept messages from the parent). While components living in your DOM will resize automatically, iframe-ed components will need to also send resizing events via window.postMessage to allow the parent to set the new size of the iframe.

The long tail

More complex reusable components don’t only have client-side presence. They have a need to call back to the server and fetch the data they need. You can configure such a component in two ways:

  1. You can fetch the data it requires for the component. In that case, the component is fully dependent on the container and it is container’s responsibility to perform all the XHR calls to fetch the data and pass it to the component. This approach may be best for pages that want full control of the network calls. As an added bonus, you can fit such a component into a data flow such as Flux, where some of the data may be coming from Web Socket driven server-side push, not just XHR request.
  2. You can proxy the requests that the component is performing. This approach is also acceptable because it allows the proxy to control which third-party servers are going to be whitelisted.
  3. You can configure CORS so that the component can make direct calls on its own. This needs to be done carefully to avoid the component siphoning data from the page to servers you don’t approve.

On all of these cases you may still want to be told about the events inside the web component using the component events as discussed above.

Frameworks are just the beginning

So there you go – all the problems you need to wrestle with when trying to reuse components in a larger project. Chances are the component is written in the ‘wrong’ framework, but trying to make the component load in your page is only the beginning. Fitting the component into the page visually, figuring out what is happening in it events-wise, and feeding it data from the server is the real battle. Unless you are trying to load a calendar widget, this is where you will spend most of your time.

© Dejan Glozic, 2016

Isomorphic Apps Part 1: Node, Dust, and Socket.io

Two-headed turquoise serpent. Mixtec-Aztec, 1400-1521. Held at British Museum. Credits: Wikimedia Commons.
Two-headed turquoise serpent. Mixtec-Aztec, 1400-1521. Held at British Museum. Credits: Wikimedia Commons.

On the heels of the last week’s post on the futility of server vs client side debate, here comes an example. I wanted to do this a long time and now the progress of the Bluemix project I am working on has made it important to figure out isomorphic apps in earnest.

For those who have not read the often cited blog post by the Airbnb team, isomorphic apps blur the line between the client and the server by sharing code and templates. For this to work, it helps that both sides of the divide are similar in nature (read: can run JavaScript). Not that it is impossible to do in other stacks, but using Node.js on the server is the most direct path to isomorphism. JavaScript libraries are now written with the implicit assumption that people will want to run them in a Node app as well as in the browser.

Do I need isomorphic?

Why would we want to render on the server to begin with?

  1. We want to send HTML to the browser on first request, showing some content to the user immediately. This will help the perceived performance since browsers are amazing at quickly rendering raw HTML, and we don’t have to wait for all the client side JavaScript to load before we can see the anything.
  2. This HTML will also give something to the search engine crawlers to chew on and give you a decent SEO without a lot of effort.
  3. Your app will fit nicely into the Web as it was designed (a collection of linked pages).

Why would you want to do something on the client then?

  1. You want to provide nice interactive experience to the user – static documents (even those dynamically rendered on the server) are not a lot of fun beyond actual content.
  2. You want your page to respond to changes on the server (other users making changes that affect the content of your page) using Web Sockets.
  3. You want to provide features that involve a number of panels that need to flow like a native app, and don’t want to reload full pages for that.

How to skin this particular cat

Today we are not hurting for choices when it comes to libraries and frameworks for Web development. As a result, I decided to write a multi-part article covering some of those options, and allow you to choose what works in your particular situation.

We will start with the simplest way to go isomorphic – by simply exploiting the fact that many JavaScript templating libraries run on both sides of the network divide. In our current projects in IBM, dustjs-linkedin is our trusted choice – a solid library used by many companies and a pleasure to work with. It can be used for rendering views of Node/Express applications, but if you compile the template down to JavaScript, you can load it and render partials on the client as well.

The app

For this exsercise, we will write a rudimentary Todo app, which is really just a collection of records we want to keep. There is already a proverbial Todo MVC app designed to test all the client side MVC frameworks known to man, but we want our app to store data on the server, and render the initial Todo list using Node.js, Express and Dust.js. Once the list arrives at the client, we want to be able to react to changes on the server, and to add new Todos by entering them on the client. In both cases, we want to render the new entries on the client using the same templates we used to render the initial list.

Since we will want to use the REST API (folded into the same app for simplicity) as the single source of truth, we will use Socket.io library to build a MVC-CV app (full MVC on the server, only the controller and the view on the client). The lack of the client model means that when we make changes to the server model through the REST API, we will rely on Socket.io to communicate with the client side controller and update the client view. With a full client side MV*, client side model would be updated immediately, followed by the asynchronous reconciliation with the server. This approach provides for immediacy and makes the application feel snappy, at the expense of the possibility that a seemingly successful operation eventually fails on the server. Mobile app developers prefer this tradeoff.

In order to make Todos a bit more fun, we will toss in Facebook-based authentication so that we can have user profile and store Todos for each user separately. We will use Passport module for this. For now, we will use jQuery and Bootstrap to round up the app. In the future instalments, we will get progressively fancier with the choices.

Less taking, mode coding

We will start by creating a Dust.js partial to render a single Todo card:

<div class="todo">
   <div class="todo-image">
      <img src="{imageUrl}">
   </div>
   <div class="todo-content">
      <div class="todo-first-row">
         <span class="todo-user">{userName}</span>
         <span class="todo-when" data-when="{when}">{whenText}</span>
      </div>
      <div class="todo-second-row">
         <span class="todo-text">{text}</span>
      </div>
   </div>
</div>

As long as the variables it needs are passed in as a dictionary, Dust Core library can render this template in NodeJS or the browser. In order to be able to load it, we need to compile it down to JavaScript and place in the ‘public/js’ directory:

#!/bin/bash          
dustc -name=todo views/todo.dust public/js/todo.js

We can now create a page where todos are rendered on the server as a list, with a text area to enter a new todo, and a ‘Delete’ button to delete them all:

<h1>Using Dust.js for View</h1>
      
<h2>Todos</h2>
<div class="new">
  <textarea id="new-todo-text" placeholder="New todo">
  </textarea>
</div>
<div class="delete">
  <button type="button" id="delete-all" class="btn btn-primary">Delete All</button> 
</div>
<div class="todos">
  {#todos}
    {>todo todo=./}
  {/todos}
</div>

You will notice in the snippet above that we are now inlining the partial we have defined before (todo). The collection ‘todos’ is passed to the view by the server side controller, which obtained it from the server side model.

The key for interactivity of this code lies in the JavaScript for this page:

<script src="http://cdn.jsdelivr.net/dustjs/2.4.0/dust-core.js"></script>  
<script src="/js/todo.js"></script>   

<script>
  var socket = io.connect('/');
  socket.on('todos', function (message) {
    if (message.type=='add') {
      dust.render("todo", message.state, function(err, out) {
        $(".todos").prepend(out);
      });
    }
    else if (message.type=='delete') {
      $('.todos').empty();
    }
  });
      
  $('#delete-all').on('click', function(e) {
    $.ajax({url: "/todos", type: "DELETE"});
  });
 
  $('#new-todo-text').keyup(function (e) {
    var code = (e.keyCode ? e.keyCode : e.which);
    if (code == 13) {
      e.preventDefault();
      $.post("/todos", { text: $('#new-todo-text').val() });
      $('#new-todo-text').val('');
    }
  });
</script> 

New todos are created by capturing the Enter key in the text area and posting the todo using POST /todos endpoint. Similarly, deleting all todos is done by executing DELETE /todo Ajax call.

Notice how we don’t do anything else here. We let the REST endpoint execute the operation on the server and send an event using Web Sockets. When we receive the message on the client, we update the view. This is the CV part of MVC-CV architecture that we just executed. The message sent via Web Sockets contains the state of the todo object that is passed to the Dust renderer. The output of the todo card rendering process is simply prepended to the todo list in the DOM.

REST endpoint and model

On the server, our REST endpoint is responsible for handling requests from the client. Since we are using Passport for authentication, the requests arrive at the endpoint with the user object attached, allowing us to execute the endpoints on behalf of the user (in fact, we will return a 401 if there is no user info).

var model = require('../models/todos');

module.exports.get = function(req, res) {
  model.list(req.user, function (err, todos) {
    res.write(JSON.stringify(todos));
    res.sendStatus(200);
    res.end();
  });
};

module.exports.post = function(req, res) {
  var body = req.body;
   
  model.add(req.user, body.text, function (err, todo) {
    res.sendStatus(201);
    res.end();
    _pushEvent("add", req.user, todo);
  });
};

module.exports.delete = function(req, res) {
  model.deleteAll(req.user, function(err) {
    res.sendStatus(204);
    res.end();
    _pushEvent("delete", req.user, {});
  });
};

function _pushEvent(type, user, object) {
  var restrictedUser = {
    id: user.id,
    name: user.displayName
  };
  var message = {
    type: type,
    state: object,
    user: restrictedUser
  };
  exports.io.sockets.emit("todos", message);
}

We are more-less delegating the operations to the model object, and firing events for verbs that change the data (POST and DELETE). The model is very simple – it uses lru-cache to store data (configured to handle 50,000 users and TTL of 1 hour for entries before they are evicted). This is good enough for a test – in the real world you would hook up a database here.

var LRU = require("lru-cache")
, options = { max: 50000
            , length: function (n) { return 1 }
            , maxAge: 1000 * 60 * 60 }
, cache = LRU(options)
;

module.exports.add = function (user, text, callback) {
  var todo = {
    text: text,
    imageUrl: "https://graph.facebook.com/"+user.id+"/picture?type=square",
    userName: user.displayName,
    when: Date.now()
  };
  var model = cache.get(user.id);
  if (!model) {
    model = { todos: [todo] };
  }
  else {
    model = JSON.parse(model);
    model.todos.splice(0, 0, todo);
  }
  cache.set(user.id, JSON.stringify(model));
  callback(null, todo);
};

module.exports.list = function(user, callback) {
  var model = cache.get(user.id);
  if (model)
    model = JSON.parse(model);
  var todos = model?model.todos:[];
  callback(null, todos);
};

module.exports.deleteAll = function(user, callback) {
  cache.del(user.id);
  callback(null);
};

The entire example is available as a public project on IBM DevOps Services. You can clone the Git repository and play on your machine, or just click on Code and inspect it in the Web IDE directly.

The app is currently running on Bluemix – log in using your Facebook account and give it a spin.

Commentary and next steps

This was the simplest way to achieve isomorphism. It has its downsides, among them the lack of immediacy caused by the missing client side model, but it is blessed by the complete freedom from client side frameworks (jQuery and Bootstrap nonwithstanding). In the part 2 of the post, I will insert Backbone on the client. Since it has support for models, collections and views, it is a particularly good choice for gradually evolving our application (AngularJS would require a complete rewrite, where Backbone can reuse our Dust.js template for the View). Also, as frameworks go, it is tiny (~9K minified gzipped).

Finally, in the part 3, we will swap Dust.js for React.js in the Backbone View implementation, just to see what all the fuss is about. Now you realize why I need to do this in three parts – so many frameworks, so little time.

© Dejan Glozic, 2015

HA All The Things

HA all the things

I hate HA (High Availability). Today everything has to be highly available. All of the sudden SA (Standard Availability) isn’t cutting it any more. Case in point: I used to listen to music on my way to work. Not any more – my morning meeting schedule intrudes into my ride, forcing me to participate in meetings while driving, Bluetooth and all. My 8 speaker, surround sound Acura ELS system hates me – built for high resolution multichannel reproduction, it is reduced to ‘Hi, who just joined?’ in glorious mono telephony. But I digress.

You know that I wrote many articles on micro-services because it is our ongoing concern as we are slowly evolving our topology away from monolithic systems and towards micro-services. I have already written about my thoughts on now to scale and provide HA for Node.js services. We have also solved our problem of handling messaging in a cluster using AMQP worker queues.

However, we are not done with HA. Message broker itself needs to be HA, and we only have one node. We are currently using RabbitMQ, and so far it has been rock solid, but we know that in a real-world system it is not a matter of ‘if’ but ‘when’ it will suffer a problem, bringing all the messaging capabilities of the system with it. Or we will mess around with the firewall rules and block access to it by accident. Hey, contractors rupture gas pipes and power cables by accident all the time. Don’t judge.

Luckily RabbitMQ can be clustered. RabbitMQ documentation is fairly extensive on clustering and HA. In short, you need to:

  1. Stand up multiple RabbitMQ instances (nodes)
  2. Make sure all the instances use the same Erlang cookie which allows them to talk to each other (yes, RabbitMQ is written in Erlang; you learn on the first day when you need to install Erlang environment before you install Rabbit)
  3. Cluster nodes by running rabbitmqctl join_cluster –ram rabbit@<firstnode> on the second server
  4. Start the nodes and connect to any of them

RabbitMQ has an interesting feature in that nodes in the cluster can join in RAM mode or in disc mode. RAM nodes will replicate state only in memory, while in disc mode they will also write it to disc. While in theory it is enough to have only one of the nodes in the cluster use disc mode, performance gain of using RAM mode is not worth the risk (performance gain of RAM mode is restricted to joining queues and exchanges, not posting messages anyway).

Not so fast

OK, we cluster the nodes and we are done, right? Not really. Here is the problem: if we configure the clients to connect to the first node and that node goes down, messaging is still lost. Why? Because RabbitMQ guys chose to not implement the load balancing part of clustering. The problem is that clients communicate with the broker using TCP protocol, and Swiss army knives of proxying/caching/balancing/floor waxing such as Apache or Nginx only reverse-proxy HTTP/S.

After I wrote that, I Googled just in case and found Nginx TCP proxy module on GitHub. Perhaps you can get away with just Nginx if you use it already. If you use Apache, I could not find TCP proxy module for it. It it exists, let me know.

What I DID find is that a more frequently used solution for this kind of a problem is HAProxy. This super solid and widely used proxy can be configured for Layer 4 (transport proxy), and works flawlessly with TCP. It is fairly easy to configure too: for TCP, you will need to configure the ‘defaults’, ‘frontend’ and ‘backend’ sections, or join both and just configure the ‘listen’ section (works great for TCP proxies).

I don’t want to go into the details of configuring HAProxy for TCP – there are good blog posts on that topic. Suffice to say that you can configure a virtual broker address that all the clients can connect to as usual, and it will proxy to all the MQ nodes in the cluster. It is customary to add the ‘check’ instruction to the configuration to ensure HAProxy will check that nodes are alive before sending traffic to them. If one of the brokers goes down, all the message traffic will be routed to the surviving nodes.

Do I really need HAProxy?

If you truly want to HA all the things, you need to now worry that you made the HAProxy a single point of failure. I told you, it never ends. The usual suggestions are to set up two instances, one standard and another backup for fail-over.

Can we get away with something simpler? It depends on how you define ‘simpler’. Vast majority of systems RabbitMQ runs on are some variant of Linux, and it appears there is something called LVS (Linux Virtual Server). LVS seems to be perfect for our needs, being a low-level Layer 4 switch – it just passes TCP packets to the servers it is load-balancing. Except in section 2.15 of the documentation I found this:

This is not a utility where you run ../configure && make && make check && make install, put a few values in a *.conf file and you’re done. LVS rearranges the way IP works so that a router and server (here called director and realserver), reply to a client’s IP packets as if they were one machine. You will spend many days, weeks, months figuring out how it works. LVS is a lifestyle, not a utility.

OK, so maybe not as perfect a fit as I thought. I don’t think I am ready for a LVS lifestyle.

How about no proxy at all?

Wouldn’t it be nice if we didn’t need the proxy at all? It turns out, we can pull that off, but it really depends on the protocol and client you are using.

It turns out not all clients for all languages are the same. If you are using AMQP, you are in luck. The standard Java client provided by RabbitMQ can accept a server address array, going through the list of servers when connecting or reconnecting until one responds. This means that in the event of node failure, the client will reconnect to another node.

We are using AMQP for our worker queue with Node.js, not Java, but the Node.js module we are using supports a similar feature. It can accept an array for the ‘host’ property (same port, user and password though). It will work with normal clustered installations, but the bummer is that you cannot install two instances on localhost to try the failure recovery out – you will need to use remote servers.

On the MQTT side, Eclipse Paho Java client supports multiple server URLs as well. Unfortunately, our standard Node.js MQTT module currently only supports one server. I was assured code contributions will not be turned away.

This solution is fairly attractive because it does not add any more moving parts to install and configure. The downside is that the clients becomes fully aware of all the broker nodes – we cannot just transparently add another node as we could in the case of the TCP load balancer. All the client must add it to the list of nodes to connect to for this addition to work. In effect, our code becomes aware of our infrastructure choices more than it should.

All this may be unnecessary for you if you use AWS since Google claims AWS Elastic Load Balancing can serve as a TCP proxy. Not a solution for us IBMers of course, but it may work for you.

Give me PaaS or give me death

This is getting pretty tiring – I wish we did all this in a PaaS like our own Bluemix so that it is all taken care of. IaaS gives you the freedom that can at times be very useful and allow you to do powerful customizations, but at other times makes you wish to get out of the infrastructure business altogether.

I told you I hate HA. Now if you excuse me, I need to join another call.

© Dejan Glozic, 2014