Socket.io and the Business of Open Source

Gerard van Honthorst: The Matchmaker (1625)
Gerard van Honthorst: The Matchmaker (1625)

In one of my previous posts on the topic of risk, I mentioned studies that show that investor tolerance for the stock market risk is much higher when the market is rising than when it is falling like a knife. It turned out that the risk is felt as something abstract until you actually start losing real money, at which point it becomes painfully real and risk appetite-suppressing. I was reminded of this phenomenon as I was researching material for this blog post, and unintentionally encountered the risk of depending on Open Source projects.

The post was supposed to be another one in a series about our journey into full-on Node.js development. I wanted to demonstrate one of the key selling points of Node – the ability to handle a lot of I/O in a non-blocking fashion, allowing the Node server to push simultaneous messages to a large number of clients using one of the push technologies. As with everything Node, there are multiple libraries for this task but it is hard to miss that Socket.io has been immensely popular among the choices (here is a nice summary of the server push choices for Node.js).

Allright then, let’s install socket.io via NPM and start coding. Not so fast – which version? And herein starts a tale full of twists and suspense. If you go to Socket.io Web site, it is all about version 0.9. Aha, but if you google Socket.io, you will find presentations of the type “Why Socket.io 1.0”, “What is new in Socket.io 1.0”, “How Socket.io 1.0 Will Be Unbelievably Awesome”, all around mid 2012 or so. Then in 2013 – silence. Everybody assumed it was a matter of days until 1.0 appeared, yet we are still waiting.

It looks like version 1.0 has been a long time coming, which is fine by itself – I have noticed that Node community favors features and quality over schedules (all these comments equally apply to Node version 0.12). However, there is precious little about the upcoming version at the project’s GitHub home page. This has caused a lot of discussion in the project itself, as well as a panicky Google Groups thread Is Socket.io Dying?

Well, it appears that rumors of Socket.io’s death have been greatly exaggerated – Mr. @rauchg himself appeared to assuage palpitating followers and assure them that all test suites for v1.0 are passing and the good news are coming very soon. This was in December 2013. We are in February 2014 and still waiting. The lower level Engine.io module is supposed to power Socket.io, and apparently that’s where most of the action is these days.

The frustrating wait for a long anticipated product is something I have experienced recently with Apple Logic Pro X. Apple’s replacement for the Logic Pro 9 was several years in coming, and the rumor mill was crazy. People were latching to every whisper on the InterWeb, every Mac Rumor they can find, publicly renounced Apple, threatened to switch to Cubase or Pro Tools. The parallels are almost uncanny: uncertainty about the next version, tight-lipped product owners leaving a lot of room for speculation, FUD, rumors of near death, reassurances by the product owners that they are ‘hard at work on a new version’.

In the end the new version did arrive, it was unbelievably awesome and everybody was too busy playing with it and loving Apple again to remember how crazy they were only a short while ago. Most likely this will be Socket.io story – @rauchg and the team will be forgiven as soon as NPM starts serving v1.0 modules.

What did we learn from this, if anything? Google Groups comments contained reminders that “Socket.io is Open Source, after all”, and that “we should be grateful for free software” and “they don’t owe you anything, man” etc. All true, but I take an issue with “free software”. Open Source software is not very different from paid software – there is always a reason why somebody writes it, and the fact that I am not asked to pay for the software upfront does not mean that a business model does not exist and that a transaction is not happening. It is just not as obvious as with payware.

Open Source is free, but it is not a hobby for most people. In the old days, releasing code into Open Source was a way to undercut a competitor by commoditizing a function or a product. More recently, contributing to Open Source became a way to grow your personal brand and be highly employable in the social economy. One of these days somebody will make the equivalent of the Klout number that measures your footprint in Open Source projects (you can call it ‘Codex’ but if you do, I want some stock options if you build a startup around this idea). Therefore, while Guillermo Rauch wrote Socket.io ‘for free’, social advertisers can calculate to a dollar the value of the tagline “Creator of socket.io” in his Twitter profile. Otherwise, how on Earth can Snapchat that makes zero revenue be valuated at $3B by Facebook, and then turn down that offer? Not seeing the monetization path right away does not mean it does not exist.

Apart from individual contributors and their public brand, whole companies build up their street cred by contributing to Open Source. In a recent article by ReadWrite (thanks for passing along, @cra!), companies such as Google, Twitter and LinkedIn were open about it. In fact, LinkedIn went as far as to claim that ‘Open Source is part of their recruiting strategy’ because their engineering blog and the code that is discussed there is essentially a replacement for the HR lady (unless you consider Veena Basavaraj a member of LinkedIn HR – it definitely seemed so when I researched their fork of Dust.js: I felt the urge to work for LinkedIn, and I wasn’t even looking).

In a sense, GitHub is becoming the software industry equivalent of that matchmaker lady in the painting above, matching companies on a lookout for talent and developers putting their best lines of code forward. Your ‘free’ GitHub project may speak about you more than your LinkedIn resume. For some developers, Open Source GitHub projects ARE their resumes, cutting the glib ‘team player’ and ‘fast learner’ claptrap of traditional resumes that nobody reads and going straight to the code, where the tire meets the road. It is all great that you work well in response to challenge, but why don’t you fix those 150 issues that are gathering dust in your project? Will you be similarly neglectful if we hire you?

In that light, I don’t think it is too much to ask to get the Socket.io v1.0 out the door already, particularly after so much PR about it. Every time we tweet about it, use it, write articles and blog posts, talk to our friends about it, we build up the Socket.io team’s reputation they can take all the way to the bank, so I think the transaction is fair and we are even. Along that line (and to tie back to my reference to risk tolerance, in this case risk of building production code on top of Open Source software), I don’t think we can just wave our hand and say ‘it comes with the territory of using software you didn’t pay for’. As I tried to prove here, I DID pay for it and continue to pay, just in a more convoluted and delayed transaction.

Less pontificating, more coding. Next week I will go back from this tangent to write about pushing events from a Node.js server using whichever Socket.io version is available at that time. Maybe this blog post will code-shame them into releasing v1.0.

Meanwhile, and talking about the risk of depending on Open Source code, not sure what to do with node-optimist, beside recommending against using modules with pirates on their home pages:

© Dejan Glozic, 2014

In The Beginning, Node Created os.cpus().length Workers

Schnorr_von_Carolsfeld_Bibel_in_Bildern_1860_006

As reported in the previous two posts, we are boldly going where many have already gone before – into the brave new world of Node.js development. Node has this wonderful aura that makes you feel unique even though the fact that you can find answers to your Node.js questions in the first two pages of Google search results should tell you something. This reminds me of Stuff White People Like, where the question ‘Why do white people love Apple products so much” is answered thusly:

Apple products tell the world you are creative and unique. They are an exclusive product line only used by every white college student, designer, writer, English teacher, and hipster on the planet.

And so it is with Node.js, where I fear that if I hesitate too much, somebody else will write an app in Node.js and totally deploy it to production, beating me to it.

Jesting aside, Node.js definitely has the ‘new shoes’ feel compared to stuff that was around much longer. Now that we graduated from ‘Hello, World’ and want to do some serious work in it, there is a list of best practices we need to quickly absorb. The topic of this post is fitting the square peg of Node.js single-threaded nature into the multi-core hole.

For many skeptics, the fact that Node.js is single-threaded is a non-starter. Any crusty Java server can seamlessly spread to all the cores on the machine, making Node look primitive in comparison. In reality, this is not so clear-cut – it all depends on how I/O intensive your code is. If it is mostly I/O bound, those extra cores will not help you that much, while non-blocking nature of Node.js will be a great advantage. However, if your code needs to do a little bit of sequential work (and even a mostly I/O bound code has a bit of blocking work here and there), you will definitely benefit from doubling up. There is also that pesky problem of uncaught exceptions –  they can terminate your server process. How do you address these problems? As always, it depends.

A simple use case (and one that multi-threading server proponents like to use as a Node.js anti-pattern) is a Node app installed on a multi-core machine (i.e. all machines these days). In a naive implementation, the app will use only one core for all the incoming requests, under-utilizing the hardware. In theory, a Java or a RoR server app will scale better by spreading over all the CPUs.

Of course, this being year 5 of Node.js existence, using all the cores is entirely possible using the core ‘cluster’ module (pun not intended). Starting from the example from two posts ago, all we need to do is bring in the ‘cluster’ module and fork the workers:

var cluster = require('cluster');

if (cluster.isMaster) {
   var numCPUs = require('os').cpus().length;
   //Fork the workers, one per CPU
   for (var i=0; i< numCPUs; i++) {
      cluster.fork();
   }
   cluster.on('exit', function(deadWorker, code, signal) {
      // The worker is dead. He's a stiff, bereft of life,
      // he rests in peace.
      // He's pushing up the daisies. He expired and went
      // to meet his maker.
      // He's bleedin' demised. This worker is no more.
      // He's an ex-worker.

      // Oh, look, a shiny new one.
      // Norwegian Blue - beautiful plumage.
      var worker = cluster.fork();

      var newPID = worker.process.pid;
      var oldPID = deadWorker.process.pid;

      console.log('worker '+oldPID+' died.');
      console.log('worker '+newPID+' born.');
   });
}
else {
   //The normal part of our app.

The code above does two things – it forks the workers (one per CPU core), and it replaces a dead worker with a spiffy new one. This illustrates an ethos of disposability as described in 12factors. An app that can quickly be started and stopped, can also be replaced if it crashes without a hitch. Of course, you can analyze logs and try to figure out why a worker crashed, but you can do it on our own time, while the app continues to handle requests.

It can help to modify the server creation loop by printing out the process ID (‘process’ is a global variable implicitly defined – no need to require a module for it):

http.createServer(app).listen(app.get('port'), function() {
   console.log('Express server '+process.pid+
                ' listening on port ' + app.get('port'));
});

The sweet thing is that even though we are using multiple processes, they are all bound to the same port (3000 in this case). This is done by the virtue of the master process being the only one actually bound to that port, and a bit of white Node magic.

We can now modify our controller to pass in the PID to simple page and render it using Dust:

exports.simple = function(req, res) {
  res.render('simple', { title: 'Simple',
                         active: 'simple',
                         pid: process.pid });
};

This line in simple.dust file will render the process ID on the page:


This page is served by server pid={pid}.

When I try this code on my quad-core ThinkPad laptop running Windows 7, I get 8 workers:

Express server 7668 listening on port 3000
Express server 8428 listening on port 3000
Express server 8580 listening on port 3000
Express server 9764 listening on port 3000
Express server 7284 listening on port 3000
Express server 5412 listening on port 3000
Express server 6304 listening on port 3000
Express server 8316 listening on port 3000

If you reload the browser fast enough when rendering the page, you can see different process IDs reported on the page.

This sounds easy enough, as most things in Node.js do. But as usual, real life is a tad messier. After testing the clustering on various machines and platforms, the Node.js team noticed that some machines tend to favor only a couple of workers from the entire pool. It is a sad fact of life that for college assignments, couple of nerds end up doing all the work while the slackers party. But few of us want to tolerate such behavior when it comes to responding to our Web traffic.

As a result, starting from the upcoming Node version 0.12, workers will be assigned in a ’round-robin’ fashion. This policy will be the default on most machines (although you can defeat it by adding this line before creating workers):

    // Set this before calling other cluster functions.
    cluster.schedulingPolicy = cluster.SCHED_NONE;

You can read more about it in this StrongLoop blog post.

An interesting twist to clustering is when you deploy this app to the Cloud, using IaaS such as SoftLayer, Amazon EC2 or anything based on VMware. Since you can provision VMs with a desired number of virtual cores, you have two dimensions to scale your Node application:

  1. You can ramp up the number of virtual cores allocated for your app. Your code as described above will stretch to create more workers and take advantage of this increase, but all the child processes will still be using shared RAM and virtual file system. If a rogue worker fills up the file system writing logs like mad, it will spoil the party for all. This approach is good if you have some CPU bottlenecks in your app.
  2. You can add more VMs, fully isolating your app instances. This approach will give you more RAM and disk space. For JVM-based apps, this would definitely matter because JVMs are RAM-intensive. However, Node apps are much more frugal when it comes to resources, so you may not need as many full VMs for Node.

Between the two approaches, ramping up cores is definitely the cheaper option, and should be attempted first – it may be all you need. Of course, if you deploy your app to a PaaS like CloudFoundry or Heroku, all bets are off. It is possible that the code I have listed above is not even needed if you intend to host your app on a PaaS, because the platform will provide this behaviour out of the box. However, in some configurations this code will still be useful.

Example: Heroku gives you a single CPU dyno (virtualized unit of server power) with 512MB RAM for free. If you stay on one instance but pick a 2-core dyno with 1GB RAM (I know, still peanuts), that will cost you $34.50 at the time of writing (don’t quote me on the numbers, check them directly at the Heroku pricing page). Using two single core dynos will cost you the same. Between the two, JVM would probably benefit from the 2x dyno (with more RAM), while a single threaded Node app would benefit from two single core instances. However, our code gives you the freedom to use one 2X dyno and still use both cores. I don’t know if availability is the responsibility of the PaaS or yourself – drop me a line if you know the details.

It goes without saying that workers are separate processes, sharing nothing (SN). In reality, the workers will probably share storage via the attached resource, and storage itself can be clustered (or sharded) for horizontal scaling. It is debatable if sharing storage (even as attached resources) disqualifies this architecture from being called ‘SN’, but ignoring storage for now, your worker should be written to not cache anything in memory that cannot be easily recreated from a data source outside the worker itself. This includes auth or session data – you should rely on authentication schemes where the client sends you some kind of a token you can exchange for the user data with an external authentication authority. This makes your worker not unlike Dory from Pixar’s ‘Finding Nemo’, suffering from short term memory loss and introducing itself for each request. The flip side is that a new worker spawned after a worker death can be ready for duty, missing nothing from the previous interactions with the client.

In a sense, using clustering from the start builds character – you can never leave clustering as an afterthought, as something you will add later when your site becomes wildly popular and you need to scale. You may discover that you are caching too much in memory and need to devise schemes to share that information between nodes. It is better to get used to SN mindset before you start writing clever code that will bite you later.

Of course, this being Node, there is always more than one way to skin any particular cat. There is a history of clustering with Node, and also keeping Node alive (an uncaught exception can terminate your process, which is a bummer if only one process is serving all your traffic). In the olden days (i.e. couple of years ago), people had good experience with forever. It is simple and comes with a friendly license (MIT). Note though that forever only keeps your app alive, it is not clustering it. More recently, PM2 emerged as a more sophisticated solution, adding clustering and monitoring to the mix. Unfortunately, PM2 comes with an AGPL license, which makes it much harder to ship it with your commercial product (which means little if you are just having fun, but actually matters if you are a company of any size with actual paying customers installing your product on premise). Of course, if your whole business is hosted and you are not shipping anything to customers, you should be fine.

What I like about ‘cluster’ module is that it is part of the Node.js core library. We will likely add our own monitoring or look for ‘add-on’ monitoring that plays nicely with this module, rather than use a complete replacement like PM2. Regardless of what we do about monitoring, the clustering boilerplate will be a normal part of all our Node.js apps from now on.

© Dejan Glozic, 2014

Dust.js: Such Templating

starbucks-doge-550

Last week I started playing with Node.js and LinkedIn’s fork of Dust.js for server side templating. I think I am beginning to see the appeal that made LinkedIn choose Dust.js over a number of alternatives. Back when LinkedIn had a templating throwdown and chose Dust.js, they classified it in the ‘logic-less’ group, in contrast to the ’embedded JavaScript’ group that allows you to write arbitrary JavaScript in your templates. Here is the list of reasons I like Dust.js over the alternatives:

  1. Less logic instead of logic-less. Unlike Mustache, it allows you to have some sensible view logic in your templates (and add more via helpers).
  2. Single instead of double braces. For some reason Mustache and Handlebars decided that if one set of curly braces is good, two sets must be better. It is puzzling that people insisting on DRY see no irony in needing two sets of delimiters for template commands and variables. Typical Dust.js templates look cleaner as a result.
  3. DRY but not to the extreme (I am looking at you, Jade). I can fully see the HTML markup that will be rendered. Life is too short to have to constantly map the shorthand notation of Jade to what it will eventually produce (and bang your head on the table when it barfs at you or spits out wrong things).
  4. Partials and partial injection. Like Mustache or Handlebars, Dust.js has partials. Like Jade, Dust.js has injection of partials, where the partial reserves a slot for the content that the caller will define. This makes it insanely easy to create ‘skeleton’ pages with common areas, then inject payload. One thing I did was set up three injection areas: in HEAD, in BODY where the content should go and at the end of BODY for injections of scripts.
  5. Helpers. In addition to very minimal logic of the core syntax, helpers add more view logic if needed (‘dustjs-helpers’ module provides some useful helpers that I took advantage of; you can write your own helpers easily).
  6. Stuff I didn’t try yet. I didn’t try streaming and asynchronous rendering, as well as complex paths to the data objects, but intend to study how I can take advantage of them. Seems like something that can come in handy once I get to more advanced use cases. That and the fact that JSPs support streaming so we feel we are not giving up anything by moving to Node/Dust.
  7. Client side templating. This also falls under ‘stuff I didn’t try yet’, but I am singling it out because it requires a different approach. So far our goal was to replace our servlet/JSP server side with Node/Dust pair. Having the alternative to render on the client opens up a whole new avenue of use cases. We have been burnt in the past with too much JavaScript on the client, so I want to approach this carefully, but a lot of people made a case for client side rendering. One thing I would like to try out is adding logic in the controller to sniff the user agent and render the same template on the server or on the client (say, render on the server for lesser browsers or search engine bots). We will definitely try this out.

In our current project we use a number of cooperating Java apps using plain servlets for controllers and JSPs for views. We were very disciplined to not use direct Java expressions in JSPs, only JSTL/EL (Tag Library and Expression Language). JSPs took a lot of flack in the dark days of spaghetti JSPs with a lot of Java code embedded in views, essentially driving current aversion to logic in templates to absurd levels in Mustache. It is somewhat ironic that you can easily create similar spaghetti Jade templates with liberal use of embedded JavaScript, so that monster is alive and well.

Because of our discipline, porting our example app to Node.js with Dust.js for views, Express.js for middleware and routing was easy. Our usual client stack (jQuery, Require.js, Bootstrap) was perfectly usable – we just copied the client code over.

Here is the shared ‘layout.dust’ file that is used by each page in the example app:

<!DOCTYPE html>
<html>
  <head>
    <title>{title}</title>
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
	<meta name="viewport" content="initial-scale=1.0, maximum-scale=1.0, user-scalable=0" />
	<link rel="stylesheet" href="/stylesheets/base.css">
	{+head/}
  </head>

  <body class="jp-page">
	{>navbar/}
	<div class="jp-page-content">
	   {+content/}
	</div>

	<script src="/js/base.min.js"></script>
	<script>
		requirejs.config({
			baseUrl: "/js"
		});
	</script>
	{+script/}
</body>
</html>

Note the {+head/}, {+content/} and {+script/} sections – they are placeholders for content that will be injected from templates that include this partial. This ensures the styles, meta properties, content and script are injected in proper places in the template. One thing to note is that you don’t have to define empty placeholders – you can place content between the opening and closing tag of the section, but we didn’t have any default content to provide here. You can view an empty tag as an ‘injection point’ (this is where the stuff will go), whereas a placeholder with some default content will be more like ‘overriding point’ (the stuff in the caller template will override this).

The header is a partial pulled into the shared template. It has been put together quickly – I can easily see the links for the header being passed in as an array of objects (it would make the partial even cleaner). Note the use of the helper for controlling the selected highlight in the header. It is simply comparing the value of the active link to the static values and adds a CSS class ‘active’ if true:

<div class="navbar navbar-inverse navbar-fixed-top jp-navbar" role="navigation">
   <div class="navbar-header">
     <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse">
        <span class="sr-only">Toggle Navigation</span>
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
     </button>
	<a class="navbar-brand" href="/">Examples<div class="jp-jazz-logo"></div></a>
   </div>
   <div class="navbar-collapse navbar-ex1-collapse collapse">
      <ul class="nav navbar-nav">
	<li {@eq key=active value="simple"}class="active"{/eq}><a href="/simple">Simple</a></li>
	<li {@eq key=active value="i18n"}class="active"{/eq}><a href="/i18n">I18N</a></li>
	<li {@eq key=active value="paging"}class="active"{/eq}><a href="/paging">Paging</a></li>
	<li {@eq key=active value="tags"}class="active"{/eq}><a href="/tags">Tags</a></li>
	<li {@eq key=active value="widgets"}class="active"{/eq}><a href="/widgets">Widgets</a></li>
	<li {@eq key=active value="opensocial"}class="active"{/eq}><a href="/opensocial">OpenSocial</a></li>
      </ul>
      <div class="navbar-right">
	<ul class="nav navbar-nav">
	   <li><a href="#">Not logged in</a></li>
	</ul>
      </div>
   </div>
</div>

Finally, here is the sample page using these partials:

{>layout/}
{<content}
  <h2>Simple Page</h2>
    <p>
      This is a simple HTML page generated using Node.js and Dust.js, which loads CSS and JavaScript.<br/>
      <a id="myLink" href="#">Click here to say hello</a> 
    </p>
{/content}
{<script}
   <script src="/js/simple/simple-page.js"></script>
{/script}

In this page, we include shared partial ‘layout.dust’, then inject content area into the ‘content’ placeholder and also some script into the ‘script’ placeholder.

The Express router for this page is very short – all it does is render the template. Note how we are passing the title of the page and also the value of the ‘active’ property to ensure the proper link is highlighted in the header partial:

exports.simple = function(req, res){
  res.render('simple', { title: 'Simple', active: 'simple' });
};

Running the Node app gives you the following in the browser:

dust-simple-page

Since we are using Bootstrap, we also get responsive design thrown in for free:

dust-simple-page-responsive

There you go. Sometimes it pays to follow in the footsteps of those who did the hard work before you – LinkedIn’s fork of Dust.js is definitely a very comfortable and capable templating engine and a great companion to Node.js and Express.js. We feel confident using it in our own projects. In fact, we have decided to write one of the apps in our project using this exact stack. As usual, you will be the first to know what we learned as we are going through it.

© Dejan Glozic, 2014

On the LinkedIn’s Dusty Trail

dusty_road_brush

There is an old New Yorker cartoon (from 1993) where a dog working on the computer tells another dog: “On the Internet, nobody knows you are a dog”. That’s how I occasionally feel about the GitHub projects – they could be solid, multi-contributor powerhouses churning out release after release, and there could be great ideas stalled when the single contributor got distracted by something new and shiny. The thing is that you need to inspect all the project artifacts carefully to tell which is which – single contributors can mount a powerful and amplified presence on GitHub (until they leave, that is).

This issue comes with the territory in Open Source (considering how much you pay for all the hard work of others), but nowhere is it more acute than in LinkedIn’s choice of templating library in 2011. In an often quoted blog post, LinkedIn engineering team embarked on a quest to standardise around one templating library that can be run both on the server and on the client, and also check a number of other boxes they put forward. Out of several contenders they finally settled on Dust.js. This library uses curly braces for pattern substition, which puts it in a hipster-friendly category alongside Mustache and Handlebars. But here is the rub: while the library ticks most of LinkedIn’s self-imposed requirements, its community support leaves much to be desired. The sole contributor seemed unresponsive.

Now, if it was me, I would move on, but LinkedIn’s team decided that they will not be deterred by this. In fact, it looks like they kind of liked the fact that they can evolve the library as they learn by doing. The problem was that committer rights work by bootstrapping – only the original committer can accept LinkedIn changes, which apparently didn’t happen with sufficient snap. Long story short, behold the ‘LinkedIn Fork of Dust.js’, or ‘dustjs-linkedin’ as it is known to NPM.

I followed this story with mild amusement and shaking my head until the end of 2013, when PayPal saw the Node.js light. As part of their Node.js conversion, they picked LinkedIn’s fork of Dust.js for their templating needs. This reminded me of how penguins jump into water – they all wait until one of them jumps, then they all follow in a quick succession. Channeling my own inner penguin, I decided the water was fine and started playing with dustjs-linkedin myself.

This is not my first foray into the world of Node.js, but in my first attempt I used Jade, which is just too DRY for my taste. Being a long time Eclipse user, I just could not revert to a command line, so I resorted to a collection of Web Development tools, then added Nodeclipse, mostly for the project creation wizard and the launcher. Eclipse is very important to me because it answers one of the key issues plaguing Node.js developers that go beyond ‘Hello, World’ – how do I control and structure all the JavaScript files (incidentally one of the hard questions that Zef Hemel posed in his blog post on blank canvas projects).

Then again, Nodeclipse is not perfect, and dustjs-linkedin is not one of the rendering engines they cover in the project wizard. I had to create an Express project configured for Jade, turn around and delete Jade from the project, and use NPM to install dustjs-linkedin locally (i.e. in the project tree under ‘node_modules’), like so:

nodeclipse-project

After working with Nodeclipse for a while, and not being able to use their Node launcher (I had to configure my own external tool launcher), I am now questioning its value but at least it got the initial structure set up for me. Now that I have a good handle of the overall structure, I could create new Node projects myself, so general purpose Web tooling with HTML, JSON and JavaScript editors and helpers may be all you need (of course, you need to also install Node.js and NPM but you would need to do it in all scenarios).

Hooking up nodejs-linkedin also requires consolidate.js in a way that is a bit puzzling to me but it seems to work well, so I didn’t question it (the author is TJ of Express fame, and exploring the code I noticed that nodejs-linkedin is actually listed as one of the recognized engines). The change required to pull in dustjs-linkedin and consolidate, declare dust as a new engine and map it as the view engine for the app:

var express = require('express')
, routes = require('./routes')
, dust = require('dustjs-linkedin')
, cons = require('consolidate')
, user = require('./routes/user')
, http = require('http')
, path = require('path');

var app = express();

// all environments
app.set('port', process.env.PORT || 3000);
app.set('views', __dirname + '/views');
app.engine('dust', cons.dust);
app.set('view engine', 'dust');

That was pretty painless so far. We have configured our views to be in the /views directory, so Dust files placed there can be directly found by Express. Since Express is an MVC framework (although much lighter than what you are normally used to in the JEE world), the C part is handled by routers, placed in the /routes directory. Our small example will have just a landing page rendered by /views/index.dust and the controller will be /routes/index.js. However, to add a bit of interest, I will toss in block partials, showing how to create a base template, then override the ‘payload’ in the child template.

We will start by defining the base template in ‘layout.dust’:

<!DOCTYPE html>
<html>
  <head>
    <title>{title}</title>
    <link rel='stylesheet' href='/stylesheets/style.css' />
  </head>
  <body>
    <h1>{title}</h1>
	{+content}
	This is the base content.
	{/content}
  </body>
</html>

We can now include this template in ‘index.dust’ and define the content section:

{>layout/}
{<content}
<p>
This is loaded from a partial.
</p>
<p>
Another paragraph.
</p>
{/content}

We now need to define index.js controller in /routes because the controller invokes the view for rendering:

/*
* GET home page.
*/
exports.index = function(req, res) {
   res.render('index',
           { title: '30% Turtleneck, 70% Hoodie' });
};

In the code above we are sending the result of rendering the view using Dust to the response. We specify the collection of key/value pairs that will be used by Dust for variable substitution. The only part left is to hook up our controller to the site path (our root) in app.js:

app.get('/', routes.index);

And that is it. Running our Node.js app using Nodeclipse launcher will make it start listening on the localhost port 3000:

dusty_road_browser

So far so good, and it is probably best to stop while we are ahead. We have a Node.js project in Eclipse configured to use Express and LinkedIn’s fork of Dust.js for templating, and everything is working. In my next installment, I will dive into Dust.js in earnest. This is going to be fun!

© Dejan Glozic, 2014

Book Review: RESS Essentials

6944OT_RESS Essentials

I have decided to start reviewing books as part of my blog activity with mixed emotions. Our industry is so fast moving, so ephemeral that books seem to fight a losing battle on being relevant the day they are published, never mind 6 months or a year down the road. Books on actual concrete products are very often little more than the missing manuals (so much so that O’Reilly actually has a ‘Missing Manuals‘ series, with a ‘The book that should have been in the box’ tagline – burn!). Just check the new titles – ‘Windows 8.1’, ‘HTML5’, ‘iPad’!!

Somewhat better fate is reserved for books that attempt to describe an approach or technique rather than any particular platform, language or library. Sound principles tend to outlive the actual building blocks used to implement them. The book I am covering in this review falls into that category, since it attempts to address a fairly new approach to wrestling with the explosion of Web clients.

The book’s title is ‘RESS Essentials’ by Joanna Krenz-Kurowska and Jerzy Kurowski and is brought to us by Packt Publishing. I already wrote about RESS in RESS to the Rescue. In a nutshell, responsive Web design (or RWD) is used to make a Web application adapt to the various aspects of the client, and reformat the experience to best fit each client in question (be it desktop, tablet or phone). RESS (or REsponsive design Server Side) goes further by involving the server. The rationale is that CSS can hide or reformat content depending on the client, but everything is still sent by the server, whether it will be needed or not. RESS attempts to address the performance side of the responsive design by choosing what to send before any formatting decisions are made on the client.

Even in Internet years, RESS is a young technique, and a very good illustration of that is that the authors felt the need to devote most of chapter 1 to controversies surrounding RESS. To put it simply, not everybody is convinced that RESS is a good idea. Nevertheless, enough people are (including the godfather of RESS Luke Wroblewski, incidentally another Pole – in a bit of Polish trivia, my photographer told me that his last name means ‘sparrow-like’). As such, the rest of the book assumes you have been sufficiently convinced RESS is a good and useful thing and you want to get on with it.

Chapter 1 also brought one of the unexpected insights into RESS and responsive design, where it can actually go to the other side of the client spectrum. When it comes to RWD, most of the focus is on how your site reacts to non-desktop clients. However, for eons the desktop design assumed certain maximum width and happily ignored the side bands, making browser window maximization futile. On high quality, large monitors these side bands seems like a waste of real estate (compared to how desktop applications or games use it). RESS and responsive Web design can unlock full potential of your high end monitor – no reason why we cannot add rules to keep adding more content if you maximize your browser on a 27” or 30” monitor.

In chapter 2, authors pay homage to the client side, by diving into the real world example using Gridpak for the responsive grid and Twitter’s Bootstrap for everything else.

Real action starts in chapter 3, where focus is moved to the server – the key value add of the RESS technique. In order to determine what to send to the client, server-side device detection is used. Two most popular solutions discussed in the chapter are WURFL and DeviceAtlas. There is a problem – these services are not available for free, or if they are, the free (or even cheap Cloud-based) version is not a viable option for anything but a test site. For completeness, authors cover some Open Source alternatives such as YABFDL (also known as Dave Olsen’s Detector) – more modest in its database but still usable and free with permissive license (with coverage that keeps up with new devices through the use of the modernizr.js library).

Chapter 4 puts server side detection into action. Different markups for different clients (to optimize for ability, support for modern features and JavaScript), different media sizes based on screen width, and manual selection of page versions. This chapter mostly deals with the first of the three goals, with plenty of code snippets using both WURFL and Dave Olsen’s Detector.

Chapter 5 moves on to dealing with image resolution on the server. Images comprise a large portion of the overall site payload and sending images that are too large or of unnecessarily high resolution is costly both performance-wise, but also by eating into the monthly bandwidth allowance for mobile users. The RESS solution for handling image sizes and resolution is in line with the overall approach and has some interesting properties compared to the known alternatives.

As said, all this effort around sending the right-sized images to the client is at least partially driven by performance considerations. In chapter 6, authors note that client screen size is not the only parameter to take into account – connection bandwidth is as important. Unfortunately, as of today there are no media queries for connection quality and bandwidth that could be used to tailor image size and compression to use. In their absence, authors offer a number of techniques to optimize images or reduce the need to use them in the first place.

One thing I really liked in this chapter was a graph showing that with the reduction of screen size (and with RESS solution for images enabled), the size of CSS and JavaScript files started to overtake images, to the point where images were only 1/3 of the CSS/JavaScript files for iPhone portrait resolutions (320px). This was a good reminder that RESS needs to be combined with other good practices if performance was a concern. I particularly liked a tidbit that the average webpage reached 1585 KB on September 15, 2013. That’s a lot of KBs to send to any client, particularly mobile phones, and underlines the need for techniques such as RESS to improve Web performance.

Changing gears, chapter 7 switched back to the client and focused on using and extending jQuery and Bootstrap to make your site responsive when dealing with elements such as tables and carousels. jQuery has essentially become the equivalent of gravity for JavaScript libraries (i.e. you cannot avoid it and must play nice with it), and Bootstrap is insanely popular in its third incarnation. While not necessarily limited to RESS (the code in this chapter could be easily used in a client-only RWD solution), we should not forget that RESS is a combination of server and client side techniques for responsive design, and the examples in this chapter are definitely useful for Web developers that seek responsive design solutions.

I am a bit puzzled with chapter 8, which introduces REST and combines it with RWD. It is not strictly RESS in that it simply uses Ajax to call REST services. There is nothing wrong with what is proposed in this chapter, but REST is another kind of ‘gravity’ for many teams including mine, and tossing it here on the pile with RESS and RWD seems odd, particularly because it involves PHP in code examples. Since this was the last chapter in the book, I was left wanting for a chapter that is more fit for wrapping it all up and reinforcing the key messages of the book (an epilogue, as it were).

In the end, where does it left me, the reader and the reviewer? I would say that RESS Essentials is part of the new breed of ‘mini-books’, considering its size (120 pages including index). It is definitely not the first of its kind – Responsive Web Design by Ethan Marcotte is 143 pages, and Mobile First by Luke Wroblewski is 123 pages. This class of books covers areas and topics that are too complex to be covered in one blog post or article but perhaps not as complex to require a book that can double as a weapon and will consume years of author’s life. The format definitely allows hot topics to be covered in a time frame that at least gives them a fighting chance of having some decent use before the inevitable progress renders them obsolete.

I think that the authors did a good job covering the ground on the topic of RESS and I definitely learned a lot by reading it. The book is part of the ‘Community Experience Distilled’ series, and it definitely fits the description, since the authors provided a number of examples and code snippets that look like something they arrived at through personal experience while working on real world projects. Whether RESS itself has a staying power is left to be seen. I am concerned about the limited choices when it comes to server side agent detection (at least in the Open Source world), so Dave Olsen’s Detector effort should be commended.

When it comes to code snippets, I was more interested in general concepts than the server side code examples because they were universally in PHP. There is nothing wrong with PHP but I would prefer examples in Java, or even better, in JavaScript using Node.js. Using Java or Node.js would work better with the final chapter about combining REST with RESS.

As far as the format and language is concerned, I have two minor nits to pick. While the book is generally well written, Joanna’s and Jerzy’s Polish background occasionally creeps into their sentence structure. Not being a native speaker myself, I cannot quite put my finger on why I was able to notice, I just felt that way as I was reading. Another mild nit is that the PDF version of the eBook results in too small a font, and when I tap to zoom in, the text is single-spaced, making it less than perfect for reading on iPads. On the flip side, single spacing made for faster vertical scrolling. YMMV if you use a different reader or format.

That’s it for my first review – next on the docket is Smashing Magazine Book #4 – now that’s a book you can use for self defense if need be!

© Dejan Glozic, 2014

Rocket Science vs System Integration

By certified su from The Atherton Tablelands, Queensland , Australia (Integration) [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0) or CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Note: In case you missed it, there is still a chance to get a free eBook of RESS Essentials from Packt Publishing. All you need to do is check the book out at Packt Publishing web site, drop a line about the book back at the comments section of my article with your email address and enter a draw for one of the three free e-books. And now, back to regular programming.

When I was in the army, my mother told me that she kept noticing soldiers on the streets. I mean, they didn’t suddenly appear, they were always there, but she only started noticing them because of me. Once I returned from my regular 12 month service, soldiers around her receded into oblivion again.

This is a neat trick that you can try yourself. Have you noticed that you start spotting other cars like your own on the road as soon as you buy it, even though you didn’t care less about them before (unless you really, really wanted it, in which case you were spotting it as an object of desire, anticipating the purchase)? If you care to learn another useless phrase, the scholarly term for this phenomenon is perceptual vigilance.

When we were building web apps in the 00’s, a number of technologies were considered a ‘solved problem’. You get yourself an Apache server, install it on a Linux box, strap on Tomcat for servlets (all right, you can use PHP instead) and add MySQL for storage. Of course, there were commercial versions that offered, well, ‘bigger’ but the basic principles were the same. Nobody was excited about serving pages, or storing data, or doing anything commoditized that way. The action was in solving the actual end-user problem, and the sooner you can assemble a system out of the off-the-shelf components and get to the problem solving part, the better.

Well, once you need to serve millions upon millions of users, strike the ‘solved’ part. You start hitting the limits of your components, and you need to start paying attention to the things you stopped noticing years ago. That’s what is now happening across the software development world. The ‘solved’ problems are being solved again, from the very bottom. Web servers, data storage, caching, page templates, message queues – everything is being revisited and reopened. It is actually fascinating to watch – like a total gutting and rebuilding of a house whose owners grew tired of the basement flooding every time it rains. It is as if millions of developers suddenly noticed the components that they were taking for granted all these years and said ‘let’s rebuild everything from scratch’.

When you look at a Node.js ‘Hello, World’ server example, it is fascinating in its simplicity until you consider how much stuff you need to add to get to the level of functionality that is actually needed in production. You are really looking at several dozens of JS modules pulled in by NPM. Using the house paradigm again, these modules are replacing all the stuff you had in your house before you reduced it to a hole in the ground. And in this frenzy of solving the old problems in the new world I see two kinds of approaches – those of idealists and pragmatists.

There is a feeding frenzy on GitHub right now to be the famous author of a great new library or framework. To stay in the Node.js neighborhood, the basement is taken by @ryah, most of the walls are solidly put by @tjholowaychuk (although there are some new walls being added), and so on. Idealists fear that if they do not hurry, the house will be rebuilt and there will be no glory left in the component business – that they will need to go back to be pragmatists, assemblers of parts.

For some people that is almost an insult. Consider the following comment by Bhaskar Ghosh, LinkedIn’s senior director of data infrastructure engineering in a recent article by Gigaom:

Ghosh, well, he shot down the idea of relying too heavily on commercial technologies or existing open source projects almost as soon as he suggests it’s a possibility. “We think very carefully about where we should do rocket science,” he told me, before quickly adding, “[but] you don’t want to become a systems integration shop.”

Of course, Bhaskar is exaggerating to prove the point – LinkedIn is both a heavy provider and a consumer of Open Source projects. Consider this (incomplete) list for example – they are not exactly shy using off the shelf code where it makes sense. Twitter is no stranger to Open Source as well – they share the use of Netty with LinkedIn and many other (albeit heavilly modified) open source projects. Nevertheless, Twitter shares the desire to give back to the Open Source world – my team together with many others are using their awesome Bootstrap toolkit, and their modifications are shared with the community. You can read it all at the Twitter Open Source page, run by our good friend @cra.

Recently, PayPal has switched a lot of their Java production servers to Node.js and documented the process. The keyboards are still hot from the effort and they are already open sourcing many of the libraries they wrote for the effort. They even joined forces with LinkedIn on the fork of the Dust.js templating library that the owner abandoned (kudos to Richard Ragan from PayPal from writing a great tutorial).

There are many other examples that paint a picture of a continuum. Idealists see this as a great time to have their name attached to a popular Open Source project and cement their place in the source code Hall of Fame – it’s 1997 all over again, baby!. On the other hand, startups have no problem being pragmatists – it is very hard to build something from scratch even with off the shelf Open Source software – who has time to write new libraries when existing ones are good enough to drive your startup into the ground 10 times over.

In the space between the two extremes, companies with something to lose and with skin in the game stride the cautious middle road – use what works, modify what ‘almost’ works, write their own when no existing software fits the bill, fully aware of the immediate and ongoing cost. And the grateful among them give back to the Open Source community as a way of saying ‘thanks’ for getting a boost when they themselves where but a fledgling startup.

© Dejan Glozic, 2014

Of Mice and Men

icestorm

The best laid schemes of mice and men
Often go awry.

Robert Burns, 1785.

Dear readers of this blog, you may have noticed a lapse in the rhythm that I faithfully followed from the very beginning (a new post every Tuesday, more recently switching to Monday). You may have inferred that I am ‘recharging’, away or simply taking a holiday break.

Far from it. I had a plan to look back at the blog and the topics I have covered over the last six months. For example, looking back at the post on databases, I have second thoughts about using SQL database in a way that is a better fit for something like MongoDB. Our practical experience is that schema changes in a rapid evolution environment are a real drag and being able to evolve the data on demand in a schema-less database would be really beneficial to us.

Other posts are holding up. We are as averse to the Extreme AJAX as ever, but we are now seeing use cases where some client-side structure (possibly using Backbone.js) would be beneficial, particularly when using Web Sockets to push updates to the client. We are still allergic to frameworks and their aggressive takeover of your life and control over code. And we are still unhappy with the horrible hacks required to implement client side templating. One day when Web Components are supported by all the modern browsers, we will remember this time as an ugly intermezzo when we had to resort to a lot of JavaScript in lieu of first class Web composition that Web Components provide (you can bookmark this claim and come to laugh at me if Web Components do not deliver on their promise).

As I said, I planned to cover all that, and also lined up a holiday project. We had recently implemented nice dashboards for our needs in Jazz Platform and Jazz Hub, and I was curious how hard it would be to re-implement the server side using Node.js and MongoDB (instead of JEE and SQL DB). The client side is already fine, using jQuery, Bootstrap and Require.js, and does not need to change. I also wanted to see how hard it would be to configure Express.js to use LinkedIn fork of the Dust.js templating library. PayPal team had a lot of fun recently moving their production applications to exactly that stack, and I was inspired by Bill Scott’s account of that effort.

The plan was to write the blog post on Sunday 22nd, and play with Node.js over the next two weeks. On Saturday 21st, eastern US and Canada were hit by a nasty ice storm. The storm deposited huge amounts of ice on the electrical wires and trees in midtown Toronto where I live. Ice is very heavy and after a while, huge mature trees that are otherwise great to have around started giving up, losing limbs and knocking down power lines in the process. By Sunday 22nd 3am we lost power, along with about a million people in Toronto at the peak.

Losing power in a low rise means no heat, no Internet or cable, no cooking and no hot water (add ‘no water’ for a high rise). We conserved our phones, following the progress of Toronto Hydro power restoration using the #darkTO hashtag on Twitter (everything else was useless). Toronto Hydro outage map crashed under load, resulting in Toronto Hydro posting the map via Twitter picture updates (kudos for Twitter robust infrastructure, which is more than I can say about Toronto Hydro servers). After a while, we drained our phones. I charged my phone from a fully charged MacBook Pro (an operation I was able to repeat twice before MacBook Pro lost its juice). We had four laptops to use as door stops/phone chargers. I could read some eBooks on a fully charged iPad, but somehow was not in the mood. Dinner was cold sandwiches by the candle light. Not as romantic in a cold room.

By Sunday night, the temperature in the apartment dropped to 19.5C (that’s 67F for my American friends). We slept fully clothed. On Monday morning we packed up and went to IBM building in Markham that had power to shower, get some core temperature back, eat a warm meal and charge all the devices. We also used the opportunity to book a hotel room in Toronto downtown (no big trees to knock down power lines – yey for the big soul-less downtown buildings). When we went back home to pack, the room temperature dropped to 18C. The temperature outside dropped to bitterly cold -10C, and going to -14C over night.

Over Monday night, the power was restored. We returned on Tuesday morning, only to find Internet and cable inoperative. Estimated time for repair – 22 hours. In addition, our building is somewhat old and its hot water furnace does not like going from 0 to full blast quickly, resulting in a temperature that was 2-3 degrees less than usual. It was a matter of time until I succumbed to common cold.

Internet and cable was restored on Wednesday, 4 days after the outage started. Over the last couple of days the winter outside let up a bit, allowing the ice on the trees to melt and furnace to bring the temperature to the normal levels. My cold is on the downswing, enough for me to write this blog post. I will still need to wait for the cold-induced watery eyes to return to normal before taking the planned photos for my 2014 Internet profiles.

Why am I writing all this? Just to show you that the real takeaway message for 2013 is not horrible first world problems we grapple with daily (should I use Node.js or not, should I use MongoDB or Couch DB or just stay with RDBs), but that most of us are privileged to live with the trappings of civilization allowing us to not worry about warm water, heat, food and clean clothing. On Wednesday when my daughter was seriously unhappy that internet was not back yet, I felt a bit like ‘the most ungrateful generation’ in the Louis CK show (“Everything is amazing and nobody is happy”). As I am writing this, there are still my fellow Torontonians without power. Their houses are probably icicles by now. My heart goes to them and hope they get power as soon as possible.

As for myself, I can tell you that the fact that when you write Node.js code you may find yourself in a callback hell didn’t mean much when I was sitting in a cold room, lit by candle light and frantically refreshing the #darkTO thread, while my battery was slowly draining. A lesson in humility and a reason to count your blessings delivered in the time of the year we normally see the re-runs of It’s a Wonderful Life on TV (if you still watch TV, that is).

Therefore, all of you reading this, have a great 2014! If you are in a warm room, have clean clothes and a warm meal, and can read this (i.e. your wifi is operational), your life is amazing and you are better off than many of your fellow human beings. Now go back to the most pressing topics of your lives, like this one:

© Dejan Glozic, 2013