Socket.io and the Business of Open Source

Gerard van Honthorst: The Matchmaker (1625)
Gerard van Honthorst: The Matchmaker (1625)

In one of my previous posts on the topic of risk, I mentioned studies that show that investor tolerance for the stock market risk is much higher when the market is rising than when it is falling like a knife. It turned out that the risk is felt as something abstract until you actually start losing real money, at which point it becomes painfully real and risk appetite-suppressing. I was reminded of this phenomenon as I was researching material for this blog post, and unintentionally encountered the risk of depending on Open Source projects.

The post was supposed to be another one in a series about our journey into full-on Node.js development. I wanted to demonstrate one of the key selling points of Node – the ability to handle a lot of I/O in a non-blocking fashion, allowing the Node server to push simultaneous messages to a large number of clients using one of the push technologies. As with everything Node, there are multiple libraries for this task but it is hard to miss that Socket.io has been immensely popular among the choices (here is a nice summary of the server push choices for Node.js).

Allright then, let’s install socket.io via NPM and start coding. Not so fast – which version? And herein starts a tale full of twists and suspense. If you go to Socket.io Web site, it is all about version 0.9. Aha, but if you google Socket.io, you will find presentations of the type “Why Socket.io 1.0”, “What is new in Socket.io 1.0”, “How Socket.io 1.0 Will Be Unbelievably Awesome”, all around mid 2012 or so. Then in 2013 – silence. Everybody assumed it was a matter of days until 1.0 appeared, yet we are still waiting.

It looks like version 1.0 has been a long time coming, which is fine by itself – I have noticed that Node community favors features and quality over schedules (all these comments equally apply to Node version 0.12). However, there is precious little about the upcoming version at the project’s GitHub home page. This has caused a lot of discussion in the project itself, as well as a panicky Google Groups thread Is Socket.io Dying?

Well, it appears that rumors of Socket.io’s death have been greatly exaggerated – Mr. @rauchg himself appeared to assuage palpitating followers and assure them that all test suites for v1.0 are passing and the good news are coming very soon. This was in December 2013. We are in February 2014 and still waiting. The lower level Engine.io module is supposed to power Socket.io, and apparently that’s where most of the action is these days.

The frustrating wait for a long anticipated product is something I have experienced recently with Apple Logic Pro X. Apple’s replacement for the Logic Pro 9 was several years in coming, and the rumor mill was crazy. People were latching to every whisper on the InterWeb, every Mac Rumor they can find, publicly renounced Apple, threatened to switch to Cubase or Pro Tools. The parallels are almost uncanny: uncertainty about the next version, tight-lipped product owners leaving a lot of room for speculation, FUD, rumors of near death, reassurances by the product owners that they are ‘hard at work on a new version’.

In the end the new version did arrive, it was unbelievably awesome and everybody was too busy playing with it and loving Apple again to remember how crazy they were only a short while ago. Most likely this will be Socket.io story – @rauchg and the team will be forgiven as soon as NPM starts serving v1.0 modules.

What did we learn from this, if anything? Google Groups comments contained reminders that “Socket.io is Open Source, after all”, and that “we should be grateful for free software” and “they don’t owe you anything, man” etc. All true, but I take an issue with “free software”. Open Source software is not very different from paid software – there is always a reason why somebody writes it, and the fact that I am not asked to pay for the software upfront does not mean that a business model does not exist and that a transaction is not happening. It is just not as obvious as with payware.

Open Source is free, but it is not a hobby for most people. In the old days, releasing code into Open Source was a way to undercut a competitor by commoditizing a function or a product. More recently, contributing to Open Source became a way to grow your personal brand and be highly employable in the social economy. One of these days somebody will make the equivalent of the Klout number that measures your footprint in Open Source projects (you can call it ‘Codex’ but if you do, I want some stock options if you build a startup around this idea). Therefore, while Guillermo Rauch wrote Socket.io ‘for free’, social advertisers can calculate to a dollar the value of the tagline “Creator of socket.io” in his Twitter profile. Otherwise, how on Earth can Snapchat that makes zero revenue be valuated at $3B by Facebook, and then turn down that offer? Not seeing the monetization path right away does not mean it does not exist.

Apart from individual contributors and their public brand, whole companies build up their street cred by contributing to Open Source. In a recent article by ReadWrite (thanks for passing along, @cra!), companies such as Google, Twitter and LinkedIn were open about it. In fact, LinkedIn went as far as to claim that ‘Open Source is part of their recruiting strategy’ because their engineering blog and the code that is discussed there is essentially a replacement for the HR lady (unless you consider Veena Basavaraj a member of LinkedIn HR – it definitely seemed so when I researched their fork of Dust.js: I felt the urge to work for LinkedIn, and I wasn’t even looking).

In a sense, GitHub is becoming the software industry equivalent of that matchmaker lady in the painting above, matching companies on a lookout for talent and developers putting their best lines of code forward. Your ‘free’ GitHub project may speak about you more than your LinkedIn resume. For some developers, Open Source GitHub projects ARE their resumes, cutting the glib ‘team player’ and ‘fast learner’ claptrap of traditional resumes that nobody reads and going straight to the code, where the tire meets the road. It is all great that you work well in response to challenge, but why don’t you fix those 150 issues that are gathering dust in your project? Will you be similarly neglectful if we hire you?

In that light, I don’t think it is too much to ask to get the Socket.io v1.0 out the door already, particularly after so much PR about it. Every time we tweet about it, use it, write articles and blog posts, talk to our friends about it, we build up the Socket.io team’s reputation they can take all the way to the bank, so I think the transaction is fair and we are even. Along that line (and to tie back to my reference to risk tolerance, in this case risk of building production code on top of Open Source software), I don’t think we can just wave our hand and say ‘it comes with the territory of using software you didn’t pay for’. As I tried to prove here, I DID pay for it and continue to pay, just in a more convoluted and delayed transaction.

Less pontificating, more coding. Next week I will go back from this tangent to write about pushing events from a Node.js server using whichever Socket.io version is available at that time. Maybe this blog post will code-shame them into releasing v1.0.

Meanwhile, and talking about the risk of depending on Open Source code, not sure what to do with node-optimist, beside recommending against using modules with pirates on their home pages:

© Dejan Glozic, 2014

Rocket Science vs System Integration

By certified su from The Atherton Tablelands, Queensland , Australia (Integration) [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0) or CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Note: In case you missed it, there is still a chance to get a free eBook of RESS Essentials from Packt Publishing. All you need to do is check the book out at Packt Publishing web site, drop a line about the book back at the comments section of my article with your email address and enter a draw for one of the three free e-books. And now, back to regular programming.

When I was in the army, my mother told me that she kept noticing soldiers on the streets. I mean, they didn’t suddenly appear, they were always there, but she only started noticing them because of me. Once I returned from my regular 12 month service, soldiers around her receded into oblivion again.

This is a neat trick that you can try yourself. Have you noticed that you start spotting other cars like your own on the road as soon as you buy it, even though you didn’t care less about them before (unless you really, really wanted it, in which case you were spotting it as an object of desire, anticipating the purchase)? If you care to learn another useless phrase, the scholarly term for this phenomenon is perceptual vigilance.

When we were building web apps in the 00’s, a number of technologies were considered a ‘solved problem’. You get yourself an Apache server, install it on a Linux box, strap on Tomcat for servlets (all right, you can use PHP instead) and add MySQL for storage. Of course, there were commercial versions that offered, well, ‘bigger’ but the basic principles were the same. Nobody was excited about serving pages, or storing data, or doing anything commoditized that way. The action was in solving the actual end-user problem, and the sooner you can assemble a system out of the off-the-shelf components and get to the problem solving part, the better.

Well, once you need to serve millions upon millions of users, strike the ‘solved’ part. You start hitting the limits of your components, and you need to start paying attention to the things you stopped noticing years ago. That’s what is now happening across the software development world. The ‘solved’ problems are being solved again, from the very bottom. Web servers, data storage, caching, page templates, message queues – everything is being revisited and reopened. It is actually fascinating to watch – like a total gutting and rebuilding of a house whose owners grew tired of the basement flooding every time it rains. It is as if millions of developers suddenly noticed the components that they were taking for granted all these years and said ‘let’s rebuild everything from scratch’.

When you look at a Node.js ‘Hello, World’ server example, it is fascinating in its simplicity until you consider how much stuff you need to add to get to the level of functionality that is actually needed in production. You are really looking at several dozens of JS modules pulled in by NPM. Using the house paradigm again, these modules are replacing all the stuff you had in your house before you reduced it to a hole in the ground. And in this frenzy of solving the old problems in the new world I see two kinds of approaches – those of idealists and pragmatists.

There is a feeding frenzy on GitHub right now to be the famous author of a great new library or framework. To stay in the Node.js neighborhood, the basement is taken by @ryah, most of the walls are solidly put by @tjholowaychuk (although there are some new walls being added), and so on. Idealists fear that if they do not hurry, the house will be rebuilt and there will be no glory left in the component business – that they will need to go back to be pragmatists, assemblers of parts.

For some people that is almost an insult. Consider the following comment by Bhaskar Ghosh, LinkedIn’s senior director of data infrastructure engineering in a recent article by Gigaom:

Ghosh, well, he shot down the idea of relying too heavily on commercial technologies or existing open source projects almost as soon as he suggests it’s a possibility. “We think very carefully about where we should do rocket science,” he told me, before quickly adding, “[but] you don’t want to become a systems integration shop.”

Of course, Bhaskar is exaggerating to prove the point – LinkedIn is both a heavy provider and a consumer of Open Source projects. Consider this (incomplete) list for example – they are not exactly shy using off the shelf code where it makes sense. Twitter is no stranger to Open Source as well – they share the use of Netty with LinkedIn and many other (albeit heavilly modified) open source projects. Nevertheless, Twitter shares the desire to give back to the Open Source world – my team together with many others are using their awesome Bootstrap toolkit, and their modifications are shared with the community. You can read it all at the Twitter Open Source page, run by our good friend @cra.

Recently, PayPal has switched a lot of their Java production servers to Node.js and documented the process. The keyboards are still hot from the effort and they are already open sourcing many of the libraries they wrote for the effort. They even joined forces with LinkedIn on the fork of the Dust.js templating library that the owner abandoned (kudos to Richard Ragan from PayPal from writing a great tutorial).

There are many other examples that paint a picture of a continuum. Idealists see this as a great time to have their name attached to a popular Open Source project and cement their place in the source code Hall of Fame – it’s 1997 all over again, baby!. On the other hand, startups have no problem being pragmatists – it is very hard to build something from scratch even with off the shelf Open Source software – who has time to write new libraries when existing ones are good enough to drive your startup into the ground 10 times over.

In the space between the two extremes, companies with something to lose and with skin in the game stride the cautious middle road – use what works, modify what ‘almost’ works, write their own when no existing software fits the bill, fully aware of the immediate and ongoing cost. And the grateful among them give back to the Open Source community as a way of saying ‘thanks’ for getting a boost when they themselves where but a fledgling startup.

© Dejan Glozic, 2014