We Don’t Have MongoDB – is CouchDB OK?

First Choice Colas, Wikimedia Commons, 2007, Erik1980
First Choice Colas, Wikimedia Commons, 2007, Erik1980

‘It can’t be helped,’ the saddlebum said. ‘What’s happened is, you’ve overloaded your analogizing faculty, thereby blowing a fuse. Accordingly, your perceptions have taken up the task of experimental normalization. This state is known as “metaphoric deformation”‘.

 

Robert Sheckley, “Mind Swap”, 1966.

One of the Sci-Fi books that left a deep impression on my young mind was Mind Swap by Robert Sheckley (no, I didn’t read it in 1966, in case you were wandering – I am not that old). It introduced a number of hilarious concepts that were surprisingly applicable to the real world, and not just the crazy ones he created in the book. One of them was the aforementioned ‘metaphoric deformation’. According to Robert, it is a disease that often befalls interstellar travellers, where a mind overwhelmed by a stimulation it cannot process and make sense of, starts generating more bearable hallucinations.

I think about metaphoric deformation a lot these days when the world of software development is changing at an unprecedented pace, and in multiple dimensions at once. In our development workflow, we create a set of roles for libraries, databases, languages and platforms. We understand that the things have changed and are willing to replace A for B but are desperately trying to keep the old habits in place, just replace the players.

Case in point – the MEAN stack. The open source world was happy with the LAMP stack (Linux, Apache, MySQL, PHP). Facebook now sits on the world-biggest mountain of PHP and stores precious big data in racks upon racks of MySQL because that’s what Mark reached for in his Harvard dorm room to create thefacebook.com (OK, I know they bolted Hack on top of PHP but still). Now that the world has changed unrecognizably, developers are searching for another four-letter stack lest they slip into a metaphoric deformation of their own.

And yet, that’s just a crutch. Let’s analyze what MEAN stands for:

  • M is for MongoDB – a goto NoSQL database for MySQL withdrawal, it recommends itself mostly because of the SQL-like query language that makes porting manageable.
  • E is for Express.js – a middleware framework for Node.js that allows you to create the familiar world of server side MVC.
  • A is for Angular.js – because if MVC on the server is good, MVC on the client must be great, and the good people of Google have thought about everything so that you don’t have to (and if you do, it’s kind of hard to wrestle the control back anyway).
  • N is for Node.js – because awesome.

See what we have done? We just recreated our old world using the new tech. Very human, but you will soon realize that the old world is gone for good and we will have to do better.

Fearful for the fragile minds of their fellow citizens, the first engineers to experiment with cars simply set out to create ‘horseless carriages’. Replace a horse with an engine, keep everything else as it has always been since the dawn of time. Now, look around you – does that 10 air bag, GPS-connecting, direct-fuel-injecting, microprocessor-controlled car that just whizzed by has any semblance to the horseless carriage to you?

Nowhere is this inclination to replace the parts but keep the structure more bizarre than in the current software world. I am not the first to notice – Forrest Norvel from New Relic has observed it in his presentation Beyond the MEAN Stack.

Forrest noticed that Node.js is all about modularity and freedom of choice. For every letter in the MEAN stack (except N, of course, lets not get crazy here) there is an alternative. Instead of MongoDB, there are other NoSQL databases. Instead of Express.js, there is Hapi.js (created by Eran Hammer, a person I deeply admire ever since he left OAuth 2.0 Workgroup with a bang). Even the author of Express, the prolific TJ Holowaychuk, moved on to the next generation framework Koa that uses ES6 generators and looks very promising. Instead of Angular, there is Ember.js, or Backbone.js, or Knockout.js, or none at all (why do you think you absolutely need MVC on the client if some jQuery or even just vanilla JS can suffice?).

Adrian Rossouw dared to ask the right question – why the crazy stampede to MongoDB? There are many situations where another NoSQL database is a better fit, so why instinctively reaching for ‘MySQL of NoSQLs’, other than to keep you from slipping into metaphoric deformation?

Due to some legal issues (AGPL license of MongoDB server is giving our lawyers heartburn), we were slow to make that instinctive choice ourselves. Then IBM bought Cloudant, and it was an easy decision – use our very own NoSQL DB, and let somebody else manage it to boot. True master-master replication can’t hurt either.

Not so fast. In case you are not familiar, Cloudant is a hosted CouchDB fork with the *Apache Lucene on top. Just looking at CouchDB itself, the goal was to do the key things well, not help developers transition to the new world by dragging in the old habits. We were staring at map/reduce views feeling the anxiety of the New, looking for the ways to make dynamic queries.

Turns out you can’t. That’s what Apache Lucene was there for – if you need to create ad hoc queries based on some computed constraints, you use Apache Lucene. Map/reduce views are for predefined queries – things you can craft in advance.

Paul Klicnik, a developer from our team, spent a lot of time banging his head against the Cloudant wall, looking wistfully over into the MongoDB land where he could have just formed a dynamic query using JavaScript. However, when he made the realization he could use Apache Lucene for that, things started looking up. In the end, we ended up with a very fast, solid implementation of a Node.js micro-service that uses nano module to persist data in Cloudant.

In Paul’s own words:

My experience is that map/reduce is a little tricky to get the hang of, but it’s very powerful. Now that I’ve experienced it for myself, I can say I like it a lot. I’m actually using a regular view with map/reduce for retrieving user filters, and it works great. There is something very simple and elegant about Couch. It’s focused on being very good in certain scenarios, and not trying to be a general purpose database which can solve every problem. I also like that the API is RESTful – I know it’s a small detail, but REST is already pretty natural for us, and it makes it easy to work with.

Apparently, Paul is firmly in the new world now – using CouchDB in a way it was intended, not by trying to pretend it is an SQL database where you don’t have to create tables any more.

Of course, not everything is perfect:

Without the Lucene functionality, we would not be able to use CouchDB for the type of querying we need to do. I have mixed feeling about this. So far, I have had zero problems with Lucene and it performs great, but I’m uneasy about the fact that someone essentially shoe-horned in a major feature. I view Cloudant as a two headed monster – one head being normal CouchDB that behaves as expected, the other a weird appendage that seems to be behaving normally but you’re never quite sure if it’ll bite your head off. I’m being a bit dramatic, but I’m not fond of solutions where someone alters the functionality and attempts to make a product do something that it was never intended to do, ultimately comprising the system on the way. I realize that this is a purist view; maybe under the covers everything is technically sound.

I have added Paul’s second comment in case you interpreted this post as an ad for Cloudant. We are just finding our way in the new world and trying to use it as intended, and sometimes it gets a little weird. I guess that’s half the fun.

As I always say, “si fueris Rōmae, Rōmānō vīvitō mōre (rough translation from Latin: “when in Rome, for the love of God stop eating at McDonald’s – there are like a million trattorias serving actual Italian pizza and stuff”). Stop searching for new stacks to recreate your familiar world and instead open your mind for the new ways of looking at things.

Or even better – don’t create a single stack. For each element of the old stack, go through several awesome choices available today, and take a pick based on the suitability for the task at hand. Contractors don’t have one hammer, one saw and one drill set – why should you?

Oh, and one more thing – stop drinking caffeinated sugary drinks. Both of them.

 

*The original version of the article claimed Cloudant uses ‘Lucene Elastic Search’. This was a mistake, although Apache Lucene (definitely not Elastic, or Lucene Search) is involved, as per company website.

© Dejan Glozic, 2014

Advertisements

16 thoughts on “We Don’t Have MongoDB – is CouchDB OK?

  1. Hi Dejan-

    Great post, I’m glad you guys have found a way to get Cloudant to work for you. I agree that simple drop-in replacing the LAMP stack is not the right way to think about application development in this rapidly-changing tech environment.

    At Cloudant we’ve integrated a great deal of functionality under a single RESTful API. Describing Cloudant as a hosted CouchDB service and Cloudant Search as a “weird appendage” represents a fundamental misunderstanding of the Cloudant Data Layer.

    We began building our search product in late 2009 because it was (and still is) a popular customer feature request and is a natural fit for indexing and dynamically querying the JSON docs stored in Cloudant. In no way was Lucene “shoe horned” into Cloudant. We’ve built the Lucene system deep into the core database such that is “equal” to the map/reduce system in Cloudant’s eyes. I promise you, our Lucene-based search will behave just as you expect and will not bite your head off.
    Best

    Alan Hoffman, Cloudant Co-founder

    1. Alan,

      Thanks for taking the time to reply. Admittedly Paul’s comments are not supported by any in-depth analysis, just a gut feeling, and they are notoriously unreliable :-). Your comments are reassuring and support Paul’s hunch that the way Lucene is hooked up is clean and solid. Anyway, it is working great for us and we have no complaints.

      All the best, Dejan

      1. “Cloudant Search is powered by Apache Lucene, the most popular open-source search library. ” is accurate and clear, no? 🙂

      2. the words ‘Lucene Search’ appear adjacently only in contexts that clearly don’t refer to a product name.

      3. could you say ‘Apache Lucene’ instead of ‘Lucene Search’, the former is the actual product name. thanks.

  2. Hah, I just found this post, and I was even quoted in it.

    Great article, and it is good to see other people questioning the logic.

    My personal stack is actually couchdb + elasticsearch, because the rivers in ES make this a no-brainer. I probably like this combination for many of the same reasons that mongo users like their query language.

    http://daemon.co.za/2012/05/elasticsearch-5-minutes
    http://daemon.co.za/2012/05/replacing-couchdb-views-with-elasticsearch

    1. Yeah, the personal choices are very often dictated by the bigger context. We found that couchdb on its own lacks the dynamic query feature, hence the pairing with elasticsearch in your case, or Apache Lucene in our case (through Cloudant – we get it for free – what’s not to like :-).

      I saw your articles – in fact I think after reading them had ‘Elastic Search’ stuck in my head long enough to make a mistake and claim Cloudant uses it as well, instead of Apache Lucene :-). Had to have some of the Cloudant own people call me on it :-).

  3. Reblogged this on Oliver Cox and commented:
    A great post highlighting our natural tendency to think of new things in terms of our things.

    I guess the lesson here is to embrace new technology fully and understand what it’s good at before attempting to implement.

  4. Pingback: Quora

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s