
Now, my story is not as interesting as it is long.
Abe Simpson
One of the privileges (or curses) of experience is that you amass a growing number of cautionary tales with which you can bore your younger audience to death. On the other hand, knowing the history of slavery came in handy for Captain Pickard to recognize that what Federation was trying to do by studying Data is create an army of Datas, not necessarily benefit the humanity. So experience can come in handy once in a while.
So let’s see if my past experience can inform a topic de jour.
AWT, Swing and SWT
The first Java windowing system (AWT) was based on whatever the underlying OS had to offer. The original decision was to ensure the same Java program runs anywhere, necessitating a ‘least common denominator’ approach. This translated to UIs that sucked equally on all platforms, not exactly something to get excited about. Nevertheless, they embraced the OS, inheriting both the shortcomings and receiving the automatic improvements.
The subsequent Swing library took a radically different approach, essentially taking on the responsibility of rendering everything. It was ‘fighting the OS’, or at least side-stepping it by creating and controlling its own reality. In the process, it also became responsible for keeping up with the OS. The Eclipse project learned that fighting the OS is trench warfare that is never really ‘won’. Using an alternative system (SWT) that accepted the windowing system of the underlying OS turned out to be a good strategic decision, both in terms of the elusive ‘look and feel’, and for riding the OS version waves as they sweep in.
The 80/20 of custom widgets
When I was working on the Eclipse project, I had my own moment of ‘sidestepping’ the OS by implementing Eclipse Forms. Since browsers were not ready yet, I wrote a rudimentary engine that gave me text reflows, hyperlinks and images. This widget was very useful when mixed with other normal OS widgets inside Eclipse UI. As you could predict, I got the basic behavior fairly quickly (the ’80’ part). Then I spent couple of years (including help from younger colleagues), doing the ‘last mile’ (the ’20’) – keyboard, accessibility, BIDI. It was never ‘finished’ and it was never quite as good as the ‘real’ browser, and not nearly as powerful.
One of the elements of that particular custom widget was managing the layout of its components. In essence the container was managing a collection of components but the layout of the components was delegated to the layout manager that could be set on the container. This is an important characteristic that will come in handy later in the article. I remember the layout class as one of the trickiest and hardest to get done right and fully debug. After it was ‘sort of’ working correctly, everybody dreaded touching it, and consequently forgot how it worked.
DOM is awesome
I gave you this Abe Simson moment of reflection to set the stage for a battle that is raging today between people who want to work with the browser’s DOM, and people who think it is the root of all evil and should be worked around. As is often these days, both points of view came across my twitter feed from different directions.
In the ’embrace the DOM’ corner, we have the Web Components crowd, who are thinking that DOM is just fine. In fact, they want us to expand it to turn it into a universal component model (instead of buying into the ‘bolt on’ component models of widget libraries). I cannot wait for it: I always hated the barrier of entry for Web libraries. In order to start reusing components today, you first need to buy into the bolt-on component model (not unlike needing to buy another set top box in order to start enjoying programming from a new content provider).
‘Embracing the DOM’ means a lot of things, and in a widely retweeted article by Reto Schläpfer about React.js, he argued that the current MV* client side framework treat DOM as the view, managing data event flow ‘outside the DOM’. Reto highlights the example of React.js library as an alternative, where the DOM that already manages the layout of your view can be pressed into double-duty of serving as the ‘nervous system’.
This is not entirely new, and has been used successfully elsewhere. I wrote previously on DOM event bubbling used in Bootstrap that we used successfully in our own code. Our realization that with it we didn’t feel the need for MVC is now echoed by React.js. In both cases, layout and application events (as opposed to data events) are fused – layout hierarchy is used as a scaffolding for the event paths to flow, using the build-in DOM behavior.
For completeness, not all people would go as far as claim that React.js obviates the need for client-side MVC – for example, Backbone.js has been shown to play nicely with React.js.
DOM is awful
In the other corner are those that believe that DOM (or at least the layout part of it) is broken beyond repair and should be sidestepped. My micro-service camarade de guerre Adrian Rossouw seems to be quite smitten with the Famo.us framework. This being Adrian, he approached this is the usual comprehensive way, collecting all the relevant articles using wayfinder.co (I am becoming increasingly spoiled/addicted to this way of capturing Internet wisdom on a particular topic).
Studying Famo.us is an archetype of a red herring – while its goal is to allow you build beautiful apps using JavaScript, transformations and animation, the element relevant to this discussion is that they sidestep DOM as the layout engine. You create trees and use transforms, which Famo.us uses to manage the DOM as an implementation detail, mostly as a flat list of nodes. Now recall my Abe Simpson story about SWT containers and components – doesn’t it ring similar to you? A flat list of components and a layout manager on top of it controlling the layout as manifestation of a strategy pattern.
Here is what Famo.us has to say about they approach to DOM for layouts and events:
If you inspect a website running Famo.us, you’ll notice the DOM is very flat: most elements are siblings of one another. Inspect any other website, and you’ll see the DOM is highly nested. Famo.us takes a radically different approach to HTML from a conventional website. We keep the structure of HTML in JavaScript, and to us, HTML is more like a list of things to draw to the screen than the source of truth of a website.
Developers are used to nesting HTML elements because that’s the way to get relative positioning, event bubbling, and semantic structure. However, there is a cost to each of these: relative positioning causes slow page reflows on animating content; event bubbling is expensive when event propagation is not carefully managed; and semantic structure is not well separated from visual rendering in HTML.
They are not the only one with ‘the DOM is broken’ message. Steven Wittens in his Shadow DOM blog post argues a similar position:
Unfortunately HTML is crufty, CSS is annoying and the DOM’s unwieldy. Hence we now have libraries like React. It creates its own virtual DOM just to be able to manipulate the real one—the Agile Bureaucracy design pattern.
The more we can avoid the DOM, the better. But why? And can we fix it?
……
CSS should be limited to style and typography. We can define a real layout system next to it rather than on top of it. The two can combine in something that still includes semantic HTML fragments, but wraps layout as a first class citizen. We shouldn’t be afraid to embrace a modular web page made of isolated sections, connected by reference instead of hierarchy.
Beware what you are signing up for
I would have liked to have a verdict for you by the end of the article, but I don’t. I feel the pain of both camps, and can see the merits of both approaches. I am sure the ‘sidestep the DOM’ camp can make their libraries work today, and demonstrate how they are successfully addressing the problems plaguing the DOM in the current browser implementations.
But based on my prior experience with the sidestepping approach, I call for caution. I will also draw on my experience as a father of two. When a young couple goes through the first pregnancy, they focus on the first 9 months, culminating in the delivery. This focus is so sharp and short-sighted that many of the couples are genuinely bewildered when the hospital hands them their baby and kicks them to the hospital entrance, with newly purchased car seat safely secured in the back. It only dawns on them at that point that baby is forever – that the bundle of joy is now their responsibility for life.
With that metaphor in mind, I worry about taking over the DOM’s responsibility for layout. Not necessarily for what it means today, but couple of years down the road when both the standards and the browser implementations inevitably evolve. Will it turn into a trench warfare that cannot be won, a war of attrition that drains resources and results in abandoned libraries and frameworks?
Maybe I can figure that one out after a nap.
© Dejan Glozic, 2014
Web Components is one approach to standardizing these ideas, but leaves us permanently crippled long term because it’s heavily tied to XML with getters and setters that likely always be string and children based.
The approach we take with Famo.us may be the newest approach, but it permits the greatest possibility and paves a path for the web to actually be able to compete head to head with native indefinitely.
The DOM is a tree that is inextricably linked to the abstraction of a document. This is a fundamentally inflexible abstraction for all the things we display on computers. I agree with your word of caution at the end, but I would content that it applies equally to web components. There are serious concerns with the approach taken by web components as well. The W3C has processes for standardization, and it would be irresponsible to press forward with either approach without trying them out both out in user land and exploring the possibilities. Only by actually spending time building lots of things in user agent userland, will we actually be able to fully understand the problem we’re trying to solve.
What we can’t have happen is what the Chrome team attempted to do with the Web Components working group, where it tried to force standardizing whatever Chrome shipped:
http://lists.w3.org/Archives/Public/www-style/2014Feb/0103.html
https://news.ycombinator.com/item?id=7184912
I think it would be instructive to get as much feedback from native land developers from all platforms on what features they feel the web should have/support in order to make it as capable long term. Web Components at the end of the day is more of the same and comes from a position of great awareness of the DOM and a position of ignorance of value of scene graphs. It’s very likely that the web could evolve to support a very capable scene graph approach that can not only display semantically rich documents but even support semantics unexpressable under a document metaphor. Supporting a scene graph approach also does not impoverish the Document Object Model at all. It can continue to persist, providing the value it was designed for, and providing ways to describe relationships between documents (hyperlinks).
I would further posit, that a lot of the really interesting long term vision ideas that the XHTML2 working group tried to bring to fruition could actually be explored without depriving people of the app-like features that the WHAT-WG eventually succeeded pushing forward. XHTML2 failed because it ignored very real use cases people demanded. Web Components doesn’t go far enough to support everything people want to be able to build. Retain-mode scene graphs provide a path that can evolve for much longer into things beyond our capacity to consider.
Lastly, where does web components fit in in a world where technologies like the Oculus Rift become the norm? Are web components based on the DOM capable of keeping up there, or will we end up with another use case perpetually out of reach of the web?
(disclaimer: I work for famo.us)
Andrew,
All good points, and I hope I was clear that at this point I don’t have enough forward visibility to declare which approach makes more sense. Both camps have valid points, and I agree with you that Web Components work in particular is a bit too much Chrome and Google-slanted for my taste. I will definitely keep an eye on the progress of famo.us – I am sure @AdrianRossouw will not allow me otherwise :-).
Dejan
I agree with Andrew that Web Components are their own special kind of hell… because it’s essentially using divs and spans as the basic building blocks for something that requires much less of what is there, and much more of what isn’t.
I gained some insight into this when I approached the spec writers to talk about “seamless iframes”, which shares many ideas with Shadow DOM. Having spent several years doing Facebook apps, I was intimately acquainted with all the ways iframes were terrible in practice, and the seamless iframe spec did not do much to address this. Being “seamless” has all sorts of implications for autosizing, CSS inheritance, link targets, etc. Basically there were at least 3 different program behaviors that were grafted onto it. When I pointed out the use cases where this would fail, the spec writer seemed surprised and asked whether he could use my examples (i.e. a facebook app and a twitter social widget, neither of which was apparently considered). This is the reality of how these specs are developed: in an ivory tower, by people who are too far removed from day to day use to know what it’s good for.
Hmmm….
That is really depressing – all the things you mentioned for seamless iframes came to my mind as well, and the fact that the spec writers didn’t consider them is a bummer.
We spent enough time with iframes to share your pain, so I was kind of hoping seamless iframes will ease it, but your info tells me not to keep my hopes up…
Makes me think of Joel Spoelsky and his ‘Architecture Astronauts’…
Please send me a gif of a cute cat doing stuff to shake this bad feeling :-).
Dejan
Totally agree. However I’m willing to give the benefit of the doubt here and not chalk it up to architecture astronauting, but to listening too much to your customer.
The browsers have had lots of customers, the overwhelming majority of which have only had computing careers as web developers. It’s a very lop-sided and not at all representative of all the types of software engineers that should have a seat at the table when product decisions are being made. In a way, the user agent makers are suffering from the innovator’s dilemma here. There is absolutely too much emphasis on the needs of the current customers and not enough exploration of the unstated or future needs of customers. HTML5 was all about paving the cow-paths. That was a nice thought, but it’s brought us to a place where the web simply can’t compete head to head with native in lots of areas. Those companies and organizations developing native platform have a much broader representation of interests contributing to the discussion of what the product should be.
Don’t get me wrong. I love what the web has done and absolutely support the goals of the semantic web and the values of documents and the document object model. If anything, native platforms are very weak in this area because not enough of the folks promoting the value of semantics are sitting at the product table for native platform.
The way I see it, the user agent developers should focus on producing the best damn DMZ possible, because that’s really what the browser is, a demilitarized zone. It’s a very unusual runtime in that it’s an inherently untrusted sandbox where lots of actors (each with their own self interests, that sometimes leads to cooperation with the user and sometimes lead to hostility towards the user) can execute code that is priveliged to certain information and restricted access to using the user’s computing resources (CPU, GPU, RAM, screen real estate, etc.).
What I think the user agent developers need to do is focus a lot more on exposing as many low level primitives as possible in the safest possible way for the users they are representing. For example, the web desperately needs APIs like the Contacts API, that provides a secure way for the user to share their contacts with specific certain web applications (via ACL policies limiting access FQDN for example) instead of letting those apps get full access to the entire address book. The contacts API specifically allows the open web to return to its decentralized roots instead of letting it consolidate at gatekeepers like Facebook, Twitter and Google. Contacts is but one example, WebGL is working on liberating GL from native land. These are but a few important APIs.
If the user agent developers focused really low level, there would be tons of experiments by really amazing native engineers that finally see APIs they can tap into to build things previously only possible in nativeland. These experimentations will offer alternative paths and hopefully out of these explorations will come some sane standards layered on top. It all needs to be built out like an onion, one layer at a time, with each lower layer serving the right abstractions upon which higher order layers can be build.
It’s ultimately a product management failure at the spec level. These are amazing, well intentioned individuals doing thankless work, but what they need is more people like you (Steven) giving them examples upon which to make their decisions. I honestly don’t know how we get more people representing the whole gamut of use cases in native better involved in sharing unconsidered low level needs with the spec writers who are really only getting product requirements from current customers based on their current needs. I wish I had an answer here. I really do.
You had me at ‘paving the cow paths’ – well put.
Sounds like an argument for a better process to adopt de facto standards …
The fact is people who create specs from imagination are always going to have major blind-spots that quickly become apparent to actual practitioners.