Wednesday, October 27, 2004

On GMail and DHTML architecture again

So Jon Udell has been pondering GMail and its architecture on his blog. He's also trying to figure out when to augment DHTML apps with even richer client technology. Below follows what I thought would be a short email that I repeat here for blogospheric conversation's sake.

GMail's architecture is actually very generic for a DHTML app. Everyone with a clue should be trying to leverage the browser and that, in essence, is all they are doing.

I helped implement an analogous architecture in Lotus K-station back in 1999. The major difference now is that everyone is using the "v.5" versions of the browsers; no one cares any longer about Netscape 4.7x.

For me the tipping point for rich DOM/JavaScript applications came when we saw mainstream applications like Hotmail and Yahoo Mail start using DHTML menus. My Yahoo only switched to DHTML menus when it was relaunched a few weeks ago and that's my litmus test for conservatism. (Ironically GMail switched back from DHTML menus to standard HTML selects for its "More actions" and apply label bar a month or so ago - although that might just be from usability testing). A similar event happened in the past year with the way users can rate items in wishlists and also in Netflix without seemingly causing page reloads - you know those five stars.

I mention the spread of all these little DHTML flourishes because I remember back in 2000 spending 3 months and well over 10 heated meetings trying to convince IBM architects to allow the use of simple things like DHTML menus and losing the battle. Of course given that kind of resistance, it was unlikely that features like drag-and-drop which one got from the UI engine in K-station would see the light in WebSphere Portal.

I was arguing that judicious leverage of DHTML could improve the user experience and that it could done while addressing accessibility concerns. Ostensibly the objections were not about usability or consistency but rather about not wanting to write multiple UIs and lack of expertise in DHTML programming. I realized then that the argument was not really about "to-DHTML-or-not" but more about comfort with a repartitioning of one's architecture.

It is riskier to do more on the rich browser client because it has been a more brittle platform over this past decade. Companies that do middleware and server-side tooling take a while to move from their core competency. Architects that thrive in that environment are essentially conservative and for good reason... Four years later, I now hear mutterings about drag-and-drop and richer clients in our corridors...

This only underscores the point that Jakob Nielsen's predictions about browser adoption cycles have turned out to be pretty accurate. Even though web application developers have been quietly spreading unobtrusive javascript usage in the interim, it is only now that there's a critical mass of clients that can leverage them; when Amazon and Yahoo move, something must be happening.

The developer tools and resources have gotten (slightly) better and there's more experience with the DOM. Increased adoption of broadband also helps reduce latency for the average client so you don't have to fight the inevitable arguments about performance and can couch your advocacy in terms of user interaction. In any case if and when you do have the performance discussion you can always argue that caching as close to the client as possible is a good thing and what better cache than the browser itself. It just so happens that applications like GMail, Bloglines and Oddpost are the state of the art in terms of browser leverage.

I recently wrote about this type of architecture in my recounting of the history of the DHTML spreadsheet and presentation components that are the genetic forebears of OddPost.

The idea is to fetch an HTML skeleton, decide what content you need, fetch that (as XML), and cache it wherever you get a chance. Render incrementally.

The pattern is simple: Database <-> XML (Optional) <--> JavaScript Object Bindings <--> UI Bindings (HTML) + UI management code
This pattern works very well for page oriented applications like portals, email, aggregators. You can cache or preload your javascript objects and just manipulate the CSS display or visibility attributes for your UI Bindings.

Incremental rendering and multithreaded loading is the name of the game here. Your application is essentially architected as a hypermedia browser just like the browser and leverages the browser's built in core features, our old favorites: incremental rendering and multithreaded downloading.

The decision about using XML as the data transfer format instead of Javascript objects is a toss-up, trading off client reach and memory. XML support in Safari and Opera is not as baked as MSIE or Mozilla so having JavaScript as your interchange format with your UI engine will buy you increased reach and smaller footprint.

Ultimately though XML will be your backend data format so the temptation will increasingly be to use the various XML-on-the-wire APIs (XMLHTTPRequest). Maybe another 4 years is needed for this pattern to see more widespread usage, JavaScript on the wire suffices for now.

Perhaps K-station was too bleeding edge trying to go for XML over HTTP, DHTML and extreme leverage of the browser client 5 years ago but that experience was a great testbed for me and I learned lots of lessons about building rich REST-ful applications, the importance of URIs etc.

Again the major missing feature for this rich web application platform is offline usage and synchronization without introducing new security holes in the browser. But then that's why Bosworth is at Google as the rumour goes, right? I suspect he's got other things in mind though...

I would note that the Mozilla folks know that the offline capabilities in their platform are a bust so I'd expect some eyes on this. Look also at applications like FormsPlayer for innovation in this space or anyone doing forms in general. XForms is my current thing and offline forms are a great feature that users can understand and demos very well; it's also something that the Notes platform does very well. Our old friend, Groove, also does that kind of synchronization well but I'd want that for the web, natively in the browser.

I think that increased leverage of the browser and the DOM is a good thing. It's also a clear trend and for many applications, the browser is good enough. Good enough for Google, good enough for Yahoo, good enough for me.

One note about memory consumption and pushing things onto the client. There's an end-to-end argument for this notion of pushing intelligence to the endpoints but there is a cost and in this case it's memory consumption. It's mininal but it exists in this case. One lesson I learned was that it's good to use the kind of machines that the average client would be using and consequently I always hoard older machines. I've noticed that with lots of tabbed browsing and increased use of GMail and Bloglines, my four year old Windows ME box (Athlon 900 Mhz 512MB RAM) thrashes memory more often and I've seen more sporadic crashes/freezes with Mozilla. I know this is actually a lagging indicator and that most users are now on Windows 2000 and possibly XP, but it is a pointer to the memory consumption overhead of DHTML apps. All my other machines are fine and I should note that these applications run just as fine on ME as on other platforms in terms of interactivity and all, it is just prolonged use and a piece-of-junk memory and resource subsytem in that excuse for an operating system... My next representative machine is a Win2k box and on that evidence everything is in order even with these highly leveraged applications.

File under: , , , , , , , , , , , , , , , , , , , , , , , , , , ,


Jep Castelein said...

What an excellent analysis of the area in this article, and in your 'history of DHTML' article. I agree the browser is good enough, but I do see the need for a better toolkit. It's currently so tough to get the most out of DHTML if you start programming from scratch.

I clearly remember Halfbrain, and all the other early attempts to get the most out of DHTML. I've always wondered why they didn't make a decent DHTML toolkit that makes this advanced functionality available to all web developers. Then I started working for a CMS vendor, and sort of forgot about it...

...until earlier this year: I was looking for a new job, and in my native Amsterdam (5 minutes from my apartment :-) I found a company working on such a DHTML toolkit. Now I'm working there as a Product Manager, and I'm still extremely amazed at what a browser is capable of.

It's very similar to the picture you paint: you load HTML and XML into the browser, which is interpreted by a JavaScript library, which is itself loaded module by module, only when needed. Of course, it makes heavy use of XMLHTTPRequest. Now on IE5+ and Mozilla, soon on Safari and Opera.

And we are currently starting a large project with IBM Netherlands: we'll see what will happen :-)

Unknown said...

Koranteng -- only just seen you blog, but it's spot on! Out of interest, have you seen the formsPlayer RSS Reader? It uses a 'cookie' technique to save a list of RSS feeds locally for the user, without having to know the installation path. I mention it in relation to your comments about 'offline' working, etc.

All the best,


Mark Birbeck
CEO, Ltd.