Koranteng's Toli: DHTML

Showing posts with label DHTML. Show all posts

Thursday, July 28, 2005

Flickr's Godfather

The 3,000th photo I uploaded to Flickr is a howler. Well at least I think it is, let's have a look (full size image).

A little context is in order...

I, and countless others, had complained about Flickr's excessive use of Flash. First I had inveighed about rendering and accessibility concerns in Cultural Sensitivity in Technology. Then I used Flickr's Flash buttons as a prominent example in The Unloved HTML Button and other Folktales.

In any case, 2 days after the folktales were told, Flickr finally switched from Flash. They no longer use a wrapper for image display - allowing native browser image rendering, and they changed their button toolbar from Flash to DHTML. I'd like to think my snarky comments were the tipping point but I won't flatter myself. They now only use Flash where it's appropriate, for drag and drop organization of photo albums, the kind of job that Flash or applets are particularly well suited for. Now of course they didn't use unloved html buttons in their toolbar but we'll take what we get. DHTML has greater reach than than Flash and accessibility concerns are more easily addressed on that front. Also there are an evolving set of design patterns for dealing with unobtrusive DOM scripting and forms.

So there I was, pleasantly surprised by their responsiveness, and going about uploading a little comic image to punctuate some later toli essay. You'll notice from the image that there were a bunch of glitches on the first morning of the big switch. The icons for some of buttons of the toolbar weren't showing up. Oh well, we'll ignore that but simply note that if they were standard HTML buttons, there would be no images to download. Moving right along...

Then I noticed a couple of typos, I had tagged the photo as sopronos instead of Sopranos and the image's title mentioned Godfarther which tickled me somewhat.

Flickr's Godfather or Flickr Goes Further?

Well anyway, Flickr implements a Click-to-Edit feature, a little unobtrusive DOM scripting that allows you to edit in place, so I corrected the title and hit the save button that appeared. That's when this error message came up and I took the screen capture

Taking a step back for a moment, let me just say that I love glitches. They expose the interesting aspects of complex systems and, much as we aim for simplicity, software tends inexorably towards complexity. As users of software we see lots of glitches daily. As an engineer, I am always interested in the first few days of a new deployment. You can test all you want but all bets are off when you get contact with real users and the real world. As an example, Technorati's recent makeover exposed lots of unforeseen glitches and they have had to work hard to address most of them in the past month. I was chatting recently with Dale Schultz, globalization architect at IBM and noted that I have a standard set of user names when I test new pieces of software, I make sure to have hyphens (hence I use my surname), ampersands (Sun & Sun is my canonical company), and, of late, accents (Rokia Traoré) because I've been bitten by various curses in the past in the software I've written. My former team has a José López test user for the same reason. Sam Ruby uses the word Iñtërnâtiônàlizætiøn as his proving ground in the same vein. We got to discussing the tyranny of patents at IBM and I pointed him to the Prior-Art-O-Matic for a laugh. Dale is obviously many steps ahead of me and of course he tried entering a euro symbol and immediately noted that that CGI application was broken and couldn't handle euros. Glitches often tell you a lot about application internals and the things that the developers tried to foresee or, as the case may be, ignored.

But back to Flickr's Godfather, what can we say about the glitch?

Flickr is passing xml back and forth in their API calls.
They are likely using XMLHttpRequest to do the voodoo of incremental loading without refreshing the page.
There's an API key, probably tied to the user's identitiy that is likely passed around in every call. Sensible enough.
They have to implement a Javascript layer to catch API errors and display something to the user.

Now I could have determined a lot of this and more by poking around and doing the View Source investigation. At the time, I wondered if I would have done any different and concluded that my implementation would have been much the same.

I have to say that like many others I'm highly impressed with Flickr, they had defensive programming and had appropriate error messages. Most people wouldn't have bothered dealing with these boundary cases. I haven't seen similar glitches since that first day thus the teething pains were temporary and they continue to add nice features to their service.

In any case, the juxtaposition of Silvio growling and in full bloom, the Godfather typos and the error message that popped up under Silvio's hands certainly made for a little amusement then and even today and now has occasioned a short blog entry. It reminded me of an advertisement for Fosters beer I believe that goes "It touches the parts other beers fail to reach". I guess the analogue in this case is "Flickr Goes Further".

As to why I had uploaded that particular Sopranos image, well let's just say that there's a famous quote from that scene and that's for some later toli.

[Update] I tried to cross-post this to my internal IBM blog only to find that the post was chopped off at Ruby's Iñtërnâtiônàlizætiøn magic word. Thus ironically as I was pointing out glitches, I just got bitten by one. I believe BlogCentral is based on Roller software and I suppose that I'll have to figure out whether the problem is in IBM's additions or in the core framework. The interesting thing about bugs with special characters is that sometimes you can't write the issue up because the software can't handle the characters in question. Perhaps BlogCentral needs a Godfather.

File under: flickr, technology, web, dom, scripting, glitches, funny, whimsy, programming, development, javascript, DHTML, software, error handling, design, photo, error, message, collage, toli

Wednesday, October 27, 2004

On GMail and DHTML architecture again

So Jon Udell has been pondering GMail and its architecture on his blog. He's also trying to figure out when to augment DHTML apps with even richer client technology. Below follows what I thought would be a short email that I repeat here for blogospheric conversation's sake.

GMail's architecture is actually very generic for a DHTML app. Everyone with a clue should be trying to leverage the browser and that, in essence, is all they are doing.

I helped implement an analogous architecture in Lotus K-station back in 1999. The major difference now is that everyone is using the "v.5" versions of the browsers; no one cares any longer about Netscape 4.7x.

For me the tipping point for rich DOM/JavaScript applications came when we saw mainstream applications like Hotmail and Yahoo Mail start using DHTML menus. My Yahoo only switched to DHTML menus when it was relaunched a few weeks ago and that's my litmus test for conservatism. (Ironically GMail switched back from DHTML menus to standard HTML selects for its "More actions" and apply label bar a month or so ago - although that might just be from usability testing). A similar event happened in the past year with the way users can rate items in Amazon.com wishlists and also in Netflix without seemingly causing page reloads - you know those five stars.

I mention the spread of all these little DHTML flourishes because I remember back in 2000 spending 3 months and well over 10 heated meetings trying to convince IBM architects to allow the use of simple things like DHTML menus and losing the battle. Of course given that kind of resistance, it was unlikely that features like drag-and-drop which one got from the UI engine in K-station would see the light in WebSphere Portal.

I was arguing that judicious leverage of DHTML could improve the user experience and that it could done while addressing accessibility concerns. Ostensibly the objections were not about usability or consistency but rather about not wanting to write multiple UIs and lack of expertise in DHTML programming. I realized then that the argument was not really about "to-DHTML-or-not" but more about comfort with a repartitioning of one's architecture.

It is riskier to do more on the rich browser client because it has been a more brittle platform over this past decade. Companies that do middleware and server-side tooling take a while to move from their core competency. Architects that thrive in that environment are essentially conservative and for good reason... Four years later, I now hear mutterings about drag-and-drop and richer clients in our corridors...

This only underscores the point that Jakob Nielsen's predictions about browser adoption cycles have turned out to be pretty accurate. Even though web application developers have been quietly spreading unobtrusive javascript usage in the interim, it is only now that there's a critical mass of clients that can leverage them; when Amazon and Yahoo move, something must be happening.

The developer tools and resources have gotten (slightly) better and there's more experience with the DOM. Increased adoption of broadband also helps reduce latency for the average client so you don't have to fight the inevitable arguments about performance and can couch your advocacy in terms of user interaction. In any case if and when you do have the performance discussion you can always argue that caching as close to the client as possible is a good thing and what better cache than the browser itself. It just so happens that applications like GMail, Bloglines and Oddpost are the state of the art in terms of browser leverage.

I recently wrote about this type of architecture in my recounting of the history of the DHTML spreadsheet and presentation components that are the genetic forebears of OddPost.

The idea is to fetch an HTML skeleton, decide what content you need, fetch that (as XML), and cache it wherever you get a chance. Render incrementally.

The pattern is simple: Database <-> XML (Optional) <--> JavaScript Object Bindings <--> UI Bindings (HTML) + UI management code

This pattern works very well for page oriented applications like portals, email, aggregators. You can cache or preload your javascript objects and just manipulate the CSS display or visibility attributes for your UI Bindings.

Incremental rendering and multithreaded loading is the name of the game here. Your application is essentially architected as a hypermedia browser just like the browser and leverages the browser's built in core features, our old favorites: incremental rendering and multithreaded downloading.

The decision about using XML as the data transfer format instead of Javascript objects is a toss-up, trading off client reach and memory. XML support in Safari and Opera is not as baked as MSIE or Mozilla so having JavaScript as your interchange format with your UI engine will buy you increased reach and smaller footprint.

Ultimately though XML will be your backend data format so the temptation will increasingly be to use the various XML-on-the-wire APIs (XMLHTTPRequest). Maybe another 4 years is needed for this pattern to see more widespread usage, JavaScript on the wire suffices for now.

Perhaps K-station was too bleeding edge trying to go for XML over HTTP, DHTML and extreme leverage of the browser client 5 years ago but that experience was a great testbed for me and I learned lots of lessons about building rich REST-ful applications, the importance of URIs etc.

Again the major missing feature for this rich web application platform is offline usage and synchronization without introducing new security holes in the browser. But then that's why Bosworth is at Google as the rumour goes, right? I suspect he's got other things in mind though...

I would note that the Mozilla folks know that the offline capabilities in their platform are a bust so I'd expect some eyes on this. Look also at applications like FormsPlayer for innovation in this space or anyone doing forms in general. XForms is my current thing and offline forms are a great feature that users can understand and demos very well; it's also something that the Notes platform does very well. Our old friend, Groove, also does that kind of synchronization well but I'd want that for the web, natively in the browser.

I think that increased leverage of the browser and the DOM is a good thing. It's also a clear trend and for many applications, the browser is good enough. Good enough for Google, good enough for Yahoo, good enough for me.

Sidenote:
One note about memory consumption and pushing things onto the client. There's an end-to-end argument for this notion of pushing intelligence to the endpoints but there is a cost and in this case it's memory consumption. It's mininal but it exists in this case. One lesson I learned was that it's good to use the kind of machines that the average client would be using and consequently I always hoard older machines. I've noticed that with lots of tabbed browsing and increased use of GMail and Bloglines, my four year old Windows ME box (Athlon 900 Mhz 512MB RAM) thrashes memory more often and I've seen more sporadic crashes/freezes with Mozilla. I know this is actually a lagging indicator and that most users are now on Windows 2000 and possibly XP, but it is a pointer to the memory consumption overhead of DHTML apps. All my other machines are fine and I should note that these applications run just as fine on ME as on other platforms in terms of interactivity and all, it is just prolonged use and a piece-of-junk memory and resource subsytem in that excuse for an operating system... My next representative machine is a Win2k box and on that evidence everything is in order even with these highly leveraged applications.

File under: architecture, DHTML, programming, technology, web, javascript, xml, patterns, html, UI, udell, conversation, history, observation, adoption, DOM, scripting, xml, xmlhttp, interaction, IBM, Lotus, K-station, oddpost, yahoo, google, gmail, toli

Friday, July 16, 2004

On rich web applications, AlphaBlox and Oddpost

I had been meaning to comment about the past few posts over at Loosely Coupled but for some reason the comment entry fields don't show up in Mozilla or Firefox. The idea of having to launch Internet Explorer just to enter a comment didn't sit well with me, but I guess I will do it this once. This is part of a fairly deep vein of conversation in recent months on web applications, rich clients, the browser as a platform, or the location field as the new command line. A certain critical mass seems to have been reached, maybe it's the maturity of the browsers competing with Internet Explorer (especially Mozilla), the long history of security holes in MSIE or the publicity surrounding GMail and OddPost. I might as well join in on the commentary with a view from the trenches.

I work at Lotus software, IBM and was part of the tangled Halfbrain, Alphablox, IBM, Oddpost story at least last year when I worked on the "Simple Browser Productivity Components" - the spreadsheet, presentation and rich text editors that are now bundled with WebSphere Portal, tied to a document management system out of the box. For 6 months, my job was to 'port' these components to Mozilla; IBM for obvious reasons wants everything it ships to work on all platforms and browsers. The spreadsheet and presentation were licensed from AlphaBlox.

I wrote about some of my experiences at length last year in a discussion about "Applications vs. W3C DOM" and the adequacy of standards and browser support for rich web applications. Here's some further history and commentary:

The Halfbrain folks made a quite sensible decision that browser-based productivity applications would be a good thing, after all they, like everyone else, were starting to 'live' in the browser. More to the point, there's a market for good componentry and focused applications - lots of companies have developed spreadsheets, grids, charts, and various other editors and components. You may not make as much money upfront selling these components as shrink-wrapped software, like say a copy of Microsoft Office, but you can certainly tie it to services, other value-add features or even advertising and make money in other ways.

Sidenote: IBM has tried before to crack this market with Lotus eSuite (Java Office Suite for thin clients) which I also worked on after transfering from Lotus Freelance Graphics. Smaller companies have fared better because they understand that this is a niche market that one has to nurture and they didn't have the kind of insane revenue expectations that IBM had for that product (echoes of MS Office). 5 years on I hope/pray/(think?) that the company now understands what these kinds of products/technologies can achieve as part of its Portal and Workplace product portfolio and competitive strategy. One aspect of which is Linux. (cross my fingers since that's were the salary is coming from).

In any case, the Halfbrain folks didn't know any better, forged ahead, and miraculously a year later they managed to get a DHTML spreadsheet working. Everybody who sees it goes "Wow! How did you do that? In Javascript? With the then awful state of tooling for browser development?". It simply works; the UI looks, behaves and feels essentially like Excel, selecting cells, columns, drag and drop etc. Technically I can't say enough about that achievement (especially when I think back to the effort involved in the eSuite spreadsheets and presentation - there was still far more tooling available, and of couse, a rich programming model even in the less capable early 1.0 versions of Java than in the DHTML world). Keep in mind that it's a full spreadsheet engine, complete with calculation engine, dependency-checking, macro language. They were crazy enough to even at first target IE4. The programmers were are all new to Javascript development, but they were very disciplined and persevered. They lived on the bleeding edge of DHTML, had to invent things at times and workaround countless browser bugs. They talked somewhat with the MSIE team who made a lot of fixes for them in IE 5 and especially 5.5. As a very complex DHTML application, I'm sure it was a good test bed for Internet Explorer. In much the same way the folks at Opera last year jumped at a chance to test some of their DOM support with the speadsheet and presentation..

AlphaBlox, which does business analytics, sees this demo and buys HalfBrain. It makes perfect sense for them, everything is moving to the web. They integrate it with their packages, add conversion services from Excel, beef it up and offer it as part of their products.

Having learnt from the experience of getting the spreadsheet working, and realizing that what they have is essentially a very capable, general purpose DHTML library, one of the guys (now at OddPost/Yahoo) then heroically decides: why not do a Powerpoint clone? He goes ahead and writes a presentation editor !!!! This involves not just drag and drop, selection handling etc but most importantly text editing - ie. when you add your bulleted lists you need to edit the text. He managed to get this working even without using contentEditable support - it's a hack but users are non the wiser. It worked since IE 5.5 was up to scratch as a DHTML platform. This was an even greater bravura performance. Also, with greater facility with JavaScript, the code was smaller, more object-oriented. AlphaBlox also added some level of Powerpoint import capability.

The Javascript libraries for the spreadsheet and presentation amounted to around 60,000 lines of code, much of it shared and like any good DHTML toolkits had infrastructure APIs for menuing, windowing drag and drop, selection handling etc - the kinds of things that the DOM is missing since it's document and not screen-oriented. Sure, performance on very large spreadsheets may be a little poky and the recalc engine may need to be tuned, but those making really extensive use of spreadsheets will always stick to Lotus 123, Excel or more specialized packages. Similarly the drawing primitives of the presentation editor are not what you'd have in Illustrator but I think they are good enough. These kinds of editors target a different audience, they are 'worse' by definition. Of course 'worse is better', something that Christensen knows well.

Some of the HalfBrain folks left AlphaBlox with the idea to write yet another browser version of a productivity application - email and formed Oddpost - which Yahoo bought last week.

I don't really need to comment much about IBM's goals which are self-evident. Amy Wohl had a decent article from a while back about IBM's strategy Reinventing The Office: IBM Workplace. I'm sure the story and strategy has been tweaked and revised since. IBM, with a gap in its componentry, and looking to beef up its portal story helped with the development of the Midas component in Mozilla, the rich text editor, and then licensed the AlphaBlox spreadsheet and presentation. This is where I and others come in and that was last year's palava.

As far as I can tell, given the evolution of these products, the code got better, more object oriented, and more compact. Also the state of the browser has evolved, removing the need for most of the hacks of yesteryear. If I was starting the spreadsheet from scratch with the current capable browser platforms, it wouldn't be as much of a pipe-dream as back when HalfBrain started. (Joel per contra says 'never rewrite from scratch'). Given what I know about the effort it took on the Mozilla port, if Yahoo is so inclined, it should take a 6-8 months or so to have Oddpost working well in Mozilla, maybe less since there's obviously more staff. It would be perhaps a little longer in Safari or Opera since their DOM support (especially Level 2 Range) is weaker.

I talked to the Oddpost folks a few times last year; they were happy to hear that their first born babies were still being tended to - commiserating with me about some of the hacks they had had to implement earlier. With the advent of XMLHttp in MSIE, and IE 6 being a more robust platform for them, I think things were looking good for them as history has shown. Also since their app was their regular email client, they were eating their own dogfood and have every motivation to make it work.

I suspect that IBM could (and maybe should) have acquired them and perhaps there was the kind of usual flirtation/evaluation and general testing of the waters for building business relationships. With all the various mail clients and various technologies that a large company like IBM has, and the trendy push for Eclipse-based components, 'Javascript' components were a harder sell I would guess.

If the Oddpost code is anything like the AlphaBlox code I had to modify, they could probably jump start their porting effort considerably by using a lot the code I had to write or certainly the 10 or so common design patterns I encountered in doing this work. In a sense there is definitely a missed opportunity for IBM, we've missed out on an application compares favourably to many others. Of course, there would have been a difficulty in crafting a marketing message about how it fit in with the rest of the vast IBM software portfolio (read Domino, iNotes or any of the various mail apps we have). Again, I am glad that they've found a home at Yahoo, Oddpost is good technology and a good response to GMail. IBM acquiring Alphablox right on the heels of Yahoo taking up Oddpost is just another twist to this tale and likely just a coincidence.

Back to web applications. Mozilla is now the best platform for doing such development. It has the best standards support, is cross-platform and has the best devloper tools: DOM Inspector and Venkman, and perhaps even mindshare. In fact (market share be damned), it makes sense to first write yout rich web application for Mozilla. Your code will be cleaner to start off, will be better structured and will most likely work in other browsers - but that's my opinion, others say target MSIE first, just look at marketshare! - I would rather ignore that and get the compelling app.

As of Mozilla 1.5, XMLHTTPRequest support was well baked, and people have been using it successfully. I haven't tried Safari but I assume that it is not as good as Mozilla, though fast improving. Back then Opera was missing XMLHttpRequest, ContentEditable and DOM Level 2 Range support. We got some level of Opera support for free once the Mozilla work was done for the editors. The spreadsheet was functional and one could view presentations but not edit them because of the lack of range support and scriptable textareas.

The "rich web application" strategy is a very powerful appproach to development - and one we first used when I was when working on Lotus K-station (the post-mortem on that is for another day). It entails complete leverage of the browser which, after all, is the ubiquitous client. If the browser adds features you inherit them automatically. A short description of what I think is needed:

A client side framework for managing 'widgets'; a 'widget' is construed as a parameterized blob that produces markup (either in-line html or iframe-based). The data model is pushed to the client, the page is stitched together on the client, augmented by chrome and a code layer handles drag and drop, preview mode, incremental rendering and client side caching etc.

The idea is to fetch an HTML skeleton, decide what content you need, fetch that (as XML), and cache it wherever you get a chance. Render incrementally.

The pattern is simple:

Database <-> XML (Optional) <--> Javascript Object Bindings <--> UI Bindings (HTML) + UI management code

And:

It's the Latency, stupid.

When dealing with distributed applications, it's the issue of latency that will determine which applications will rule. Users ultimately want applications that are 1. fast to load 2. capable and 3. intuitive. They want all these at the same time. This is where making increased use of the DOM should shine compared to most simple html based UIs. Of course, you have to work hard when writing DOM apps to figure out where in your architecture you can do things incrementally and where to cache. But that's what engineering is all about and how one would get paid.

In K-station the 'widget' was a portlet, your portal pages was the drawing canvas that the framework managed, and you could navigate from page to page, incrementally building your pages in memory and switch back and forth instantaneously. With client-side caching, all you're doing is toggling the css display property. In the presentation editor, the analog of the widget was a drawing object (text, image, group etc). GMail and Oddpost follow the same pattern and it is the incremental rendering and caching that distinguishes them in their performance characteristics and makes them 'feel' like desktop apps.

The major missing infrastructure piece in rich web applications is going offline and synchronizing with good security. But that's a story for another day.

Koranteng's Toli

Thursday, July 28, 2005

Flickr's Godfather

Wednesday, October 27, 2004

On GMail and DHTML architecture again

Friday, July 16, 2004

On rich web applications, AlphaBlox and Oddpost

About Me

Toli Things

Archives

The Things Fall Apart Series

The Book of Toli

The Toli Technology Series

Odds and Ends

Contacting Me

About Me