Wednesday, April 13, 2005

On XForms, XPath, CSS, Brevity, Syntax And More

The always interesting Mark Birbeck of formsPlayer has a nice article up on his blog.

Ostensibly it's about "CSS, the XForms Dependency Engine, and 'Dynamic Infosets'", and he starts out by asking why there are 2 main languages (CSS and XPath) for selecting and addressing nodes within a DOM. It's a good question, and one that many have asked before. The piece is probably the definitive consideration of the question and he's a great guide, walking us through all the issues involved.

What is more interesting to me is the wider point that he goes on to make as he lurches into a very useful discussion of how we design languages and layer and model our systems. Almost in passing he addresses an issue I find most fascinating which boils down to importance of ease of authoring and syntax in technology.

The mental makeup of human beings means that brevity matters and "intuitiveness" becomes a concern. As an example, so long as the length of phone numbers was low, they were easily memorable, these days however, with 10+ digit dialing, we rely on Caller Id and programming numbers into our phones. Thus our cognitive faculties and our short-term powers of recall come into question. Being able to control the nicknames and identifiers we use in our buddy lists is a very significant factor in the spread of instant messaging and now applications like Skype. Identifiers matter significantly in this respect. The simplicity of a URI as a key, and memorable tenet, of the web architecture is a similar case in point.

From another angle on the issue, consider that not everyone can tilt their heads enough to handle the parentheses of a typical Lisp program. Most programmers can, on the whole, and some, like the Paul Grahams of the world, even wear it as a badge of honour. Of course a good computer science program should expose budding engineers to this way of thinking and many do. But these, like the Smalltalk gurus and others, are sadly outliers in the software landscape. I would hazard here that the largest impediment to the widespread adoption of the elegant programming model of Lisp is not that something like recursion is difficult to understand but rather the dissonance that the proliferation of parentheses can cause when Jane Programmer scans a listing in an editor. Vacant stares and cognitive overload ensues.

Marc Andreessen will be remembered for many things; amongst others: Mosaic, Netscape, the AOL merger, a little dotcom hubris some might say, but simply youthful exuberance I would say, evidence in the flesh of what a monopoly like Microsoft can do when provoked, and a pointer, along with Jim Clark, to the role of gravity in deflating bubbles ala Great Crash). Historians will point to all that and more.

For me though, his choice of the syntax for the hypertext link is his most lasting contribution to technology and to mankind in general. Others argued otherwise at the time and would have foisted semantic doodles on us. The "View Source" impulse that has led directly to the success of the web, that great conversational engine, would have been stymied by much head-scratching by the eveyrday people who created many a homepage circa 1995-1999. Those much mocked homepages were wonderful assertions of identity, and the lowered barriers to entry enabled many people to land their flag on this here internet where they, their friends, parents and children now live, shop and commune. If he ever receives an honourary knighthood from King Charles, his coat of arms should read

<a href="http://netscape.com">Mozilla Hyperlink Andresson</a>
It is the succint expression of the ethos of simplicity in human history.

In this vein, I was perplexed that in XPath 1.0, it is better, or rather less ambiguous, to write true() rather than true. In other words, it is recommended or even required that we treat booleans as functions and not as literals. Indeed everything is a function and as we know functions need parentheses to indicate their arguments. This always trips me up and maybe this is no longer the case in XPath 2.0. Who knows? I certainly haven't cared to look. What is true is that the cognitive impedance this caused me on my first date with the language will forever taint it in my eyes even though I have daily dealings with it.

If you had to say huh? when you did your first view source of a web page, would you have gone with that newfangled web thing or would you have written it off as one of those overly complicated buzzwords that you would look at later "when you had more time"? First impressions and snap judgments (ala Blink) count surprisingly much in these things.

One of the things that I keep thinking we need, and that I hope someone with an itch will build, is a nice XPath expression editor, a component that parses XPath and walks you through the processes of adding conditions and formulating expressions. Maybe a wizard or something, with selectors for picking the various kinds of things that are typical when building forms applications e.g. this field should be less than the value from this other field. A component that would let you add a library of custom XPath functions that could implement additional rules. Each of these libraries would be able to specify their editors but most could just be simple drop-down lists. It's not a big thing to do and you can sketch out a nice design for such a component and knock it out over a weekend. Make it open source it and be done with it.

Still, my focusing on the critical necessity of such a component is simply a recognition that hand-authoring XPath can quickly turn into a nightmare of missed parentheses, predicates and selectors. It is true that authoring in XPath doesn't require as much head scratching as say XSLT, in which context I first encountered the language, and which mere mortals like me will never understand even as I used to write in Lisp. But it is something that raises the bar quite high for the average author. In contrast there is something strangely satisfying about editing a style sheet (or maybe it's just that I've grown accustomed to that over the years). Something like this I expect is what lies behind the impulse for Web Forms 2.0, and the WHATWG, a pragmatism borne of weighing programmers' familiarity with scripting languages and a tenacious devotion to backward compatibility.

More generally though, the issue is that getting general users to author structured content is a big problem, indeed it is a nigh insoluble issue. And all the software that we produce cares very much about structure. The wonder of the spread of HTML and XML is that, ever since Berners-Lee, Bray and others unleashed their projects on us, human beings have adapted to angle brackets, < >, and now don't see them as much ado about anything. The same thing goes with CSS, the tradeoff that was made for syntax is now bearing fruits.

In the past, I've had to deal with writing a number of applications that have had things like 60,000 lines of Javascript. The messy reality is that of dealing with things like focus, issues of scope in browsers, the power and contradictory complications of late-binding scripting languages, the earlier lack of powerful debuggers and Dom inspectors, the legacy of box-model quirks as well as the powerful notion of stitching together user interfaces by leveraging the incremental rendering and multi-threaded downloading that is the basis for the hypermedia browser. All this can be done, you can have even page editors and rich spreadsheet and presentation engines in Javascript. I've written about the heroism of those who write and maintain such things. Google (Suggest, Gmail, Maps and more), Yahoo/Oddpost, IBM, and many others have competitive edges because they have developers with the skillset and more importantly the insane programming discipline required to crank out the composable browser voodoo that causes much serendipity for end users. What however about the Long Tail of Application Authoring on the web? That's what VB and other environments have catered to on desktop clients. The endpoint in all of this is when a team lead or department head can compose an application for their local concerns without much (or preferably without any) handholding from the IT departments, if indeed they have one. People just want to be able to handle their little processes and get on with things. This is what web publishing and especially blogs and wikis have done by lowering the bar for authoring with the attendant benefits in communication and global conversation.

Thus, one of the main questions that will determine the adoption (or lack thereof) of XForms or Web Forms and their ilk is the perplexing matter of whether human beings in the next decade will become as inured to writing true() in an expression as they have become with the angle brackets of html and xml. Put a different way, it could well be something completely orthoganal to the merits of the underlying technology that will determine the outcome: it will be the appearance of the kind of code you see when you do View Source on the first cool forms application you encounter. I'm suggesting then that the language acquisition cost and what I'm terming the cognitive impedance in the average human being of parentheses for functions, and forward slashes for selectors will determine the adoption rates of XForms technology.

I've been working on Forms, and XForms in particular, for the past couple of years. I happen to think that XForms has gotten the abstraction and decompostion right and that it is a great means for lowering the skillset required to model and author the kind of form-based applications that are the glue of the many custom processes of the Long Tail of Software. Indeed my only nitpick is that the specification doesn't include upfront the equivalent of the JavaScript confirm function as a concession to usablity so that you can easily put up a message for the user so that they can decide whether they "are really sure that they want to submit their form" or not. It can be done, I've been told, but it isn't emblazoned in the specification. By being too general (they'd argue that something like this needs to consider mutiple modalities and the like), they are missing out on something interaction designers would immediately point at as a shortcoming. Again, first impressions count.

I see a bright future in which that much maligned Forms "programming model" that is at the core of the Lotus Notes platform could be brought to the web platform leveraging the native primitives of the Web style (hypermedia, uris, linking etc). XForms is singularly well suited to do this. For those unfamiliar with Notes/Domino, my handwaving elevator pitch is that it is a platform essentially founded on the fundamental insight that a huge class of applications can be built based on just a few compositional building blocks: Forms, Views a standard file format, the note in Notes terms. The brouhahas made about messaging, security, directory services, and all that paraphernalia that marketing people throw about when they pitch the platform to you are all syntactic sugar around the core competency of Forms and Views and the client and server processes that can manage them. A whole cottage industry of business partners are doing very fine thank you building custom and evolvable applications for businesses, small and large, everywhere. The fact that email can be construed as a forms application is just a side benefit and detracts from the real focus of the platform. This is much misunderstood by people whose only encounter with Notes is as a Mail client. It's really just a forms and view app for people and processes. Incidentally this same platform is most likely what is funding my current work and much of the IBM Software Group, even as resources are spent on other "sanctioned" and more "strategic" approaches. C'est la vie.

One thing I've noticed is that many people seem to want to ignore the lessons learned from the Notes world over the past 15 years and and behave as if the forms space is terra incognita - a brave new world indeed. On the contrary, the Forms problem and the wider Process problem is nothing new. These are things that have been with us almost from the time that societies became organized and larger communities formed as Barry Briggs has pointed out. Whenever I plumb those depths however, I am reminded of the notion that Joel Spolksy so eloquently coined that in software it it easier to write code than to read code. In software terms, 15 years is an eternity hence we are fated to reinvent and rewrite anew old software. Just look as WS-* as opposed to Corba. Sometimes I almost despair at this notion, since it bespeaks a total lack of curiousity and historical memory even with those who are sitting in the same building who have learned comprehensive lessons about the many problems of forms: evolvable schemas, metadata, annotations and the like.

Of course I'll continue to build the tools, the processors, the renderers and the infrastructure plumbing to to make the forms dream an easier reality. I'd still argue that adoption will ultimately come down to whether the View Source impulse can be leveraged and whether the average Joe will get turned off by things like true() instead of true. If I were inclined to be a research type, I'd imagine a case study or paper titled something like
"The Importance Of Syntax In Technology Adoption - Historical Insights From The Trenches 1940-2005"
A more prescient Historian of Science would note that the issue of notation in mathematics is similarly a longstanding area of concern. A linguist would add insights about how different societies adapted different writing systems and the impact on the writing system on cognition and development. Anthropologists, sociologists or psychologists would have much to say in this vein.

Technologies like XSLT and XForms which are the prime users of XPath are in still in their infancy (as are some of the other takes on this problem space from Adobe and Microsoft). Despite having many implementations at its launch, XForms is still ambling towards its inflection point and I'd hazard that the majority of XForms templates and transactions are machine-generated. Fair enough perhaps. Wearing my prediction hat however, it will be very interesting to see what happens 6 months after the default installation of Firefox includes its XForms extension. We're going to see the same thing in microcosm now that Mozilla have announced that Firefox 1.1 will include SVG which has a more limited utility for mass audiences. With very little tongue in cheek, I'd wonder what contact with a massively vaster audience of form authors will do to XForms implementors. I'd lay bets on the first XForms engine that implements a "quirks mode" for their XPath evaluation engines to dealt with common patterns of mistakes in hand-authored forms. It will be a case of omitted parentheses rather than browser tag soup that will cause much fretting in mailing lists the world over. I wonder whether Peter-Paul Koch will then have to add an XForms or XPath section on his invaluable site that documents browser quirks. The litmus test will be the teenager doing a summer job in a lawyer's office who is asked to write a little forms application to help some workflow. If parentheses make their eyes glaze over, I doubt I'll be proved wrong (although I hope to be), about whether XForms would be used for that custom application. If the typical simplified wiki syntax (whatever Jotspot or SocialText are using) is more intuitive, that will be what gets used.

The other point Birbeck raises, and here the argument is much stronger, is about the tradeoffs that designers consider when it comes to seperating the processing, addressing, eventing and styling models. He speaks to an architectural truism whatever the domain in question. This is where his clarity of thought comes to light. A clarity that stems from being one of the exalted "Invited Experts" on the XForms and HTML W3C working groups, and having an innovative product that daily explores this landscape,

If, for example, you started off in the mad, slapdash world that was early browser development, you might opt instead for a very pragmatic viewpoint on these issues. That's the kind of weighing that has characterized the Mozilla folks. HÃ¥kon Lie, Hixie of Opera, fall into this category even if they appear to take it to almost militant extremes at times. Still I see where they are coming from. Your take on these things is coloured by contact with the millions of end-user authors and the daily reality of tag soup.

I've only spoken with the Opera folks a couple of times and always forgot to ask them the burning question I have. How difficult is it, by the way, to add an XPath engine to a browser? I've always assumed that the real reason (as opposed to the stated reason, "we have everything we need in scripting and css") for Opera's almost visceral objection to XForms has been a concern over footprint since the same codebase is used on desktop and pervasive clients. But with the necessity of XML engines in browsers(with the now indispensable XMLHttpRequest) and Moore's law at work on your more limited clients, at what stage do protestatations about XPath (and hence XForms which is a dependency engine over and above XPath) become simply bywords for inertia? As a very conservative software engineer personally, I am similarly not inclined to jump on bandwagons just because they are in vogue. Still I'm interested in the architectural thinking that lies behind their position.

If, on the other hand, like Birbeck, you've drunk the XForms Kool-Aid, you'd be a generalist and will be inclined to see almost everything through rose-tinted glasses in terms of seperation of model from UI and from eventing and actions. You might recite MVC chapter-and-verse as you lull yourself to sleep at night, self-satisfied at your specification. The irony is that the visual effect of mere parentheses on an average teenager could cut short your sweet dreams of empire building.

Of course it's not always so cut and dry, sometimes you're just in the middle, trying to figure out which of the 45 latest buzzwords you're expected to spout fluently tomorrow to get a raise that beats inflation - or even a promotion. Or maybe you're just trying to code for food and get some real work done by helping a doctor's assistant keep track of HMO paperwork more efficiently or something of that sort. Oh well. Food for thought in any case.

Cross-posted at the Inside Lotus weblog.

File under: , , , , , , , , , , , , , , , , , , ,

1 comment:

James Governor said...

the force is strong within you. seriously i have been having a wander and really enjoy the work you do