Friday, March 17, 2006

Minutiae

A surprisingly large part of a software engineer's life is spent dealing with the little things. Much as I like to write about grand designs and architectural issues or people, processes and communities, all too often, the devil is in the details and I get lost chasing technological quirks. Herewith a sample of just the past week's minutiae, perhaps fodder for historians of science or anthropologists.

Boolean Identity Crisis


I was going over some code in my pet XForms processor implementation and wondering what was going wrong - incidentally IBM has written (at least) 6 forms processors in the past 4 years along with contributing to the Mozilla XForms effort but we won't get into that - I'll save that for a business school case study or something.

After an hour or so of head-scratching this was what I found. In XML Schema Datatypes, the lexical space of boolean values includes not only "true" and "false" but also "0" and "1". I assume the inclusion of 0 and 1 in the specification comes from the legacy of the C language and presumably that makes sense. But I ask, was that a wise decision? The designers of the specification chose a binary representation in a textual format.

The forms processor is written in Java. It turns out that in the Java language, the lexical space of boolean values is case-insensitive "true" and "false". So if you have a mapping layer that goes from schema datatype to Java you have to add special case code to deal with 0 and 1. Presumably also in the other direction, you have to make sure that you don't output "True" instead of "true". I had been linking to a library that hadn't bothered to implement this more robust logic and ran into this boundary condition. Oh well, I thought, I'll have to add an adapter around this library or rip it out and roll my own - an assignment for the weekend.

I vaguely remember Sam Ruby mentionning that this was one cause of interoperability issues between SOAP implementations and it stands to reason: we've codified an identity crisis.

Structured Data Footnotes


Jon Udell is a careful man whose
blog works differently from the rest of the InfoWorld blogs. The content is well-formed XML, and it follows certain self-imposed rules.
This discipline enables him to work miracles and regularly come up with lots of cool applications. But in the wider world, getting people to author structured data is an often intractable problem. XML is often structurally invalid; as an example the Google Reader team have reported that 15% of feeds on the web are not well formed and that is before considering the feeds' semantic validity.

Thus I was tickled by this footnote he wrote when playing with microformats.
The difference? I ran the page through HTML Tidy to get well-formed XML 2...

2   It wasn't entirely automatic, unfortunately, I had to wrestle with character encoding issues too.
Ladies and Gentlemen, I give you The Gruesome Twosome of Computer Science: Structured Data and Character Encoding. We call them footnotes.

Highlights, Lowlights


It started with an offhand comment, a report that one couldn't set the background color on text in the rich text editor in Mozilla Firefox. Every other command on the editing palette worked. You thought, "15 minutes tops, I'll take it".

So you look up the list of command identifiers in Internet Explorer and notice that the execCommand method has an attribute named BackColor which
Sets or retrieves the background color of the current selection.
You head back to the Midas spec for Mozilla and see that for the backcolor entry
This command will set the background color of the document.
Hmm... Your first thought had been that it was a case-sensitivity problem, "backcolor" rather than "BackColor", but the documentation notes that case doesn't matter. So the problem is rather what you highlighted, in Mozilla, backcolor applies to the document and not to the current selection. This means that you have to use a different identifier to achieve the same effect in Mozilla. You promptly notice one called hilitecolor
This command will set the hilite color of the selection or at the insertion point. It only works with useCSS enabled.
Okay, someone decided to not use the same command as Internet Explorer, fair enough, you've seen worse. As you change the code, you say, it's Mozilla and CSS is enabled by default so all I have to do is switch to using hilitecolor.

Of course that doesn't work. You then tell yourself that you'll just enable CSS with the useCSS attribute (you had looked more closely at the code and noticed that CSS styling had been deliberately turned off for some obscure reason).

Hmm, that didn't work...

So you go back to the spec and you notice a styleWithCSS attribute. Interesting... You try that instead of useCSS. Of course, you happen to be testing using a version of Firefox (1.04) that doesn't support that attribute so there's an exception. Presumably this attribute was introduced in Firefox 1.5 or something. Still it's a good thing that you are using an older version for testing otherwise this would have been another bug (feature doesn't work in Firefox 1.0 etc).

You bang around for a while and go back to the original Midas demo which seems to work correctly if you have the "use CSS" checkbox checked.

You view the source of the demo to figure out what could possibly be making it work and you notice that they are setting the useCSS property to false when the checkbox is checked. Huh? They set useCSS to false in order to enable CSS!

So you go back and look more closely at the documentation. First you notice that the useCSS property is deprecated. Hmm... the plot thickens. Then you read this
useCSS - value: true/false

Note: This command has been replaced with styleWithCSS. It takes the same values as styleWithCSS, but the meaning of true and false are inversed.
Up is down in other words. It's like Alice in Wonderland or something. Do note that there is no word on what version the replacement occurred. Nor indeed is there any footnote on why true and false are "inversed".

Anyway you've finally figured it out, you make the change to enable CSS styling only when applying the hilitecolor attribute, add some future-proofing to use the styleWithCSS attribute in case the deprecated useCSS attribute is removed in later browsers, submit the patches and 3 hours of your life have passed. It was frustrating but there's an object lesson somewhere. If you are building a robust cross-browser rich text editing application, you have some code that has to go through this rigmarole.

I have a confession. I recounted the above tale because I feel guilty: I could have prevented all this four years ago.

Internet Explorer was the first browser to introduce rich text editing. It was not pretty since they didn't want it to be as good as Microsoft Word, but it worked reasonably. IBM later contributed resources to Mozilla to beef up its rich text editing. Indeed four years ago, I was asked to review the resulting Midas spec by my colleagues who had been working on that functionality. The spec looked serviceable, pointing mostly to the Microsoft documentation since they wisely chose to follow the de-facto standard. I reported a few bugs with the intial implementation and we built a rich text editor around it. It's widely used on the web these days.

I obviously was a bad reviewer because I certainly didn't notice that there was a change in the semantics of the backcolor attribute from "current selection" to "document" nor indeed that a new attribute, hilitecolor, had been introduced in Mozilla. There is absolutely no reason for these discrepancies. Nor do I want to go down the forensic trail that explains why you can't have a background color on text if you don't style with CSS. That's a rathole of its own.

So what do we have here? A de-facto standard was implemented but someone took the liberty to change one aspect of it for whatever reason - semantic purity or something. I suspect I won't be the last developer to waste an afternoon on this or indeed the last user to be cursing about why I can't add a background colour to a piece of text. I hope that Opera and Safari, if and when they have their rich text editing implementation in place, will do a wholesale copying of Internet Explorer's behaviour. I don't want to be chasing these lowlights again. I often see criticism of Microsoft's inconsistent approach to standards but everyone can be as guilty as them on occasion.

Which Side Are You On?


Stefan Tilkov recently asked what's wrong with Javascript? The answer of course is nothing really, it's a fine language as evidenced by looking around current thinking on the language. Indeed Brendan Eich's biggest admitted gotcha about the language is automatic semi-colon insertion. Thus there is no reason it can't be used in environments outside the browers in which it is most widely deployed as glue - I've played with Rhino on the server-side without any problem.

From what I understand, Jotspot, which incidentally I consider a cunning plan to showcase the Dojo toolkit, uses Javascript as the scripting language on both the server and client. All power to them. Of course with Javascript from the same document executing in both environments, you can get very confused. The answer to one of Jotspot's most frequently asked questions, "Why isn't my Javascript function being called?" is that "your code is not running where you think it is" ergo, you're expecting that the current code you're looking at is server-side rather than client-side, or vice-versa. This is especially true since the object model is likely to be different, the browser DOM is a deliberately constrained environment whereas on the server side you'd want to allow your plugins to do more.

Thus there is a little impedance with using Javascript everywhere. Perhaps it's less confusing to use a different language and syntax for server side code - a tag library in jsp, or embedded java, php, asp or whatever. Or maybe clever syntax coloring in your editor or IDE would do the trick to remind you of the context. Needless to say, you have to decide what side you're on...

I've been working on a project in which the others on my team are seasoned PHP gurus and are occasionally petrified of Javascript - the reason of course being the continued brittleness of the browser platform. When I started work, this bias showed and their initial recommendation was to do as much as possible in PHP on the server-side. They recognize however that that we are living in an age of interactivity so we need that shine that comes with moving intelligence to the client. Still when you start doing data-binding and automatic JSON serialization of PHP objects, you get constructs on the browser client that you wouldn't normally use if you were a client-side person. As someone who's very comfortable with both client and server side code (perhaps more comfortable with Java than PHP), I keep running into such peculiarities all the time. Of late I find myself prototyping code in client side Javascript even if it will eventually morph into server side code. Perhaps I need to get more into Python or that Ruby bandwagon. Still you tend to develop a split personality when you develop for the web.

The Null Hypothesis


I got the note: "you can't append a column if you click on a cell in the last column of a table in Internet Explorer".

Huh? I attempted to reproduce the bug and, sure enough, that was the case. Vaguely at the back of my mind I recalled from painful experience in K-station that there were special APIs in the HTML DOM for dealing with tables. Thus I searched the codebase for insertRow and insertCell. Hmmm those functions were nowhere to be found. How were they doing the column insertion, I wondered? My guess was that this was either some innerHTML tricks or simple standard DOM manipulation. Thus I had to dig through the code and eventually encountered the Node.insertBefore conundrum.

Now insertBefore is a method on the Node interface that is part of DOM level 1 specification. Every browser claims at least DOM level 1 support.

It turns out that in certain versions of Internet Explorer the second parameter to the insertBefore call can't be null, you get an Invalid Argument exception. Mozilla handles this condition as one would expect in their Javascript binding; Internet Explorer chokes. [Obscenity]. There was too much code to change to use the HTML-specific methods so I just hacked special case code that ensures that I don't pass a null to Internet Explorer - 3 hours of my life perhaps.

Now this doesn't amaze me really, thinking back on it, this is probably the reason that somewhere in the bowels of every Javascript library that deals with dynamically adding a new option to a select control, you'll find code like the following:
function appendOptionElement(select, newoption){ if(is_ie) // test for internet explorer somehow   select.add(newoption); else   select.add(newoption, null); }
The browser is a fragile place and hopefully the frameworks that are being developed will shield you somewhat from such issues, but it is telling that you can't rely on core DOM functionality. Even if these quirks are fixed in Internet Explorer 7, this patched-up code will have to hang around for another 5 years before users will no longer use older versions of the browser. We're in a world of pain in the browser world.

Nulls however are problematic throughout computer science. The arguments around them may sound like angel and pinhead discussions but they are fair questions. What indeed is null? What is zero for that matter? For centuries and throughout the Dark Ages of the West, there was no concept of zero, it took Arabic Algebra to spread that notion. Why should one expect that programmers would have internalized the null concept? Reasonable people can and do differ on how to treat null.

Just now, reading through the latest Dr Dobbs journal, I noticed the following in an article about Consuming .NET Web Services in Oracle JDeveloper
The ATL Server SOAP handler generates an xsi:nil="1" attribute when the element's value is null, or when the array is null or zero-size. Unfortunately, Apache SOAP fails to deserialize UDTs whose fields contain the xsi:nil="1" attribute and expects zero-size arrays to be represented as XSD arrays with the dimension parameter set to zero.
The author then proceeds to outline a variety of workarounds. Now I happen to not be a fan of the SOAP style of programming (in the past I've called it Crusty Old Architecture pronounced SOA with a silent P) but as a developer, I feel the pain. The article is a catalog of kludges and likely mapping errors that have to be worked around - the word "unfortunately" is used entirely too often.

I started my career doing graphical programming thus I perked up when Raymond Chen recently outlined the consequences of invalidating the null window and his anecdote is worth quoting at length.
If however you end up passing NULL as the window handle to the InvalidateRect function, this is treated as a special case for compatibility with early versions of Windows: It invalidates all the windows on the desktop and repaints them.

Even more strangely, passing NULL as the first parameter to ValidateRect has the same behavior of invalidating all the windows. (Yes, it's the "Validate" function, yet it invalidates.) This wacko behavior exists for the same compatibility reason. Yet another example of how programs rely on bugs or undocumented behavior, in this case, the peculiar way a NULL parameter was treated by very early versions of Windows due to lax parameter validation. Changing nearly anything in the window manager raises a strong probability that there will be many programs that were relying on the old behavior, perhaps entirely by accident, and breaking those programs means an angry phone call from a major corporation because their factory control software stopped working.
The null hypothesis strikes again.

Camel Humps


Even when you get past these details, you get into matters of syntax - a longstanding pet topic of mine. Consider the separators that people use for readability. Some people like to use Hungarian notation, others prefer hyphens... well I've already written a hyphenated parable so I'll skip that aspect. Anyway, assume for some insane reason that it makes sense for your spec to have an attribute named "windowTop". There'll undoubtedly be people who will write it "window-top", "window.top" or with some other variant of case, "WindowTop"? If you're dealing with XML where case matters, things will fail. You say potato, I say pubDate anyone?

I've been playing with a product that is essentially a wiki and it turns out that by convention, camel case (or should I write it as CamelCase) is significant in the wiki world as denoting a "WikiWord". I even had to spend half an hour writing glossary entries for these concepts since in our system, user names and page names had to be in that format.

You can imagine however the kinds of issues that arise when you have an html editor that allows you to enter HTML and Javascript along with php code and a specialized wiki syntax - since for some reason, it was decided not to use angle brackets in this product. The most complicated piece of code is going to be the parser. As currently implemented, the parser is a mass (or should I say, a morass) of regular expressions and all kinds of things I'll never understand. There's even special case code in there to handle nested comments in Javascript. What was never forseen however was that you'd have to deal with user-entered script code. What would happen if camel case is used for variables inside of said script? Well it wasn't pretty when the wiki engine jumped in and treated script as wiki words, let's just say that someone had to come up with a solution.

There's this concept known as McCabe Cyclomatic Complexity which is basically an application of graph theory to software and is often used to pinpoint potential problems. The basic insight is that if there are too many decisions or paths through your code, it will be buggier, harder to maintain and test. A good guideline is that anything with complexity greater than 15 in this scheme is ripe for refactoring. Luckily I have access to some tools that can generate reports and provide some numbers to validate that nagging sense that a piece of code is getting harder to understand. What worries me is that I keep running into code that laughs at such guidelines - after the latest change the parser code just hit 41 (to give some context, values between 21 and 50 indicate "a complex, high risk program" and with the 50 barrier looming, we are verging towards that notable status of the "untestable program - very high risk"). I'm sure it didn't start out this way and that the steady accretions have been solutions to real problems. Still, technical arteriosclerosis continues its inexorable spread...

Ruby on Rails and the Zend framework for PHP are founded on favouring convention. We had this code that automatically created the schema for what amounted to a database table. We never actually told users that this was what was going on. Of course once a wider audience started to play with the product, we had to fix the bugs that arose to enforce the convention so that databases would be named as the code expected. A classic case of leaky abstractions I suppose. You can guess the complexity the additional error checking added.

Moving up a level, you get politics - the most aberrant of which have been the feed wars of the past few years perhaps best summarized hilariously by Shelley Powers. Truly, Jesus wept.

Bill de Hóra in an aside wisely noted that
Programmers would rather squabble about minutiae - it seems this transcends community or language choice.
But even when you move beyond whiplash and abrasive personalities, who would have thought that the Atom working group would have spent so much time on the concept of dates. When I saw dissertation-length essays on the various types of dates that might be used in publishing systems, I was content to continue lurking in that community. The thing however is that these details do matter and they are best dealt with up front. The aggregate waste of programmer effort in pursuit of minutiae might keep the profession in business but it surely isn't sustainable.

The East Australian Singularity


I'll conclude by noting that acts of God (or in this case, his flawed proxies: politicians) can come into the picture. Thus I read the following this week: Eastern Australia: Java applications impacted by the change to daylight savings time dates
The running of the Commonwealth Games in Australia in March and April 2006 has resulted in an extension to daylight savings time in the eastern states of Australia, which includes New South Wales, Victoria, South Australia, Australian Capital Territory and Tasmania.

Rather than the clocks being put back an hour at 03:00 on Sunday 26 March 2006, they will now be adjusted at 03:00 on Sunday 2 April 2006. The IBM Software Development Kit (SDK) or Java Runtime Environment (JRE) is not currently aware of the one-time change to the daylight savings time extension. An interim fix is required for your application to have the same time as your operating system.

This change applies to all of the Java environments, irrespective of their setup, whether they are set up to use the operating system time zone information, or the user.timezone custom property that can be set for the SDK or JRE.
I assume television schedules were behind this change to the time/space continuum since the Commonwealth games have just begun. Or perhaps the politicians were concerned about athletes missing their events or something. Still imagine if you're the Java programmer who has suffered through the badly defined date apis in Java 1.0 a decade ago, adapted to the improvements in succeeding versions and finally, finally gotten a robust application deployed. Then your manager walks in and tells you about the East Australian Singularity. You're going to have to rework everything since bank transactions might be messed up and there are likely to be blinking clocks etc. We live in a global village so you have to worry about what happens if some mission-critical application that you rely on is being run in Eastern Australia. You can't even have the special case restricted to that country, it's only a sub-region that is affected. At such times I'd prefer being the poor Eastern Australian sheep or lamb, their lot is much easier.

What a way to make a living, if it isn't one thing it's the other. And then there's version hell. Lord help me.

File under: , , , , , , , , , ,

4 comments:

Little Mr Square Eyes said...

"Then your manager walks in and tells you about the East Australian Singularity."

Koranteng, down this way many of us just refer to the extended daylight saving thing as a 'pain in the arse' (granted East Australian Singularity does sound a bit more technically genteel).

If it's any consolation Betty and Phil (aka her maj) popped down under to open the games and she definitely looked like she wanted to be somewhere else as well.

Koranteng said...

To add insult to injury, the Commonwealth Games are not being broadcast in the US. I can't even follow the athletics competition and see how Ghana's athletes are doing - whether Aziz Zakari will wake up on the right side of bed, whether Leo Myles-Mills will fumble the baton in the 4x100 relay. It will have to be a case of monitoring minutiae in the US newspapers if indeed they do cover the games.

Alessandro Vernet said...

Koranteng,

Thank you for sharing all this with us. Definitely very useful! I posted an entry in our blog about what what you are saying here regarding 0 and 1 being valid Boolean values in XML schema. And I am just mentioning this here because you might be interested by that blog as you seem to be watching the XForms space.

Alex

Anonymous said...

Thanks for the hilitecolor remark. I found it extremely useful especially that i had no clue where to look for it.
Cheers
Andrei