Koranteng's Toli: syntax

Showing posts with label syntax. Show all posts

Saturday, June 14, 2025

Contraction

"Can I help who's next?"

The mantra of the person at the service counter
An aggravation. A provocation to this grammar pedant

For the question always sounds awkward to these ears
As it appears to be a contraction of "Can I help whoever's next?"¹
A contraction born of sheer number of times the question is uttered every day

Mind you, I too would seek to minimize the number of syllables I have to say
I too would pay lip service to the corporate overlords that write my welcome script

It struck me, however, that a little syntax could come to the rescue
That my naysaying can be remedied with some punctuation
For, if rendered as "Can I help? Who's next?"
The insertion of a question mark would solve the concern

The only problem is that, in practice, it never sounds like two questions
And so my ears continue to screech at the damn contraction²:

"Can I help who's next?"

Contraction, a playlist

A mostly dance soundtrack for this pedantic note (spotify version)

Can I help you? by Amnesty
From the crate digger's soul/funk holy grail album
Can I help you by Dani Hageman
Who's next? by Neighborhood Kids
What can I help you with now? by Slick Stomp
Contraction by Dxstxnce
Can I help you by Adam Veldt
Can I help you by Notches
Contraction by Copy
Next! by NCTS
Contraction by Pau viguer
How can I help you by Riverlabs

Bonus beats Excuse me Miss by Jay-Z

...

A former linguist pointed out that "Can I help the person who is next?" might be a better source for the contraction. I concur, although like the service industry invocation of Can I help the next customer? (or client, guest or whatever term The Company's service manual recommends), it is quite the mouthful. ↩
A political scientist harkened to the more salutory New York City contraction "Next!" which, with its conscision and emphatic declarative stance, minimizes the demands on the speaker, clarifies the intent to the bearer, reframes the mooted question as an exclamation, serves the proletarian interest being amenable to be uttered by a grunt while still paying lip service to the transactional demands of capital. I heartily recommend the practice. ↩

See previously Public Nuisance Number 64, Less Unfortunate Casualties and Ode to the word nuisant

File under: Small Things, language, syntax, grammar, bureaucracy, culture, observation, perception, humour, poetry, toli

Writing log: May 31, 2025

Friday, December 04, 2009

66 Ways to Franco

It was the kind of thing that you found yourself doing in the middle of the night, musing on an idle question born of two of your concerns: musical obsession and software anthropology. Counting the ways to Franco was an excursion into the realm of metadata, matters of syntax, and a contemplation of the hive mind of the web. The initial insomniac impulse was to create a playlist; your search, however, found a surprising 38 variants of the name in your library and you couldn't locate the song that had triggered your nocturnal foraging.

You remembered that at the beginning of the year you only had a couple of his albums in your collection but the music blogs of this world wide web had sprang to life and you'd steadily filled the gaps while reading them. In this golden age of music distribution, all one needs is a vague memory and an internet connection to fulfill one's aural titillation. Let's see, a cursory glance shows that the mp3 collection now stands at 25,494 songs, 186 GB, or 99 days of non-stop music - probably a third of which was acquired over the past year. Now you do still spend a lot on your musical vices, but you can imagine that it is highly unlikely that $8,500 (at iTunes or Amazon pricing) would have left your insubstantial wallet during this Great Recession. No, your collection grew by osmosis. Moving on...

Your count was akin to last year's discussion about the bewildering number of ways people mangle song and artist names. That led to the sight of those Top 100 ways to write Guns N' Roses – Knockin' on Heaven's Door. What a strange glimpse of the musical Tower of Babel we now have. Take a lexical curio like Guns N' Roses (call it typographical eccentricity), add a few apostrophes and you'll realize that the children of Mr Special Character and Mrs Structured Data are blessed ones. This is the minutiae that software people - and that unwashed sub-clan, the database denizens, have to deal with on a regular basis. Whole businesses have been founded on making sense of such messy data. Information retrieval is the general term of art, and metadata, well, metadata is the data about data. It's one of the hard problems in your line of work.

So what do we have then? Through a set of historical accidents over the past decade - notably the piecemeal standardization of ID tags in the mp3 file format, the mp3 file format itself, the rapid adoption of internet led by the web, improvements in storage technologies, the development of new portable music players and the decentralized mass digitization and distribution of music, we can behold the glorious results of a democratic exercise of mass data entry. During this time, millions of ordinary people took to their computers and ripped their cd collections. Millions more downloaded and shared this great social bounty. True, the more prudent laggards waited until there was commercial affirmation and legally sanctioned avenues for their digital music. Throughout, however, this mountain of music had to be labeled.

The inevitable curators came along fairly early on in this process - the online databases, GraceNote and then Freedb, to help automate things and identify the cd once you loaded it on the computer. Their altruism however didn't stem the tide of data entry (someone after all had to have entered it once and we all know how that goes). Offerings like MusicBrainz have emerged in recent years as repositories of high quality music metadata, ostensibly on a mission to bring accuracy and fingerprinting to digital music. The commercial services too now loom large as major distribution points, but they too license their listings from some of these databases, and it shows: errors everywhere. More puzzling is that the record companies haven't been of much help, they too don't pay attention to the details of the names of the artists they supposedly promote, nor indeed the titles of the songs. They, like iTunes and Amazon stores, aren't perfect at information hygiene. You see typos and plain wrong labeling of music. You don't have to take the word of an opinioniated metadata curmudgeon, the proof is in the existence of a vibrant ecosystem of tag editing software, free and even commercial. Imagine that, people pay for a product to help them label their music.

Given that even the big boys don't label their music accurately, it has been a true free-for-all and that's not even taking into account the trend of the past couple of years of digitizing (and sharing) all the lost vinyl. I'll only note that the amount of African music that is being rediscovered is frankly startling. The labour of love of those who find and clean dusty grooves, scan album covers, digitize and share their musical memories is a true surplus for society.

It turns out that all these musical curators and aggregators have only been partially successful. It really is a problem with the human factor. When you have humans doing data entry you'll have errors. When you have on order of millions doing data entry, you'll have large numbers of errors. These glitches appeal to me, truth be told. People label things to remember them and the patterns they use are worthy artifacts. The variants, I'll suggest, are emblematic of both folk memory and the mass creation of semi-structured data. If I could, I'd write an ode to the mp3 tag. In the meantime, I set about to count the ways to Franco.

Last.fm was my weapon of choice - a music recommendation system with attitude. For one, they have been gathering data from all and sundry for years now - they call it audio scrobbling. They gather data about what people listen to, crunch away, and make contextual recommendations. They deal with huge amounts of data and the attendant complexity. One early strategy to work around the inadequacies of ID tags in mp3s was to simply escape from their confines and allow users to tag artists, songs and albums on the website and watch the free-form folksonomie emerge. Web-savvy as they are they organized these musical objects each with its url and wiki page. People like to discuss songs, albums and artists. Few algorithms can handle the notion that Orchestra Baobab recorded two albums in 1975 under the Orchestre Bawobab moniker. You need a human intervention to account for this kind of peculiarity. And so they did. It's a simple application of collective wisdom: watch what users do and organize around it. They can even offer radio stations based on tags. At a certain point also, they tried a fingerprinting technique that would process their users' libraries to allow them to normalize the metadata associated with a piece of music. Taking this further, they can simply ask and allow people to correct spellings or suggest alternatives for artists that performed under different incarnations. Once a critical mass is reached you can automatically apply this feedback to tend to this garden of metadata. This is the business of web scale identifiers.

And so we come to Franco. Not the fascist Spanish dictator, whose despicable legacy is still paradoxically a touchstone for some American neocons. No. For most Africans, Franco is all you need to say to signify great, pulsing guitar-driven music, rumba, soukous, social commentary and old time good fun. All these are the elements of Franco. It's all the same good, liquid music and great memories of excursions on the dancefloor, a long career spanning four decades. It's been twenty years since François Luambo Makiadi passed away, time enough to count.

And so I counted:

There are at least 66 ways to get to Franco.

I must admit looking at the list gave me pause. How do people remember a musician? How do we remember a piece of music? Only about 9,000 users of last fm seem to listen to Franco and yet here we find world class variety. He appears in 66 different guises to the world if you exclude the 3 obvious mistaggings and misspellings. The labelers in a pool of 1.5 million Guns N' Roses listeners could only mangle that band's spelling 56 ways. What was so special about Franco, I wondered, as I looked at the list? The obvious answer is that much of his music is not available commercially in digital form. Instead, a lot of his listeners are typing up the records and cassettes.

Capitalization concerns surface immediately. Some people like ALL CAPS, others are lower case freaks. That is par for the course in a world where search engines basically ignore case. (Sidenote: looking at the top search trends of the year, it appears that users use lower case in search engines, few bother these days with "Britney Spears" when "britney spears" will do. In this age of mobile phones, it seems that the shift key is being used less). Sidenote: we won't digress onto Camel Case discussions at this stage.

Most people think of Franco as synonymous with the band he founded: OK Jazz. But how do people deal with the abbreviation? How does one spell OK? Is it rather "Ok" or "O.K."? Opinions are varied. The OK Jazz band was so named because they began as the house band in the OK Bar in Leopoldville (now Kinshasa). It should be simple then: Franco & OK Jazz say - assuming you go with no dots.

Matter of punctuation however open up a can of worms; we stray onto typographical concerns, with pronounced eccentricity in the choice of separators. For delimiters, we see dots, semi-colons, slashes, commas and hyphens.

People have different conventions for conjunctions: it's a case of ampersands (&) for the many and fully spelled out for the few.

We have the language issue, the band was named in French so we'd expect "et", but some labelers are English-speakers hence we get a few instances of "and", "with" and "featuring" showing up. Incidentally the folks in the forums at MusicBrainz will regale you with tales about the epic wars over the conventions for dealing with the word "featuring". Briefly, some people spell it out, others contract to "feat" or "feat." - with the punctuation, or even further to "ft.". Suffice to say that it was much like the egg-cracking debate in Gulliver's Travels.

Lest you think that punctuation doesn't matter, let me interject the following anecdote. Tony! Toni! Toné! were named as such in their debut album, "Who?". In later albums, they were called Tony Toni Toné - with the exclamation points removed. Which is the more accurate name for the band I ask? Could you spin a story from the missing exclamation points? Well I'll engage in mindless speculation on this typographical mystery - stay with me. It's obvious: they changed record label and lost the rights to their name (much like The Jackson 5 had to become The Jacksons when they left Motown). They were shrewd in their negotiations and the price they paid was the removal of the exclamation points. The transformation was from 3 ejaculations (those 3 exclamation marks) to a sedate sentence, perhaps indicating a newfound maturity. Truth be told, the music was better without the exclamation histrionics yet it clearly is the same band. For what it's worth, Last fm and Amazon all normalize the name without the exclamation point. Anyway, I won't pursue this tall tale further. Back to Franco and OK Jazz...

The OK Jazz band was an orchestra with a shifting cast - in Congo and much of Africa after the second world war, there was a great flowering of such orchestras as proving grounds and incubators - so you have some renderings going with Orchestre OK Jazz. Again native language comes into play, for the English it's orchestra instead of the French orchestre.

Then there are the nicknames and honorifics. Simply put, Africans love titles. OK Jazz acquired the "Tout Puissant" prefix (almighty, literally all powerful). Franco acquired the appellation, "Grand Maître" (Grandmaster). Add in the grammatical concerns and you expand the choices; is it "le tout puissant" or "son tout puissant", ergo is it "the almighty" or "his almighty"? Le Grand Maître Franco & le T.P.O.K. Jazz perhaps? Franco Luambo Makiadi & OK Jazz? Sometimes also, Franco's full name is spelled out as if to underlie the vastness of his catalog. And when doing this, the name order varies.

At its most extravagant you'd get something like Le Grand Maitre Franco Luambo Makiadi & Le Tout Puissant Orchestre OK Jazz.

Even then, would you contract to TPOK Jazz? Or T.P. OK Jazz? Or T.P. O.K. Jazz? The plot thickens.

With such a long career, there was inevitably a shifting of focus; on some albums it was the band that was the lead, on others it was Franco who took the spotlight, and at other times other members took center stage (Sam Mangwana, Vicky, Taby Ley Rochereau and so forth). The names varied accordingly. It was all Franco, and it was all good, if you don't mind my saying. Trust me, pick almost anything he recorded and you'll be a happy camper.

But to get to my point. The music I'd been looking for at the midnight hour was the following album:

Franco et le T.P. OK Jazz sing for Mobutu

Over his career, there were a number of occasions where Franco had to pay obeisance to his patron, that murderous dictatorial rogue, Mobutu, kleptocrat without equal. The music on those few albums were not the best that he produced in his illustrious career. True, the tracks were danceable but they weren't ecstatic as usual. Some have even detected elements of irony in some of the songs - subversive dog-whistles that undercut the dictator's purpose and propaganda. I imagine some Africanist historian writing the definitive study of this phenomenon, perhaps something titled Musical Resistance in Dictatorial Times in 20th Century Congo: Rumba as Social Subversion. Interestingly enough, as you can see, the word Franco doesn't appear on the billing of the album, it is just plain old Luambo Makiadi. Twenty five years after its release, I couldn't find Candidat Na Biso Mobutu when I searched my iTunes and Winamp libraries for Franco's music, and it figures: he didn't need the dictator's bloodstains attached to his musical name. Franco was a smart man, he knew all about branding. He is sorely missed.

File under: whimsy, metadata, data, music, Franco, naming, syntax, structure, standards, software, technology, observation, Africa, toli

Friday, March 17, 2006

Minutiae

A surprisingly large part of a software engineer's life is spent dealing with the little things. Much as I like to write about grand designs and architectural issues or people, processes and communities, all too often, the devil is in the details and I get lost chasing technological quirks. Herewith a sample of just the past week's minutiae, perhaps fodder for historians of science or anthropologists.

Boolean Identity Crisis

I was going over some code in my pet XForms processor implementation and wondering what was going wrong - incidentally IBM has written (at least) 6 forms processors in the past 4 years along with contributing to the Mozilla XForms effort but we won't get into that - I'll save that for a business school case study or something.

After an hour or so of head-scratching this was what I found. In XML Schema Datatypes, the lexical space of boolean values includes not only "true" and "false" but also "0" and "1". I assume the inclusion of 0 and 1 in the specification comes from the legacy of the C language and presumably that makes sense. But I ask, was that a wise decision? The designers of the specification chose a binary representation in a textual format.

The forms processor is written in Java. It turns out that in the Java language, the lexical space of boolean values is case-insensitive "true" and "false". So if you have a mapping layer that goes from schema datatype to Java you have to add special case code to deal with 0 and 1. Presumably also in the other direction, you have to make sure that you don't output "True" instead of "true". I had been linking to a library that hadn't bothered to implement this more robust logic and ran into this boundary condition. Oh well, I thought, I'll have to add an adapter around this library or rip it out and roll my own - an assignment for the weekend.

I vaguely remember Sam Ruby mentionning that this was one cause of interoperability issues between SOAP implementations and it stands to reason: we've codified an identity crisis.

Structured Data Footnotes

Jon Udell is a careful man whose

blog works differently from the rest of the InfoWorld blogs. The content is well-formed XML, and it follows certain self-imposed rules.

This discipline enables him to work miracles and regularly come up with lots of cool applications. But in the wider world, getting people to author structured data is an often intractable problem. XML is often structurally invalid; as an example the Google Reader team have reported that 15% of feeds on the web are not well formed and that is before considering the feeds' semantic validity.

Thus I was tickled by this footnote he wrote when playing with microformats.

The difference? I ran the page through HTML Tidy to get well-formed XML ²...

² It wasn't entirely automatic, unfortunately, I had to wrestle with character encoding issues too.

Ladies and Gentlemen, I give you The Gruesome Twosome of Computer Science: Structured Data and Character Encoding. We call them footnotes.

Highlights, Lowlights

It started with an offhand comment, a report that one couldn't set the background color on text in the rich text editor in Mozilla Firefox. Every other command on the editing palette worked. You thought, "15 minutes tops, I'll take it".

So you look up the list of command identifiers in Internet Explorer and notice that the execCommand method has an attribute named BackColor which

Sets or retrieves the background color of the current selection.

You head back to the Midas spec for Mozilla and see that for the backcolor entry

This command will set the background color of the document.

Hmm... Your first thought had been that it was a case-sensitivity problem, "backcolor" rather than "BackColor", but the documentation notes that case doesn't matter. So the problem is rather what you highlighted, in Mozilla, backcolor applies to the document and not to the current selection. This means that you have to use a different identifier to achieve the same effect in Mozilla. You promptly notice one called hilitecolor

This command will set the hilite color of the selection or at the insertion point. It only works with useCSS enabled.

Okay, someone decided to not use the same command as Internet Explorer, fair enough, you've seen worse. As you change the code, you say, it's Mozilla and CSS is enabled by default so all I have to do is switch to using hilitecolor.

Of course that doesn't work. You then tell yourself that you'll just enable CSS with the useCSS attribute (you had looked more closely at the code and noticed that CSS styling had been deliberately turned off for some obscure reason).

Hmm, that didn't work...

So you go back to the spec and you notice a styleWithCSS attribute. Interesting... You try that instead of useCSS. Of course, you happen to be testing using a version of Firefox (1.04) that doesn't support that attribute so there's an exception. Presumably this attribute was introduced in Firefox 1.5 or something. Still it's a good thing that you are using an older version for testing otherwise this would have been another bug (feature doesn't work in Firefox 1.0 etc).

You bang around for a while and go back to the original Midas demo which seems to work correctly if you have the "use CSS" checkbox checked.

You view the source of the demo to figure out what could possibly be making it work and you notice that they are setting the useCSS property to false when the checkbox is checked. Huh? They set useCSS to false in order to enable CSS!

So you go back and look more closely at the documentation. First you notice that the useCSS property is deprecated. Hmm... the plot thickens. Then you read this

useCSS - value: true/false

Note: This command has been replaced with styleWithCSS. It takes the same values as styleWithCSS, but the meaning of true and false are inversed.

Up is down in other words. It's like Alice in Wonderland or something. Do note that there is no word on what version the replacement occurred. Nor indeed is there any footnote on why true and false are "inversed".

Anyway you've finally figured it out, you make the change to enable CSS styling only when applying the hilitecolor attribute, add some future-proofing to use the styleWithCSS attribute in case the deprecated useCSS attribute is removed in later browsers, submit the patches and 3 hours of your life have passed. It was frustrating but there's an object lesson somewhere. If you are building a robust cross-browser rich text editing application, you have some code that has to go through this rigmarole.

I have a confession. I recounted the above tale because I feel guilty: I could have prevented all this four years ago.

Internet Explorer was the first browser to introduce rich text editing. It was not pretty since they didn't want it to be as good as Microsoft Word, but it worked reasonably. IBM later contributed resources to Mozilla to beef up its rich text editing. Indeed four years ago, I was asked to review the resulting Midas spec by my colleagues who had been working on that functionality. The spec looked serviceable, pointing mostly to the Microsoft documentation since they wisely chose to follow the de-facto standard. I reported a few bugs with the intial implementation and we built a rich text editor around it. It's widely used on the web these days.

I obviously was a bad reviewer because I certainly didn't notice that there was a change in the semantics of the backcolor attribute from "current selection" to "document" nor indeed that a new attribute, hilitecolor, had been introduced in Mozilla. There is absolutely no reason for these discrepancies. Nor do I want to go down the forensic trail that explains why you can't have a background color on text if you don't style with CSS. That's a rathole of its own.

So what do we have here? A de-facto standard was implemented but someone took the liberty to change one aspect of it for whatever reason - semantic purity or something. I suspect I won't be the last developer to waste an afternoon on this or indeed the last user to be cursing about why I can't add a background colour to a piece of text. I hope that Opera and Safari, if and when they have their rich text editing implementation in place, will do a wholesale copying of Internet Explorer's behaviour. I don't want to be chasing these lowlights again. I often see criticism of Microsoft's inconsistent approach to standards but everyone can be as guilty as them on occasion.

Which Side Are You On?

Stefan Tilkov recently asked what's wrong with Javascript? The answer of course is nothing really, it's a fine language as evidenced by looking around current thinking on the language. Indeed Brendan Eich's biggest admitted gotcha about the language is automatic semi-colon insertion. Thus there is no reason it can't be used in environments outside the browers in which it is most widely deployed as glue - I've played with Rhino on the server-side without any problem.

From what I understand, Jotspot, which incidentally I consider a cunning plan to showcase the Dojo toolkit, uses Javascript as the scripting language on both the server and client. All power to them. Of course with Javascript from the same document executing in both environments, you can get very confused. The answer to one of Jotspot's most frequently asked questions, "Why isn't my Javascript function being called?" is that "your code is not running where you think it is" ergo, you're expecting that the current code you're looking at is server-side rather than client-side, or vice-versa. This is especially true since the object model is likely to be different, the browser DOM is a deliberately constrained environment whereas on the server side you'd want to allow your plugins to do more.

Thus there is a little impedance with using Javascript everywhere. Perhaps it's less confusing to use a different language and syntax for server side code - a tag library in jsp, or embedded java, php, asp or whatever. Or maybe clever syntax coloring in your editor or IDE would do the trick to remind you of the context. Needless to say, you have to decide what side you're on...

I've been working on a project in which the others on my team are seasoned PHP gurus and are occasionally petrified of Javascript - the reason of course being the continued brittleness of the browser platform. When I started work, this bias showed and their initial recommendation was to do as much as possible in PHP on the server-side. They recognize however that that we are living in an age of interactivity so we need that shine that comes with moving intelligence to the client. Still when you start doing data-binding and automatic JSON serialization of PHP objects, you get constructs on the browser client that you wouldn't normally use if you were a client-side person. As someone who's very comfortable with both client and server side code (perhaps more comfortable with Java than PHP), I keep running into such peculiarities all the time. Of late I find myself prototyping code in client side Javascript even if it will eventually morph into server side code. Perhaps I need to get more into Python or that Ruby bandwagon. Still you tend to develop a split personality when you develop for the web.

The Null Hypothesis

I got the note: "you can't append a column if you click on a cell in the last column of a table in Internet Explorer".

Huh? I attempted to reproduce the bug and, sure enough, that was the case. Vaguely at the back of my mind I recalled from painful experience in K-station that there were special APIs in the HTML DOM for dealing with tables. Thus I searched the codebase for insertRow and insertCell. Hmmm those functions were nowhere to be found. How were they doing the column insertion, I wondered? My guess was that this was either some innerHTML tricks or simple standard DOM manipulation. Thus I had to dig through the code and eventually encountered the Node.insertBefore conundrum.

Now insertBefore is a method on the Node interface that is part of DOM level 1 specification. Every browser claims at least DOM level 1 support.

It turns out that in certain versions of Internet Explorer the second parameter to the insertBefore call can't be null, you get an Invalid Argument exception. Mozilla handles this condition as one would expect in their Javascript binding; Internet Explorer chokes. [Obscenity]. There was too much code to change to use the HTML-specific methods so I just hacked special case code that ensures that I don't pass a null to Internet Explorer - 3 hours of my life perhaps.

Now this doesn't amaze me really, thinking back on it, this is probably the reason that somewhere in the bowels of every Javascript library that deals with dynamically adding a new option to a select control, you'll find code like the following:

function appendOptionElement(select, newoption){ if(is_ie) // test for internet explorer somehow select.add(newoption); else select.add(newoption, null); }

The browser is a fragile place and hopefully the frameworks that are being developed will shield you somewhat from such issues, but it is telling that you can't rely on core DOM functionality. Even if these quirks are fixed in Internet Explorer 7, this patched-up code will have to hang around for another 5 years before users will no longer use older versions of the browser. We're in a world of pain in the browser world.

Nulls however are problematic throughout computer science. The arguments around them may sound like angel and pinhead discussions but they are fair questions. What indeed is null? What is zero for that matter? For centuries and throughout the Dark Ages of the West, there was no concept of zero, it took Arabic Algebra to spread that notion. Why should one expect that programmers would have internalized the null concept? Reasonable people can and do differ on how to treat null.

Just now, reading through the latest Dr Dobbs journal, I noticed the following in an article about Consuming .NET Web Services in Oracle JDeveloper

The ATL Server SOAP handler generates an xsi:nil="1" attribute when the element's value is null, or when the array is null or zero-size. Unfortunately, Apache SOAP fails to deserialize UDTs whose fields contain the xsi:nil="1" attribute and expects zero-size arrays to be represented as XSD arrays with the dimension parameter set to zero.

The author then proceeds to outline a variety of workarounds. Now I happen to not be a fan of the SOAP style of programming (in the past I've called it Crusty Old Architecture pronounced SOA with a silent P) but as a developer, I feel the pain. The article is a catalog of kludges and likely mapping errors that have to be worked around - the word "unfortunately" is used entirely too often.

I started my career doing graphical programming thus I perked up when Raymond Chen recently outlined the consequences of invalidating the null window and his anecdote is worth quoting at length.

If however you end up passing NULL as the window handle to the InvalidateRect function, this is treated as a special case for compatibility with early versions of Windows: It invalidates all the windows on the desktop and repaints them.

Even more strangely, passing NULL as the first parameter to ValidateRect has the same behavior of invalidating all the windows. (Yes, it's the "Validate" function, yet it invalidates.) This wacko behavior exists for the same compatibility reason. Yet another example of how programs rely on bugs or undocumented behavior, in this case, the peculiar way a NULL parameter was treated by very early versions of Windows due to lax parameter validation. Changing nearly anything in the window manager raises a strong probability that there will be many programs that were relying on the old behavior, perhaps entirely by accident, and breaking those programs means an angry phone call from a major corporation because their factory control software stopped working.

The null hypothesis strikes again.

Camel Humps

Even when you get past these details, you get into matters of syntax - a longstanding pet topic of mine. Consider the separators that people use for readability. Some people like to use Hungarian notation, others prefer hyphens... well I've already written a hyphenated parable so I'll skip that aspect. Anyway, assume for some insane reason that it makes sense for your spec to have an attribute named "windowTop". There'll undoubtedly be people who will write it "window-top", "window.top" or with some other variant of case, "WindowTop"? If you're dealing with XML where case matters, things will fail. You say potato, I say pubDate anyone?

I've been playing with a product that is essentially a wiki and it turns out that by convention, camel case (or should I write it as CamelCase) is significant in the wiki world as denoting a "WikiWord". I even had to spend half an hour writing glossary entries for these concepts since in our system, user names and page names had to be in that format.

You can imagine however the kinds of issues that arise when you have an html editor that allows you to enter HTML and Javascript along with php code and a specialized wiki syntax - since for some reason, it was decided not to use angle brackets in this product. The most complicated piece of code is going to be the parser. As currently implemented, the parser is a mass (or should I say, a morass) of regular expressions and all kinds of things I'll never understand. There's even special case code in there to handle nested comments in Javascript. What was never forseen however was that you'd have to deal with user-entered script code. What would happen if camel case is used for variables inside of said script? Well it wasn't pretty when the wiki engine jumped in and treated script as wiki words, let's just say that someone had to come up with a solution.

There's this concept known as McCabe Cyclomatic Complexity which is basically an application of graph theory to software and is often used to pinpoint potential problems. The basic insight is that if there are too many decisions or paths through your code, it will be buggier, harder to maintain and test. A good guideline is that anything with complexity greater than 15 in this scheme is ripe for refactoring. Luckily I have access to some tools that can generate reports and provide some numbers to validate that nagging sense that a piece of code is getting harder to understand. What worries me is that I keep running into code that laughs at such guidelines - after the latest change the parser code just hit 41 (to give some context, values between 21 and 50 indicate "a complex, high risk program" and with the 50 barrier looming, we are verging towards that notable status of the "untestable program - very high risk"). I'm sure it didn't start out this way and that the steady accretions have been solutions to real problems. Still, technical arteriosclerosis continues its inexorable spread...

Ruby on Rails and the Zend framework for PHP are founded on favouring convention. We had this code that automatically created the schema for what amounted to a database table. We never actually told users that this was what was going on. Of course once a wider audience started to play with the product, we had to fix the bugs that arose to enforce the convention so that databases would be named as the code expected. A classic case of leaky abstractions I suppose. You can guess the complexity the additional error checking added.

Moving up a level, you get politics - the most aberrant of which have been the feed wars of the past few years perhaps best summarized hilariously by Shelley Powers. Truly, Jesus wept.

Bill de Hóra in an aside wisely noted that

Programmers would rather squabble about minutiae - it seems this transcends community or language choice.

But even when you move beyond whiplash and abrasive personalities, who would have thought that the Atom working group would have spent so much time on the concept of dates. When I saw dissertation-length essays on the various types of dates that might be used in publishing systems, I was content to continue lurking in that community. The thing however is that these details do matter and they are best dealt with up front. The aggregate waste of programmer effort in pursuit of minutiae might keep the profession in business but it surely isn't sustainable.

The East Australian Singularity

I'll conclude by noting that acts of God (or in this case, his flawed proxies: politicians) can come into the picture. Thus I read the following this week: Eastern Australia: Java applications impacted by the change to daylight savings time dates

The running of the Commonwealth Games in Australia in March and April 2006 has resulted in an extension to daylight savings time in the eastern states of Australia, which includes New South Wales, Victoria, South Australia, Australian Capital Territory and Tasmania.

Rather than the clocks being put back an hour at 03:00 on Sunday 26 March 2006, they will now be adjusted at 03:00 on Sunday 2 April 2006. The IBM Software Development Kit (SDK) or Java Runtime Environment (JRE) is not currently aware of the one-time change to the daylight savings time extension. An interim fix is required for your application to have the same time as your operating system.

This change applies to all of the Java environments, irrespective of their setup, whether they are set up to use the operating system time zone information, or the user.timezone custom property that can be set for the SDK or JRE.

I assume television schedules were behind this change to the time/space continuum since the Commonwealth games have just begun. Or perhaps the politicians were concerned about athletes missing their events or something. Still imagine if you're the Java programmer who has suffered through the badly defined date apis in Java 1.0 a decade ago, adapted to the improvements in succeeding versions and finally, finally gotten a robust application deployed. Then your manager walks in and tells you about the East Australian Singularity. You're going to have to rework everything since bank transactions might be messed up and there are likely to be blinking clocks etc. We live in a global village so you have to worry about what happens if some mission-critical application that you rely on is being run in Eastern Australia. You can't even have the special case restricted to that country, it's only a sub-region that is affected. At such times I'd prefer being the poor Eastern Australian sheep or lamb, their lot is much easier.

What a way to make a living, if it isn't one thing it's the other. And then there's version hell. Lord help me.

File under: technology, software, abstraction, details, minutiae, glitches, programming, standards, language, absurd, toli

Koranteng's Toli

Saturday, June 14, 2025

Contraction

Contraction, a playlist

Friday, December 04, 2009

66 Ways to Franco

Friday, March 17, 2006

Minutiae

Boolean Identity Crisis

Structured Data Footnotes

Highlights, Lowlights

Which Side Are You On?

The Null Hypothesis

Camel Humps

The East Australian Singularity

About Me

Toli Things

Archives

The Things Fall Apart Series

The Book of Toli

The Toli Technology Series

Odds and Ends

Contacting Me

About Me