Archive

Archive for August, 2008

The End of Design By Committee

The W3C always had great intentions. The goal has always been to create great standards encompassing for all possible situations and respecting all special needs. In the early days, great standards still living today were created, like HTML and CSS. Of course, there were problems. It took years for standards to be supported correctly because of ambiguities. Facing those problems, they decided not to make standards official until they were fully supported by enough implementations. XHTML 2, CSS 3 never saw the light.

Standards like RDF never became what they were meant to be nearly 10 years after the reccommended proposition. SOAP and WSDL are huge buzzwords in the SOA world, but it never quite works as well as it should. Implementations are still incompatible and subsets of the spec need to be used for communication to be handled properly. Not to mention there are still no traces of XLink, XPointer or XForms anywhere in the ecosystem. All these specifications appeared in the early 2000s or late 1990s. Who were they made for?

The big problem with all of them is that they are so abstract that no one outside the committee who designed them can understand their purpose, let alone any idea of trying to support them. The specifications are too large. Too complicated. Building around XML probably wasn’t the best idea ever. It really is unlike anything else and painful to work with, unless you use even more XML technologies. It does not map well to common programming paradigms. It was only ever similar to HTML and SGML. Maybe those should have been taken as exceptions rather than the rule.

I consider the best specification build around XML is XPath, but only because it removes all the burden of managing XML and it’s not XML-based. CSS is great for formatting HTML, again, not XML-based. XSL-T is not too bad because it plays nice with HTML, but I find some other techniques like Zope’s TAL to be a lot more elegant. It extends XML without adding to the tag soup.

By ignoring all the details, APIs like SimpleXML allow you to read XML seamlessly, but writing it is a completely different task. XML works everywhere, but it’s always alien to the environment.

Recently, I have noticed that the Web started to regain it’s original nature. Standards are emerging rather than cultivated. The days where companies assigned employees to a consortium in order to write a specification are over. The W3C is still working on their specifications, trying to get them out the door, but nothing new started in a long while. During that period, we got to see great standards establishing themselves, not because they were supported by the industry, but because they were good.

Think about JSON and YAML. JSON is a subset of JavaScript that is well specified, easy to understand, easy to read for a machine and easy to write for a machine. YAML is a human readable format that is formal enough to be parsed by a machine. What do they have in common? They map to programming concepts. All scripting languages out there can load them in a single function call into their internal formats and rewrite them just as easily.

In the end, the problem can be distilled down to a single preference. You can either write a complex specification, encompassing for all possible cases, and spend months implementing it and making sure it’s compatible, then spend 5 minutes configuring it to perform all the magic. Or you can have a simple data exchange format, hook it to a scripting language and spend anywhere from 15 minutes to a few hours to do what you need. On the large scale, complex standards are worth it, but in most cases, they are a waste of time.

One of the great aspects of web development is that there are so many problems, there are thousands of people thinking about them. Over the years, a great ecosystem of tools and techniques was built. These days, all you need to do is piece together existing components. HTTP is a good environment to make requests and get responses. Data serialization is available. All you need to choose is decide how to use them. Recently, Identi.ca/Laconi.ca wrote a small specification for open microblogging. OEmbed allows to export the location of images and videos. When you look at those, the first thing that comes to your mind is: Why hasn’t anyone thought about it before?

It doesn’t have to be great. It doesn’t have to be so smart. We only need to agree on something, or do something and get others to follow. There is nothing religious about saying which field name you will use to contain the location or the size of an image. There is no need for namespaces and extendability. It does not even deserves debates or discussions. Just decision making. It’s a simple problem and it deserves a simple solution.

The specifications fit on a few sheets of paper. They can be read and understood by anyone who cares without investing significant efforts. Simple use cases can be illustrated. People get it.

There are so many ways in which different websites can’t talk to each other, which makes it painful to develop applications and forces people to re-implement the same things over and over again. In the new Web-SaaS-driven world, it’s a shame. Especially since the underlying protocol does not prevent anything. It’s just that one took the time to write down the problem and write down the simplest solution that could work.

Sure, you could go out and write something generic that solves everything (it would probably end up looking like RDF). In the end, unless you know what you’re searching for, there is no way you will find it. Abstract tokens don’t help anyone.

I’m currently writing my own spec (more information soon). What are your problems in integrating with other applications?

Note to self: This post contains too many acronyms and references. I should look these up and link in the future.

Categories: General, Programming Tags: