So the system went live, and the requests are pouring in like crazy, and the phones are off the hook with angry customers on the other line. “Where’s the product I just ordered?”
“Let me get back to you.”
Turns out it was a big mess. Customers were sending orders in broken messages we couldn’t understand, we couldn’t even tell where to send our errors to. We had orders with no item identifiers, and orders with items inside the shipping address, and orders with zipcodes instead of phone numbers. What a mess. One order looked like it was CSV, another content encoded as a JPG.
Joe said it will only take a couple of weeks, but we can get the server to check all incoming orders and send error responses. Management wasn’t happy. Then Bob chimed in, said it will only take a day to create a schema and run a validator on all the inputs. Guess what we did next?
No more angry customers. In fact, customers are deligheted judging by the sudden increase in order volume. Then we get a call from the CEO. How come sales are up, but we’ve just lost $10m? We check with accounting. Everything matches up. Check with the bank. No fraudulent transactions. Run a report on the sales system, get back the same data. All’s well, but something is a miss.
So we pick a few customers at random, and run through their orders from input to output, and we find it. It turns out some customers managed to game the system. They sent us orders that looked like that:
<order> <item>MacBook Pro 17"</item> <total currency="USD">65</total> </order>
Oops. We listened to Bob. We checked the markup, but we forgot to check the data.
Does the world need another markup definition language?
(There’s a lot more I can say about XSD pretend-extensibility, the namespace illusion of versions, or what happens when all you do is exchange semanticless WSDLs and prey for interoperability, but I feel one point is enough)
I think the root of the problem is that we’re trying to find an abstraction from the implementation, only to realize — the horror — that no abstraction is as good as the implementation it attempts to replace.
We can’t all agree on the one development language to use, some camp with Java, some swear by Ruby, some know their Python and some bought into C#. So clear what we need is another language-neutral language. Like XSD. Yes. No, maybe RELAX-NG. Sometimes. But really, Schematron.
At some point that need to not deal with languages ends up creating more half-backed languages than we already need. We’ve substituted inevitable complexity for unnecessary complexity.