tommorris.org

Discussing software, the web, politics, sexuality and the unending supply of human stupidity.


web architecture


Fault tolerance on the Web and the dark side of Postel's Law

I’ve been reading @adactio’s new book. Pretty much all I have read is great, and I highly recommend reading it.

But in the grand spirit of pedantic blogging, I take issue with this:

“Imperative languages tend to be harder to learn than declarative languages”

That tend to is doing a lot of work.

I think Jeremy is conflating declarative with fault tolerant here. HTML and CSS are declarative and fault tolerant. SQL is declarative and decidedly fault intolerant (and quite hard to master to boot). PHP’s type system is a lot more permissive and tolerant (“weak”, one might say) than a language like Python. The former is great for newbies and non-programmers, because adding 1 + "1" (that is, adding a string and an integer together) will give you 2, or at least something that when printed to screen looks vaguely like a two, though under the covers it may be Cthulhu.1 And the behaviour of something like Python is great for jaded old gits like me who don’t want the system to magically convert strings into integers but to blow up as quickly as possible so that it can be fixed before all this nastiness gets stitched together with the rest of the code and causes some real major bugs. The same principle applies on a grander scale with stricter type systems like Java or the broad ML family (including things like Scala, F#, Haskell etc.).

“A misspelt tag in HTML or a missing curly brace in CSS can also cause headaches, but imperative programs must be well‐formed or they won’t run at all.”

Again, depends on the language. PHP is pretty permissive. If you make a very minor mistake, you can often just carry on and keep going. If you are feeling particularly careless about your errors, you prefix your expression with an @ sign and then security people get to laugh at you. I hesitate to say this was “designed” but it was at the very least chosen as a behaviour in PHP.

This may all seem rather pedantic, but it is actually quite important. HTML and CSS are declarative and permissively fault-tolerant. That’s a key strength. In an environment like the web, it creates a certain robustness. If your JavaScript fails to load entirely, you can get some very strange behaviour if, say, a function that is expected to be there isn’t. (I use a site frequently that makes multiple Ajax calls but if one fails, say due to a bad connection, the site is unusable and must be reloaded from scratch. It is also contained in an iOS app, which must be manually restarted.) But if some of your CSS doesn’t load, that’s not the end of the world: you still get something you can read. If some of your HTML has weird new stuff in it, as Jeremy points out elsewhere in the book, that’s still backwards compatible–the browser simply ignores that element and renders its content normally.

This error handling model, this permissiveness of web technologies, isn’t a side-effect of being declarative. It’s actually a property of them being designed for the web, of human choice by the creators. There is a cost to this. It has been incredibly hard for us to secure the web. Permissive error handling can and has enabled a whole class of security vulnerabilities.

If Postel’s Law gives you the ability to use border-radius in your CSS or aside in your HTML and some terrible old version of Internet Explorer happily ignoring it without vomiting XML warnings all across your screen, then Postel’s Law also comes with the cost of having to worry about downgrade attacks. We collectively left SSLv2 running long after it should have been dead and we got DROWN. We did the same with SSLv3 and we got POODLE. These are examples of ungraceful degradation and the sad cost is your server being vulnerable to being pwned.2

With the last few attacks on SSL/TLS, it wasn’t just nasty old versions of Internet Explorer on Windows XP getting shut out of the web, it was non-browser apps that talked to HTTPS-based APIs. The Logjam attack meant that a lot of people upgraded their servers to not serve DH keypairs that are below 1024-bit. For most current day browsers, this was not an issue. Apple, Mozilla, Google, Microsoft and others released patches for their current browsers. Java 6 didn’t get a patch for a very long time. If you had a Java desktop app that consumed an HTTPS-based RESTful API which had patched Logjam, that broke with no graceful degradation, and the general solution was to upgrade to a new version of Java. On OS X, Java used to be distributed by Apple, albeit as something of a reluctant afterthought. Upgrading every desktop Java user on OS X was a bit of a faff. (My particular Java Logjam issue was with… JOSM, the Java OpenStreetMap editor.)

Postel’s Law giveth and it taketh away. One could give this kind of rather pedantic counterexample to Jeremy’s account of Postel’s Law and then conclude hastily “given how badly the web is at solving these security problems, the grass on the native apps side of the fence might just be a little bit greener and yummier”. Oh dear. Just you wait. When every app on a platform is basically reimplementing a site-specific client, you might actually get more fragility. Consider our recent vulnerabilities with SSL/TLS. After something like Logjam, the bat signal went out: fix your servers, fix your clients. On the server side, generally it meant changing a config file for Apache or Nginx in a fairly trivial way and then restarting the web server process. On the client side, it meant downloading a patch for Chrome or Firefox or Safari or whatever. That may have just been rolled into the general system update (Safari) or rolled into the release cycle (Chrome) without the end user even being aware of it. The long tail3 of slightly oddball stuff like Java desktop apps, which tends to affect enterprise users, assorted weirdos like me, and niche use cases, took a bit longer to fix.

If every (client-server) app4 that could be in a browser, in the case of a security fail, fixing all those apps would be as simple as fixing the browser (and the server, but that’s a separate issue). If everything were a native app, you have to hope they are all using the system-level implementations of things like HTTP, TLS and JSON parsing, otherwise you have a hell of a job keeping them secure after vulnerabilities. We already see things going on in native-app-land (Napland?) that would cause a browser to throw a big stonking error: user data being sent in cleartext rather than over TLS being more common than I care to think about. But the native app won’t scream and shout and say “this is a bad, very wrong, no good idea, stop it unless you really really really want to”, because the UI was developed by the person who created the security issue to start with.

The downside to Postel’s Law5 is sometimes the graceful degradation is pretty damn graceless. Sadly, the least graceful degradation is often security-related. The web might still be better at mitigating those than all but the most attentive native app developers, or not. Browser manufacturers may be better at enforcing security policies retroactively than app stores, or they might not. We shall have to wait and see.

The Web is a hot mess.

But we still love it.

  1. Sausage typing: if it looks like a sausage, and tastes like a sausage, try not to think about what it really is.

  2. At risk of giving technical backing to rising reactionary movements, perhaps the modern day variation of Postel’s Law might be: “Be conservative in what you send, and be liberal in what you accept, to the degree that you can avoid catastrophic security failures.”

  3. Wow, it’s been a long time since we talked about them…

  4. Let’s leave aside the semantics of what is or isn’t an app. Incidentally, I was in a pub the other day and saw a few minutes of a football game. There was a big advert during the game from a major UK retail brand that said “DOWNLOAD THE [brand name] APP”. Why? I don’t know. We are back at the ghastly “VISIT OUR WEBSITE, WE WON’T TELL YOU WHY” stage with native mobile apps. I’m waiting for a soft drink brand to exhort me to download their app for no reason on the side of a bottle.

  5. I claim no originality in this observation. See here and here. Interestingly, if you take a look at RFC 760, where Postel’s Law is originally described, it has a rather less soundbitey remark just before it:

    The implementation of a protocol must be robust. Each implementation must expect to interoperate with others created by different individuals. While the goal of this specification is to be explicit about the protocol there is the possibility of differing interpretations. In general, an implementation should be conservative in its sending behavior, and liberal in its receiving behavior. That is, it should be careful to send well-formed datagrams, but should accept any datagram that it can interpret (e.g., not object to technical errors where the meaning is still clear).

    The first two sentences are key…


DNS issues reveal inherent fragility and redundancy of DOIs

Those of you unfamiliar with bibliographic standards may be unaware of digital object identifiers (DOIs). DOIs are used by academic publishers to provide a unique global identifier for academic papers published in journals. A DOI is printed in the following form:

doi:10.1000/182

As I said, the point of a digital object identifier is to provide an identifier of a particular resource. This makes them a duplicate of a rather more popular and widespread system for identifying resources, namely uniform resource locators (URLs), the popular standard used on the Internet for web addresses.

Why not just use URLs then? Because librarians need DOIs. They need DOIs because you can’t trust the dastardly Internet to keep resources around. Links rot, stuff starts 404ing, people forget to renew domains. If you use a URL, the URL might break. So rather than having academic journal publishers assign URLs and then potentially doing something stupid, the best thing to do is to have a trusted party run their own URL system. The trusted third party in this case is an organisation called the Corporation for National Research Initiatives, and they run the technical infrastructure on behalf of the International DOI Foundation. Note the use of the words “international”, “foundation” and “corporation”—obviously these guys are serious and know what they are doing.

Of course, because DOIs aren’t actually that useful on their own, you need a resolver. You can take a DOI and remove the “doi:” prefix and then append the rest of it to the address http://dx.doi.org/ and you then have… a URL! But then you have the problem that URLs have. Links rot. Servers break. People forget to renew domains. Or, as happened today, the CNRI made a mess of the DOI resolver’s DNS records because they manually renewed the DOI domain name at the last minute. This caused intermittent resolution issues.

I mean, sure, if one journal publisher were to accidentally mess up their website for a day, that’d be annoying. But when you put the very act of resolving your identifier in the hands of one organisation, the potential for catastrophe is pretty amazing. Especially given the sheer amount of academic infrastructure that is sitting on top of the DOI infrastructure: citation generators, bibliometric analysis software, personal paper library storage, pre-print metadata stores, institutional subscription-wall software. Whoops.

I’m still not convinced there is any point to DOIs. There is an international, widely adopted standard for identifying digital objects: URLs. URLs aren’t perfect, but you aren’t reliant on librarians to remember to renew their domain name for the whole system to continue running.

You want metadata retrieval? Easy. HTTP gives you that: content negotiation. Hell, the two formats that anyone is going to give a damn about for academic papers (HTML and PDF) can contain metadata—HTML in the form of microformats or RDFa (or, Xenu forbid, microdata), PDF in the form of the Info Dictionary and XMP.

Today’s silliness with DOIs show to me the inherent fragility of any “let’s just give a number to everything” system. The Internet exists and they all duplicate what URLs already do but with more indirection and with a constant risk of the lookup server going wrong. If your solution to “things on the internet go away” is to introduce a resolution service that can also go away, you are duplicating entities beyond necessity