Discussing software, the web, politics, sexuality and the unending supply of human stupidity.

Request for comment: a 'Good API' checklist and committee

The message of open data and linked data is being accepted slowly by businesses and governments.

But it’s taking too long and the results that come out of it are often mediocre. As a community of programmers who build software on top of web APIs, we should attempt to guide the process as best as possible.

One way to do this would be simply to have a group of people who produce a checklist of what a good API has. This would be simple, well-documented and could probably be listed on one side of A4.

And then this group of people would have a mailing list where they can decide which APIs meet the checklist. They would then be given a simple certification like “Good API”.

My initial draft of a checklist would consist of things like:

  • Follows relevant web standards as closely as possible. Specifically, HTTP 1.0/1.1, XML 1.0, HTTPS, the use of appropriate MIME media types, HTML/XHTML, the RDF standards, ECMA 262 (for front-end JavaScript-based APIs).
  • I can use it from curl.
  • Doesn’t require an API key. For read-only APIs, the use of an API key is unnecessary. You don’t require people to have a special key to read your website or subscribe to your RSS feed, so there’s no need to track API keys. Require people instead to use the User-Agent header properly. (Perhaps we could call this “ZeroAuth”. Heh.)
  • All data that has been downloaded is syntactically valid: for JSON, it can be loaded without error into a standards compliant JSON parser; for XML, it is well-formed and if it specifies a DTD/XSD/RNG schema, it complies with the schema; for RDF, it loads into Jena properly; for Atom, it validates to a machine-readable Atom spec. You get the drift.
  • Resources are discoverable and usable in a RESTful manner.
  • Identifiers are not needlessly cryptic and are preferably URIs.
  • Documentation allows someone who is not familiar with the API to use it with relative simplicity.
  • Client libraries are open source with a free software compatible license.
  • Client libraries follow the common practices of other open source projects in that language (for instance, in Ruby, it is packaged as a RubyGem; in Python, it can be installed with easy_install; in Java, it is available as a JAR and a Maven/Ivy dependency; in C#, the library works on both the Microsoft Windows implementation of the .NET framework and on the current stable release of Mono).
  • Client libraries do not have excessive dependencies.
  • Use-rate limitations of the API are clearly explained, as are all copyright and legal reuse matters.
  • Does not needlessly reinvent the wheel.
  • A commitment is made about the amount of notice given if the API is going to be shut down. This commitment would include the ability to sign up to an announcement-only mailing list and/or blog with RSS/Atom feed, which would only be used for major changes to the API.

Of course, those are just my personal preferences and would have to be worked on a bit. Once the list is specified, we would release the final document under an open content license (CC-BY-SA 3.0) so it could be reused.

The Good API checklist would be opinionated. I’m pro-HTTP. I’m pro-RDF. I’m pro-XML. I’m pro-JSON. (Above all, I’m pro-content negotiation and pro-format-diversity.) I’m pro-using-URIs-as-identifiers. I’m anti-SOAP. I’m anti-API-keys. I’m anti-complexity. I’m anti-complex-documentation. I’m of the opinion that APIs need to be simple enough that I can use them from curl, because if I can use them from curl I can use them from any programming language because most programming languages now have good HTTP abstractions. I’m pro-HTTP because HTTP is the protocol of the damn web. I’m pro-standards. I’d hope people getting involved with this would be likewise.

But the Good API committee wouldn’t be arseholes. The point would be that it is educational: it would be to help people design better APIs. Because better APIs are simpler to use, which means all that fancy long-tail economics voodoo might actually work. Better APIs don’t need custom code, which means there’s less to go wrong. Better APIs mean that people actually trust the API rather than engaging in anti-pattern behaviour like the ever-present “download the whole dataset and mirror it” (because data never changes, right?).

Upon being given an API to try out, we would try it out. Hopefully, this would be about half an hour a week of commitment: because if an API takes longer than half an hour to figure out, it’s probably not a good API. If it is a Good API, we’d pretty much all say “well done, you’ve got it!” but if isn’t a Good API, we would provide reasons why it isn’t good.

If you are interested in getting involved, leave a comment.