Skip to content
October 9, 2013 / pauldundon

Why you can’t have a RESTful API

One of the questions which arises in discussion of REST is “what is a resource?” The difficulty with this question is that we have an instinctive grasp of what a resource is but it’s hard to pin down exactly what we mean. A resource is whatever flows over the wire; whatever we name with our URIs; the thing the user wants to work with. The question of how we might decide whether a given thing constitutes a resource or not is somewhat open and this is a problem if we are trying to use a process of articulating resources and resource types as an early step in our application design process.

This article by Roy Fielding implies a more specific answer. The “hypertext constraint” on RESTful APIs requires that the possibility of transition between resources be captured in the data for those resources in a way which is defined by the media type, and not the resource type. If we have an “order” resource type and a “customer” resource type, and it’s possible to transition from an order to a customer (ie, view the details of the customer who placed the order) then this fact must be captured in the details of the order in such a way that the client can present this possibility to the user without the client knowing that it is displaying an order entity as opposed to any other entity.

In his PhD, Fielding puts it like this: “Representational State Transfer is intended to evoke an image of how a well-designed Web application behaves: presented with a network of Web pages (a virtual state-machine), the user progresses through an application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use.”

This means that every resource delivered by the server has to contain a list of all the application state transitions which are possible from that resource: the links between application states just are the links between resources. And it follows that resources are application states.

This brings a number of challenges. To help articulate these, we introduce the notion of a desideratum. If we use Google to search for “REST”, we get back some search results (which is what we wanted) along with some ads, the Google logo, and other sundry items which aren’t particularly what we wanted. The search results are the desideratum: the thing we wanted. It’s also the thing we would want to get access to if we bookmarked the page and used the bookmark later – at this point, we wouldn’t care if the logo had changed or we saw different ads. Similarly, it’s what we want someone else to see if we share a link with them.

We might be tempted to think that desiderata are resources. However:

  1. An application state might permit transitions which are nothing to do with the content of the desiderata being viewed. A top-level menu, for example, has links which are largely independent of the item on display.
  2. An application state might forbid transitions which are part of the content of a desideratum in order to encourage users to take a particular path during a process
  3. The same desideratum might appear in more than one location in an application state graph. For example, a shopping cart might be visible in a view which is designed to allow searching for products in a catalogue, with a “checkout” link leading to another view of the shopping cart with a “proceed to payment” link

Thus, if resources are application states, then resources are not desiderata. This is significant because the temptation in writing a RESTful API is to provide HTTP access to a database of desiderata, which is, I take it, the main frustration Fielding raises in his article.

It’s useful to distinguish, at this point, the difference between a desideratum which is an HTML document and a desideratum such as a customer record which is expressed as an HTML document. The distinction becomes clear if we consider what would happen if the user wanted to modify the desideratum in question. We can imagine a browser with an edit mode, which allowed the user to edit HTML (perhaps in a WYSIWIG sort of way) and post the results back to the server. If someone were part of a team publishing research findings, for example, then they might work with HTML documents in this way, and would expect, when entering “edit” mode, to see all the relevant markup, links and the like. On the other hand, if they were working in an office maintaining a list of contacts, and wanted to edit a phone number, when entering edit mode they would be surprised to see, and probably not able to understand, the markup around the number.

If the desiderata are HTML documents, then the issue we have identified is not a problem. As a document author, I determine which other documents might be of interest to my reader, and embed links to these in the document as I write it. Thus, there are no transitions which are not content-determined (point 1). Similarly, I have no interest in the process my reader follows; on the contrary, I want them to be able to move around as freely as possible. Thus, no transitions are possible but sometimes forbidden (point 2) and no document needs to appear more than once in the application graph (point 3). All my links are in the data and the application graph is simply an emergent property of those links; there is no “application design process” as such because the application graph is fully determined by the data.

This is really the beauty of the web; we see it when we realise that the web is not a collection of applications, but one big application, engineered concurrently by millions of people, able to work independently. You can search Google and find this article; you can read it and then go and read Fielding’s comments about the hypertext constraint. The whole journey will involve multiple servers, and software engineered by possibly hundreds of people, yet all this is completely transparent – the web can work in this way precisely because no-one designs it; the shape of the web emerges from the bottom up, not the top down.

The difficulty begins because the driver for having an API is presumably that we want to write multiple applications, designed from the top down, which manipulate the same desiderata; and those desiderata are not HTML documents, but contact records, product details, tallies of who has made tea in the office etc. Thus, there must be multiple application designs in play, and multiple application state graphs. Each of those graphs defines a set of application states. Since these are resources, the server must be able to respond to a request for any application state we define. That is, the server owns the definition of possible application states. And since each application state encodes the possibilities of its outbound transitions, the server owns the definition of these as well. Thus, the interface to the server defines the application’s state graph.

The point of an API is, of course, that it is independent of the state graph – it abstracts the data storage and processing operations which are common across applications, without prejudicing how the application uses those operations. A RESTful API which satisfies the hypertext constraint simply cannot do this, so is not, in fact, an API.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: