7 Tips for Designing a Better REST API

If you need to develop a REST API for a database-driven application, it’s almost irresistible not to use the database tables as REST resources, the four HTTP methods as CRUD operations, and then simply expose your thinly-wrapped database as a REST API:

OperationHTTPSQL
CreatePOSTINSERT
ReadGETSELECT
UpdatePUTUPDATE
DeleteDELETEDELETE
The mapping between SQL and HTTP is deceptively simple.

The problem is that one of the foundations of the REST architecture is that the client-facing representation of a resource must be independent of the underlying implementation, and implementations details should definitely not be leaked to the client, which is all too easy with the database-driven approach.

It’s also important to ask yourself if an almost raw database is the best interface you can offer your API users? I mean there is already a near-perfect language for doing CRUD operations on database tables, it’s called SQL… And you probably have some business logic on top of those tables that your API users would appreciate not having to re-implement in their own code.

So how do you move beyond this database-oriented thinking and closer to a more RESTful design for your API?

Let’sThe mapping between SQL and HTTP is deceptively easy. find out…

1. Begin with the API User in Mind

Bestselling author and architect Sam Newman’s great book on microservices provides a powerful alternative to the database-driven approach for designing REST web services. It’s useful even if you don’t plan to use microservices.

Newman suggests that you divide your application into bounded contexts (similar to business areas). Each bounded context should provide an explicit interface for those who wish to interact with it. Implementation details of the bounded context that don’t need to be exposed to the outside world are hidden behind the interface.

You should use this explicit interface as the basis for your API design. Start by asking yourself what business capabilities do the API user needs, rather than what data that should be shared. In other words, ask yourself what does this bounded context do? and then ask yourself what data does it need to do that?

The promise is that if you wait with thinking about shared data until you know what business capabilities you need to offer, it will lead to a less database-oriented design.

I think his approach is a good way to jolt you out of the database-driven mindset, but you need to be careful that you don’t end up designing a REST-RPC hybrid.

What I also like about this approach is that it minimizes the interface and doesn’t expose all data by default, but hides internal data (like logging and configuration tables) from the client and instead focuses on what the client actually needs.

This also fits beautifully with veteran API designer Joshua Bloch’s maxim saying that When in doubt, leave it out (from his highly popular presentation on API design), and it also harmonizes with the REST principle that a representation of a resource doesn’t need to look like the underlying resource, but can be changed to make it easier for the client.

So feel free to think about what would be the easiest interface for the API user, and then let your resource take data from multiple tables and leave out columns that are irrelevant to the job that clients need to perform.

2. Use Subresources to Show Relationships

An attractive alternative to only using top-level resources is to use subresources to make the relationships between resources more obvious to the API user, and to reduce dependencies on keys inside the resource representation.

So how do you decide what resources should be subresources? A rule of thumb is that if the resource is a part of another resource then it should be a subresource (i.e. composition).

For example, if you have a customer, an order and an order line then an order line is a part of an order, but an order is not a part of a customer (i.e. the two exists independently and a customer is not made up of orders!)

So the URIs would look like this:

/customers/{id}

/orders/{id}/lines/{id}

A different rule of thumb is to also include aggregations as subresources. That is, a belongs to relationship. If we use this rule then an order belongs to a customer, so the path would look like this:

/customers/{id}/orders/{id}/lines/{id}

So what rule should you pick?

The idea with subresource is to make your API more readable. For example, even if you don’t know the API you can quickly guess that POST /customer/123/orders will create a new order for customer 123.

However, if you end up with more than about two levels then the URI starts to become really long and the readability is reduced.

You also need to be aware that subresources cannot be used outside the scope of their parent resource. In the second example, you need a customer id before you can lookup a order, so if you want a list of all open orders (regardless of customer) then you cannot do it in the second example.

Ehh, so what to pick?

If you want a flexible API, aim for fewer subresources. If you want a more readable API, aim for more subresources.

The important thing is that whatever rule of thumb you pick then be consistent about it. I mean the API user might disagree with your decision, but if you are using it consistently throughout your API, he or she will probably forgive you.

3. Use Snapshots for Dashboard Data

A deal-breaker for using subresources is that the client might need to access data across subresources to get data for a dashboard or something similar. For example, a manager might want to get some statistics about orders across all customers.

Before you go ahead and flatten your whole API, there are two alternatives you should consider.

First, remember that there is nothing that prevents you from having multiple URIs that point to the same underlying resource, so beside /customers/{id}/orders/{id}, you could add an extra URI to query orders outside of the customer scope:

/orders/{id}

To minimize the duplication of functionality, you can limit the top-level /orders URI to only accept GET requests, so if clients want to create a new order, they will always do it in the context of a customer.

Of course, you need to be careful not to duplicate resources unnecessarily, but if there is a customer need, then there is a customer need, and we need to find the best possible solution.

In RESTful Web Services Cookbook, Chief Engineer at eBay (and former Yahoo Architect) Subbu Allamaraju suggests an alternative approach called snapshots.

For example, if an order manager wants to see some specific statistics (5 latest orders, 5 biggest clients, etc.) then we can create a snapshot resource that finds and returns all these information:

/orders/snapshot

I personally like the snapshot approach better, because it doesn’t feel like querying a database. But with that said, the snapshot approach requires an intimate knowledge of the API user, and the extra order top-level resource will offer more flexibility.

4. Use Links for Relationships

Another way to show relationships between resources, without falling back on using keys in an SQL-like manner, is to embed links inside your responses:

{
  "id": 123,
  "title": "Mr.",
  "firstname": "Han",
  "surname": "Solo",
  "emailPromotion": "No",
  "_links": {
    "self": {
      "href": "https://api.example.com/v1/customers/123"
    }, 
    "contactDetails": {
      "href": "https://api.example.com/v1/customers/123/contactDetails"
    },
    "orders": {
      "href": "https://api.example.com/v1/orders?customerId=123"
    }
  }
}

A cool thing about links is that they allow autodiscovery by clients! When the client gets the response back then it can see in the _link section what other actions it can follow from here. This is just like when you are surfing the web where you come to a page and then you can follow its links to new pages.

Another nice thing is that clients will have fewer hard-coded links in their code, which will make the code more robust. If the client wants to see the customer’s orders, it can just follow the orders link to get them.

However, there are different opinions about if you should use links, or not…

Vinay Sahni writes in his excellent blog post Best Practices for Designing a Pragmatic RESTful API that links are a good idea, but we are not ready to use them yet. On the other hand, the RESTful maturity model says that when you start using links you have reached the highest level of REST maturity.

So what to do?

Well, Dr. Roy Fielding, an expert on software architectures and the inventor of REST architectural style, flatly said on his blog that if you don’t use links it ain’t REST services, and he kindly encourages you to use another buzz word for your API!

5. Hide Internal Codes

In an earlier post, I incidentally leaked internal codes in the job_id column on the employees table:

{
  "jobId": "SH_CLERK"
}

Needless to say, this is an implementation detail that gives away that we are using a relational database, and experienced Oracle users would instantly spot that it’s Oracle’s sample HR schema. This leaking makes it harder to switch to a document-oriented database, like MongoDB, where there is no concept of foreign keys.

But even if the chance of switching to MongoDB is zero, it still makes the response harder to read.

So a better approach is to let the REST API translate the internal code to the human-readable value that the code represents (i.e. “Shipping Clerk”) and then also remove the Id part of the field name.

{
  "job": "Shipping Clerk"
}

This version is definitely more readable, but a fair concern is if the service will be slower now that it needs to lookup the value? I used to be an avid reader of Tom Kyte, the Oracle DB expert, and still remember that you should always optimize from measurements. I mean there’s a good chance that the HTTP cache will help us out and make it less of a bottleneck than it appears at first glance.

As a rule of thumb, if performance means everything to you (or you have a lot of lookup fields) then you might consider leaking the internal codes. Otherwise, you should provide a more readable API by hiding them.

6. Translate Automatically

But what about translations? What if you have a multilingual application that has translations of the internal code in multiple languages? How do you handle that?

Simple! You let the API user specify its preferred language in the Accept-Language HTTP header. For example, Accept-Language: da and then the REST API should automatically translate it into Danish:

{
  "job": "Shippingmedarbejder"
}

7. Create a Resource for Metadata

So what if the API user needs to show a drop-down list that shows all possible jobs? How will he or she get a complete list of all possible values for the job field?

The easy solution for you is simply to write the list of possible values in your API documentation and then the API user can hardcode them in the drop-down list. But this will lead to fragile client apps that need to be updated when new job types are added, and it just doesn’t feel very web-like to rely on offline metadata.

A more robust solution is to create a metadata subresource that provides list of values and other metadata that are needed when using the resource. For example, a /employees/metadata subresource could provide the API user will all the metadata needed to interact with the employees resource.

This solution is similar to how Atlassian is doing in the JIRA API. If you make sure that the response from the metadata subresource is cached properly it shouldn’t affect performance adversely and you will provide a more flexible API that leads to more stable client apps.

That’s it for now. I really hope that some of these tips will help you design better REST APIs. Thanks for reading and take good care until next time!