[1/3] Clean API Endpoints? Si, Dai!

[1/3] Clean API Endpoints? Si, Dai!

Selecting Resources Firebase Style

Since my first years as a backend developer I often struggled when having to come up with meaningful endpoints to share with my frontend colleagues. After ten years of struggle, I kind of made up my mind with a set of possible best practices that somehow seem to fit most of the scenarios I encounter in my everyday job.

Each time we hire a new junior I see them stumbling upon the same issues I had to overcome before, so I put some effort in trying to formalise them and here they are!

What's Already There

Of course I know I'm not the first one to feel the need to have best practices for RESTful API endpoints. Looking around the web I found this article from Microsoft Azure which I find quite complete and clear. So for starters, the key concepts to grasp are:

  1. What is REST?
  2. HTTP Methods: What and How?
  3. What is a Resource?

When you think you have a good understanding of those key concept, keep reading as it's time to dig in!

The Firebase Realtime Database Approach

One thing that really helped me visualise and keep organised with API endpoints is Firebase Realtime Database which is basically a huge JSON file you can explore by adding segments to an URL.

This helped me realise two things:

  1. Basic CRUD API endpoints can replicate the underlying data structure and organisation: I mean, you're already putting effort in structuring your data in whatever grouping your database supports (tables, collections etc.) so I guess those are your resources right?
  2. You can SELECT a specific information in your database using a path: something like /table_name, /table_name/row_id, table_name/row_id/column_name etc.

So my base idea is that:

A good endpoint starts with selecting the right information you want to interact with!

Selecting Galore!

Let's dig in with some example starting with the "huge JSON database" case.

{
  "users": {
    "79ee89cc-fdde-4b63-8e12-953a23be8cc3": {
      "first_name": "Carlo",
      "last_name": "Moretti",
      "email": "carlo@nodai.wtf",
      "dogs": [
        "a18c50e9-1845-4e5b-bb3c-e6edec7d10e0"
      ]
    },
    "66056e5c-67d6-4600-b921-3001701565ca": {
      "first_name": "Jennifer",
      "last_name": "Rovetta",
      "email": "jenny@nodai.wtf",
      "dogs": [
        "a18c50e9-1845-4e5b-bb3c-e6edec7d10e0"
      ]
    },
    "491d8774-d304-4c30-81d3-42e5a3bcc024": {
      "first_name": "Paolo",
      "last_name": "Moretti",
      "email": "paolo@nodai.wtf",
      "dogs": [
        "88833e53-4d48-4d2a-9abb-75bfe3181670"
      ]
    }
  },
  "dogs": {
    "a18c50e9-1845-4e5b-bb3c-e6edec7d10e0": {
      "name": "Buck",
      "age": 14,
      "breed": "Mixed",
      "owners": [
         "79ee89cc-fdde-4b63-8e12-953a23be8cc3",
         "66056e5c-67d6-4600-b921-3001701565ca"
      ]
    },
    "88833e53-4d48-4d2a-9abb-75bfe3181670": {
      "name": "Gus",
      "age": 7,
      "breed": "German Shepherd",
      "owners": [
         "491d8774-d304-4c30-81d3-42e5a3bcc024"
      ]
    }
  }
}

Let's get going with some nice users endpoint path:

  • GET - /users says – «give me the list of all users»
  • GET - /users/:userId says – «give me the user with the given userId»
  • GET - /users?last_name=Moretti says – «give me the list of users with last name Moretti»

Let's go deeper:

  • GET - /users/:userId/dogs says – «give me the list of dogs of the user with the given userId»
  • GET - /users/:userId/dogs?min_age=12 says – «give me the list of dogs of the user with given id whose age is equal or greater than 12»

But Let's not go too far!

Let's say you used the endpoint GET - /users/:userId/dogs to fetch the list of a given user's dogs and you now want to get the details of a single dog of such list. Of course at this point, you might feel legitimate to make an endpoint like this:

GET - /users/:userId/dogs/:dogId which "inner voice" says – «give me the dog with id dogId from the list of dogs owned by the user with id userId»

But does this really make sense? Well, it depends...

Does the fact that the dog we are looking for belongs to the given user affect what we expect?

I think in most cases the answer would be no. In this case it's way better to "step back" to the base dogs resource and just call:

GET - dogs/:dogId

On the other hand, we may want to receive a 404 - Not Found error if we try to fetch dogs info for a dog which is not owned by the given user. In this case then, the "nested" selection would work great!

I think a general rule when building a selection path is

Does the context you're giving affect what you expect? If not, then go back to the base resource!

Attribute level selections

Thought before I hinted that /table_name/row_id/column_name would be a possible selection path, I'm not too fond of such a selection as it goes too deep in the models inspection and it gets quite confusing when figuring out the effects and responses of such an endpoints.

Let's consider the following examples:

  • GET - users/last_name says – «give me the list of last names of all the users»
  • GET - users/:userId/last_name says – «give me the last name of the user with the given userId»

But what do we expect the response of such endpoints to be?

  • A (list of) string(s): not quite right if we want to stick with JSON formatted responses and resource models.
  • A (list of) { "last_name": "...." } objects: which is also quite confusing because those item do not actually represent a resource and might get quite confusing and redundant to handle client side.

On the other hand, we might want to fill a select with the list of all available last names in the system, so calling something like:

GET - users/last_name?distinct=true which says – «give me the list of all distinct last names of all users»

Might actually make sense!

Singulars and Plurals

A general rule is to put resource names as plural because this way you represent a collection of such resources. So users represent the collection of all users, which can identify a MongoDB collection or a table in PostgreSQL ecc.

When selecting a collection you should always expect a list of resources.

Query string parameters should represent filters to apply to such a list, so selecting a collection with query string parameters should still return a list of filtered resources.

Single objects should be selected using an identifier (like an UUID) so to get a specific object add the identifier to the path after the collection it is found in such as /users/:userId.

Now let's imagine each user can have a list of addresses which are user-specific and can't really be shared between users, so an user can be like:

{
  "first_name": "Carlo",
  "last_name": "Moretti",
  "email": "carlo@nodai.wtf",
  "dogs": [
    "a18c50e9-1845-4e5b-bb3c-e6edec7d10e0"
  ],
  "addresses": [
    {
       "favourite": true,
       "address": "Via Lunga 123333",
       "zip_code": "20100",
       "city": "Brescia",
       "country": "Italy"
    },
    {
       "address": "Via Corta 1/10",
       "zip_code": "20100",
       "city": "Brescia",
       "country": "Italy"
    }
  ]
}

Since addresses aren't resources and don't have an identifier, then you shouldn't be able to select a single address with a path! You should stop at users/:userId/addresses or at most users/:userId/addresses?favourite=true but both must return lists of addresses even if the second one will always contain at most a single item because the favourite address can be at most one.

So to wrap it up:

Collections and sub-collections should always be addressed with a plural and return a list of items. Only lists should accept query string filters. Only items that have a proper identifier should be selectable.

Next Steps

In this article I explained some best practice rule I comply with when creating endpoints to select informations (with a nice http GET method!). With the next articles I'll cover:

  • [2/3] Apply CRUD methods to selected informations
  • [3/3] Other non CRUD methods and actions