API with NestJS #77. Offset and keyset pagination with raw SQL queries

JavaScript NestJS

This entry is part 77 of 121 in the API with NestJS

So far, we have returned the full content of our tables. However, as our database grows, this might prove not to be the best approach in terms of performance. A popular solution is to serve the data in chunks by presenting multiple pages or implementing infinite scrolling. In this article, we implement its back-end aspect using NestJS and PostgreSQL. We also compare various approaches to achieving it and point out their advantages and disadvantages.

You can find the code from this article in this repository.

Offset and limit

Let’s start by investigating this simple query:

It returns all of the records from the table.

 

A significant thing to acknowledge is that the order of the above rows is not guaranteed. However, when implementing pagination, we depend on the order of rows to be predictable. Therefore, we should use the clause.

To start paginating our data, we need to limit the number of rows in our query. To do that, we need the statement.

Thanks to the above, we now get the first ten items instead all of them. This allows us to present the user with the first page of results.

To serve the second page of the data, we need to specify the starting point of our query. We can use the keyword to specify how many rows we want to skip.

Above, we omit the first ten posts and get ten posts in the results. In our case, it gives us entities with ids from 11 to 20. This is where the order of our data plays a significant role. We can easily modify it by changing the clause, but keeping some order is important.

Counting the number of rows

It is a common approach to display the number of data pages to the user. For example, if we have fifty rows and display ten per page, we have five data pages.

To do the above, we need to know the number of rows of data in our table. To do that, we can use the keyword.

We can use the keyword while selecting data from some columns. When doing that, we need to specify the section of the data we are counting by partitioning it. For example, we can count the number of posts by a certain author.

To present the results in a readable way, let’s only display one row per author using the keyword.

Above, we can see that the author with id wrote two posts, and the author with id wrote forty posts.

In our case, we want to count the total number of posts. However, even though that’s the case, we still need to use the clause.

Grouping and partitioning data with the function is a good topic for a separate article.

The whole idea is to count the number of rows and fetch their details in the same transaction to keep the integrity of the data. When we run a single query, PostgreSQL wraps it in a transaction out of the box.

We can define a transaction separately if we want to count the posts in a separate statement.

If you want to know more about transactions, check out API with NestJS #76. Working with transactions using raw SQL queries

It is also important to notice that PostgreSQL returns the result of as big int. The maximum value of a regular integer is 2³¹⁻¹ (2,147,483,647), and for a big integer, it is 2⁶³⁻¹ (9,223,372,036,854,775,807).

Unfortunately, JavaScript does not know how to parse big integers to JSON out of the box.

Uncaught TypeError: Do not know how to serialize a BigInt

If we don’t expect our table to hold more than 2,147,483,647 elements, we can cast the result of to a regular integer.

Implementing offset pagination with NestJS

When implementing the offset pagination with NestJS, we expect the user to provide the offset and limit as query parameters. To handle that, we can create a designated class.

paginationParams.ts

We then use it in our controller.

posts.controller.ts

The last step is to implement the logic in our class.

posts.repository.ts

A significant thing above is that we provide default values for offset and limit:

  • providing for offset means that we don’t intend to skip any rows,
  • by setting the limit to , we state that we don’t want to limit the results.

Doing all of the above, we end up with fully functional offset pagination.

Disadvantages

The offset and limit approach to pagination is widely used. Unfortunately, it has some significant disadvantages.

The most important caveat is that the database needs to compute all of the rows skipped by the keyword. This can take a toll on the performance:

  • first, the database sorts all of the rows as specified in the clause,
  • then, PostgreSQL drops the number of rows specified in the .

Aside from the above issue, we can run into a problem with consistency:

  1. the first user fetches page number one with posts,
  2. the second user creates a new post that ends up on page number one,
  3. the first user fetches the second page.

Unfortunately, the above operations cause the first user to see the last element of the first page again on the second page. Besides that, the user missed the element added to the first page.

Advantages

The offset approach is very common and straightforward to implement. It is also very easy to change the column we use for sorting, including multiple columns. It makes it an acceptable solution in many cases, especially if the offset is not expected to be big and the data inconsistencies are acceptable.

Keyset pagination

We can take another approach to pagination by filtering out the data we’ve already seen using the keyword instead of . First, let’s run the following query:

In the results, we can see that the last post has an id of . We can now use this knowledge to request posts with the id bigger than .

To get the next page of results, we need to inspect the above results and notice that the id of the last row is . We can use that to modify our clause.

Unfortunately, this exposes the most significant disadvantages of the keyset pagination. To get a chunk of data, we need to know the id of the last element of the previous chunk. This makes traversing more than one page at once impossible.

To change the column by which we order our elements, we need to modify both and clauses.

Counting the number of rows

It is crucial to notice that using the clause affects the rows counted with . To deal with this issue, we need to count the rows separately. We can create an explicit transaction or use a Common Table Expression query using the statement.

Implementing keyset pagination with NestJS

First, let’s modify our class to accept an additional query parameter.

paginationParams.ts

We also need to modify our to handle the additional parameter.

posts.repository.ts

Disadvantages

The most apparent disadvantage of the keyset pagination is that the users need to know the id of the row they want to start with. However, we could overcome that by mixing the offset-based pagination with the keyset pagination.

Additionally, the column used in the clause should have an index for an additional performance boost. Fortunately, PostgreSQL creates an index for every primary key out of the box. Therefore, the keyset pagination should perform well when using ids.

Also, ordering the results by text fields might not be straightforward if we want to use natural sorting. If you want to know more, check out this answer on StackOverflow.

Advantages

The keyset pagination can be a significant performance improvement over the offset-based approach. It also solves the data inconsistency issue we can experience with offset pagination. The user adding or removing elements between fetching chunks of data does not cause elements to be duplicated or skipped.

Summary

In this article, we’ve gone through two different approaches to pagination with PostgreSQL. After pointing out their advantages and disadvantages, we can conclude that each can be a reasonable solution. The keyset pagination is more restrictive but can provide a performance boost. Fortunately, we can mix different ways of paginating the data, and combining the offset and keyset pagination can cover a wide variety of cases.

Series Navigation<< API with NestJS #76. Working with transactions using raw SQL queriesAPI with NestJS #78. Generating statistics using aggregate functions in raw SQL >>
Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Bert
Bert
1 year ago

Keyset pagination seems great. Although an issue I see : when the first user checks page 2, the second user deletes the entity with ID 20, the first user fetches page 3

But isn’t keyset pagination only usable with generated number ID’s?