- 1. API with NestJS #1. Controllers, routing and the module structure
- 2. API with NestJS #2. Setting up a PostgreSQL database with TypeORM
- 3. API with NestJS #3. Authenticating users with bcrypt, Passport, JWT, and cookies
- 4. API with NestJS #4. Error handling and data validation
- 5. API with NestJS #5. Serializing the response with interceptors
- 6. API with NestJS #6. Looking into dependency injection and modules
- 7. API with NestJS #7. Creating relationships with Postgres and TypeORM
- 8. API with NestJS #8. Writing unit tests
- 9. API with NestJS #9. Testing services and controllers with integration tests
- 10. API with NestJS #10. Uploading public files to Amazon S3
- 11. API with NestJS #11. Managing private files with Amazon S3
- 12. API with NestJS #12. Introduction to Elasticsearch
- 13. API with NestJS #13. Implementing refresh tokens using JWT
- 14. API with NestJS #14. Improving performance of our Postgres database with indexes
- 15. API with NestJS #15. Defining transactions with PostgreSQL and TypeORM
- 16. API with NestJS #16. Using the array data type with PostgreSQL and TypeORM
- 17. API with NestJS #17. Offset and keyset pagination with PostgreSQL and TypeORM
- 18. API with NestJS #18. Exploring the idea of microservices
- 19. API with NestJS #19. Using RabbitMQ to communicate with microservices
- 20. API with NestJS #20. Communicating with microservices using the gRPC framework
- 21. API with NestJS #21. An introduction to CQRS
- 22. API with NestJS #22. Storing JSON with PostgreSQL and TypeORM
- 23. API with NestJS #23. Implementing in-memory cache to increase the performance
- 24. API with NestJS #24. Cache with Redis. Running the app in a Node.js cluster
- 25. API with NestJS #25. Sending scheduled emails with cron and Nodemailer
- 26. API with NestJS #26. Real-time chat with WebSockets
- 27. API with NestJS #27. Introduction to GraphQL. Queries, mutations, and authentication
- 28. API with NestJS #28. Dealing in the N + 1 problem in GraphQL
- 29. API with NestJS #29. Real-time updates with GraphQL subscriptions
- 30. API with NestJS #30. Scalar types in GraphQL
- 31. API with NestJS #31. Two-factor authentication
- 32. API with NestJS #32. Introduction to Prisma with PostgreSQL
- 33. API with NestJS #33. Managing PostgreSQL relationships with Prisma
- 34. API with NestJS #34. Handling CPU-intensive tasks with queues
- 35. API with NestJS #35. Using server-side sessions instead of JSON Web Tokens
- 36. API with NestJS #36. Introduction to Stripe with React
- 37. API with NestJS #37. Using Stripe to save credit cards for future use
- 38. API with NestJS #38. Setting up recurring payments via subscriptions with Stripe
- 39. API with NestJS #39. Reacting to Stripe events with webhooks
- 40. API with NestJS #40. Confirming the email address
- 41. API with NestJS #41. Verifying phone numbers and sending SMS messages with Twilio
- 42. API with NestJS #42. Authenticating users with Google
- 43. API with NestJS #43. Introduction to MongoDB
- 44. API with NestJS #44. Implementing relationships with MongoDB
- 45. API with NestJS #45. Virtual properties with MongoDB and Mongoose
- 46. API with NestJS #46. Managing transactions with MongoDB and Mongoose
- 47. API with NestJS #47. Implementing pagination with MongoDB and Mongoose
- 48. API with NestJS #48. Definining indexes with MongoDB and Mongoose
- 49. API with NestJS #49. Updating with PUT and PATCH with MongoDB and Mongoose
- 50. API with NestJS #50. Introduction to logging with the built-in logger and TypeORM
- 51. API with NestJS #51. Health checks with Terminus and Datadog
- 52. API with NestJS #52. Generating documentation with Compodoc and JSDoc
- 53. API with NestJS #53. Implementing soft deletes with PostgreSQL and TypeORM
- 54. API with NestJS #54. Storing files inside a PostgreSQL database
- 55. API with NestJS #55. Uploading files to the server
- 56. API with NestJS #56. Authorization with roles and claims
- 57. API with NestJS #57. Composing classes with the mixin pattern
- 58. API with NestJS #58. Using ETag to implement cache and save bandwidth
- 59. API with NestJS #59. Introduction to a monorepo with Lerna and Yarn workspaces
- 60. API with NestJS #60. The OpenAPI specification and Swagger
- 61. API with NestJS #61. Dealing with circular dependencies
- 62. API with NestJS #62. Introduction to MikroORM with PostgreSQL
- 63. API with NestJS #63. Relationships with PostgreSQL and MikroORM
- 64. API with NestJS #64. Transactions with PostgreSQL and MikroORM
- 65. API with NestJS #65. Implementing soft deletes using MikroORM and filters
- 66. API with NestJS #66. Improving PostgreSQL performance with indexes using MikroORM
- 67. API with NestJS #67. Migrating to TypeORM 0.3
- 68. API with NestJS #68. Interacting with the application through REPL
- 69. API with NestJS #69. Database migrations with TypeORM
- 70. API with NestJS #70. Defining dynamic modules
- 71. API with NestJS #71. Introduction to feature flags
- 72. API with NestJS #72. Working with PostgreSQL using raw SQL queries
- 73. API with NestJS #73. One-to-one relationships with raw SQL queries
- 74. API with NestJS #74. Designing many-to-one relationships using raw SQL queries
- 75. API with NestJS #75. Many-to-many relationships using raw SQL queries
- 76. API with NestJS #76. Working with transactions using raw SQL queries
- 77. API with NestJS #77. Offset and keyset pagination with raw SQL queries
- 78. API with NestJS #78. Generating statistics using aggregate functions in raw SQL
- 79. API with NestJS #79. Implementing searching with pattern matching and raw SQL
- 80. API with NestJS #80. Updating entities with PUT and PATCH using raw SQL queries
- 81. API with NestJS #81. Soft deletes with raw SQL queries
- 82. API with NestJS #82. Introduction to indexes with raw SQL queries
- 83. API with NestJS #83. Text search with tsvector and raw SQL
- 84. API with NestJS #84. Implementing filtering using subqueries with raw SQL
- 85. API with NestJS #85. Defining constraints with raw SQL
- 86. API with NestJS #86. Logging with the built-in logger when using raw SQL
- 87. API with NestJS #87. Writing unit tests in a project with raw SQL
- 88. API with NestJS #88. Testing a project with raw SQL using integration tests
- 89. API with NestJS #89. Replacing Express with Fastify
- 90. API with NestJS #90. Using various types of SQL joins
- 91. API with NestJS #91. Dockerizing a NestJS API with Docker Compose
- 92. API with NestJS #92. Increasing the developer experience with Docker Compose
- 93. API with NestJS #93. Deploying a NestJS app with Amazon ECS and RDS
- 94. API with NestJS #94. Deploying multiple instances on AWS with a load balancer
- 95. API with NestJS #95. CI/CD with Amazon ECS and GitHub Actions
- 96. API with NestJS #96. Running unit tests with CI/CD and GitHub Actions
- 97. API with NestJS #97. Introduction to managing logs with Amazon CloudWatch
- 98. API with NestJS #98. Health checks with Terminus and Amazon ECS
- 99. API with NestJS #99. Scaling the number of application instances with Amazon ECS
- 100. API with NestJS #100. The HTTPS protocol with Route 53 and AWS Certificate Manager
- 101. API with NestJS #101. Managing sensitive data using the AWS Secrets Manager
- 102. API with NestJS #102. Writing unit tests with Prisma
- 103. API with NestJS #103. Integration tests with Prisma
- 104. API with NestJS #104. Writing transactions with Prisma
- 105. API with NestJS #105. Implementing soft deletes with Prisma and middleware
- 106. API with NestJS #106. Improving performance through indexes with Prisma
- 107. API with NestJS #107. Offset and keyset pagination with Prisma
- 108. API with NestJS #108. Date and time with Prisma and PostgreSQL
- 109. API with NestJS #109. Arrays with PostgreSQL and Prisma
- 110. API with NestJS #110. Managing JSON data with PostgreSQL and Prisma
- 111. API with NestJS #111. Constraints with PostgreSQL and Prisma
- 112. API with NestJS #112. Serializing the response with Prisma
- 113. API with NestJS #113. Logging with Prisma
- 114. API with NestJS #114. Modifying data using PUT and PATCH methods with Prisma
- 115. API with NestJS #115. Database migrations with Prisma
- 116. API with NestJS #116. REST API versioning
- 117. API with NestJS #117. CORS – Cross-Origin Resource Sharing
- 118. API with NestJS #118. Uploading and streaming videos
- 119. API with NestJS #119. Type-safe SQL queries with Kysely and PostgreSQL
- 120. API with NestJS #120. One-to-one relationships with the Kysely query builder
- 121. API with NestJS #121. Many-to-one relationships with PostgreSQL and Kysely
As our database grows, we need to put more and more emphasis on performance. A popular approach to dealing with this problem is by implementing indexes. In this article, we look into doing that with raw SQL in a project that does not use ORM.
Introducing indexes
So far, in this series, we’ve created a table for storing posts.
1 2 3 4 5 6 |
CREATE TABLE posts ( id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, title text NOT NULL, post_content text NOT NULL, author_id int REFERENCES users(id) NOT NULL ) |
We’ve also implemented getting a list of posts written by a particular author.
posts.repository.ts
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
import { Injectable, } from '@nestjs/common'; import DatabaseService from '../database/database.service'; import PostModel from './post.model'; @Injectable() class PostsRepository { constructor(private readonly databaseService: DatabaseService) {} async getByAuthorId( authorId: number, offset = 0, limit: number | null = null, idsToSkip = 0, ) { const databaseResponse = await this.databaseService.runQuery( ` WITH selected_posts AS ( SELECT * FROM posts WHERE author_id=$1 AND id > $4 ORDER BY id ASC OFFSET $2 LIMIT $3 ), total_posts_count_response AS ( SELECT COUNT(*)::int AS total_posts_count FROM posts WHERE author_id=$1 AND id > $4 ) SELECT * FROM selected_posts, total_posts_count_response `, [authorId, offset, limit, idsToSkip], ); const items = databaseResponse.rows.map( (databaseRow) => new PostModel(databaseRow), ); const count = databaseResponse.rows[0]?.total_posts_count || 0; return { items, count, }; } // ... } export default PostsRepository; |
The above method also implements pagination. If you want to know more, check out API with NestJS #77. Offset and keyset pagination with raw SQL queries
In our SQL query, we add WHERE author_id=$1. Because of that, PostgreSQL needs to scan the entire posts table to find matching records. The bigger the table, the more it affects the performance. Let’s run a simple query and see how fast it is
1 2 3 |
EXPLAIN ANALYZE SELECT * FROM posts WHERE author_id = 14 |
The crucial part of the above query plan is that PostgreSQL uses the parallel sequential scan that takes quite a lot of time. We can deal with this problem by adding an index.
Adding an index
An index can make some of our queries faster by organizing a table using a particular column. Let’s create a migration that adds an index using the author_id column.
1 |
npx knex migrate:make add_post_author_id_index |
20221105181728_add_post_author_id_index.ts
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import { Knex } from 'knex'; export async function up(knex: Knex): Promise<void> { return knex.raw(` CREATE INDEX post_author_id_index ON posts (author_id) `); } export async function down(knex: Knex): Promise<void> { return knex.raw(` DROP INDEX post_author_id_index `); } |
After running the npx knex migrate:latest command, our database is ready to go. Let’s rerun the same query and compare.
1 2 3 |
EXPLAIN ANALYZE SELECT * FROM posts WHERE author_id = 14 |
As we can see above, the execution time of our query dropped drastically. Now, PostgreSQL can run an index scan instead of a parallel sequential one.
When we create an index, PostgreSQL maintains a data structure organized using a certain column. We can imagine the index as key and value pairs. In the case of the above example, the keys are author ids, and the values point to particular posts.
author_id | post_id |
---|---|
1 | 1 |
2 | 2 |
2 | 3 |
2 | 4 |
3 | 5 |
To be more precise, the actual data structures PostgreSQL uses for indexing are more intricate. By default, PostgreSQL implements the B-tree data structure with each leaf poiting to a particular table row.
Thanks to having a data structure sorted by the author id, PostgreSQL can quickly find all posts written by a particular author. However, besides the obvious advantages when fetching data, indexes have some important downsides.
Indexes can speed up our SELECT queries but make our inserts and updates slower. This is because PostgreSQL needs to update the indexes each time we modify the data. Indexes also require additional space in our database.
Multi-column indexes
Making queries with multiple conditions is a popular case. Let’s take a look at our comments table:
1 2 3 4 5 6 7 |
CREATE TABLE comments ( id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, content text NOT NULL, post_id int REFERENCES posts(id) NOT NULL, author_id int REFERENCES posts(id) NOT NULL, deletion_date timestamptz ); |
A common query would be selecting comments that are not deleted and are related to a particular post.
1 2 |
SELECT * FROM comments WHERE post_id=1 AND deletion_date IS NULL |
Creating an index just on the post_id would speed up the above query. However, if we want to take it even further, we could create a multi-column index.
1 2 |
CREATE INDEX post_id_deletion_date_comments_index ON comments (post_id, deletion_date) |
Unique indexes
In one of the previous articles in this series, we’ve defined a table of users.
1 2 3 4 5 6 |
CREATE TABLE users ( id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, email text NOT NULL UNIQUE, name text NOT NULL, password text NOT NULL ) |
We’ve decided to mark the email column as UNIQUE. Because of that, every time we insert a new row into the users table, PostgreSQL checks if the given email is already in the table.
The important thing is that creating a unique constraint causes PostgreSQL to create an index. Thanks to that, the database can quickly search through all existing records to determine if a given value is unique.
The index created through the UNIQUE keyword can also be helpful in the SELECT queries if the database decides it might cause a performance boost.
Types of indexes
All of the above indexes so far have used the B-tree structure. This is because it fits most use cases, but there are other options too.
Hash indexes
The hash indexes use the hash table under the hood, which might prove to be beneficial in some use cases.
1 |
CREATE INDEX posts_title_index ON posts USING hash (title) |
Generalized Inverted Indexes (GIN)
For the GIN indexes to work, we might need to enable the pg_trim and btree_gin extensions.
1 2 |
CREATE EXTENSION pg_trgm; CREATE EXTENSION btree_gin; |
Once we do the above, we can use the USING GIN keyword to create the index.
1 |
CREATE INDEX posts_title_index ON posts USING GIN (title) |
Generalized Search Tree (GIST)
The GIST indexes might be preferable over GIN in some cases. They might be useful when indexing geometric data and implementing text search.
For it to work, we might need to enable the btree_gist extension.
1 |
CREATE EXTENSION btree_gist; |
We can now use the USING GIST keyword.
1 |
CREATE INDEX posts_title_index ON posts USING GIST (title) |
Block Range Indexes (BRIN)
When dealing with data types with a linear sort order, the Block Range Indexes might prove to be useful.
1 |
CREATE INDEX posts_title_index ON posts USING BRIN (title) |
Summary
In this article, we’ve gone through the basics of indexes and considered their advantages and disadvantages. Then, we implemented an example that improves the performance of our SELECT queries.
Besides the regular indexes, we’ve also mentioned multi-column indexes and indexes using data structures different from the B-tree. Some of them are worth a separate article, so stay tuned!