API with NestJS #98. Health checks with Terminus and Amazon ECS

March 6, 2023

This entry is part 98 of 121 in the API with NestJS

In one of the previous parts of this series, we learned how to use the Elastic Container Service to deploy multiple instances of our application. With this architecture, we maintain the target group, where each target is a single instance of our application. Thanks to that, the load balancer can route a particular API request to one of the registered targets.

Before redirecting the traffic to a particular target, the load balancer must know if the target can handle it. To determine that, the load balancer periodically sends requests to all registered targets to test them. We call those tests health checks. Thanks to performing them, the load balancer redirects the traffic only to the healthy targets.

A common approach is to create a designated endpoint that responds with the status of the application. To create it, we can use the tool called Terminus that NestJS is equipped with.

Using Terminus

Let’s start by installing the Terminus library.

1	npm install @nestjs/terminus

To introduce an endpoint using Terminus, we should create a new controller.

health.controller.ts

import { Controller, Get } from '@nestjs/common';

import { HealthCheckService, HealthCheck } from '@nestjs/terminus';

@Controller('health')

class HealthController {

constructor(private healthCheckService: HealthCheckService) {}

@Get()

@HealthCheck()

check() {

return this.healthCheckService.check([]);

}

export default HealthController;

The @HealthCheck() decorator is optional. As we can see under the hood, it allows for integrating Terminus with Swagger.

The most important thing above is the healthCheckService.check method. The code we have so far gives us a straightforward health check.

Built-in health indicators

We can perform more advanced checks using the health indicators built into NestJS. With them, we can verify a particular aspect of our application.

A very good example is the TypeOrmHealthIndicator. Under the hood, it performs a simple SELECT SQL query to verify that our database is up and running. Doing that also ensures we’ve established a connection successfully.

There is also the MikroOrmHealthIndicator, SequelizeHealthIndicator, and MongooseHealthIndicator if you are using some other ORM than TypeORM.

health.controller.ts

import { Controller, Get } from '@nestjs/common';

import {

HealthCheckService,

HealthCheck,

TypeOrmHealthIndicator,

} from '@nestjs/terminus';

@Controller('health')

class HealthController {

constructor(

private healthCheckService: HealthCheckService,

private typeOrmHealthIndicator: TypeOrmHealthIndicator,

) {}

@Get()

@HealthCheck()

check() {

return this.healthCheckService.check([

() => this.typeOrmHealthIndicator.pingCheck('database'),

]);

}

export default HealthController;

The healthCheckService.check method responds with a few properties:

status
- if all of our health indicators report success, it equals ok. Otherwise, it can be shutting_down or an error. If the status is not ok, the endpoint responds with 503 Service Unavailable instead of 200 OK.
info
- has data about each healthy indicator
error
- contains information about every unhealthy indicator
details
- has data about every indicator

Terminus offers more health indicators than just those related to the database:

HttpHealthIndicator
- allows us to make an HTTP request and verify if it’s working as expected
MemoryHealthIndicator
- verifies if the process does not exceed a specific memory limit
DiskHealthIndicator
- checks how much storage our application uses
MicroserviceHealthIndicator
- ensures a given microservice is up. To learn more about microservices, check out API with NestJS #18. Exploring the idea of microservices,
GRPCHealthIndicator
- verifies if a service is working as expected using the standard health check specification of GRPC.

Custom health indicators

The above list contains health indicators for various ORMs. However, in some parts of this series, we’ve worked with raw SQL without any ORM.

Fortunately, we can set up a custom health indicator. To do that, we need to extend the HealthIndicator class.

databaseHealthIndicator.ts

import { Injectable } from '@nestjs/common';

import {

HealthIndicator,

HealthIndicatorResult,

HealthCheckError,

} from '@nestjs/terminus';

import DatabaseService from '../database/database.service';

@Injectable()

class DatabaseHealthIndicator extends HealthIndicator {

constructor(private readonly databaseService: DatabaseService) {

super();

}

async isHealthy(): Promise<HealthIndicatorResult> {

try {

await this.databaseService.runQuery('SELECT 1');

return this.getStatus('database', true);

} catch (error) {

throw new HealthCheckError(

'DatabaseHealthIndicator failed',

this.getStatus('database', false),

);

}

export default DatabaseHealthIndicator;

The this.getStatus method generates the health indicator result that ends up in the info, error, and details objects.

To include it, we must call our new isHealthy method in the HealthController.

health.controller.ts

import { Controller, Get } from '@nestjs/common';

import { HealthCheckService, HealthCheck } from '@nestjs/terminus';

import DatabaseHealthIndicator from './databaseHealthIndicator';

@Controller('health')

class HealthController {

constructor(

private healthCheckService: HealthCheckService,

private databaseHealthIndicator: DatabaseHealthIndicator,

) {}

@Get()

@HealthCheck()

check() {

return this.healthCheckService.check([

() => this.databaseHealthIndicator.isHealthy(),

]);

}

export default HealthController;

Setting the health check with AWS

We need to point the load balancer to our /health endpoint. We do that when setting up the load balancer while starting tasks in our Elastic Container Service cluster.

Above, when creating the target group for our cluster, we specify /health as the health check path. Thanks to that, the load balancer periodically sends requests to the /health endpoint to determine if a particular instance of our NestJS application is working as expected.

If our task takes a long time to start, the load balancer might mark it as unhealthy and shut it down. We can prevent that by setting up the health check grace period in the above form. This gives our tasks additional time to reach a healthy state.

Verifying if our tasks are running

In the previous part of this series, we learned how to manage logs with Amazon CloudWatch. We also created the LoggerInterceptor that logs every endpoint requested in our API.

Let’s look at the logs to verify if the load balancer requests our /health endpoint.

It’s also worth looking at the “Health and metrics” tab on the page dedicated to the service running our tasks.

If no target is healthy, the load balancer cannot handle the incoming traffic. So if our application does not work, it’s one of the first things to check when debugging.

It’s also worth looking at the “Deployment and events” tab. If something goes wrong with our deployment, the issue will often be visible in the “events” table.

Summary

In this article, we learned what health checks are and how to design them. We also used them together with Amazon Elastic Container Service to verify if the instances of our NestJS application were running correctly. While doing so, we’ve learned more about debugging our NestJS app running with AWS and why we need to care about the health of our tasks running in the cluster.

There is still more to learn about running NestJS with AWS, so stay tuned!

Series Navigation<< API with NestJS #97. Introduction to managing logs with Amazon CloudWatchAPI with NestJS #99. Scaling the number of application instances with Amazon ECS >>