What HTTP status code should I use to verify health?

I am implementing an endpoint /_status/ , which performs some data health checks in our database.

For example, we collect measurements, and the state should go "bad" if the last measurements exceed an hour.

I would like to point Pingdom to this URL in order to use their alert infrastructure and let us know when something is wrong.

In โ€œgoodโ€ condition, I will serve the HTML page with an HTTP status of 200 OK. But what would be the appropriate HTTP status code for the "bad"? Or would it be more correct not to transmit this information through a status code, but instead through HTML content?

Thanks!

+6
source share
3 answers

We just had a similar discussion in our group. For our purposes, we decided that HTTP response codes should report the success of your server or not fulfill this request. For GET, this means that you can respond with the requested resource. In this case, the requested resource is a health report, so while you return it successfully, it should be 200 response.

We return JSON for health checks, with the top-level "isHealthy" field set to true or false. Our load balancer and other monitors will analyze JSON and use this field to determine if the system is healthy or not.

If you do not want to parse JSON in your monitors, you can try placing a custom response header to indicate the binary state of the system, for example, System-Health: true or System-Health: false . You may be lucky to see monitors that can verify this.

If you really want to use a response code, I would recommend an additional endpoint called "health", which returns "204 No Content" when it is healthy, and "404 Not Found" when it is not healthy. In this case, the resource identified by the URL is symbolically the health of your system, so if it is healthy, you can return a successful response. If it is unhealthy, then this health cannot be found, therefore, 404.

+5
source

Well ... this is an old question, but I ended up here, so I thought I would give two cents here: It seems pretty obvious that 2xx should be returned if everything is ok

If health is not in order, I think he should return the result 5xx (4xx indicates that the client is to blame for the request, 2xx and 3xx are all successful to some extent).

I think 5xx is correct, because it is a special request that is responsible for the state of the entire service. In addition, since most load balancers offer live checks based on response codes, and not everyone offers a way to analyze more complex payloads (except perhaps RegExp matching, which can make the check fragile).

I agree with @Julien that 500 (in particular) does not seem appropriate, and we decided to use 503 Service Unavailable .

503 seems to be suitable for several reasons:

  • This is the result code of the 5xx family, which indicates that something is happening on the server side.
  • It is temporary in nature, indicating that it can recover.
+3
source

If your data is โ€œbadโ€ because there is a service failure (even if the job was unsuccessful), then the HTTP 500 seems like a valid response. This means that something is broken.

This is not very specific, you shrug and say:

A status code of 500 (Internal Server Error) indicates that the server encountered an unexpected condition that prevented it from completing the request.

ietf rfc7231

0
source

Source: https://habr.com/ru/post/974040/


All Articles