Testing HTTP clients


tl;dr

  • Contract testing is nice but doesn’t prevent responsibility from consumers to still test failure cases.

  • You should write software with the assumption that things will fail and make a plan for it because this can have weird flow on consequences.

I am writing a HTTP client/wrapper, what options do I have for testing?

A general guiding principal when it comes to any testing is “how early and how easily can I validate the correctness of my system”.

Like a lot of conversations that involve testing, this comes down to understanding how you are designing your architecture (both of the system as a whole and the service in general).

By this there are a few questions I would start with asking:

  • Is the service you are consuming a service built internal to your company or an externally maintained service?
  • How do you consume this API, is it via a client library of some kind or is are you writing your own client?
  • What are the consequences of a failure scenario if a transaction fails somewhere?

A general call out is that, I personally really dislike mocking a HTTP client of any kind. I prefer always using an actual HTTP client that acts against a stub server hosted using a tool such was WireMock.NET, LocalStack or using Pact if it’s relevant (but more on that later).

Internally maintained services

When it comes to testing against internally maintained systems one of my preferred tools is using Pact for consumer driven contract testing. Part of the benefit of this is not actually testing your code, but helping solve the sociotechnical problem of “how do we ensure breaking changes can’t occur and how do we ensure teams talk”. The important thing when trying to utilise this though is to ensure you are following Postel’s Law, the more fields you try to couple to that you don’t actually rely on the less valuable this test becomes.

The issue with Pact is that it doesn’t actually help you properly for failure or unexpected cases.

  • What happens if you’re coupling to an enum that suddenly returns a new value?
  • What if the service times out?
  • What if an authorization policy changes an 403’s suddenly start getting thrown?
  • What if the service throws a 500?
  • What if the service throws a 429?
  • What if the service throws a 418?
  • What if a reverse proxy fails between you and the consumer and you suddenly are getting HTML back from an nginx generic error page?

These are all scenarios that don’t make sense to test as a part of a a contract and some of these should be considered (a 418 is always a risk).

When it comes to considering these kinds of issues I still prefer spinning up a temporary HTTP server and controlling the responses that are returned via that server. I still consider this a “unit” test because it’s the way I can validate the unit of work of my client, but this strategy is invaluable as we also move into having locally runnable integration tests that test the side effects of downstream failures on things like your stored database state.

You’re not alone if you don’t account for all failure scenarios, Google have a great story in their SRE handbook for The Global Planned Chubby Outage in which they deliberately took down a core service because they knew people were consuming this with an assumption it would never go down, but, when you take it down, uhh…

Its high reliability provided a false sense of security because the services could not function appropriately when Chubby was unavailable, however rarely that occurred.

Externally maintained services

External services such as cloud services or third party API’s are not a good fit for testing with contract tests because there’s no internal contract unsurprisingly enough.

There’s always a double edged sword, testing locally gives you great and quick feedback cycles and also what I call the train factor, but actually deployed infrastructure gives actual certainty.

Local testing and local development should always be primary priority for general productivity, and this generally, unfortunately, means some concessions need to be made. This means that you may consider using something like LocalStack for local development against an emulated AWS environment and couple that with TestContainers to get repeatable tests for all engineers and in CI. Likewise you can utilise a stub HTTP client locally and a live client in AWS and use unit tests that hit a HTTP server running in something such as WireMock to check the validity of both your production and development clients against known contracts.

Here comes the actual fun part though, how do you account for things your emulators can’t simulate (eg, API errors ) When it comes to integration tests I love libraries like terratest which allows for your Infrastructure as Code to be spun up for the duration of a test and then instantly torn down afterwards.

This allows you to ensure that all API requests make sense against live infra with up to date API limitations/restrictions in mind, not just the emulated ones as well as testing of things like “does my security groups allow actual communication”.

This approach doesn’t always play nice with “fast local tests” because you are relying on a slow initial delay of the infrastructure being created as well.

With this in mind I do love coupling this with to spin up local containers running tools such as LocalStack as long as you’re ok with “this isn’t perfect, but it’s ok I guess”.

Weird failure cases to think about.

There is one anecdotal story I heard of someone who was writing a Lambda function that listened to an SQS queue that created new versions of templates within AWS’s Service Catalogue which allows end users to create resources from known safe templates.

They found a scenario where the general flow was Event > Service Catalogue > Database but they started to face issues in the database transaction, which then did not lead to a reversion of the Service Catalogue update.

When this failed, the use of Lambda lead to the event retrying multiple times. The consequence? A very very very quick jump to version 40-something of the template.

Ultimately this is a trivial example, but still shows the importance of testing a full system for failure cases and understanding the atomic state of any transaction and the relevant touch points.

If you do a POST request to create a resource which fails after the POST request, should you also fire a follow up DELETE request?

This is a lot of why you can’t just rely on contract testing or unit testing when it comes to failure scenarios.

The HTTP request isn’t the only thing always influenced by the outcome, it’s the transaction as a whole.

Sure, you could stub out the HTTP client, but that just feels like a weird level to mock things at given the tooling that exists to do the job easier and better.

I have an off the shelf SDK, what should I do?

Hope it’s unit tested against their contract I guess?

I still like using LocalStack because there’s been a few times where the AWS API has a “you need either A or B in your request”, the compiler doesn’t catch it but LocalStack does.

It’s lightweight enough for me to be happy enough for common workflows but also knowing that if I need LocalStack Pro that I’ll get scared for the architecture of whatever I’m doing or the actual value add of emulation there is so niche that I’m ok considering it an edge case I’ll check in a post-deploy smoke test.