In this blog series I’ve written about&nbsp;<a href="https://www.reaktor.com/blog/little-server-patterns-dependency-parameters/">replacing global configurations with dependency parameters</a>&nbsp;and&nbsp;<a href="https://www.reaktor.com/blog/little-server-patterns-failing-quickly/">validating and failing quickly at the boundaries of your service</a>. This post is about why and how to write independent tests for your servers.

Problem: Dependent tests are flaky and inextensible

I worked with a startup that had a web service with a single large set of fixture data used across almost all tests. The idea was to model how a large real world data set might look, so that tests would mimic the real world. This is a noble goal, in and of itself. The problem is that this shared one set of fixtures across all tests, so every test essentially depended on every other test.

Every test addition or change modified this shared set of fixtures, but each test had different needs and assumptions. Every change risked breaking every other test or worse, keeping every other test functional but quietly removing the reason the test was supposed to fail in the first place. Tests were hard to add, so there were far fewer tests than needed and the result was a flaky, poorly-tested mess.

This is an extreme example, but similar problems crop up in almost every project I’ve been a part of. The underlying issue is tests that depend on other tests. Problems come up in many ways:

Unrelated tests failing at the same time.&nbsp;A lot of wasted effort goes into debugging and fixing tests that fail simply because they depend on changes made in another test.

Time-consuming test debugging.&nbsp;When tests affect other tests, you never know why a test fails.

Poor test coverage.&nbsp;When adding tests means figuring out how it interacts with many other tests, you tend to avoid adding tests. It’s also easy to invalidate other tests.

Slow, non-parallelizable tests.&nbsp;Running tests in parallel can make them fast, but dependent tests can’t be easily run in parallel.

This might sound familiar if you read&nbsp;<a href="https://www.reaktor.com/blog/little-server-patterns-dependency-parameters">my previous post about dependency parameters</a>. That’s because global configurations are a common way in which tests become dependent on each other. But for tests specifically, here are some patterns I often see in the wild that violate test independence:

Shared servers.Test suites often spin up a single server before tests start, run a battery of tests against the server, and then spin down the server at the end. This makes it hard to configure individual tests for individual needs and can cause tests to fail or pass unexpectedly from changes to shared server state like caches and connections.

Shared databases.Most tests I see spin up a single database and then run all tests against that database. If care is not taken, the data from previous tests can affect other tests.

Shared data.Even if tests run against completely independent databases and data is created independently, if two tests depend on the same data fixture setup then there is a direct link between the two tests.

Global function mocking, HTTP capture, and time capture.Functions might be mocked out by mutating the function globally or globally capturing function calls and then running the test against that. Or all HTTP calls are captured, or any calls to get the current time.

In an ideal world, any test can be run in parallel with any other test without the tests affecting each other. This means:

Independent servers: Spin up a separate server on a unique port for each test. This avoids shared server configurations, caches, and internal states.

Independent data: Each test creates its own data without shared data fixtures.

Independent databases: Spin up a separate database for each test, accepting connections on a&nbsp; unique port. Each test inserts its own data into an empty world.

Independent mocking:&nbsp;Instead of using global function, module, HTTP, or time mocking, each test configures these dependencies for the server independently.

Independent tests are easier to understand, update, add, and optimize. The data and dependencies that a test needs can all be seen in the test setup. You don’t need to think about other tests when you change one test.

In an ideal case, all tests can be run in parallel and each test does the following:

Spins up a new server talking to the database.

Of course, you may have tests that don’t need the server or a database or other parts of this setup. But if you’re running a server, you probably have tests that need these. I favor these kind of API-level integration tests because they are “end-to-end” as far as your server is concerned and so test real functionality, but they’re also scoped enough to write tests for many nuanced cases.

You don’t have to actually run your tests in parallel, but it’s a good way to check if your tests are independent.

“OK, but this is a fantasy land dream,” you say. “I have hundreds of tests. I can’t spin up and tear down servers and databases for every single test. I can’t painstakingly hand-craft artisan data for each and every test. This will be slow and painful.”

Maybe. But maybe not. Computers are fast. Spinning up a fresh database in Docker and a fresh server can be extremely fast. Wiping database tables is trivial. Inserting data often doesn’t have to take long. And in many cases, running tests in parallel can keep it fast enough.

Tests should also be fast, easy to create, and easy to maintain. But I find that if you shoot for the ideal of independent tests, you often get close, while keeping it fast and easy. Then you can keep this ideal in mind as you back off and make compromises to make them stay fast and easy. But only back off as you need to.

Ideal case: spin up a new server on a unique port for each test.

You can write test helpers for spinning up new servers and getting connections to those servers. I’ve had projects with hundreds of tests where this is plenty fast. Some care needs to be taken to make sure servers get closed out properly, but this can usually be abstracted into test setup helpers. You may also need shared global state to generate unique ports, which violates one principle but maintains the broader principle.

Use a single shared server, but clear server state between tests.

Use a single shared server, but use fresh connections for each test.

Ideal case:&nbsp;every test starts with a fresh connection to a new database. Spin up a fresh database on a unique port for each test.

This could be done with databases in Docker. But if spinning up databases is slow or difficult, you may need to back off from the ideal. I’ll confess that I usually don’t achieve the ideal setup with databases but use one of the back off options. But it might be perfectly feasible depending on your setup.

Spin up only one database, but run each test against a separate schema. This means your server configuration needs to specify the schema it is running against, and every database call needs to use the specified schema.

Spin up one database with one schema, but wipe the database between each test.

Spin up one database with one schema and wipe the database between suites of tests.

If you do nothing else, at least keep the test database separate from the development/staging/production databases.

Remember that truncating database tables is usually quite fast. Inserting data can be slower, but this depends on the project and it hasn’t usually been an issue for me.

Ideal case:&nbsp;every test inserts all of its own data at the start of the test.

Creating data for tests shouldn’t be hard. If it’s hard, make it easy. Write helpers that create data with sensible defaults. Tests use these helpers and modify the defaults for their particular cases. You can set this up in any language, but in JS/TS it might be as simple as writing functions like this:

Test suites share some inserted data, but data is wiped between test suites.

Ideal case:&nbsp;every test sets up its own mocking of functions, modules, HTTP requests, and time as needed. No mocking is global, so the mocking from one test does not affect any other.

This one is pretty easy to set up if you follow the&nbsp;<a href="https://www.reaktor.com/blog/little-server-patterns-dependency-parameters/">dependency parameter</a>&nbsp;pattern I outlined in a previous post.

Tests use global mocking, but mocking is reset between tests. This can be quite convenient: Clojure, for instance, has functions like<a href="https://clojuredocs.org/clojure.core/with-redefs">with-redefs</a>, and there are snazzy libraries for HTTP, module, and function capture and mocking. But as soon as you go this route you can no longer run tests in parallel, and every test that uses mocking affects every other test, so some care is needed.

Here is an example test setup closely inspired by projects I have worked on with independent servers, a shared database with independent schemas for each test, independent data, and independent mocking of external services. The example is TypeScript using Jest testing syntax but you can do something similar in any language.

We eventually cleaned up the testing mess at the startup I mentioned earlier. We wrote all new tests as independent tests that create their own data and ported the old tests over bit- by- bit to create their own data instead of relying on shared fixtures. It came together over a few months while developing other features and the result was a better-tested, less buggy, more maintainable service.

The new tests were not 100% independent from each other – they shared database connections, among other things – but that’s not the point. The point is that by remembering and striving for the ideal of independent tests you can write better tests and better software.&nbsp;Every test should be runnable in parallel with every other test.&nbsp;Or as close as you can get.