Context Makes Tests Reusable

I've been designing and implementing a testing framework for around a year. The API was simple and stable for a long time, but a couple of things were bothering me: setting up and tearing down testing environments is cumbersome, reusing tests is inconvinient.

I found out that a minor adjustment to API can make a huge difference and significantly improve UX, so I share my observations and findings with you.

This is a testing library design note and a list of cool consequences of one design decision that made a whole class of tasks easier.

The first two sections provide necessary context for understanding of the syntax and semantics of code examples. The third section introduces "The Problem". The next sections immediately uncover deceptively simple design decision, and gradually show how it addresses the issues mentioned in the previous section and radically improves the overall situations.

Original Design and Syntax

After dozens of iterations, weeks of research and multiple workarounds of imlicit and explicit constraits I ended up with a quite clean syntax and three primary entities: assertion, test and suite. This is how test definition syntax looks like:

(suite "outer"
 (suite "nested"
  (test "the test" ; test can be re-executed multiple times
   (define val (prepare-value somehow))
   (is (good? val)) ; asserting
   (is (really-good? val))) ; asserting again
  (test "small"
   (is (= 4 (+ 2 2))))))

The way it works is (maybe) not intuitive, but simple: the test macro captures its body, creates a thunk (zero-arguments procedure) out of it and sends this procedure with some extra data to test-runner. This process is called test loading: test runner registers a test (test/body-procedure + test/description + test/metadata) and later can execute and re-execute test/body-procedure whenever it is needed.

The test being a first-class runtime entity is a must-have property for interactive development workflows (REPL-driven or similiar). Moreover, having a test + information from previous runs, we can schedule next executions in a very effective way and integrate the run with other development tooling, e.g. we can schedule the quickest previously failed tests to be executed first, and in case of failure, halt/pause the further execution and immediately bring up a debugger.

A test can exist on its own, but for big projects it's better to keep things organized. The suite entity helps with it, it serves two purposes: grouping related entities and building hierarchies. After outer suite is loaded the test runner's will be aware of the test hierarchy:

📂 outer
└─ 📂 nested
   ├─ 📄 the test
   └─ 📄 small

This is cool on its own, but becomes even cooler, when we add metadata feature into the equation. The concept of metadata is simple, you can add arbitrary data to the test or suite. Like this:

(test "my test"
 'metadata '((tags . (slow integration)))
 (is (good? val)))

This way we can provide more info for the test runner, which can be later used to simplify our lives, e.g. we can mark some tests with slow tag and ask test runner to temporary skip them during [re-]executions. It will keep our feedback loop tighter, but still allow to easily run all the tests once in a while. Of course, it's only one of use cases, I bet you already have a bunch of cool ideas of what we can do with access to test-runner internals and this mechanism, but let's explore metadata for suites first.

Syntax-wise it looks exactly the same, but internally it works a bit different. Test runner loads the whole hierarchy and for each test computes a test/compound-metadata value by merging metadata of all enclosing suites and test itself. So tests basically "inherit" the metadata of their ancestors (suites). This way we can mark the whole suite of tests with a particular tag or provide a fixture for setting up a db connection for all db-related tests at once.

Dynamically Scoped Variables

A small almost offtopic, but important section, explaining parameters (dynamically scoped variables) by a short example. We will need understanding of it in the next section.

(define s-s-v 6)

(define (fn)
  (display s-s-v))

(let ((s-s-v 7))
  ;; let doesn't affect staticallyh/lexically scoped s-s-v captured by fn
  (fn) ; prints 6
  (display s-s-v) ; prints 7
  )

(define d-s-v (make-parameter 6))

(define (fn2)
  (display (d-s-v)))

(parameterize ((d-s-v 7))
  ;; parameterize directly affects the value of dynamically scoped d-s-v
  (fn2) ; prints 7
  (display (d-s-v)) ; prints 7
  )

AFAIK, In modern PLT, lexical scoping is preferred over dynamic for general-purpose languages. Code with lexically scoped variables is easier to reason about and maintain.

Sometimes dynamically scoped variables can be useful, e.g. when you want to add a new parameter to a function, but don't want to propagate extra argument to all the callers. In this case dynamically scoped variables can be a compromise: they are still better than global mutable variables, it less refactoring work then updating all the callers signatures and adding extra argument to them.

Still, they introduce unecessary coupling and make flow control much more cumbersome and opaque, so it's better to avoid them when possible.

The Inconvinience of Original Design

All the prerequisits are discussed and set, time to get back to the testing library. Suites gave us grouping, hierarchies, metadata inheritance and while all that is cool, there is a fundamental issue: only test runner has access to it. Tests have no clue about the surrounding and it's limiting. Let's sketch a hypothetical test suite and dissect it next.

(define db* (make-parameter #f))

(define create-admin-fixture
  (lambda (f)
    (init-db-with-admin-user! (db*) ...)
    (f)))

(define create-user-fixture
  (lambda (f)
    (init-db-with-basic-user! (db*) ...)
    (f)))

(define db-connection-fixture
  ;; In real-world code it's better to use dynamic-wind to make sure
  ;; teardown is executed on exception or other non-local control
  ;; transfer.
  (lambda (f)
    (parameterize ((db* (open-db-connection ...)))
      (f)
      (close-db-connection! (db*)))))

(define-suite (user-tests)
  'metadata
  `((fixtures ,create-user-fixture))

  (test "user is present"
    (is (user-exists? (db*) "user")))

  (test "user id is set"
    (is (number? (user-id (db*) "user"))))

  (test "user is not admin"
    (is (not (member "admins" (get-user-groups (db*) "user")))))

(define-suite (admin-tests)
  'metadata
  `((fixtures ,create-admin-fixture))

  (test "user is present"
    (is (user-exists? (db*) "admin")))

  (test "user id is set"
    (is (number? (user-id (db*) "admin"))))

  (test "user is admin"
    (is (member "admins" (get-user-groups (db*) "admin")))))

(define-suite (multiple-users-tests)
  'metadata
  `((fixtures ,create-admin-fixture ,create-user-fixture))

  ;; TODO: [Andrew Tropin, 2026-06-15] Make tests unaware of database,
  ;; by introducing users to context for db-csv-export-test

  (test "there are two users"
    (define users (get-users (db*)))
    (is (= 2 (length users))))

  (test "admin and user are present"
    (define users (get-users (db*)))
    (define user-names (map user-name users))

    (is (member "admin" user-names))
    (is (member "user" user-names))))


(define-suite (db-tests)
  'metadata
  `((fixtures ,db-connection-fixture))

  (user-tests)
  (admin-tests)
  (multiple-users-tests))

The code is relatively straightforward and due to its size still looks quite elegant, but when we start scaling up the amount of tests in our project, we will see some issue emerging and biting us. Let's focus on two primary that are already visible on this suite.

From library design introduction section we remember that tests are independent executable units, they can be run and re-run in arbitrary order, multiple times. They must work despite the order and number of re-runs. That means, we have to setup a proper clean environment everytime we run a test and that's totally fine and expected.

Simple option could be to repeat setup and teardown in each test manually, but it lead to significant repetitions when the setup is the same for multiple tests, and the extra setup code visually obscures the test's logic.

Luckily, in the example above we do it with fixtures. Fixtures are reusable and composable. The issue here is that test-runner has no direct communication channel with tests. Which means there is no way to explicitly share the execution context with a test.

For the fixtures it means we forced to provide execution environment via dynamically scoped variables. It couples tests, fixtures via db* parameter. It has all the cons of parameters. Excessive coupling is no good too.
Not having access to execution context from inside the test means we can't adjust its behavior and reuse same in multiple different contexts. In real life it often handy, e.g. to test API-compatible implementations with the same test suite or when we want to run same checks, but with sligtly different settings or data sources.

Let's call those issues Excessive Coupling and Context Unawareness. This way it will be easier to reference them and discuss how the solution addresses them.

The Solution

Syntax-wise the solution is exceptionally simple, we just add an argument to the test that can be referenced in its body. We wrap it in parenthesis to make it look clearer. That's it:

;; The old syntax
(test "description"
  (is (ok? (some-external-dependency*)))
  (is (good? value)))

;; becomes =>

(test ("description" ctx)
  (is (ok? (assoc-ref ctx 'value)))
  (is (good? value)))

;; ctx, _, context you name it

In addition to syntax changes, our test's test/body-procedure changes from a zero-argument to single-argument procedure. We also need to update test-runner implementation to properly construct the context and pass it to test/body-procedure, but it's trivial.

Now, we all set, so let's see the effect.

Contextual Awareness

Contextual Awareness provides multiple benifits, but we will start from deduplication. You probably noticied a repetitive pattern in user-tests and admin-tests. The existing checks are almost identical with a minor difference of user name to be checked. The original code was:

(define-suite (user-tests)
  'metadata
  `((fixtures ,create-user-fixture))

  (test "user is present"
    (is (user-exists? (db*) "user")))

  (test "user id is set"
    (is (number? (user-id (db*) "user"))))

  (test "user is not admin"
    (is (not (member "admins" (get-user-groups (db*) "user")))))

(define-suite (admin-tests)
  'metadata
  `((fixtures ,create-admin-fixture))

  (test "user is present"
    (is (user-exists? (db*) "admin")))

  (test "user id is set"
    (is (number? (user-id (db*) "admin"))))

  (test "user is admin"
    (is (member "admins" (get-user-groups (db*) "admin")))))

Let's extract the common part into a separate suite, refactor it to new syntax and generalize it.

(define-suite (user-set-correctly-tests)
  (test ("user is present" ctx)
    (is (user-exists? (db*) (assoc-ref ctx 'sut/user))))

  (test ("user id is set" ctx)
    (is (number? (user-id (db*) (assoc-ref ctx 'sut/user))))))

Instead of hardcoded user name, now we obtain it from ctx. To add the name to the context, we will just modify a metadata for enclosing suites. sut stands for subject under the test.

(define-suite (user-set-correctly-tests)
  (test ("user is present" ctx)
    (is (user-exists? (db*) (assoc-ref ctx 'sut/user))))

  (test ("user id is set" ctx)
    (is (number? (user-id (db*) (assoc-ref ctx 'sut/user))))))

(define-suite (user-tests)
  'metadata
  `((fixtures ,create-user-fixture)
    (sut/user . "user"))

  (user-set-correctly-tests)

  (test ("user is not admin" _)
    (is (not (member "admins" (get-user-groups (db*) "user"))))))

(define-suite (admin-tests)
  'metadata
  `((fixtures ,create-admin-fixture)
    (sut/user . "admin"))

  (user-set-correctly-tests)

  (test ("user is admin" _)
    (is (member "admins" (get-user-groups (db*) "admin")))))

In some scenarios duplication can be completely justified, but for the cases where you do a lot of re-use, something like api-compatible library reimplementation, copy-pasting and monkeypatching tests is a guaranteed way to hell.

Of course, code reuse is not the only benifit of contextual awarness. By having an execution context accessible, test can control its behavior. For example if test sees fast-run? set to #t it can skip expensive computations and related assertions. Or in our user-set-correctly-tests suite we can add an extra assertion that checks for user named "admin" that corresponding permission field in the database is initialized with a correct value. The imagination is the limit now, not a testing library :)

Excessive Uncoupling

Let's explore the situtation with dynamic vars, fixtures and unnecesseray coupling and how much it is improved or worsen, hehe.

Decoupling tests from dynamic vars is easy, we just obtain db or any other part of execution environment from the context:

(test ("db access example" ctx)
  (define db (assoc-ref ctx 'db))
  (is (db-connection? db)))

With this small change, test became self-contained, decoupled from fixture implementation, dynamic vars and whatever else. The dependency is now explicit, test connects directly to the test-runner through context argument.

After this update test-runner needs to contruct a proper execution context and call test/body-procedure with context as an argument. The implementation is trivial, but this will also affect how fixtures work. Instead of relying on some shared dynamically scoped variable and implicit prameterization of it, now they just enrich context with values necessary for other fixtures and tests, and pass it to the next fixtures/test in the stack.

(define (db-connection-fixture f)
  (lambda (ctx)
    (let ((db (open-db-connection ...)))
      ;; Add `(db . ,db) pair to context, so the further fixtures and
      ;; tests have access to db
      (f (acons 'db db ctx))

      (cleanup-db! db)
      (close-db-connection! db))))

(define (create-admin-fixture f)
  (lambda (ctx)
    (init-db-with-admin-user! (assoc-ref ctx 'db))
    (f ctx)))

(define (create-user-fixture f)
  (lambda (ctx)
    (init-db-with-basic-user! (assoc-ref ctx 'db))
    (f ctx)))

You probably already saw this pattern, where a number of composable functions wrap each other in some particular order to build a context and possibly process the return value and enrich it on the way back. Such functions usually called middlewares. This composition is very sensitive to order, but the implementation is deadly trivial.

The Synergetic Power Of Friendship

Cool, context awareness added, excessive coupling removed, but can we do now what was impossible before? I got an example, where the both improvements combines, synergize and provide a quite nice developer experience.

We have multiple-users-tests, but didn't touch it yet. How about making this suite generic enough, so it can be used both for testing db connection and csv backup?

;; The original code
(define-suite (multiple-users-tests)
  'metadata
  `((fixtures ,create-admin-fixture ,create-user-fixture))

  (test "there are two users"
    (define users (get-users (db*)))
    (is (= 2 (length users))))

  (test "admin and user are present"
    (define users (get-users (db*)))
    (define user-names (map user-name users))

    (is (member "admin" user-names))
    (is (member "user" user-names))))

First, we will make tests unaware of the source from which users are comming. They will only know that somebody provides a list of users via context. After rewriting tests to the new syntax and removing dependency on db* we get following:

(define-suite (multiple-users-tests)
  (test ("there are two users" ctx)
    (define users (assoc-ref ctx 'users))
    (is (= 2 (length users))))

  (test ("admin and user are present" ctx)
    (define users (assoc-ref ctx 'users))
    (define user-names (map user-name users))

    (is (member "admin" user-names))
    (is (member "user" user-names))))

After that we return back the db-based functionality. We need an extra fixture, which extracts users from database and put them into the context.

(define (db->users-fixture f)
  (lambda (ctx)
    (chain
     (assoc-ref ctx 'db) ; obtain db connection from context
     (get-users _) ; pass it as an argument to get-users
     (acons 'users _ ctx) ; add a list of users to the context
     (f _) ; call fixture/test further down the stack
     )))

(define-suite (db-multiple-users-tests)
  'metadata
  `((fixtures ,create-admin-fixture ,create-user-fixture ,db->users-fixture))

  (multiple-users-tests))

Refactoring is kinda complete and we can add new functionality: running multiple-users-tests on csv backup source instead of db connection. One more fixture and we ready to go.

(define (csv-backup->users-fixture f)
  (lambda (ctx)
    (chain
     ;; (assoc-ref ctx 'users-csv-file-name)
     "resources/users-table-backup.csv" ; hardcode filename for now
     (get-users-from-csv _) ; obtain list of users from csv
     (acons 'users _ ctx) ; add a list of users to the context
     (f _) ; call fixture/test further down the stack
     )))

(define-suite (csv-backup-tests)
  'metadata
  `((fixtures ,csv-backup->users-fixture))

  (multiple-users-tests))

Simple, neat and concise. Now imagine we are going further and try to implement a full test suite for backup validation. We already have fixtures for connecting to db, and initializing necessary values, so all we need is to add a fixture which serializes a db table to the file and we are ready to go: just take already existing db test, make them independent from data source and run them on both original db suites and backup verification suites.

We won't demonstrate the implementation here, but I hope the idea sparks the joy and curiosity in your head.

Conclusion

With one small change, we were able to achive quite a lot. Let's quickly recap what we've got:

A slightly more verbose syntax :'(
A direct communication channel between test-runner and tests
A subjectively clearer test syntax which reads as a re-runnable entity
Tests are reusable now
Tests are context-aware and smart now! :)

Computation-wise, it didn't add something that was impossible before. The both variants still turing-equivalent, but it's definitely changed the UX/DX quite a lot and I hope to the better.

The new syntax is reflected in the second draft of SRFI-269. The fixtures and test-runner modifications for suitbl testing library are still work-in-progress at the moment of writing.

Support

The work on suitbl testing library and SRFI-269 is proudly funded by NLnet.

If you enjoyed the reading, consider to support me, my work and projects. Any help, being it a coin or a word makes a difference.

Future work

The one thing that still bothers me is that metadata attached to the suite-loader, which means to adjust metadata you have to wrap one suite into another. Seems reasonable, but there is the other option: we could make it possible to call suite-loader with optional metadata argument, which will enchance original metadata, so instead of:

(define-suite (my-suite)
  'metadata
  '((original-metadata . #t))
  (test ...)
  (test ...))

(define-suite (proxy-1-suite)
  'metadata
  '((extra-metadata . 1))
  (my-suite))

(define-suite (proxy-2-suite)
  'metadata
  '((extra-metadata . 2))
  (my-suite))

(define-suite (main-suite)
  (proxy-1-suite)
  (proxy-2-suite)
  (other-important-suite))

we can do something like:

(define-suite (my-suite)
  'metadata
  '((original-metadata . #t))
  (test ...)
  (test ...))

(define-suite (main-suite)
  (my-suite '((extra-metadata . 1)))
  (my-suite '((extra-metadata . 2)))
  (other-important-suite))

Definitely looks more elegant, but it's a very recent design idea, so other implications are not clear yet. Need a bit of time to process and think about it.

BTW, SRFI-269 will be finalized in a couple of weeks, so there is a chance to share feedback, uncovered use cases, propose changes before the spec set in stone.

Appendix

Complete example

All the snippets put together in one place:

(define (db-connection-fixture f)
  (lambda (ctx)
    (let ((db (open-db-connection ...)))
      ;; Add `(db . ,db) pair to context, so the further fixtures and
      ;; tests have access to db
      (f (acons 'db db ctx))

      (cleanup-db! db)
      (close-db-connection! db))))

(define (create-admin-fixture f)
  (lambda (ctx)
    (init-db-with-admin-user! (assoc-ref ctx 'db))
    (f ctx)))

(define (create-user-fixture f)
  (lambda (ctx)
    (init-db-with-basic-user! (assoc-ref ctx 'db))
    (f ctx)))

;;; User tests

(define-suite (user-set-correctly-tests)
  (test ("user is present" ctx)
    (is (user-exists? (assoc-ref ctx 'db) (assoc-ref ctx 'sut/user))))

  (test ("user id is set" ctx)
    (is (number? (user-id (assoc-ref ctx 'db) (assoc-ref ctx 'sut/user))))))

(define-suite (user-tests)
  'metadata
  `((fixtures ,create-user-fixture)
    (sut/user . "user"))

  (user-set-correctly-tests)

  (test ("user is not admin" ctx)
    (is (not (member "admins" (get-user-groups (assoc-ref ctx 'db) "user"))))))

(define-suite (admin-tests)
  'metadata
  `((fixtures ,create-admin-fixture)
    (sut/user . "admin"))

  (user-set-correctly-tests)

  (test ("user is admin" ctx)
    (is (member "admins" (get-user-groups (assoc-ref ctx 'db) "admin")))))

(define-suite (multiple-users-tests)
  (test ("there are two users" ctx)
    (define users (assoc-ref ctx 'users))
    (is (= 2 (length users))))

  (test ("admin and user are present" ctx)
    (define users (assoc-ref ctx 'users))
    (define user-names (map user-name users))

    (is (member "admin" user-names))
    (is (member "user" user-names))))

;;; Database suites

(define (db->users-fixture f)
  (lambda (ctx)
    (chain
     (assoc-ref ctx 'db) ; obtain db connection from context
     (get-users _) ; pass it as an argument to get-users
     (acons 'users _ ctx) ; add a list of users to the context
     (f _) ; call fixture/test further down the stack
     )))

(define-suite (db-multiple-users-tests)
  'metadata
  `((fixtures ,create-admin-fixture ,create-user-fixture ,db->users-fixture))

  (multiple-users-tests))

(define-suite (db-tests)
  'metadata
  `((fixtures ,db-connection-fixture))

  (user-tests)
  (admin-tests)
  (db-multiple-users-tests))

;;; CSV backup suite

(define (csv-backup->users-fixture f)
  (lambda (ctx)
    (chain
     ;; (assoc-ref ctx 'users-csv-file-name)
     "resources/users-table-backup.csv" ; hardcode filename for now
     (get-users-from-csv _) ; obtain list of users from csv
     (acons 'users _ ctx) ; add a list of users to the context
     (f _) ; call fixture/test further down the stack
     )))

(define-suite (csv-backup-tests)
  'metadata
  `((fixtures ,csv-backup->users-fixture))

  (multiple-users-tests))


;;; All tests suite

(define-suite (all-tests)
  (db-tests)
  (csv-backup-tests))