TDD
Specification by Tests: LLM‑Driven TDD
Until now we often asked the LLM to produce the final result right away: write the code, assemble the screen, wire the API. There is another, equally valid way to build with an LLM: express the requirement as checks first and only then implement. Using tests as the specification gives a concrete, objective definition of done and keeps the assistant from guessing.
Test‑driven development (TDD) is exactly that discipline. A small, precise expectation is written as a test; the minimum code is created to satisfy it; the code is then cleaned up without changing behaviour. Starting from a failing test turns vague instructions into something measurable, and every discovered issue becomes another test so the fix remains permanent.
Where this approach fits best
- Well‑defined flows with observable outcomes (signup, checkout, ticket creation).
- Adapters & clients for third‑party APIs (deterministic inputs/outputs, pagination, rate‑limit handling).
- Business rules & validation (eligibility, pricing, form rules, content policies).
- Transformations & utilities (parsers, normalizers, formatters) that are easy to isolate.
- Contracts at boundaries: microfrontend remotes, public component APIs, endpoint tables.
- Security & safety rules (authorization boundaries, data redaction) that must never regress.
Less suitable for open‑ended visual design or rapidly shifting specs until acceptance criteria settle.
Workflow (docs → failing tests → code → green)
- Collect inputs: product brief, endpoint table, data model, security rules.
- Ask LLM for a test plan: list user stories, edge cases, and negative paths.
- Generate executable tests (start small): unit/contract tests with minimal fixtures and clear pass/fail.
- Review & trim: remove implementation hints; keep behavior‑only assertions.
- Run tests → they fail (by design).
- Ask LLM to implement the minimal code to satisfy the failing tests.
- Iterate: add edge cases; refactor with tests green.
- Add acceptance checks for critical flows (happy path + key errors).
- Keep tests as living docs: when requirements change, update tests first.
Test types (start with a small set)
- Unit / business‑rule tests: pure functions; no network or database.
- Contract tests for API clients: verify request shape, headers, pagination, 429 handling; run against a local mock.
- Component/contract tests for microfrontends: mount public exports; check props/events only.
- Integration tests (thin): a narrow slice through adapter → service → response; fast and deterministic.
- Acceptance/E2E (very few): smoke the critical user paths with stable selectors/test IDs.
Prompt templates
Generate a test plan from docs
text
Context: <paste product brief, endpoint table, data rules>.
Task: Propose a minimal test plan that defines behavior without prescribing implementation. Cover: happy paths, edge cases, negative cases, and security/authorization rules. Output a numbered list with short titles and expected outcomes.
Produce executable unit tests
text
From tests #1–#3 in the plan, generate executable unit tests in <framework>. Use small, explicit fixtures. No network or file I/O. Assert behavior only. If a rule depends on time/randomness, inject a clock/seed.
Contract tests for a third‑party API client
text
Given the official API spec <link or excerpt>, generate contract tests that:
- build requests with exact paths/methods/headers/fields,
- verify pagination and rate‑limit handling (429 backoff once max),
- forbid unknown fields. Use a local mock server with canned responses.
Microfrontend remote contract
text
For remote "profile/Widget": write tests that mount the exposed component and verify props/events only. Do not import internal files. Fail if undocumented props are used. Provide a minimal host stub.
Implement to pass tests
text
Write the minimal code to make these tests pass. Do not change tests. If a test is ambiguous, propose a clearer assertion. Keep functions small; no side effects outside specified adapters.
Extend with acceptance checks
text
Generate 2–3 acceptance tests for the primary user flow. Use stable selectors/test IDs. Mock network at the boundary adapter. Keep each test under 2 seconds.
Review checklist for LLM‑written tests
- Does each test state behavior, not implementation details?
- Are selectors/IDs stable and tied to public contracts?
- Are there clear negative tests (forbidden actions, validation failures)?
- Are network calls mocked with official spec shapes only?
- Is flakiness minimized (no sleeps; use events/awaits; fixed seeds)?
Summary
Treat tests as the contract. Let the LLM draft the tests from your documentation, then implement to make them pass. Start with a few unit/contract checks, keep acceptance tests minimal, and require exact API shapes to avoid hallucinations. This keeps behavior explicit, enables parallel work, and provides reliable guardrails as the product evolves.