Skip to main content
When you are working inside a sprint, test coverage often lags behind development. Features ship, PRs merge, and tests get written later — if at all. With Claude Code and the Spur MCP, you can generate and validate tests as part of your development workflow, closing the gap between code changes and test coverage. This guide walks you through a real-world workflow for generating tests from code context, creating them in Spur via MCP, and iterating on failures — all from within your editor.

Who is this for?

  • QA engineers who want to scale test creation without sacrificing quality
  • Developers who want to validate their changes before merging
  • Teams looking to shift testing left into the sprint cycle

Prerequisites

  • Spur MCP set up in Claude Code
  • A GitHub or Jira MCP connected (for PR/ticket context)
  • Claude Code with access to your project codebase

The Workflow

Step-by-Step

1

Feed your agent a PR or ticket

Start by pointing Claude Code at the change you want to test. This can be a pull request, a Jira ticket, or even a commit hash.
"Here's PR #247 — it updates the search modal to clear the
search term on close. Can you generate test cases for this?"
Claude reads the PR description, code diffs, and changed files to understand what was modified. Even PRs with minimal descriptions work well — the code diff itself provides rich context about what changed and what needs testing.
2

Claude identifies test areas from code context

With access to your full codebase, Claude goes beyond the PR diff. It examines the surrounding code, related components, and existing test coverage to identify what should be tested.For example, Claude might identify:
  • The primary flow (search term clears on modal close)
  • Edge cases (what if the search had active results?)
  • Related areas (does the search state persist across navigation?)
You can also ask Claude to prioritize broadly:
"What are the highest-priority areas of our app that need
test coverage? Focus on cart, checkout, and navigation."
Claude analyzes your codebase structure, identifies critical user flows, and ranks them by importance — giving you a prioritized list of test cases to create.
3

Generate test steps

Claude outputs structured test steps that map directly to what Spur expects. Each test case includes a clear description, preconditions, and step-by-step actions with expected outcomes.Example output:
Test: Search term clears on modal close
1. Navigate to the homepage
2. Click the search icon to open the search modal
3. Type "jacket" in the search field
4. Verify search results appear
5. Close the search modal
6. Reopen the search modal
7. Verify the search field is empty
4

Create the test in Spur

Once you are happy with the test steps, copy them into Spur to create the test. As Spur MCP adds CRUD tools, this step will become fully automated — Claude will create and update tests directly via MCP without manual intervention.
CRUD operations for test creation and editing via MCP are coming soon. This will let Claude create and update tests programmatically, eliminating the manual copy step.
5

Run the test

Use the Spur MCP to trigger the test directly from your editor:
"Run the search modal test on staging"
Claude calls run_test with the correct environment and monitors the run. You stay in your editor the entire time.
6

Analyze and iterate on failures

When a test fails, Claude pulls the full debugging context automatically:
  • get_test_run_overview — What failed and at which step
  • get_test_run_console_logs — JavaScript errors or warnings
  • get_test_run_network_logs — Failed API calls or unexpected responses
  • get_test_run_screenshots — Visual state of the browser at failure
Claude then correlates the failure with your codebase to determine whether it is a test issue (wrong selector, timing, incorrect step) or an actual bug in the code.If the test steps need adjustment, Claude revises them and you run again. If it is a real bug, Claude helps you fix it in your code.

Scaling with Code-to-Test Mapping

As your test suite grows, you can maintain a mapping between code areas and their corresponding Spur tests. This lets you (or your CI pipeline) automatically identify which tests to run when specific code changes.

How it works

  1. Map code areas to tests — Maintain a reference (in your repo or in Spur) that links code paths to test IDs
  2. When a PR touches a file — Claude checks the mapping and identifies which Spur tests cover the affected area
  3. Run only relevant tests — Instead of running the full suite, you run targeted tests that correspond to the change
"I changed the cart component. What Spur tests should I run?"
Claude cross-references the changed files with your test mapping and calls list_tests to confirm coverage, then runs the relevant subset with run_tests.

Example mapping

# test-mapping.yaml
homepage:
  paths: ["src/components/Home/**", "src/pages/index.*"]
  spur_tests: ["test_id_1", "test_id_2"]

cart:
  paths: ["src/components/Cart/**", "src/context/CartContext.*"]
  spur_tests: ["test_id_3", "test_id_4", "test_id_5"]

search:
  paths: ["src/components/Search/**"]
  spur_tests: ["test_id_6", "test_id_7"]

Pairing with Other MCPs

This workflow becomes even more powerful when you combine multiple MCPs:
MCPWhat it adds
GitHubPR diffs, commit history, changed files
Jira / LinearTicket context, acceptance criteria, sprint priorities
See Connecting with Other MCP Tools for setup details.

CI/CD Integration

To fully close the loop, you can trigger Spur tests from your CI pipeline on deploy previews:
  1. Your CI builds a deploy preview for the PR
  2. A GitHub Action triggers Spur tests against the preview URL
  3. Spur runs the relevant tests using environment URL overrides
  4. Results feed back to the PR as a status check
This means every PR gets tested automatically against a live preview — catching regressions before they reach staging or production. See CI/CD Integration for setup instructions.

Tips from Real Usage

Ask Claude to survey your codebase for high-priority test areas first. Then work through them systematically — one area at a time (cart, then navigation, then checkout). This is more productive than trying to test everything at once.
When authoring tests, use relative paths instead of hardcoded domains. This makes it easy to run the same tests across staging, production, and deploy previews by overriding the base URL in your environment configuration.
As you iterate with Claude Code, it learns your patterns — what step types to use, how to structure verifications, and what edge cases matter for your app. The more you use it, the less iteration you need.
Each test should validate one flow. Avoid combining multiple scenarios into a single test — it makes failures harder to diagnose and steps harder to maintain.
The most efficient workflow is: generate, create, run, analyze failure, fix steps, run again. Claude can drive this entire loop, only pausing for your approval before running tests.