Skip to main content
When you are working inside a sprint, test coverage often lags behind development. Features ship, PRs merge, and tests get written later — if at all. With Claude Code and the Spur MCP, you can generate and validate tests as part of your development workflow, closing the gap between code changes and test coverage. This guide walks you through a real-world workflow for generating tests from code context, creating them in Spur via MCP, and iterating on failures — all from within your editor.

Who is this for?

  • QA engineers who want to scale test creation without sacrificing quality
  • Developers who want to validate their changes before merging
  • Teams looking to shift testing left into the sprint cycle

Prerequisites

  • Spur MCP set up in Claude Code
  • A GitHub or Jira MCP connected (for PR/ticket context)
  • Claude Code with access to your project codebase

The Workflow

Step-by-Step

1

Feed your agent a PR or ticket

Start by pointing Claude Code at the change you want to test. This can be a pull request, a Jira ticket, or even a commit hash.
"Here's PR #247 — it updates the search modal to clear the
search term on close. Can you generate test cases for this?"
Claude reads the PR description, code diffs, and changed files to understand what was modified. Even PRs with minimal descriptions work well — the code diff itself provides rich context about what changed and what needs testing.
2

Claude identifies test areas from code context

With access to your full codebase, Claude goes beyond the PR diff. It examines the surrounding code, related components, and existing test coverage to identify what should be tested.For example, Claude might identify:
  • The primary flow (search term clears on modal close)
  • Edge cases (what if the search had active results?)
  • Related areas (does the search state persist across navigation?)
You can also ask Claude to prioritize broadly:
"What are the highest-priority areas of our app that need
test coverage? Focus on cart, checkout, and navigation."
Claude analyzes your codebase structure, identifies critical user flows, and ranks them by importance — giving you a prioritized list of test cases to create.
3

Generate test steps

Claude outputs structured test steps that map directly to what Spur expects. Each test case includes a clear description, preconditions, and step-by-step actions with expected outcomes.Example output:
Test: Search term clears on modal close
1. Navigate to the homepage
2. Click the search icon to open the search modal
3. Type "jacket" in the search field
4. Verify search results appear
5. Close the search modal
6. Reopen the search modal
7. Verify the search field is empty
4

Create the test in Spur

Once you are happy with the test steps, ask Claude to create the test directly in Spur. Claude will:
  1. Call list_suites to discover available suites and their URL keys
  2. Show you a full test summary — suite name, title, URL key and resolved URL, and all numbered steps
  3. Ask for your explicit approval before saving
  4. Call create_test to save the test to the suite
"Create this test in the Checkout suite using the staging URL key"
After creation, Claude returns the new test_id which it can use immediately to run the test.If you want to revise the test after reviewing the result, ask Claude to update it:
"Update that test — change step 3 to click the 'Add to cart' button instead"
Claude calls update_test with the revised steps, again showing you what will change before saving.
5

Run the test

Use the Spur MCP to trigger the test directly from your editor:
"Run the search modal test on staging"
Claude calls run_tests with the correct environment name and monitors the run. You stay in your editor the entire time.
6

Analyze and iterate on failures

When a test fails, Claude pulls the full debugging context automatically:
  • get_test_run_overview — What failed and at which step
  • get_test_run_console_logs — JavaScript errors or warnings
  • get_test_run_network_logs — Failed API calls or unexpected responses
  • get_test_run_screenshots — Visual state of the browser at failure
Claude then correlates the failure with your codebase to determine whether it is a test issue (wrong selector, timing, incorrect step) or an actual bug in the code.If the test steps need adjustment, Claude calls update_test to revise them in-place and you run again — no need to go back to the Spur UI. If it is a real bug, Claude helps you fix it in your code.

Scaling with Code-to-Test Mapping

As your test suite grows, you can maintain a mapping between code areas and their corresponding Spur tests. This lets you (or your CI pipeline) automatically identify which tests to run when specific code changes.

How it works

  1. Map code areas to tests — Maintain a reference (in your repo or in Spur) that links code paths to test IDs
  2. When a PR touches a file — Claude checks the mapping and identifies which Spur tests cover the affected area
  3. Run only relevant tests — Instead of running the full suite, you run targeted tests that correspond to the change
"I changed the cart component. What Spur tests should I run?"
Claude cross-references the changed files with your test mapping and calls list_tests to confirm coverage, then runs the relevant subset with run_tests.

Example mapping

# test-mapping.yaml
homepage:
  paths: ["src/components/Home/**", "src/pages/index.*"]
  spur_tests: ["test_id_1", "test_id_2"]

cart:
  paths: ["src/components/Cart/**", "src/context/CartContext.*"]
  spur_tests: ["test_id_3", "test_id_4", "test_id_5"]

search:
  paths: ["src/components/Search/**"]
  spur_tests: ["test_id_6", "test_id_7"]

Pairing with Other MCPs

This workflow becomes even more powerful when you combine multiple MCPs:
MCPWhat it adds
GitHubPR diffs, commit history, changed files
Jira / LinearTicket context, acceptance criteria, sprint priorities
See Connecting with Other MCP Tools for setup details.

CI/CD integration

To fully close the loop, you can trigger Spur tests from your CI pipeline on deploy previews:
  1. Your CI builds a deploy preview for the PR
  2. A GitHub Action triggers Spur tests against the preview URL
  3. Spur runs the relevant tests using Override URLs to target the preview environment
  4. Results feed back to the PR as a status check
This means every PR gets tested automatically against a live preview — catching regressions before they reach staging or production.

CI/CD Integration

Set up GitHub or GitLab pipelines to trigger tests automatically.

Override URLs

Redirect tests to deploy previews or feature branch URLs at runtime.

Tips from Real Usage

Ask Claude to survey your codebase for high-priority test areas first. Then work through them systematically — one area at a time (cart, then navigation, then checkout). This is more productive than trying to test everything at once.
When authoring tests, use relative paths instead of hardcoded domains. This makes it easy to run the same tests across staging, production, and deploy previews by overriding the base URL in your environment configuration.
As you iterate with Claude Code, it learns your patterns — what step types to use, how to structure verifications, and what edge cases matter for your app. The more you use it, the less iteration you need.
Each test should validate one flow. Avoid combining multiple scenarios into a single test — it makes failures harder to diagnose and steps harder to maintain.
The most efficient workflow is: generate, create, run, analyze failure, fix steps, run again. Claude can drive this entire loop, only pausing for your approval before running tests.