Triaging Test Plan Failures with MCP

Test plans run your full regression suite across every environment and configuration in one go. When failures surface, you normally have to navigate the Spur UI to understand what went wrong for each test. With the Spur MCP, your AI agent can do this triage for you — pulling every failure, their reasons, and the detailed artifacts needed to diagnose each one. You can also create and update test plans directly from your editor.

Available Tools

Tool	What it does
`list_test_plans`	Lists all test plans with suites, environments, and last run time. Pass `plan_id` to get the full editable configuration
`create_test_plan`	Creates a new test plan grouping suites and environments for coordinated execution
`update_test_plan`	Updates an existing test plan — name, description, environments, or suite configuration
`run_test_plan`	Triggers a full plan run across all configured suites and environments
`get_test_plan_runs`	Returns recent run history for a plan: timestamps, pass/fail counts, run IDs
`get_test_plan_run_overview`	Shows every test result in a run — failures first with failure reasons and `task_id`s for drilling in

Once you have task_ids from get_test_plan_run_overview, use the standard run analysis tools to investigate individual failures:

get_test_run_overview — Step-level failure summary for a specific task
get_test_run_screenshots — Visual state of the browser at each step
get_test_run_console_logs — JavaScript errors and browser warnings
get_test_run_network_logs — HTTP requests and responses

Common Workflows

Create a new test plan

"Create a test plan called 'Nightly Regression' with the Checkout and Cart suites on Staging"

Claude calls list_suites and list_environments to resolve names to IDs, shows you a full plan summary, and waits for your approval before calling create_test_plan to save it.

Update a test plan

"Add the Search suite to the Nightly Regression plan"

Claude first calls list_test_plans with the plan_id to get the current configuration, then shows you what will change and asks for approval before calling update_test_plan. When updating suites, the full new list replaces the existing one — so Claude copies across all existing suites and adds the new one.

Trigger a regression run and wait for results

"Run the Regression test plan and let me know when it finishes"

Claude calls run_test_plan with the plan ID (discovered via list_test_plans) and then polls get_test_plan_runs to monitor progress.

Triage all failures from the latest run

"What failed in the latest regression run?"

Claude calls list_test_plans to find the plan, get_test_plan_runs to find the most recent run ID, then get_test_plan_run_overview to list every failure with its reason.

Deep-dive into a specific failure

"Look into the checkout failure from that run"

Claude takes the task_id from the failed test in the plan overview and calls get_test_run_overview, get_test_run_screenshots, and get_test_run_console_logs to build a full picture of why it failed.

Full triage loop

For teams doing post-deploy validation, a complete triage looks like this: You can ask Claude to work through every failure in one pass:

"Go through each failed test from the last regression run and tell me
whether it looks like a product bug or a test issue"

Claude iterates over each task_id from the plan overview, pulls the artifacts for each one, and returns a triage summary — distinguishing test issues (wrong selector, timing, stale step) from actual product regressions.

Step-by-Step Example

Discover your test plans

"What test plans do we have?"

Claude calls list_test_plans and returns a list with plan IDs, suite counts, environment names, and the last run time for each.

Trigger a run

"Run the Nightly Regression plan"

Claude calls run_test_plan with the plan ID. The run is queued and Claude confirms it has started.

Check run history

"Show me the last 5 runs for that plan"

Claude calls get_test_plan_runs and returns a table of run IDs, timestamps, and pass/fail/error counts so you can pick a run to investigate.

Get a full overview of a run

"Give me a full breakdown of run 107272"

Claude calls get_test_plan_run_overview. Failures are listed first with their failure reasons. Each failure includes a task_id for deeper investigation.

Investigate individual failures

"Dig into the cart failure — is it a product bug or a test issue?"

Claude uses get_test_run_overview and get_test_run_screenshots with the task_id to pull the step-level failure details and visual context.

Tips

Use run IDs from URLs

You can pass a run ID directly from a Spur URL. In /test-plans/665/runs/107272, the run ID is 107272. Drop it straight into your prompt: “Give me the overview for run 107272.”

Triage failures before the standup

Run a plan at the start of your day and ask Claude to triage every failure before your team standup. You’ll arrive with a prioritized list of what needs attention and what’s just noise.

Combine with create_test and update_test

When get_test_plan_run_overview reveals a test issue (not a product bug), ask Claude to fix it in-place: “Update the steps for that test to handle the new modal.” Claude calls update_test to revise the steps without you leaving your editor.

Getting Started

Authoring Tests

Managing Tests

Running Tests

Analyzing Tests

Additional Resources

Triaging Test Plan Failures with MCP

Available Tools

Common Workflows

Create a new test plan

Update a test plan

Trigger a regression run and wait for results

Triage all failures from the latest run

Deep-dive into a specific failure

Full triage loop

Step-by-Step Example

Tips

Getting Started

Authoring Tests

Managing Tests

Running Tests

Analyzing Tests

Additional Resources

​Available Tools

​Common Workflows

​Create a new test plan

​Update a test plan

​Trigger a regression run and wait for results

​Triage all failures from the latest run

​Deep-dive into a specific failure

​Full triage loop

​Step-by-Step Example

​Tips

Available Tools

Common Workflows

Create a new test plan

Update a test plan

Trigger a regression run and wait for results

Triage all failures from the latest run

Deep-dive into a specific failure

Full triage loop

Step-by-Step Example

Tips