Available Tools
| Tool | What it does |
|---|---|
list_test_plans | Lists all test plans with suites, environments, and last run time. Pass plan_id to get the full editable configuration |
create_test_plan | Creates a new test plan grouping suites and environments for coordinated execution |
update_test_plan | Updates an existing test plan — name, description, environments, or suite configuration |
run_test_plan | Triggers a full plan run across all configured suites and environments |
get_test_plan_runs | Returns recent run history for a plan: timestamps, pass/fail counts, run IDs |
get_test_plan_run_overview | Shows every test result in a run — failures first with failure reasons and task_ids for drilling in |
task_ids from get_test_plan_run_overview, use the standard run analysis tools to investigate individual failures:
get_test_run_overview— Step-level failure summary for a specific taskget_test_run_screenshots— Visual state of the browser at each stepget_test_run_console_logs— JavaScript errors and browser warningsget_test_run_network_logs— HTTP requests and responses
Common Workflows
Create a new test plan
list_suites and list_environments to resolve names to IDs, shows you a full plan summary, and waits for your approval before calling create_test_plan to save it.
Update a test plan
list_test_plans with the plan_id to get the current configuration, then shows you what will change and asks for approval before calling update_test_plan. When updating suites, the full new list replaces the existing one — so Claude copies across all existing suites and adds the new one.
Trigger a regression run and wait for results
run_test_plan with the plan ID (discovered via list_test_plans) and then polls get_test_plan_runs to monitor progress.
Triage all failures from the latest run
list_test_plans to find the plan, get_test_plan_runs to find the most recent run ID, then get_test_plan_run_overview to list every failure with its reason.
Deep-dive into a specific failure
task_id from the failed test in the plan overview and calls get_test_run_overview, get_test_run_screenshots, and get_test_run_console_logs to build a full picture of why it failed.
Full triage loop
For teams doing post-deploy validation, a complete triage looks like this: You can ask Claude to work through every failure in one pass:task_id from the plan overview, pulls the artifacts for each one, and returns a triage summary — distinguishing test issues (wrong selector, timing, stale step) from actual product regressions.
Step-by-Step Example
Discover your test plans
list_test_plans and returns a list with plan IDs, suite counts, environment names, and the last run time for each.Trigger a run
run_test_plan with the plan ID. The run is queued and Claude confirms it has started.Check run history
get_test_plan_runs and returns a table of run IDs, timestamps, and pass/fail/error counts so you can pick a run to investigate.Get a full overview of a run
get_test_plan_run_overview. Failures are listed first with their failure reasons. Each failure includes a task_id for deeper investigation.Tips
Use run IDs from URLs
Use run IDs from URLs
You can pass a run ID directly from a Spur URL. In
/test-plans/665/runs/107272, the run ID is 107272. Drop it straight into your prompt: “Give me the overview for run 107272.”Triage failures before the standup
Triage failures before the standup
Run a plan at the start of your day and ask Claude to triage every failure before your team standup. You’ll arrive with a prioritized list of what needs attention and what’s just noise.
Combine with create_test and update_test
Combine with create_test and update_test
When
get_test_plan_run_overview reveals a test issue (not a product bug), ask Claude to fix it in-place: “Update the steps for that test to handle the new modal.” Claude calls update_test to revise the steps without you leaving your editor.