The most-asked-for v3 feature wasn't a new tool category — it was a way for the AI agent to actually see what happened after running its code. v3 ships nine pie_* tools that drive Play-In-Editor end-to-end, and five test-authoring tools that scaffold and run UE automation specs. Together they form what we've been calling the "closed loop" — agent writes code, agent hits Play, agent observes, agent fixes itself.
This tutorial builds a self-correcting playtester from scratch in about 30 minutes. By the end you'll have a chat workflow that:
- Scaffolds a new automation spec under
Source/<YourProject>Tests/. - Builds the project.
- Runs the spec inside PIE.
- Captures the screenshot + last runtime error if the spec fails.
- Edits the player Blueprint to fix the failure.
- Repeats until the spec passes — or gives up cleanly with a written-up failure report.
Prereqs: Unreal Engine 5.7, Unreal MCP Server v3 installed, Claude Code (or any MCP client) connected to
http://localhost:13579/mcp.
Step 1: Verify the new tools are exposed
In your AI client, ask:
"List the tools in the unreal MCP server whose name starts with
pie_orcreate_automation_."
You should see all nine PIE tools and the five test-authoring tools. If you don't, open Project Settings → Plugins → Unreal MCP Server and confirm bEnablePIETools = true and bEnableTestAuthoringTools = true. Both default to true and ship in the Gameplay preset.
Step 2: Scaffold a spec
Ask the agent:
"Create a new automation spec under
Source/MyGameTests/calledJumpHeightSpecthat places the FirstPersonCharacter at the origin, simulates a Spacebar press, waits 0.5s, and asserts the character's Z is greater than 90 units. Also list the spec after creating it so I can see the file path."
The agent will issue:
{ "method": "tools/call", "params": {
"name": "create_automation_spec",
"arguments": {
"target_dir": "MyGameTests",
"name": "JumpHeight",
"describe": "FirstPersonCharacter jump",
"it_blocks": [
{ "name": "rises above 90 units after Space press", "code": "<auto-generated>" }
]
}
} }
It then calls list_automation_specs { filter: \"JumpHeight\" } to confirm the test isn't yet registered (specs only register after a build).
Step 3: Build the project
The agent calls your project's build pipeline (this is project-specific — many users wire the build to a package_build tool in their setup, or shell out via a separate MCP). For Blueprint-only changes you can skip the build; for C++ specs you need a real compile.
After the build, list_automation_specs { filter: "JumpHeight" } will return the registered test name (typically MyGame.JumpHeight.rises above 90 units after Space press).
Step 4: Run the spec end-to-end
Now the closed loop. The agent runs:
{ "method": "tools/call", "params": {
"name": "run_automation_specs",
"arguments": { "filter": "JumpHeight" }
} }
This is a synchronous runner — it iterates each matching spec, accumulates pass/fail/duration, and persists into a session-scoped last-report buffer. If it passes, you're done. If not:
{ "method": "tools/call", "params": { "name": "get_last_test_report" } }
The report comes back with the failed assertion message. Most "jump doesn't work" failures fall into one of three buckets:
- The character's
JumpActionEnhanced Input mapping isn't wired. - The player Blueprint's
OnJumpevent has anAccessNone(e.g.,SkeletalMeshis null at startup). - Gravity / character movement settings put the jump apex below 90 units.
The agent now needs to see which one. Two paths:
Path A — The runtime error
{ "method": "tools/call", "params": { "name": "get_last_runtime_error" } }
If there was a BP exception during the jump, this returns the most recent OnScriptException payload — class, function, node graph path, error string. The agent uses this to navigate straight to the broken node.
Path B — The screenshot
If there's no runtime error but the assertion still failed, the spec likely succeeded mechanically (input got pressed) but the character didn't actually move. The agent calls:
{ "method": "tools/call", "params": { "name": "pie_start", "arguments": { "mode": "Selected", "num_players": 1 } } }
{ "method": "tools/call", "params": { "name": "pie_send_input", "arguments": { "key": "SpaceBar", "event": "Pressed" } } }
{ "method": "tools/call", "params": { "name": "pie_step_frame" } }
{ "method": "tools/call", "params": { "name": "pie_step_frame" } }
{ "method": "tools/call", "params": { "name": "pie_screenshot" } }
The screenshot comes back as a base64 PNG. A multimodal agent can look at it directly and reason: "The character is still standing, no animation, no jump arc — this is an input-mapping failure, not gravity."
Step 5: Apply the fix
Now the agent issues tools/call against any of the 51 Blueprint tools to fix the mapping or wire the missing node:
{ "method": "tools/call", "params": {
"name": "add_blueprint_input_action_event",
"arguments": {
"blueprint_path": "/Game/FirstPerson/BP_FirstPersonCharacter",
"action": "/Game/FirstPerson/Input/Actions/IA_Jump"
}
} }
{ "method": "tools/call", "params": {
"name": "compile_blueprint",
"arguments": { "blueprint_path": "/Game/FirstPerson/BP_FirstPersonCharacter" }
} }
Then back to Step 4 — re-run the spec. Loop until pass.
Step 6: Wrap it in a transaction and source-control submit
Once the spec passes, you want the whole sequence of edits to land as one logical change in source control. Use multi-call transactions and the source-control tools:
{ "method": "transactions/begin", "params": { "label": "Wire jump input + spec" } }
{ "method": "tools/call", "params": { "name": "sc_check_out", "arguments": { "files": [...] } } }
// ... all the edits ...
{ "method": "transactions/commit" }
{ "method": "tools/call", "params": {
"name": "sc_submit",
"arguments": {
"files": [...],
"description": "agent-fix: wire IA_Jump on BP_FirstPersonCharacter; JumpHeightSpec passes"
}
} }
One Ctrl+Z reverts every edit. One source-control commit lands the whole change with a description that names what the agent did.
What this changes
The closed loop is the difference between AI as a code generator and AI as a colleague. A code generator produces text and waits for you to verify it. A colleague writes the code, runs it, sees what happened, and tells you when it's done.
That's been the missing piece in MCP servers for game engines. v3 ships it. Try the loop, then check out the PIE Control and Test Authoring docs for the full surface area.
→ Get Unreal MCP Server v3 — launch pricing for 5 days → v3 release notes → Production-safe agents devlog