Godot — snapshot-test your game flow
Script a walkthrough of your game, mark the moments that matter, and let Dungbeetle fail CI when the gameplay changes — with a diff you can read. This guide runs the example game that ships in the CLI repository: title menu → collect three squares → game over.
By the end you'll have: a green headless run in ~3 seconds, a deliberate gameplay change caught as a two-line semantic diff, and a flake harness proving the run is deterministic.
Screencast — the whole loop, live: doctor → baseline → green ci → a collectible moves 40px → ci fails with the gameplay diff → revert → flake proves 5/5 deterministic runs.
Flow video — the terminal session replayed, then the failing visual-mode run's HTML report: the gameplay diff plus before/after/diff screenshots at every marker.
Prerequisites
- A Godot 4.x binary (the editor download is fine).
- The Dungbeetle CLI repository (the example lives in
examples/game/), or your own project with the adapter installed.
Point Dungbeetle at your Godot binary once:
export DUNGBEETLE_GODOT_PATH="/Applications/Godot.app/Contents/MacOS/Godot"1. Look at the pieces
Three files make a game target:
The config (examples/game/dungbeetle.config.json) — a game target pointing at a Godot project and a walkthrough:
{
"kind": "game",
"name": "godot-demo",
"engine": "godot",
"project": "examples/game/godot",
"walkthrough": "examples/game/walkthrough.json"
}The walkthrough (examples/game/walkthrough.json) — the flow, with markers at menu, mid-run, and game over:
{
"steps": [
{ "wait": 10 },
{ "screenshot": "menu" },
{ "input": "ui_accept" },
{ "waitFor": "Main:started == true", "timeoutTicks": 60 },
{ "input": "move_right", "mode": "down" },
{ "waitFor": "Main:game_over == true", "timeoutTicks": 600 },
{ "input": "move_right", "mode": "up" },
{ "wait": 10 },
{ "screenshot": "game-over" },
{ "assert": "Player:collected >= 3" }
]
}The game's opt-in — nodes join the dungbeetle group and expose the fields that matter (examples/game/godot/player.gd):
func _ready() -> void:
add_to_group("dungbeetle")
func get_dungbeetle_state() -> Dictionary:
return {"collected": collected}2. Validate the setup
npm run game:doctorPASS game-walkthrough:godot-demo - Walkthrough is valid …
PASS game-adapter:godot-demo - Adapter 0.1.0 installed (protocol v1–v1, CLI speaks v1).
PASS game-engine:godot-demo - Engine binary … reports version 4.7.stable…
PASS game-determinism:godot-demo - Deterministic run enforced: seed 0, 60 physics ticks/s …3. Capture the baseline
npm run game:updateThe run is fully headless — no window, no Xvfb — because the default semantic mode snapshots game state, not pixels. It takes about 3 seconds: the engine boots, the adapter holds the scene behind a deterministic gate, the walkthrough drives it tick by tick, and each marker records an allowlisted scene-tree state.
4. Re-run and compare — green
npm run game:ci✅ PASSED game:godot-demo5. Break the game, read the diff
Move the third collectible 40px farther, in examples/game/godot/main.gd:
-const ITEM_POSITIONS := [220.0, 320.0, 420.0]
+const ITEM_POSITIONS := [220.0, 320.0, 460.0]npm run game:ci❌ FAILED game:godot-demo
~ $.markers.game-over.state.Player.position[0]: 398 → 438
~ $.markers.game-over.tick: 174 → 194That's the point of semantic-first: the player travelled farther and the run ended 20 ticks later. A pixel tool would show you a red smear; this reads as a gameplay change. Revert the change and the run is green again.
6. Prove it's deterministic
npm run game:flake✅ game:godot-demo — 5/5 runs identicaldungbeetle flake --repeat N captures the target N times with no baseline and reports any run-to-run divergence per marker — the canary to wire into CI before you trust a new walkthrough.
7. Add screenshots (optional)
Set "mode": "visual" on the target and re-baseline. Each marker now also captures a screenshot (locally a window flashes briefly; on Linux CI use xvfb-run). Visual changes are advisory by default — reported next to the semantic diff, rendered as per-marker side-by-side / onion-skin comparisons in the cloud review UI, but they only gate the run if you set "screenshotMode": "strict".
8. Review in the cloud
Push a run to a Dungbeetle server and the walkthrough's semantic gameplay diff — marker states, positions, the pickup tick — is reviewable in a browser. In visual mode each marker (menu, mid-run, game-over) also carries a screenshot, so the review pairs the state diff with per-marker before/after/diff images — advisory by default: semantic state stays the gate.
Screencast: sign in → open the failing run → the marker-state diff plus per-marker screenshot comparisons → approve and promote the new baselines.
Next steps
- The full config, step, and determinism reference: game snapshots.
- Push runs to the cloud for per-marker visual review — see cloud.
- Installing the adapter into your own project: the adapter.