Symptom

Two failure modes observed on the Verify Agent Ghost workflow run for PR #216:

  1. Two identical verify runs were queued for the same head SHA. Run 25290400159 (action opened) was created at 20:47:55Z and run 25290400368 (action labeled) at 20:47:56Z — both against commit b02e73e5. The concurrency group verify-agent-ghost serialised them, but both still consumed runner time.
  2. Install built packages step exited with sudo-rs: timed out exactly 300s after the step started. No useful logs, just a hard stop. Because this step runs after ~30 min of WASM / ProGuard / shadowJar builds, the wasted-cycles cost per failed run is large.

Root cause

  1. Duplicate runs. When a PR is created with a label already attached (the dev agent’s normal flow uses gh pr create --label needs-qa-verify), GitHub fires both opened and labeled webhook events. The previous trigger config (types: [opened, synchronize, reopened, labeled]) admitted both, and the if condition was true in both cases (label present AND either action != labeled OR the labelled label is needs-qa-verify). Two distinct workflow runs fired against the identical PR head SHA.
  2. sudo timeout. The ghost runner’s ben user does not have a NOPASSWD entry covering apt-get, systemctl, rm /srv/krill/..., or journalctl -u krill. sudo apt-get install -y … therefore prompted for a tty password, got none from the GitHub Actions non-interactive shell, and waited until sudo-rs’s default passwd_timeout of 300s elapsed before exiting.

Fix

The host-side fix (the actual NOPASSWD entry on the ghost runner) lives outside this repo. The expected /etc/sudoers.d/krill-ghost-runner contents:

1
ben ALL=(ALL) NOPASSWD: /usr/bin/apt-get, /usr/bin/systemctl, /usr/bin/journalctl, /usr/bin/rm /srv/krill/data/*.db, /usr/bin/dpkg

Prevention