A client launched with no stored servers sits on the ClientScreen showing “No Nodes Yet” and the avatar bubble “Searching for Krill servers…”. A server that comes online after the client never gets discovered — node download and SSE connection never start. Restarting the app makes the same server appear immediately.
Peer discovery was one-shot. Each side (ClientBeaconSupervisor,
ServerBeaconSupervisor) fired a single startup beacon and then only listened.
An idle client could only learn about a later-starting server from that
server’s single best-effort UDP multicast startup beacon; one dropped packet
(or IGMP-membership / interface timing on the just-joined listener) meant
permanent silence, because neither side ever re-announced. The connection
chain (ClientBeaconWireHandler → ClientServerConnector.connectWire → cert →
node download → SSE) was intact — it simply never got triggered. Restart
“fixed” it only because it made the client emit a fresh port=0 probe that the
now-listening server replied to (ServerBeaconWireHandler answers any new
peer’s probe). The BeaconSupervisor / Multicast KDocs already claimed
beacons “fire periodically” — the implementation had drifted to one-shot.
ClientBeaconSupervisor now runs a reprobeJob alongside the listener: while
AppLifecycle.isForeground is true it re-broadcasts the presence probe every
REPROBE_INTERVAL_MS (5s) via the existing BeaconSender (itself rate-limited,
so the cadence is an upper bound). Because the server replies to a new peer’s
probe, a server coming online after the client is now discovered within one
interval — no restart. Backgrounded apps skip the re-probe to respect battery.
Discovery over best-effort UDP multicast must never depend on a single packet
arriving — periodic re-announce/re-probe is mandatory (this is why mDNS/SSDP
poll). The regression test (ClientBeaconSupervisorReprobeTest) advances
virtual time and asserts the supervisor keeps emitting beacons after the
startup one, and that backgrounding pauses them. When an interface KDoc says a
behavior is “periodic,” a test should pin that contract so an implementation
can’t silently regress it to one-shot.