Symptom

A server (ghost.local) went offline; the desktop client correctly showed it as disconnected. When the user opened the server’s settings and deleted the node, it immediately re-appeared and kept re-appearing. Logs showed an endless cycle: begin connection attemptserver unreachable, skipping connectionClientNodeManager: Observing node ... Server ghost WARNbegin connection attempt.

Root cause

Two reconnect engines resurrect a deleted server:

alarm() calls update(), whose structural-insert path used getOrPut(node.id) { MutableStateFlow(node).also { observeNode(it) } }. For an id that was just deleted (and therefore absent from the map) this re-created the node and re-armed its observer. The observer immediately re-dispatched into the connector, restarting the connection attempt — an infinite loop. delete() removed the node from the map but had no way to say “and keep it gone,” so any in-flight reconnect won the race.

Fix

Tombstone deliberately-deleted servers. ClientNodeManager now holds a deletedServers set: delete() adds the Server’s install id (synchronously); update()’s structural-insert path drops any update whose id is tombstoned (so alarm/late SSE can’t re-create it); allowServer(id) clears the tombstone and isServerDeleted(id) exposes it. The tombstone is cleared only on clear “server is back” intent: ClientBeaconWireHandler calls allowServer(wire.installId) when a live peer beacon arrives, and ClientServerConnector clears it on an explicit manual (form) re-add. EventClient breaks its retry loop when the server is tombstoned, and ClientServerConnector.performConnectionSanityCheck refuses tombstoned servers — so the client actually stops trying to reconnect. The installId() seam in ClientNodeManager was parameterised (constructor default installIdProvider = installId) so delete() is unit-testable without reading the real ~/.krill/install_id.

Prevention