Symptom

After a node’s collection job was killed by a CancellationException thrown from dispatch, the node became permanently un-observable: a subsequent observe() call for the same node id silently skipped setup, no collection job was started, and the UI received no further state updates for that node.

Root cause

The finally block inside DefaultNodeObserver’s inner collection coroutine only logged on exit; it never removed the dead job from the jobs map:

1
2
3
4
5
6
7
8
jobs[node.value.id] = observerScope.launch {
    try {
        node.collect(collector)
    } finally {
        logger.w("exited node observing job ${node.details()}")
        // jobs[id] NOT removed here — stale entry persists
    }
}

When dispatch threw a CancellationException, the inner job was cancelled and exited via finally, but jobs[id] still held the dead Job reference. The idempotency check at the top of observe()

1
if (!jobs.containsKey(node.value.id)) {  }

— found the stale entry and returned without creating a new collection job. All subsequent observations for that node id were silently dropped.

Fix

1
2
3
4
5
6
7
8
9
10
11
val id = node.value.id
jobs[id] = observerScope.launch {
    try {
        node.collect(collector)
    } finally {
        logger.w("exited node observing job ${node.details()}")
        withContext(NonCancellable) {
            mutex.withLock { jobs.remove(id) }
        }
    }
}

Added regression test re-observe after CancellationException restarts dispatch in DefaultNodeObserverDispatchTest.

Prevention