Source-driven nodes (Triggers, Calculations, TaskList, etc.) occasionally fired twice on a single root-cause change. The duplicates were not stable enough to catch in tests; they showed up as e.g. an SMTP email arriving twice from a single threshold cross, or a TaskList reset cascading where it should not have.
Underneath the surface, two unrelated complaints were both true:
update() should be CRUD, not invocation.” Every write to the DB went
through ServerNodeManager.update(node), which called node.type.emit(node)
— the KrillApp observer hook — re-entering the processor’s post(). CRUD
and processor invocation were entangled in one function.nodeAction = RESET triggered
a TaskList reset, the TaskList still fanned out to its own subscribers,
potentially waking downstream webhooks / SMTP nodes that the user did not
intend to fire on a reset.Phase 2 (unify-source-verb-wiring) shipped the source-owned-verb dispatch
model and retired NodeState.RESET as a verb carrier — but it left the EXECUTE
half of the dispatch riding the very anti-pattern it was supposed to retire:
ServerNodeManager.update() called node.type.emit(node) on every
persistence write. CRUD and wake were one function; a processor’s own
persistence write re-entered itself.ServerNodeManager.doNodeVerbOnTarget() existed as a “self-execute” API
whose only mechanism was stamping NodeState.EXECUTED onto a node and
letting the observer wake the processor. Seven sites called it (cron tick,
webhook ingress, click, task expiry, serial monitor, lifecycle startup,
LogicGate hand-off).NodeProcessor.onSourceTrigger default impl was
post(node.copy(state = NodeState.EXECUTED)) — source-triggered EXECUTE
loops right back through the state-stamp / observer / post path.type.emit (push) AND
executeSources (pull) fired twice. EventMonitor’s PIN_CHANGED/SNAPSHOT_UPDATE
handlers called executeSources after the underlying write that already went
through update()’s type.emit.The RESET-cascade was a separate but related design gap: each receiver had to
remember to suppress executeSources(self) on RESET, and most didn’t. Forcing
every downstream node type to be RESET-aware is the same silent-fall-back failure
mode Phase 1 D1 rejected.
Make CRUD and invocation two separate concerns with a single, deliberate seam:
ServerNodeManager.update() becomes pure CRUD. No type.emit. CRUD writes
never wake server processors.ServerNodeManager.invoke(target, by: NodeIdentity, verb: NodeAction) is
the one entry point that wakes a server processor. It broadcasts the activity
pulse and dispatches to NodeProcessor.onInvoke. by is always a full
NodeIdentity (nodeId + hostId) — never a bare string — so cross-server
originators retain their hostId end-to-end.doNodeVerbOnTarget is removed. Every self-fire caller migrates to
invoke(self, by = self.id(), verb = EXECUTE). Self-fire is the degenerate
case where by == target.NodeProcessor.onInvoke replaces onSourceTrigger’s state-stamp default:
EXECUTE calls process(node) directly; RESET is an explicit no-op (the
unify-source-verb rule against silent fall-back to EXECUTE is preserved).POST /node/{id}/invoke route taking
an InvokeRequest { by, verb } body — not by clients posting a state-stamped
node update.executeSources fans out via invoke() — the same seam. The legacy
wakeFromSource → onSourceTrigger path is retained in the SDK for cross-server
SSE transport only.verb = RESET
performs its reset semantics and does NOT call executeSources(self). EXECUTE
cascades; RESET stops. (Worked example: a Button with nodeAction = RESET
targeting a TaskList resets the TaskList without cascading to its downstream
OutgoingWebHook → SMTP chain. The user gets the reset they asked for; no
shadow email.)NodeState.EXECUTED survives only as a UI/SSE visual signal. Server
processors stop reading node.state == EXECUTED to drive forward work. Every
server processor’s post() override is dropped or trimmed to CRUD-lifecycle
arms only (Pin pi4j register/unregister, MQTT SUB init, SilentAlarm watchdog
re-arm — all flagged for follow-up rewiring when type.emit is finally
removed from update()).meta.sources wires invocation (who wakes
this node). meta.inputs wires data (what snapshots this node reads when
invoked). Every processor that previously read meta.sources.first().nodeId
to fetch data was migrated to meta.inputs.first() + Node.snapshot() —
including SMTP, OutgoingWebHook, Compute, MQTT, Trigger, Lambda, MQTT,
SerialDevice. Variable names, log messages, and comments using “source” for
data now use “input”.Regression-locked by InvocationSeamTest: a single source change invokes each
subscriber exactly once; a Button click invokes its subscriber exactly once; a
Cron tick fans out exactly once; meta.inputs plays no role in invocation
dispatch (changing inputs without changing sources does not change who wakes).
The worked-example end-to-end tests (TaskList → cross-server OutgoingWebHook →
SMTP EXECUTE chain; Button-RESET-stops-at-TaskList chain) are pending and
require a multi-host harness.
Three rules, each enforceable by a structural test or compiler check:
One seam. ServerNodeManager.invoke(target, by, verb) is the ONLY path
that wakes a server processor. CRUD writes (update, create, delete,
setStateToNone, alarm, updatePinState, updateMetaData, setErrorState) MUST
NOT wake a processor. The test asserts that an update(node) to a node that
is neither a source nor self-fired wakes no processor.
No verb via state stamp. No code path writes state = NodeState.EXECUTED
(or any state) with the intent of waking a processor. The non-exhaustive
when (node.state) sweep over server processors confirms no EXECUTED arm
drives forward logic — only UI/SSE signals.
RESET is terminal at the receiver. A receiver’s process() cascades to
executeSources(self) only when invoked with verb = EXECUTE. The
InvocationSeamTest source-driven RESET test (worked-example, pending the
multi-host harness) locks this in.
The lookalike trap: a CRUD-lifecycle post() arm (e.g. Pin’s USER_EDIT →
reconfigure pi4j) looks like a state-driven wake but is doing legitimate
side-effect work. Each such arm is flagged with a TODO(§2.2) comment naming
the explicit hook it needs once type.emit is removed from update(). The
rule: a post() body that calls a forward-logic helper is the anti-pattern;
a post() body that responds to a CRUD state (USER_EDIT, CREATE_OR_OVERWRITE,
DELETING) is a transitional bridge with a documented next step.
§2.2/§3.2 — remove node.type.emit(node) from update() and delete
doNodeVerbOnTarget. Blocked on app teams shipping the /invoke route
integration so clicks survive the cut.§2.3-§2.5 — rewire the CRUD-lifecycle hooks the post() sweep left as
breadcrumbs (Pin pi4j register/unregister, SilentAlarm watchdog re-arm on
meta edit, MQTT SUB subscription init, Graph autowiring, DataPoint USER_EDIT
ingest) so they fire from ServerNodeManager.update / create / delete
directly rather than via the legacy observer chain.§3.4 — wire EventMonitor’s cross-server SOURCE_TRIGGERED inbound to
invoke(localTarget, triggeringSource, nodeAction). No-op today.§8.4/§8.5 — worked-example multi-host tests (TaskList → OutgoingWebHook
cross-server → SMTP EXECUTE chain; Button-RESET-stops-at-TaskList). Needs a
fake-second-host loopback harness.§9 — krill-mcp invoke_node(target, by, verb) tool + inputs surface so
QA can verify source/input distinction end-to-end.