Krill Connectivity & Synchronization Report
Krill Connectivity & Synchronization Report
Krill Codebase Quality Review Report
Date: November 30, 2025
Scope: /server, /krill-sdk, /shared/commonMain, /composeApp/desktopMain
Platform-specific modules excluded: iOS, Android, WASM
Overall Code Quality Score
🟡 68/100 - Fair Quality with Notable Improvement Areas
| Category | Score | Notes |
|---|---|---|
| Architecture | 70/100 | Good separation of concerns, but coupling between modules needs work |
| Coroutine Safety | 55/100 | Multiple scope management issues, potential memory leaks |
| Thread Safety | 50/100 | Mutable collections accessed from multiple coroutines without synchronization |
| Error Handling | 60/100 | Inconsistent error handling, some silent failures |
| Code Organization | 75/100 | Good modular structure, clear feature boundaries |
| Documentation | 40/100 | Minimal inline documentation, some TODO comments |
| Completeness | 70/100 | Several unimplemented features marked with TODO |
Entry Point Analysis
1. Ktor Server Entry Point: server/src/main/kotlin/krill/zone/Application.kt
flowchart TD
subgraph Main["main()"]
A[Start] --> B["runBlocking { serverWarmup() }"]
B --> C["embeddedServer(Netty, ...)"]
C --> D["module = Application::module"]
D --> E[".start(wait = true)"]
style A fill:#90EE90
style B fill:#FFB6C1
style E fill:#90EE90
end
subgraph Warmup["serverWarmup()"]
W1[Set isServer = true] --> W2{Host node exists?}
W2 -->|No| W3[Create ServerMetaData]
W2 -->|Yes| W4[Skip creation]
W3 --> W5["nm.update(node, post=true, observe=true)"]
W4 --> W6{Platform is Pi?}
W5 --> W6
W6 -->|Yes| W7[PiManager.initPins]
W6 -->|No| W8[Done]
W7 --> W8
style W3 fill:#FFFF99
end
B -.-> Warmup
Issues Identified:
| Severity | Issue | Location |
|---|---|---|
| 🔴 HIGH | runBlocking used in main thread - blocks startup | Application.kt:18 |
| 🟡 MEDIUM | Global mutable isServer flag without synchronization | Platform.kt:8 |
| 🟡 MEDIUM | Redundant repeat(headerPins.size) loop creating duplicate pins | Application.kt:43-44 |
2. Application Module: server/src/main/kotlin/krill/zone/server/Server.kt
flowchart TD
subgraph Module["Application.module()"]
M1[Install WebSockets] --> M2[Install CORS anyHost]
M2 --> M3[Install ContentNegotiation]
M3 --> M4[Install ShutDownUrl]
M4 --> M5[Setup Routing]
style M2 fill:#FFB6C1
end
subgraph Lifecycle["Monitor Events"]
L1[ApplicationStarted] --> L2["serverScope.launch MQTTBroker.start()"]
L2 --> L3["NodeEventBus.subscribe"]
L3 --> L4["ServerReady: nm.init(), BeaconService.start()"]
L4 --> L5[ApplicationStopping: serverScope.cancel]
L5 --> L6[ApplicationStopped: unsubscribe]
style L3 fill:#FFB6C1
end
M5 --> Lifecycle
Issues Identified:
| Severity | Issue | Location |
|---|---|---|
| 🔴 HIGH | Global serverScope never properly cancelled on error scenarios | Server.kt:31 |
| 🔴 HIGH | CORS anyHost() allows any origin - security risk | Server.kt:43-44 |
| 🔴 HIGH | Jobs map mutableMapOf<String, Job>() accessed without synchronization | Server.kt:37 |
| 🟡 MEDIUM | NodeEventBus subscriptions never cleaned up properly | Server.kt:225-284 |
| 🟡 MEDIUM | Silent exception catch at line 259 catch (_: Exception) | Server.kt:259 |
| 🟡 MEDIUM | Nested monitor.subscribe inside ApplicationStarted handler | Server.kt:287-332 |
3. Desktop App Entry Point: composeApp/src/desktopMain/kotlin/krill/zone/main.kt
flowchart TD
subgraph DesktopMain["main(args)"]
D1["deleteReadyFile()"] --> D2[Parse demo arg]
D2 --> D3[Load icon]
D3 --> D4["Window(onCloseRequest = exitApplication)"]
D4 --> D5["App(demo) { exitApplication() }"]
style D1 fill:#90EE90
end
subgraph AppComposable["App.kt"]
A1[Set demoMode] --> A2[DarkBlueGrayTheme]
A2 --> A3[AppScaffold]
end
subgraph AppScaffold
S1["LaunchedEffect: initializeCore()"] --> S2{ready.value?}
S2 -->|No| S3[Show loading]
S2 -->|Yes| S4[MainNodeScreen]
end
D5 --> AppComposable
AppComposable --> AppScaffold
flowchart TD
subgraph InitCore["initializeCore()"]
I1["nm.init()"] --> I2["ClientCore.startWasmSocket()"]
I2 --> I3{demo mode?}
I3 -->|No| I4["PeerConnector.start()"]
I3 -->|Yes| I5[Skip PeerConnector]
style I4 fill:#FFFF99
end
Issues Identified:
| Severity | Issue | Location |
|---|---|---|
| 🟡 MEDIUM | deleteReadyFile() called before window creation - could leave stale file on crash | main.kt:10 |
| 🟡 MEDIUM | PeerConnector creates new unmanaged CoroutineScope | PeerConnector.kt:15 |
| 🟡 MEDIUM | Multiple CoroutineScopes without lifecycle management | Various |
High Priority Issues
🔴 Critical: Thread-Unsafe Mutable Collections
Multiple mutable collections are accessed from coroutines without synchronization:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Server.kt:37 - accessed from multiple coroutines
val jobs = mutableMapOf<String, Job>()
// NodeManager.kt:52-54 - modified from multiple launch blocks
private val jobs = mutableMapOf<String, Job>()
private val nodes: MutableMap<String, NodeFlow> = mutableMapOf()
// MQTTBroker.kt:26
val jobs = mutableMapOf<String, Job>()
// NodeEventBus.kt:11 - public mutable list
val subscribers = mutableListOf<NodeEvent>()
// ClientSocketManager.kt:16 - accessed from coroutines
val sessions = mutableSetOf<DefaultClientWebSocketSession>()
Recommendation: Use ConcurrentHashMap or wrap access with Mutex:
1
2
3
4
private val jobs = ConcurrentHashMap<String, Job>()
// or
private val jobsLock = Mutex()
private val jobs = mutableMapOf<String, Job>()
🔴 Critical: Orphaned CoroutineScopes
Several places create new CoroutineScopes that are never cancelled:
1
2
3
4
5
6
7
8
9
10
11
// screenCore.kt:147 - creates scope that's never cancelled
CoroutineScope(Dispatchers.Default).launch { ... }
// MainNodeScreen.kt:84 - same issue
CoroutineScope(Dispatchers.Default).launch { ... }
// MQTTBroker.kt:94 - creates new scope per publish
jobs[node.id] = CoroutineScope(Dispatchers.Default).launch { ... }
// PeerConnector.kt:15 - scope never cancelled
val scope = CoroutineScope(SupervisorJob() + Dispatchers.Default)
Recommendation: Use structured concurrency with parent scopes or lifecycle-aware scopes.
🔴 Critical: Potential Memory Leaks
- NodeManager observer pattern (
NodeManager.kt:328-368):- Jobs are added to map but cancellation logic has race conditions
- Lines 356-359 cancel and relaunch in same block without proper synchronization
- ServerNodeProcess (
ServerNodeProcess.kt:16-17):1
lateinit var job: Job // Instance variable, but class instances may be created multiple times
- BeaconService (
BeaconService.kt:14-15):1
private var jobs: List<Job> = emptyList() // Jobs not cancelled on service destruction
🟡 Medium: Anti-Patterns Detected
- God Object Pattern:
NodeManagerhandles too many responsibilities:- Node CRUD operations
- Job management
- HTTP communication
- Swarm management
- Chain building
- Global State: Multiple
objectsingletons with mutable state:NodeEventBusComposeCoreScreenCoreMQTTBrokerClientCore
- Hardcoded Values:
- Port 8442 in multiple places
- File paths like
/etc/krill/,/srv/krill/,/var/lib/krill/
🔴 Critical: Hardcoded Certificate Passwords
Certificate passwords are hardcoded as "changeit" in multiple locations:
1
2
3
4
5
6
7
8
9
// KtorConfig.kt:19
password = "changeit".toCharArray()
// KtorConfig.kt:28-29
keyStorePassword = { "changeit".toCharArray() },
privateKeyPassword = { "changeit".toCharArray() }
// MQTTBroker.kt:31
keyStorePassword = "changeit",
This is a critical security vulnerability. These passwords should be:
- Read from environment variables or secure configuration
- Never committed to source control
- Unique per deployment
Coroutine and Scope Analysis
graph TB
subgraph "Server Scopes"
SS[serverScope<br/>Global, SupervisorJob+Default]
SS --> SS1["MQTTBroker jobs"]
SS --> SS2["ServerSocketManager broadcasts"]
SS --> SS3["Node observation jobs"]
SS --> SS4["BeaconService"]
style SS fill:#90EE90
end
subgraph "Client Scopes"
CS[ComposeCore.scope<br/>Global, Default+SupervisorJob]
CS --> CS1["showPairing delays"]
CS --> CS2["checkForInteraction"]
style CS fill:#90EE90
end
subgraph "Unmanaged Scopes"
US1["CoroutineScope(Dispatchers.Default)<br/>screenCore.kt:147"]
US2["CoroutineScope(Dispatchers.Default)<br/>screenCore.kt:200"]
US3["CoroutineScope(SupervisorJob + Default)<br/>PeerConnector.kt:15"]
US4["CoroutineScope(Default)<br/>MQTTBroker.kt:94"]
style US1 fill:#FFB6C1
style US2 fill:#FFB6C1
style US3 fill:#FFB6C1
style US4 fill:#FFB6C1
end
subgraph "NodeManager Scope"
NMS[NodeManager.scope<br/>Instance, Default+SupervisorJob]
NMS --> NMS1["update posts"]
NMS --> NMS2["delete operations"]
NMS --> NMS3["observe flows"]
style NMS fill:#FFFF99
end
Legend:
- 🟢 Green (#90EE90): Properly managed scopes - [GOOD]
- 🟡 Yellow (#FFFF99): Needs review - [REVIEW]
- 🔴 Pink (#FFB6C1): Unmanaged/leaking scopes - [CRITICAL]
Server HTTP Transaction Flow
sequenceDiagram
participant C as Client
participant R as Routing
participant NM as NodeManager
participant FS as FileOperations
participant WS as WebSocket
participant MQTT as MQTTBroker
C->>R: POST /node/{id}
R->>R: call.receive<Node>()
R->>NM: nm.update(node, post=true, observe=false)
alt node.host == installId
NM->>FS: fileOperations.update(node)
else remote host
NM->>NM: postNode(node) via HTTP
end
NM-->>NM: updateSwarm(node.id)
R-->>C: 200 OK + Node
Note over NM,MQTT: On NodeState.EXECUTED
NM->>NM: buildChain(node).invoke(scope)
NM->>WS: ServerSocketManager.broadcast(node)
NM->>MQTT: MQTTBroker.publish(node)
TODO Items for Agent Completion
| Location | TODO | Agent Prompt |
|---|---|---|
MQTTBroker.kt:21-23 | Send smaller DTO with just id and snapshot or command | “Refactor MQTTBroker.publish() to send a lightweight DTO containing only node.id and snapshot instead of the full Node object to reduce MQTT message size” |
DataStore.kt:33 | Check for triggers that would prevent data recording | “Implement trigger checking logic in DataStore.post() method that evaluates DiscardAbove and DiscardBelow triggers before recording data to file” |
SilentTriggerManager.kt:23,28 | Implement waitJob completion logic and post() method | “Implement the SilentTriggerManager.waitJob() method to fire an alarm event after the delay period and implement the post() method to properly handle silent alarm triggering” |
TriggerEventProcessor.kt:32 | Refresh nodes when reading for latest snapshot | “Modify TriggerEventProcessor to call nm.refresh() or re-read the datapoint node before evaluating trigger to ensure latest snapshot value is used” |
NodeMetaData.kt:223 | Port should not be hardcoded | “Add a configurable port parameter to ServerMetaData.createMetadata() or read from configuration instead of hardcoding 8442” |
HardwareDiscovery.kt:13 | Turn on i2cdetect with raspi-config | “Create documentation or setup script that enables I2C interface on Raspberry Pi using raspi-config during Krill installation” |
HardwareDiscovery.kt:101 | HAT META implementation | “Implement proper HAT metadata parsing in readHatInfo() function to correctly parse /proc/device-tree/hat entries and create SerialDeviceMetaData” |
MediaPlayer.jvm.kt:190 | mediaPlayer not implemented | “Implement the mediaPlayer actual property in MediaPlayer.jvm.kt by returning a properly initialized MediaPlayerJvm instance” |
Architecture Diagrams
Node Hierarchy
classDiagram
class Node {
+id: String
+parent: String
+host: String
+type: KrillApp
+state: NodeState
+meta: NodeMetaData
}
class KrillApp {
<<sealed>>
+exec: (CoroutineScope, Node) -> Unit
+children: List~KrillApp~
}
class NodeMetaData {
<<interface>>
+name: String
}
KrillApp <|-- Server
KrillApp <|-- Client
KrillApp <|-- DataPoint
KrillApp <|-- SerialDevice
KrillApp <|-- RuleEngine
KrillApp <|-- Project
Server <|-- Pin
DataPoint <|-- Trigger
DataPoint <|-- CalculationEngine
DataPoint <|-- Compute
Trigger <|-- HighThreshold
Trigger <|-- LowThreshold
Node --> KrillApp : type
Node --> NodeMetaData : meta
Data Flow Architecture
flowchart LR
subgraph Devices["Input Sources"]
SD[Serial Devices<br/>USB/UART]
ZB[Zigbee<br/>Coordinator]
WH[Webhooks<br/>HTTP]
MT[MQTT<br/>Clients]
end
subgraph Server["Krill Server"]
NM[NodeManager]
DP[DataPoint<br/>Processor]
TE[Trigger<br/>Engine]
RE[Rule<br/>Engine]
DS[DataStore<br/>Time Series]
end
subgraph Outputs["Output Actions"]
GPIO[GPIO Pins]
WB[Outgoing<br/>Webhooks]
SCR[Scripts]
BRD[WebSocket<br/>Broadcast]
end
subgraph Clients["Krill Clients"]
DA[Desktop App]
WA[WASM App]
MA[Mobile Apps]
end
SD --> NM
ZB --> NM
WH --> NM
MT --> NM
NM --> DP
DP --> DS
DP --> TE
TE --> RE
RE --> GPIO
RE --> WB
RE --> SCR
NM --> BRD
BRD --> DA
BRD --> WA
BRD --> MA
Revised Prompt for Future Reviews
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
Review the Kotlin code in the Krill platform focusing on:
**Scope:**
- `/server/src/main/kotlin/krill/zone/` - Ktor server implementation
- `/krill-sdk/src/commonMain/kotlin/krill/zone/` - Core SDK shared code
- `/krill-sdk/src/jvmMain/kotlin/krill/zone/` - JVM-specific implementations
- `/shared/src/commonMain/kotlin/krill/zone/` - Shared multiplatform code
- `/composeApp/src/commonMain/kotlin/krill/zone/` - Compose UI code
- `/composeApp/src/desktopMain/kotlin/krill/zone/` - Desktop-specific code
**Exclude:** iOS, Android, WASM platform-specific modules
**Entry Points:**
1. Server: `server/src/main/kotlin/krill/zone/Application.kt` (Ktor Netty server)
2. Desktop: `composeApp/src/desktopMain/kotlin/krill/zone/main.kt` (Compose Desktop)
**Analysis Required:**
1. **Coroutine Analysis:**
- Map all CoroutineScope declarations and their lifecycle management
- Identify orphaned scopes (created but never cancelled)
- Check Job management in maps/collections
- Evaluate structured concurrency usage
2. **Thread Safety:**
- Find mutable collections accessed from coroutines
- Check for synchronization mechanisms (Mutex, lock, synchronized)
- Identify potential race conditions in Node/Job management
3. **Memory Leaks:**
- Check observer/subscription patterns for cleanup
- Review StateFlow/SharedFlow usage and collection lifecycle
- Identify retained references in singletons
4. **Architecture:**
- Evaluate module dependencies and coupling
- Check for circular dependencies
- Review singleton usage patterns
5. **Feature Definitions:**
- Read `/content/feature/*.json` for feature specifications
- Cross-reference with `KrillApp.kt` implementation
- Identify gaps between spec and implementation
**Output Format:**
- Overall quality score (0-100) with breakdown
- Mermaid diagrams for:
- Entry point flow
- Coroutine scope hierarchy
- Data flow architecture
- Tables for issues (severity, location, description)
- TODO items with agent prompts for completion
- Professional evaluation summary
**Previous Issues to Re-verify:**
- serverScope lifecycle management in Server.kt
- Jobs map synchronization in NodeManager, MQTTBroker
- NodeEventBus subscriber cleanup
- CORS anyHost() security
Professional Application Evaluation
Executive Summary
Krill is an ambitious IoT/automation platform built on modern Kotlin Multiplatform technology with Ktor, Compose Multiplatform, and coroutines. The project aims to provide a unified system for:
- Data Collection - Time-series data from sensors and devices
- Automation - Rule-based triggers and actions
- Device Control - GPIO, serial devices, Zigbee integration
- Monitoring - Real-time visualization across platforms
Strengths 🟢
- Modern Technology Stack
- Kotlin Multiplatform enables code sharing across JVM, JS (WASM), iOS, Android
- Ktor provides lightweight, coroutine-native server
- Compose Multiplatform offers declarative UI
- kotlinx.serialization for type-safe data handling
- Good Architectural Concepts
- Node-based data model is flexible and extensible
- Feature definitions in JSON allow UI/behavior configuration
- Sealed class hierarchy for
KrillAppprovides type-safe feature modeling - Code generation for feature content maintains consistency
- Network Discovery
- Multicast beacon service for automatic server discovery
- MQTT broker for real-time data streaming
- WebSocket support for client-server communication
- Hardware Integration
- Pi4J integration for Raspberry Pi GPIO
- Serial device monitoring and communication
- Zigbee coordinator support
Areas for Improvement 🟡
- Production Readiness
- Thread safety issues need resolution before production use
- Memory leak potential in observer patterns
- Hardcoded credentials and paths
- Missing comprehensive error handling
- Code Quality
- Inconsistent coroutine scope management
- Some God Object anti-patterns (NodeManager)
- Missing unit tests
- Sparse documentation
- Security
- CORS allows any origin
- Hardcoded certificate passwords
- No authentication layer visible in routing
Market Potential 📊
The project occupies an interesting niche:
| Comparison | Krill Advantage | Competitor Advantage |
|---|---|---|
| vs Home Assistant | Native Kotlin, lighter weight | Massive ecosystem, community |
| vs Node-RED | Type safety, multiplatform | Visual programming, plugins |
| vs OpenHAB | Modern stack, simpler | Mature, enterprise features |
Target Markets:
- DIY IoT Enthusiasts - Good fit for Raspberry Pi hobbyists
- Industrial Light Automation - Data logging, simple rule engines
- Educational - Kotlin Multiplatform showcase project
Recommended Roadmap
Phase 1: Stability (1-2 months)
- Fix thread safety issues with concurrent collections
- Implement proper coroutine scope lifecycle management
- Add authentication/authorization to server
- Configuration externalization (remove hardcoded values)
- Add comprehensive logging
Phase 2: Testing (1 month)
- Unit tests for NodeManager, DataStore
- Integration tests for server routes
- UI tests for Compose components
- Load testing for MQTT/WebSocket
Phase 3: Features (2-3 months)
- Complete TODO implementations
- Add more trigger types
- Implement calculation engine
- Dashboard/visualization improvements
- Mobile app completion
Phase 4: Production (1-2 months)
- Security audit
- Performance optimization
- Documentation
- Deployment automation
- Monitoring/alerting for server health
Conclusion
Krill shows significant potential as a modern IoT automation platform. The use of Kotlin Multiplatform positions it well for cross-platform deployment, and the architectural foundations are sound. However, the codebase needs attention to thread safety, memory management, and production hardening before it’s ready for serious deployment.
Recommendation: Continue development with focus on stability and testing. The project would benefit from:
- A dedicated pass to fix coroutine/threading issues
- Security hardening before any public deployment
- Community engagement to grow the ecosystem
Rating: ⭐⭐⭐ (3/5) - Promising but needs maturation
Report generated by automated code review. Manual verification recommended for critical findings.