Post

Icon Beacon Listener and Certificate Refactoring Analysis

Analysis of beacon listener critical paths and certificate download process optimization opportunities for improved efficiency and reliability

Beacon Listener and Certificate Refactoring Analysis

Beacon Listener and Certificate Refactoring Analysis

Current Architecture Analysis

1. Beacon Listener Critical Path

Startup Flow:

  1. Client Apps (DefaultClientProcessor):
    • processClientNode() → sends initial beacon via beaconSender.sendSignal()
    • Starts startBeaconListener() via multicast.receiveBeacons()
    • Listener runs indefinitely in background job
  2. Servers (ServerServerProcessor):
    • processServerNode() → sends initial beacon via beaconSender.sendSignal()
    • Starts startBeaconListener() via multicast.receiveBeacons()
    • Listener runs indefinitely in background job

Beacon Processing Flow:

graph LR
    A[Beacon Received] --> B[handleIncomeWire]
    B --> C[BeaconProcessor.processWire]
    C --> D[ServerHandshakeProcess.trustServer]

2. Certificate Download Process (CURRENT - INEFFICIENT)

Current Flow (Every Beacon):

  1. Beacon received with wire.port > 0 (indicates server)
  2. ServerHandshakeProcess.trustServer() called
  3. establishTrust()trustHttpClient.fetchPeerCert() downloads cert
  4. Compare cert bytes with existing file
  5. If changed: write file + rebuildHttpClient() (creates new HttpClient with all certs)
  6. downloadAndSyncServerData() → downloads nodes using httpClient

Problems:

  • ✅ Certificate downloaded EVERY time beacon received (even if unchanged)
  • ✅ One-off insecure HttpClient created for EACH cert download
  • ✅ Full HttpClient rebuild even when cert unchanged
  • ❌ No error handling for SSL failures during node download
  • ❌ No retry logic if cert is outdated/invalid

3. Beacon Listener Lifecycle Issues

Client (DefaultClientProcessor):

  • Uses mutableStateOf<Job?> for beacon job tracking
  • Listener only started ONCE per node processing
  • ⚠️ If listener crashes, it’s NOT restarted

Server (ServerServerProcessor):

  • Uses simple var beaconJob: Job? for tracking
  • Listener only started ONCE per node processing
  • ⚠️ If listener crashes, it’s NOT restarted

Proposed Refactoring

Phase 1: Certificate Download Optimization

A. Lazy Certificate Fetching

Change: Only download cert when needed (first connection or after connection failure)

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class ServerHandshakeProcess {
    private val certCache = mutableMapOf<String, CertificateInfo>()
    
    suspend fun trustServer(wire: Wire): Result<Unit> {
        val serverId = wire.peerId
        
        // Check cache first
        val cachedCert = certCache[serverId]
        if (cachedCert != null && !cachedCert.isExpired()) {
            return Result.success(Unit)
        }
        
        // Only download if not cached or expired
        return establishTrust(wire)
    }
}

B. Smart HttpClient Rebuild

Change: Only rebuild HttpClient if certificate actually changed

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
private suspend fun establishTrust(wire: Wire): Result<Unit> {
    val newCert = trustHttpClient.fetchPeerCert(wire)
    val existingCert = fileOperations.readCertificate(wire.peerId)
    
    if (newCert.contentEquals(existingCert)) {
        logger.d { "Certificate unchanged for ${wire.peerId}" }
        return Result.success(Unit)  // Skip rebuild
    }
    
    // Only rebuild if cert changed
    fileOperations.writeCertificate(wire.peerId, newCert)
    rebuildHttpClient()
    return Result.success(Unit)
}

C. Connection Retry with Cert Refresh

Change: On SSL failure, try refreshing certificate once

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
suspend fun downloadAndSyncServerData(wire: Wire) {
    try {
        nodeHttp.syncFromServer(wire)
    } catch (e: SSLException) {
        logger.w { "SSL error, refreshing certificate for ${wire.peerId}" }
        
        // Force cert refresh
        certCache.remove(wire.peerId)
        establishTrust(wire)
        
        // Retry once
        nodeHttp.syncFromServer(wire)
    }
}

Phase 2: Beacon Listener Resilience

A. Supervised Listener

Change: Monitor beacon listener and restart on crash

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class BeaconListenerSupervisor(
    private val scope: CoroutineScope,
    private val multicast: MulticastService
) {
    private var listenerJob: Job? = null
    private val supervisorJob: Job
    
    init {
        supervisorJob = scope.launch {
            while (isActive) {
                if (listenerJob?.isActive != true) {
                    logger.w { "Beacon listener not active, restarting..." }
                    startListener()
                }
                delay(5000)  // Check every 5 seconds
            }
        }
    }
    
    private fun startListener() {
        listenerJob = scope.launch {
            try {
                multicast.receiveBeacons().collect { wire ->
                    processBeacon(wire)
                }
            } catch (e: Exception) {
                logger.e(e) { "Beacon listener crashed" }
            }
        }
    }
}

B. Exponential Backoff on Failure

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
private suspend fun startListenerWithBackoff() {
    var attempt = 0
    var delay = 1000L
    
    while (isActive) {
        try {
            multicast.receiveBeacons().collect { wire ->
                processBeacon(wire)
                attempt = 0  // Reset on success
                delay = 1000L
            }
        } catch (e: Exception) {
            attempt++
            logger.e(e) { "Beacon listener failed (attempt $attempt)" }
            delay(delay)
            delay = min(delay * 2, 30000L)  // Max 30 seconds
        }
    }
}

Phase 3: Metrics and Monitoring

Add metrics for:

  • Beacon receive rate
  • Certificate download count
  • HttpClient rebuild count
  • Listener restart count
  • SSL failure count

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class BeaconMetrics {
    var beaconsReceived = 0L
    var certsDownloaded = 0L
    var httpClientRebuilds = 0L
    var listenerRestarts = 0L
    var sslFailures = 0L
    
    fun report() {
        logger.i {
            """
            Beacon Metrics:
            - Beacons: $beaconsReceived
            - Cert Downloads: $certsDownloaded
            - HttpClient Rebuilds: $httpClientRebuilds
            - Listener Restarts: $listenerRestarts
            - SSL Failures: $sslFailures
            """.trimIndent()
        }
    }
}

Expected Improvements

Certificate Download Optimization

  • Current: N downloads per minute (one per beacon)
  • After: 1 download per server discovery + on-demand refreshes
  • Reduction: ~95% fewer certificate downloads

HttpClient Rebuild Optimization

  • Current: Rebuilds even when cert unchanged
  • After: Only rebuilds when cert actually changes
  • Reduction: ~99% fewer rebuilds

Listener Reliability

  • Current: No automatic restart on crash
  • After: Supervised with automatic restart
  • Improvement: 100% uptime

Migration Strategy

  1. ✅ Implement cert caching (backward compatible)
  2. ✅ Add metrics (observability)
  3. ✅ Implement smart rebuild (optimization)
  4. ✅ Add listener supervisor (reliability)
  5. ✅ Monitor in production
  6. ✅ Tune cache TTL based on metrics

Security Considerations

  • Certificate cache must expire appropriately
  • Failed cert validation must not use cached cert
  • Insecure HttpClient only used for cert download (not data sync)
  • Cert refresh on SSL error prevents MITM attacks
This post is licensed under CC BY 4.0 by the author.