From 7a7d19e73bcbf78820cf3e4312fdfec103fbfac7 Mon Sep 17 00:00:00 2001 From: Fabio Fernandes Valente Date: Fri, 26 Jun 2026 16:22:01 -0500 Subject: [PATCH] fix(macos): retry launchd reload on transient bootstrap failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit refresh_launchd_plist_if_needed ran `launchctl bootout` then `launchctl bootstrap` with errors silenced (`2>/dev/null` in the detached helper, `check=False` in the direct subprocess path). Under high load or a launchd race, the bootout succeeds — removing the service from launchd — but the follow-up bootstrap fails silently. The service stays unregistered; KeepAlive can't revive a service launchd no longer knows about, so the gateway stays dark until a manual `launchctl bootstrap`. Observed incident (2026-06-26): `/restart` in chat triggered a planned drain; during the drain a separate call re-triggered the plist refresh, which bootout'd the live service. Under loadavg 9.48 the bootstrap failed silently — 2h35min offline until manual recovery. Fix: retry the bootstrap up to 5 times with 2s back-off, verify with `launchctl list