hermes-agent

History

Ben 8ab7246c45 fix(gateway): stamp drain marker with instantiation epoch so a durable-volume restart clears it (NS-570) The external-drain marker .drain_request.json is written under HERMES_HOME, which on Hermes Cloud is a persistent Fly volume (/opt/data). A begin-drain marker therefore SURVIVES the post-update machine restart. But the disruptive lifecycle actions a drain protects (auto-update / image migrate / env edit / profile change) all restart the machine — which is exactly the signal the drain is over. The freshly-restarted gateway re-read the orphaned marker on its startup reconcile and parked itself back in 'draining', refusing every new turn indefinitely (NS-570: ~52 min until manually cleared). Fix: stamp the marker with an identity of THIS container/VM instantiation (kernel boot_id + PID 1 start time, read from /proc) and treat a marker whose epoch differs from the current instantiation as absent. A deliberate restart → new PID 1 → new epoch → stale marker ignored → gateway boots 'running'. A marker written during the current instantiation (the live drain) still matches; an s6 respawn of just the gateway (PID 1/init unchanged) keeps the same epoch, so an in-flight drain is still honoured (D4a reversibility preserved). The staleness check is lenient and never fail-closed: a legacy marker with no epoch, a corrupt/contentless marker, or an environment with no /proc (epoch unavailable) all degrade to the original presence-only behaviour. NAS is untouched — it only ever POSTs begin/cancel-drain over HTTP; the marker file is purely gateway-internal IPC. The fix is entirely within gateway/drain_control.py; the watcher and the dashboard endpoint go through the same drain_requested()/write_drain_request() chokepoints and need no functional change.		2026-06-26 18:59:41 +05:30
..
assets	fix: improve telegram topic mode setup	2026-05-04 12:07:17 -07:00
builtin_hooks	remove: BOOT.md built-in hook (#17093 )	2026-04-28 09:50:27 -07:00
platforms	fix(whatsapp_cloud): resolve reply-to text so the agent sees reply context (#52957 )	2026-06-26 01:05:05 -07:00
relay	feat(relay): multi-platform-per-agent — list identity, provision-loop, N-hello, per-frame egress (Phase 1.5) (#52830 )	2026-06-26 17:32:46 +10:00
__init__.py	docs(gateway): mention Weixin in gateway help and docstrings	2026-05-12 17:08:51 -07:00
authz_mixin.py	fix(relay): authorize relay-delivered events by delivery, not source.platform (#52306 )	2026-06-25 14:21:09 +10:00
channel_directory.py	docs(sessions): clarify sessions.json is the gateway routing index, not the session list (#51726 )	2026-06-23 23:56:36 -07:00
code_skew.py	fix(gateway): refuse model switch on stale checkout to avoid env_float ImportError	2026-06-24 04:16:54 +05:30
config.py	Address email pairing review feedback	2026-06-21 22:43:57 -07:00
delivery.py	fix(delivery): drop env-var knob, flag all chunking adapters	2026-06-22 05:41:22 -07:00
display_config.py	feat(discord): render reasoning as -# subtext via display.reasoning_style (#51168 )	2026-06-23 10:44:02 -07:00
drain_control.py	fix(gateway): stamp drain marker with instantiation epoch so a durable-volume restart clears it (NS-570)	2026-06-26 18:59:41 +05:30
hooks.py	feat(hooks): expose thread_id and chat_type in agent:start/end context (#41672 )	2026-06-07 19:16:36 -07:00
kanban_watchers.py	fix(kanban): honor kanban.auto_decompose toggle live, without a gateway restart (#50358 )	2026-06-21 12:43:44 -07:00
memory_monitor.py	Port from cline/cline#10343: periodic gateway memory logging (#27102 )	2026-05-16 12:55:23 -07:00
message_timestamps.py	feat(gateway): inject stable human-readable message timestamps	2026-06-16 15:49:59 -07:00
mirror.py	fix(cron): mirror continuable cron as a labelled user turn (alternation-safe)	2026-06-24 20:27:05 -07:00
pairing.py	fix(gateway): preserve WhatsApp pairing approvals across JID/LID alias flips	2026-05-23 01:46:34 -07:00
platform_registry.py	refactor(plugins): add apply_yaml_config_fn registry hook	2026-05-13 22:20:30 -07:00
response_filters.py	fix(gateway): suppress exact silence tokens without mutating history	2026-06-14 03:25:08 -07:00
restart.py	fix(gateway): exit 78 (EX_CONFIG) on fatal startup errors, s6 finish script stops restart loop	2026-06-24 16:34:51 +10:00
rich_sent_store.py	fix(telegram): resolve replies to rich (sendRichMessage) messages	2026-06-16 13:04:20 -07:00
run.py	fix(gateway): stamp drain marker with instantiation epoch so a durable-volume restart clears it (NS-570)	2026-06-26 18:59:41 +05:30
runtime_footer.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
scale_to_zero.py	feat(gateway): scale-to-zero idle detection + dormant-quiesce (Phase 0)	2026-06-24 18:47:18 -07:00
session.py	fix(gateway): dedupe user turns on transient failure (#47237 )	2026-06-26 00:11:17 +05:30
session_context.py	fix(api-server): stop silently promising async delivery on stateless HTTP path (#50319 )	2026-06-21 12:15:14 -07:00
shutdown_forensics.py	chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926 )	2026-05-11 11:03:29 -07:00
slash_access.py	feat(gateway): per-platform admin/user split for slash commands (salvage of #4443 ) (#23373 )	2026-05-10 12:33:54 -07:00
slash_commands.py	fix: stop reporting cache-hit rate and cost across all UI surfaces (#52717 )	2026-06-25 15:21:22 -07:00
status.py	fix(gateway): scope dashboard liveness fallback to the profile	2026-06-25 10:25:54 +10:00
sticker_cache.py	fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes	2026-05-19 00:12:41 -07:00
stream_consumer.py	fix(gateway): respect adapter decline of fresh-final to prevent double delivery	2026-06-21 13:55:50 -07:00
stream_dispatch.py	feat(gateway): structured stream-event protocol + Telegram draft formatting parity (#37250 )	2026-06-02 00:33:50 -07:00
stream_events.py	feat(gateway): structured stream-event protocol + Telegram draft formatting parity (#37250 )	2026-06-02 00:33:50 -07:00
whatsapp_identity.py	fix(whatsapp): normalize bare phone targets to JIDs before bridge send	2026-06-21 13:32:22 -07:00