hermes-agent/cron
teknium1 b48cacb97b fix(gateway,cron): guard cron model-tool path + add auto-resume loop breaker (#30719)
Completes the #30719 restart-loop defenses. Defenses 1-2 (the
_HERMES_GATEWAY guard on `hermes gateway stop|restart` + terminal_tool,
and the cron-creation lifecycle filter) already landed on main, but two
gaps remained:

- The agent's `cronjob` model tool calls cron.jobs.create_job directly,
  bypassing the hermes_cli.cron.cron_create CLI filter, so lifecycle
  commands scheduled via the model tool were only blocked at execution
  time (terminal_tool), not at creation. Moved the filter to a shared
  cron/lifecycle_guard.py enforced at create_job — the single chokepoint
  every job-creation path hits (CLI + model tool). Re-exported
  _contains_gateway_lifecycle_command from hermes_cli.cron so
  terminal_tool's import keeps working.
- No breaker for the auto-resume loop itself. Defenses 1-2 cover the
  cron/CLI/terminal paths, but any other SIGTERM source (e.g. a raw
  terminal("launchctl kickstart ai.hermes.gateway")) still triggers the
  boot->auto-resume->re-run cycle. Added gateway/restart_loop_guard.py:
  counts restart-interrupted boots in a rolling window (config
  gateway.restart_loop_guard, default 3 boots / 60s) and skips
  auto-resume for that boot once tripped. The gateway still comes up and
  serves real inbound messages; it just stops replaying the session that
  keeps killing it, putting a human back in the loop.

Also tightened the lifecycle regex over main's version: dropped
`hermes gateway start` (benign), required the gateway identifier on the
launchctl/systemctl branches (so `launchctl unload
ai.hermes.update-checker.plist` and `systemctl restart
hermes-meta.service` no longer false-positive), added the inverse
pkill token order, and fixed the binary-script bypass (decode with
errors='replace' instead of swallowing UnicodeDecodeError). The
create_job guard resolves relative script paths under HERMES_HOME/scripts
the same way the scheduler does, so a bare script name is scanned as the
file that actually runs.

Design and much of defense-2 originate from PR #33395 (@kshitijk4poor),
which itself salvaged #30728 (@SimoKiihamaki). Rebuilt against current
main since defenses 1-2 had already landed under different names.

Closes #30719.

Co-authored-by: SimoKiihamaki <simo.kiihamaki@gmail.com>
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-07-01 02:48:36 -07:00
..
scripts fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX 2026-06-11 10:49:47 -07:00
__init__.py docs: clarify gateway service scopes (#1378) 2026-03-14 21:17:41 -07:00
blueprint_catalog.py docs: finish Automation Blueprints terminology rebrand (#44470) 2026-06-11 17:22:22 -04:00
jobs.py fix(gateway,cron): guard cron model-tool path + add auto-resume loop breaker (#30719) 2026-07-01 02:48:36 -07:00
lifecycle_guard.py fix(gateway,cron): guard cron model-tool path + add auto-resume loop breaker (#30719) 2026-07-01 02:48:36 -07:00
scheduler.py security(cron): fail closed in scheduler backstop when validator errors 2026-07-01 14:23:01 +05:30
scheduler_provider.py fix(cron): avoid provider package shadowing core cron 2026-06-23 23:39:22 -07:00
suggestion_catalog.py fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX 2026-06-11 10:49:47 -07:00
suggestions.py fix(cron): make per-profile cron isolation intentional and tested (#4707) (#53570) 2026-06-27 03:55:01 -07:00