hermes-agent

Author	SHA1	Message	Date
Que0x	62882b8e6f	fix(matrix): isolate per-event failures in _dispatch_sync gather `_dispatch_sync` gathers the mautrix per-event handler tasks with a bare `asyncio.gather(*tasks)`. Without `return_exceptions=True`, the first handler that raises aborts the gather, so the sibling events in the same sync response are dropped unprocessed — the exception propagates up to the sync loop, which logs a single "sync error" and moves on. The invite/redaction gathers a few lines above already use `return_exceptions=True`. Use `return_exceptions=True` and log each failing handler, so one bad event no longer takes out the rest of its batch and per-event failures stay visible. Regression test: a batch with one failing and one succeeding handler no longer raises, the good handler still runs, and the failure is logged (mutation- verified — reverting re-raises RuntimeError out of _dispatch_sync).	2026-07-03 03:27:47 -07:00
Eugeniusz Gilewski	e4dbb67bf5	fix(security): remove model-controlled delegate ACP transport Source: https://github.com/NousResearch/hermes-agent/pull/52346 Related prior work: https://github.com/NousResearch/hermes-agent/pull/39462 Related prior work: https://github.com/NousResearch/hermes-agent/pull/27426 Maintainer direction: https://github.com/NousResearch/hermes-agent/pull/52346#issuecomment-4854881612 Remove acp_command and acp_args from the model-facing delegate_task schema and dispatch paths. Child agents can still use ACP subprocess transport when it comes from trusted delegation config or parent inheritance, but a model tool call can no longer choose the command or arguments that reach child construction. This is salvageable because the risky boundary is model control over child ACP transport, not ACP itself. The patch follows the maintainer direction from the source discussion by preserving trusted ACP configuration and prior integration work while removing the untrusted tool-call fields from both top-level and per-task delegate inputs. Reproduced on main by passing acp_command through delegate_task and observing it reach _build_child_agent. Verified after the fix that model dispatch strips the hidden top-level fields and per-task hidden fields are ignored before child construction. Co-authored-by: Carlosian <claudlos@agentmail.to> Co-authored-by: ssiweifnag <120658181+ssiweifnag@users.noreply.github.com> Co-authored-by: nikshepsvn <23241247+nikshepsvn@users.noreply.github.com>	2026-07-03 03:27:47 -07:00
liuhao1024	1bcc52c14e	fix(dashboard): use pattern match for .env sensitive file guard Replace the exact-filename frozenset with _is_sensitive_filename() that matches .env plus any .env.<suffix> variant. This covers shorthand suffixes like .env.prod that the previous enumeration missed. Add test_sensitive_env_suffix_variants_blocked regression test covering .env.prod, .env.dev, .env.staging.local, and .env.ci. Addresses review feedback from egilewski on PR #57507.	2026-07-03 03:27:47 -07:00
liuhao1024	bc55c201c7	fix(dashboard): block .env files from managed-files API The dashboard Files tab could list, read, and download .env files containing API keys when running with a bind-mounted Hermes home directory (e.g. docker run -v ~/.hermes:/opt/data). Add _SENSITIVE_FILENAMES frozenset and filter these from list_managed_files(), read_managed_file(), and download_managed_file(). Return 403 for direct read/download attempts on sensitive files. Fixes #57505	2026-07-03 03:27:47 -07:00
srojk34	16332af60b	security(gateway): anchor api_server MEDIA tag resolution to safe paths _resolve_media_to_data_urls's ad-hoc _MEDIA_TAG_RE matched any bare token after MEDIA: (no absolute-path anchor) and read the resolved path directly with no denylist. A relative/traversal path like MEDIA:../../../../etc/passwd.png slipped through, and any image- suffixed file the process could read (including under ~/.ssh, ~/.aws, etc.) was base64-inlined into the API response if its path merely appeared in the model's own final reply text. Every other platform adapter's MEDIA: handling already goes through two shared primitives in gateway/platforms/base.py: - MEDIA_TAG_CLEANUP_RE, which anchors the path to ~/, /, or a Windows drive letter plus a known deliverable extension. - validate_media_delivery_path, which resolves symlinks and rejects paths under the credential/system-path denylist. Reuse both here instead of the local unanchored pattern and naive Path().expanduser() resolution.	2026-07-03 03:27:47 -07:00
srojk34	47764f19f4	fix(browser): apply private-page guard to browser_cdp frame_id routing browser_cdp's frame_id (OOPIF) path returned early via _browser_cdp_via_supervisor before _browser_cdp_private_guard ever ran, unlike the stateless path a few lines below. A model that navigated a cloud browser to a private/internal URL could still read page content by passing frame_id, bypassing the same SSRF/private-page boundary already enforced on Runtime.evaluate, Page.navigate, and other raw CDP calls. Apply the same guard call used by the stateless path before dispatching to the supervisor, so both routing modes share one boundary.	2026-07-03 03:27:47 -07:00
dsad	4470d957cb	fix(browser): block Camofox input on private pages	2026-07-03 03:27:47 -07:00
Teknium	b14d75f8af	fix(update): prevent and self-heal half-updated venvs on Windows (#57659 ) Root-causes the July 2026 Windows incident chain (locked _brotlicffi.pyd / _sodium.pyd during install, then 'No module named annotated_doc' with 'hermes update' insisting 'Already up to date!'): - hermes update: probe venv core imports even when the checkout is current; a half-updated venv (dep sync killed mid-flight by a locked .pyd) is now detected and repaired instead of being reported as up to date - hermes update (Windows): after pausing gateways, refuse to mutate the venv while other processes run from the venv interpreter (the Desktop backend runs as python.exe so the hermes.exe shim guard never saw it); --force keeps the old behavior - install.ps1 venv stage: disarm gateway autostart Scheduled Tasks before the kill sweep (they respawn the gateway inside the kill->delete window), make the sweep a bounded loop requiring 3 clean passes, and rename-then- delete the old venv (a rename succeeds even with mapped DLLs) with stale- dir cleanup on the next run - desktop updater: 'venv shim still locked after 15s' now ABORTS the update hand-off (restarting our backend, surfacing the holder to the user) instead of 'proceeding anyway (force)' into guaranteed venv corruption; the unlock wait also re-kills respawned backends each poll tick	2026-07-03 03:24:08 -07:00
LeonSGP43	bb24ac6f20	fix(gateway): preserve queued native image attachments	2026-07-03 03:21:09 -07:00
tt-a1i	e880396488	fix(gateway): key native image handoff by session	2026-07-03 03:21:09 -07:00
Brooklyn Nicholson	c1e825399c	test(gateway): stub get_compression_tip in stale-guard db mock The routing-heal added to get_or_create_session calls SessionDB.get_compression_tip; the stale-guard suite's bare MagicMock db returned a Mock the heal then assigned as session_id, failing JSON serialization. Model the real contract (a non-compressed session's tip is itself) so the heal is a correct no-op.	2026-07-03 04:46:01 -05:00
Brooklyn Nicholson	52d0d671e7	fix(desktop): poll messaging sessions so platform traffic appears live Inbound Telegram/WeChat/Discord messages are written by the background gateway, not the desktop websocket that drives local chats. Without explicit polling the messaging sidebar and the open transcript stay frozen until the user manually refreshes. Desktop: - MESSAGING_POLL_INTERVAL_MS (10 s): interval poll of the messaging session list so new platform sessions surface automatically. - ACTIVE_MESSAGING_SESSION_POLL_INTERVAL_MS (5 s): poll the currently- viewed messaging transcript and re-hydrate the chat state when the FNV-1a signature changes (hash covers role + timestamp + content). - sameCronSignature now compares lineage_root_id / source / profile / preview / message_count / last_active / ended_at so stale previews and activity times are no longer silently ignored. - sessionMatchesStoredId helper de-dups the id / _lineage_root_id check. - refreshMessagingSessions exposed from useSessionListActions so the controller can use it in the poll effect. Gateway: - SessionStore._compression_tip_for_session_id: look up the latest compression continuation for a session id. - SessionStore._heal_compression_tip_locked: rewrite a stale entry to the compression child before returning it, so a restart or failed send no longer leaves the store pinned to the compressed parent. Co-authored-by: lawyer112 <lawyer112@users.noreply.github.com>	2026-07-03 04:29:22 -05:00
teknium1	eb99f82ce4	fix(browser): surface launch diagnostics when debug browser never opens the CDP port Follow-up to the salvaged early-exit retry fix (#35617): the debug-browser launch path was fire-and-forget (stderr to DEVNULL, no logging), so every platform failure — Windows singleton forward to an existing instance, bad profile dir, missing shared libraries, policy blocks — collapsed into the same unactionable 'port 9222 isn't responding yet' message and debug reports contained nothing. - launch_chrome_debug() returns a structured ChromeDebugLaunch with per-candidate attempts (state, exit code, stderr tail) - browser stderr is captured to <hermes_home>/chrome-debug/launch-stderr.log - clean exit (code 0) without the port opening is detected as Chromium's single-instance forward and produces a targeted user hint to close all running instances of that browser - crash exits surface the stderr tail (e.g. missing libnspr4.so) - every spawn/exit is logged to agent.log so hermes debug share captures it - CLI (/browser connect) and TUI/desktop (browser.manage) both print the hint	2026-07-03 01:05:22 -07:00
LeonSGP43	c74f093523	fix(browser): retry next candidate when debug launch exits early	2026-07-03 01:05:22 -07:00
Teknium	c7103c637c	feat(desktop): CLI/dashboard parity — skills hub, MCP test/toggle/catalog, maintenance ops, log filters (#57441 ) * feat(desktop): CLI/dashboard parity — skills hub browser, MCP test/toggle/catalog, maintenance ops, log filters Brings desktop GUI to parity with hermes skills/mcp/doctor/backup/debug-share/ curator/memory CLI commands and the dashboard's System + Skills-hub pages: - Skills page: new Browse Hub tab (search official/GitHub/community sources, preview SKILL.md, security scan verdicts, install/update with live action log) - MCP settings: connection test (tool listing), per-server enable/disable toggle, and a Catalog tab installing Nous-approved MCP servers with env prompts - Command Center: new Maintenance section (doctor, security audit, backup, debug share links, curator status/pause/run, memory file status + reset) - Command Center system logs: file (agent/errors/gateway/desktop), level, and substring filters instead of a fixed agent.log tail - hermes.ts API client + types for all the above; en/zh locale strings (ja and zh-hant inherit via defineLocale) * feat(desktop): backend model catalogs in toolset config — hermes tools parity Completes the `hermes tools` parity gap: after picking an image/video generation backend the CLI runs a model picker (e.g. FAL's multi-model catalog with speed/strengths/price); the desktop toolset drawer now has the same flow as a radio-card list. - web_server: GET /api/tools/toolsets/{name}/models (catalog + current + default for the active or named provider row) and PUT .../model (validated write to image_gen.model / video_gen.model), reusing the CLI's plugin catalog helpers so GUI and `hermes tools` stay in lockstep - desktop: ModelCatalogPicker in ToolsetConfigPanel — per-model cards with speed/strengths/price, in-use + default badges, disabled until the backend is the active one; provider selection now mirrors is_active locally so the catalog unlocks without a refetch - tests: 3 backend endpoint tests (catalog shape invariants, persist + validation), 2 component tests, 2 API-contract tests; en/zh strings	2026-07-03 01:02:47 -07:00
Teknium	6eb39c2bbe	fix(opencode-go): heal stripped /v1 base_url so non-minimax models stop 404ing (#57585 ) OpenCode Go serves minimax/qwen via Anthropic Messages (base URL without /v1 — the SDK appends /v1/messages) and glm/kimi/deepseek/mimo via OpenAI chat completions (base URL WITH /v1). The runtime stripped /v1 for anthropic-routed models, and the TUI/desktop + gateway persisted that stripped URL to model.base_url. Every later chat_completions model then POSTed to https://opencode.ai/zen/go/chat/completions — a 404 (the marketing site). Result: only minimax worked; glm/deepseek/kimi all 404ed. - New normalize_opencode_base_url(): symmetric /v1 normalization — strip for anthropic_messages, re-append for chat_completions / codex_responses on opencode.ai hosts (heals persisted stripped URLs; custom proxy overrides untouched) - Applied at all three former one-way strip sites (resolve_runtime_provider x2, switch_model) - opencode_model_api_mode: all Qwen models on Go AND Zen now route via /v1/messages per current published endpoint tables (previously only qwen3.7-max on Go — qwen3.6-plus etc. would 404 the same way) - Catalog refresh: Go gains deepseek-v4-pro/flash, glm-5.2, kimi-k2.7-code, minimax-m3, qwen3.7-plus; Zen gains glm-5.2, kimi-k2.7-code, minimax-m3, qwen3.7-plus Reported by IndieSuperhuman on X: opencode-go 404s for any model other than minimax.	2026-07-03 00:46:45 -07:00
Teknium	372f8195c7	fix(moa): default temperatures to unset — provider default, like single-model agents (#57440 ) A single-model Hermes agent never sends temperature; the provider default applies. MoA hardcoded reference_temperature=0.6 / aggregator_temperature=0.4, and the coercion float(preset.get(key, 0.6) or 0.6) made unset IMPOSSIBLE to express: absent, null, empty, and even an explicit 0 all collapsed to the baked-in default. Every MoA advisor and aggregator therefore ran at 0.6/0.4 while the same model running solo used the provider default — silently skewing solo-vs-MoA comparisons and overriding provider-tuned defaults. - moa_config normalization: temperatures coerce to None when absent/blank/ invalid (new _coerce_float_or_none); explicit values incl. 0 honored. - moa_loop: _preset_temperature() resolves preset values; None flows to call_llm, which already omits the parameter when None (same contract as max_tokens). Aggregator still inherits the acting agent's own configured temperature when the preset doesn't pin one. - conversation_loop (context-mode MoA): same resolution, no more hardcoded 0.6/0.4 at the call site. - DEFAULT_CONFIG preset + web_server payload models + docs updated: unset is the default, pinning stays available.	2026-07-03 00:22:49 -07:00
kshitijk4poor	e1a1dac848	fix(agent): enforce marker-strip invariant with a single terminal sweep (#57491 ) Follow-up to the per-site strips from the review gate. The two copy-site strips are correct but positional — a copy site added after the assembly loops would re-leak _db_persisted into the child-session flush. Add a single terminal sweep (_strip_persistence_markers) run once on the fully-assembled compressed list so the invariant 'no compacted message leaves compress() carrying a persistence marker' is structural, not dependent on copy-site order. - agent/context_compressor.py: _strip_persistence_markers() called before compress() returns; helper docstring notes the sweep is the authoritative guard - tests/agent/test_context_compressor.py: structural regression — neuter the per-site helper to a leaking copy, assert the terminal sweep still strips - tests/run_agent/test_compression_persistence.py: pin the fixture assumption behind the exact-equality row-count assertion	2026-07-03 12:51:12 +05:30
nankingjing	3e204bd771	fix(agent): strip _db_persisted when assembling rotation compression transcript (#57491 ) Shallow messages[i].copy() during context compression propagated the _db_persisted marker from cached gateway incremental flushes into the post-rotation compressed list. _flush_messages_to_session_db then skipped every row when writing to the new child session, so gateway restarts lost the compacted transcript (severe amnesia). Strip the marker in _fresh_compaction_message_copy() and add regression tests for rotation flush + compressor assembly. Fixes #57491	2026-07-03 12:51:12 +05:30
kshitijk4poor	5e2b051e60	test(slack): give the MPIM reaction-guard test real teeth The reaction-guard regression test defined a local _should_react lambda and asserted it against itself — a tautology that would stay green even if the production guard at _handle_slack_message reverted to (is_dm or is_mentioned), re-introducing the unmentioned-MPIM reaction spam this PR fixes. Replace it with a shared _reaction_guard helper plus a source-introspection test that pins the production expression: asserts (is_one_to_one_dm or is_mentioned) is present and (is_dm or is_mentioned) is absent. Mutation-checked — reverting the adapter guard now fails the test. Follow-up self-review finding on the salvage of #57339.	2026-07-03 12:34:53 +05:30
Victor Kyriazakos	accd672054	fix(slack): MPIMs (group DMs) obey shared-surface mention gating + reaction guard Group DMs (MPIMs) were classified as DMs and thereby exempted from every operator control that shared surfaces are supposed to honor: allowed_channels, require_mention, strict_mention, free_response_channels, and the reaction guard. Symptom: the bot added 👀/✅ to unmentioned MPIM messages and still invoked the agent (which then returned NO_REPLY) instead of the gateway dropping the event before model execution. Removing an MPIM from allowed_channels did not disable it. Root cause is the DM classification at adapter.py: is_dm = channel_type in {"im", "mpim"} used for BOTH routing exemptions and reaction gating. An MPIM is a shared surface (multiple humans can see and trigger the bot), not a private 1:1 DM, so it must be gated like a channel. This behavior was introduced/reinforced by a trail of Slack group-DM PRs: - #4633 fix(slack): treat group DMs (mpim) like DMs + reaction guard - #54632 fix(slack): subscribe to message.mpim + mpim scopes so group DMs work - #54663 fix(slack): group DMs work OOTB + reinstall nudge #54632/#54663 correctly made MPIM messages reachable; #4633 over-reached by giving them the DM mention/reaction exemptions. This corrects only that over-reach. Fix (minimal): introduce `is_one_to_one_dm = channel_type == "im"` and key the two EXEMPTION sites off it instead of `is_dm`: - mention/allowlist gating block (`if not is_one_to_one_dm and bot_uid:`) - reaction guard (`(is_one_to_one_dm or is_mentioned)`) `is_dm` is intentionally retained for session/thread scoping and chat_type labeling, where treating an MPIM as a persistent multi-party conversation is correct — only the mention/reaction exemptions were wrong. Docs: slack.md now distinguishes 1:1 DMs (mention-exempt) from group DMs (shared surface; obey require_mention/strict_mention/allowed_channels/ free_response_channels; reactions only when @mentioned). Tests: +7 in test_slack_mention.py (MPIM unmentioned dropped under require_mention and strict_mention; MPIM mentioned processed; MPIM off allowed_channels dropped; MPIM in free_response opted in; 1:1 IM still exempt; reaction guard drops unmentioned MPIM). Updated _would_process to model the is_one_to_one_dm gating + strict_mention. 72 passed.	2026-07-03 12:34:53 +05:30
Gille	551e5af50d	fix(config): preserve owner on atomic writes (#56644 )	2026-07-03 14:27:34 +10:00
Gille	e9ce250374	fix(file-tools): preserve container paths for docker file ops (#56637 )	2026-07-03 14:18:20 +10:00
Brooklyn Nicholson	89acc19606	fix(dump): flag API keys visible only to the shell, not the managed backend hermes debug share reads os.getenv — the invoking terminal's environment — but launchd/systemd and the desktop-spawned `serve` backend load credentials from ~/.hermes/.env, not the login shell. A key exported in the shell but absent from .env is invisible to the backend, yet the dump printed a bare "set", sending support down a phantom "the key is configured" path. This was the actual trap behind a "Desktop has no web_search / no tools" report: FIRECRAWL_API_KEY was a shell export (so `debug share` in a terminal read "firecrawl set") but not in .env, so the launchd backend's check_web_api_key returned False and web_search was gated off — which a contributor then misdiagnosed as a missing `desktop` platform registration. The dump now annotates any key set in-process but missing from ~/.hermes/.env with "(shell only — not in .env; managed/desktop backend may not see it)" so the mismatch is obvious instead of hidden behind "set".	2026-07-02 19:52:18 -05:00
Teknium	64ed99a6e6	fix(webhook): close per-delivery session at the true end of the run (#57423 ) The merged webhook session-close fix (#57370, salvaging #57322) wrapped handle_message in a try/finally — but BasePlatformAdapter.handle_message is fire-and-forget: it spawns _process_message_background and returns before the agent run starts. The finally-close therefore ran BEFORE get_or_create_session created the session row, found no session_id, and silently no-op'd — the ghost-session leak persisted on the real path. (The shipped test masked this by stubbing handle_message with a fake that created the row synchronously.) Move the close to an on_processing_complete override — the lifecycle hook the base class fires at the TRUE end of the run, on the success, failure, and cancellation paths alike. Empirically verified through the real fire-and-forget pipeline: before, ended_at stayed NULL; after, ended_at is set with end_reason=webhook_complete and the row is prunable. Tests now stub only the runner-side _message_handler (the seam the live gateway injects) so handle_message / _process_message_background / on_processing_complete all run for real; adds an AsyncSessionDB-facade coverage test for the coroutine-await branch.	2026-07-02 17:39:09 -07:00
kshitijk4poor	ed4123792c	refactor(providers): dedupe extra_headers normalizer + key picker groups by headers Follow-up to @helix4u's #57336 salvage. Two review findings: - W1: model-picker grouped custom-provider rows by (api_url, credential, api_mode) but NOT extra_headers. Entries sharing a URL+credential+api_mode yet declaring different headers (e.g. per-tenant routing behind one proxy) collapsed into one row and probed /models with whichever header set was seen first (order-dependent). Fold a canonical header identity into group_key so distinct header-authed endpoints stay separate; drops the now-dead first-non-empty merge branch. - W2: the extra_headers stringify+None-filter comprehension existed in 5 copies (config.py x2, runtime_provider.py, model_switch.py, models.py). Extract one shared hermes_cli.config.normalize_extra_headers primitive; all sites now call it. Tests: +normalize_extra_headers unit tests, +regression test proving two same-endpoint entries with different headers stay distinct and each probes with its own headers. 223 targeted tests pass; ruff clean.	2026-07-03 04:23:15 +05:30
helix4u	ab40e952f3	fix(providers): pass extra headers to model discovery	2026-07-03 04:23:15 +05:30
kshitijk4poor	0950dae2fa	Merge remote-tracking branch 'upstream/main' into HEAD # Conflicts: # scripts/release.py	2026-07-03 03:52:15 +05:30
kshitijk4poor	201b646d67	fix(gateway): complete on_session_end coverage across all eviction paths Follow-up to the cherry-picked #31856 fix. The contributor's guard defers idle-TTL eviction until the session store reports the session expired, so the expiry watcher can tear the agent down and fire MemoryProvider.on_session_end() with the live transcript. Two gaps remained: 1. Memory-leak regression for mode='none' sessions. _is_session_expired() returns False forever for the 'none' reset policy, so the naive guard would never idle-evict those agents — reopening the unbounded-cache leak the idle sweep (#11565) exists to relieve. Added SessionStore.is_session_finalizable() (a public predicate: will the expiry watcher EVER finalize this session?) and gate the deferral on it. mode='none' agents fall through to soft eviction as before. 2. on_session_end still dropped on the LRU-cap path. Both cache-pressure paths (_enforce_agent_cache_cap and _sweep_idle_cached_agents) soft-evict via _release_evicted_agent_soft, which by design does NOT fire on_session_end. If cache pressure evicts a finalizable-but-not-yet-expired agent before it expires, the watcher later finds no cached agent and the hook is skipped. Added _commit_memory_before_soft_evict(): at LRU eviction, if the session is finalizable and not yet expired, commit end-of-session extraction via the live agent's own (fully-scoped) memory manager using commit_memory_session() — extraction WITHOUT provider teardown, so the eviction stays soft and a resumed turn keeps working. Skipped for mode='none' (no missed boundary to compensate) and expired sessions (the watcher tears those down directly). This closes #11205 for ALL eviction paths and reset policies, not just the idle-sweep + finite-policy case, while preserving the soft-eviction resumability contract (never calls close() on a live session). Tests: 5 new cases in test_agent_cache.py (mode='none' still reaped, LRU-cap commits for finalizable / skips for none, real is_session_finalizable predicate); all mutation-checked. Contributor's original 2 tests updated to assert the finalizable path explicitly.	2026-07-03 03:46:43 +05:30
Hermes Trismegistus	90b618f48a	fix(gateway): keep idle cached agents alive until session actually expires The idle-TTL sweep (_sweep_idle_cached_agents) was evicting agents as soon as they passed _AGENT_CACHE_IDLE_TTL_SECS, even when the session hadn't expired yet. In daily-reset mode the reset can fire hours after the last user message — evicting the agent early means the session-expiry watcher has no agent in cache to call on_session_end() with, so memory providers miss the live transcript. Now the sweep checks the session store before evicting: if the session still exists and hasn't expired, the agent stays in cache so the expiry watcher can tear it down properly later. When the session store is unavailable or throws, falls back to the original eviction behavior (safe default). Fixes: #11205	2026-07-03 03:46:43 +05:30
kshitijk4poor	1c93799b49	fix(agent): self-review follow-ups on vLLM local-context salvage Self-review (ruff+ty lint diff = 0 net-new; 2-agent deep review) surfaced one Warning + comment-accuracy nits; no Critical: - W1: the local-probe TTL cache memoized None (probe failure) for 30s, so a probe that failed during a startup race would suppress a legit retry once the server came up. Cache only positive results — still fully bounds the hot-path probe rate (reachable servers cache their value) while an unreachable one re-probes on the next call. Add a regression test asserting a None result is NOT cached (retry re-probes); mutation-verified. - Tighten the platform-guard comment: gateway/TUI/cron already construct with quiet_mode=True (gated by `not agent.quiet_mode`), so the guard's active job is CLI dedup vs show_banner, not "filling the gateway/TUI gap" as originally worded. Verified not-issues (per review): positive-value 30s cache does not break the reconcile-after-restart freshness contract (restart = fresh process, empty cache); cache key is collision-safe; platform guard is correct in both directions (no runtime path leaves platform None on a non-CLI surface). Tests: 149 passed. ruff clean; ty 0 net-new vs base.	2026-07-03 03:36:22 +05:30
kshitijk4poor	e73adb5043	fix(dashboard): disable ws keepalive ping on loopback to survive event-loop stalls Desktop/dashboard WebSocket connections drop during long agent operations (delegate_task subagents, large model outputs) when the uvicorn event loop is GIL-starved for minutes. Root cause: uvicorn's ws keepalive ping runs on the SAME event loop as agent turns. A single synchronous GIL-holding call on a worker thread (a regex/scrub over a large output, or a long subagent turn) freezes the loop, so it cannot process the incoming pong within ws_ping_timeout and uvicorn closes an otherwise-healthy connection (#53773: 'event loop stalled 226.3s'; #48445/#50005). Loosening the timeout only raises the threshold — a multi-minute stall sails past any finite window. The keepalive ping exists to detect half-open connections (reverse-proxy 524, dropped tunnels), which cannot happen on loopback: there is no network or proxy in the path, and a dead local client tears the socket down with a real FIN/RST that starlette surfaces as WebSocketDisconnect regardless of the ping. So on loopback the ping provides ~no liveness value while actively killing recoverable stalls — disable it entirely (ws_ping_interval/timeout=None). Non-loopback (public) binds sit behind a Cloudflare Tunnel where half-open IS a real failure mode, so the ping stays at 20/20 to detect it. Empirically verified (real uvicorn + websockets peer): with ws_ping=None the server never closes a silent peer during an 8s window; with the pre-fix 2s/2s window uvicorn closes it. A genuinely-dead client still fires the WebSocketDisconnect reap path regardless of the ping. Note: this fixes the local Desktop case (the OP's scenario). A remote Desktop over an authenticated public dashboard route (McCalebTheSecond's comment) keeps the ping and needs the deeper GIL-hotspot fix — tracked separately. Closes #53773	2026-07-03 03:33:22 +05:30
kshitijk4poor	b9a197ec59	fix(agent): resolve review findings on vLLM local-context salvage Salvage review of #56431 surfaced one Critical + two Warning issues; fix them on top of the contributor's cherry-picked commits: 1. Critical — duplicate non-agentic warning on the interactive CLI. The new agent_init warning fires on every platform, but cli.py show_banner() already warns on CLI (richer output + /model hint), so a CLI user saw the warning twice per startup. Guard the agent_init emit to skip platform=="cli" — it now fills exactly the gateway/TUI gap the PR intended, no duplication. 2. Warning — vLLM error-parse regex under-matched. The patterns required a literal space before the number, so "max_model_len: 32768", "=32768", "(32768)", and "... is 32768" all returned None. Broaden both patterns to accept :/=/(/ 'is' delimiters. Add a parametrized test over all delimiter variants. 3. Warning — per-call live probe latency on local endpoints. The new reconcile-on-hit + pre-defaults step-7 probe made every local resolution fire a synchronous network probe (banner + /model switch + compressor update_model each within one startup). Add a 30s in-process TTL cache keyed by (model, base_url) around _query_local_context_length so back-to- back resolutions reuse one round-trip; not persisted to disk, so the reconcile freshness contract (re-probe after restart) is preserved. Add an autouse fixture clearing the cache between tests + TTL coverage. Tests: 148 passed (was 138). ruff clean.	2026-07-03 03:27:13 +05:30
kshitijk4poor	65cb70b8d0	refactor(gateway): add SessionStore.peek_session_id public accessor for webhook close Replace the webhook delivery-close path's direct reach into private SessionStore._entries (which also bypassed the store lock) with a public, lock-held peek_session_id(session_key) accessor. Mirrors the existing lookup_by_session_id inverse helper. Keeps a getattr fallback for older stores / test doubles. Adds a unit test for the accessor.	2026-07-03 03:26:53 +05:30
Gumclaw	14882bab7e	fix(gateway): close webhook sessions on delivery completion so prune can reap them Webhook deliveries created a unique one-shot session (delivery_id baked into the session key at gateway/platforms/webhook.py:668) but the adapter fired handle_message via asyncio.create_task WITHOUT ever ending the session (webhook.py:713, pre-fix). Nothing else closes it: the gateway caches/expires the agent per session_key but never calls end_session for the webhook path, and _end_session_on_close teardown doesn't run for these fire-and-forget tasks. SessionDB.prune_sessions (hermes_state.py:4965) only deletes rows WHERE ended_at IS NOT NULL. So every webhook session stayed with ended_at NULL -> unprunable -> unbounded state.db growth. This was the primary driver of the SQLite lock-contention gateway outage. Fix: wrap the delivery in _run_delivery_and_close, which awaits handle_message and then (in finally, so failures still reap) calls _end_webhook_session -> SessionDB.end_session(session_id, 'webhook_complete'). This mirrors how cron closes its session with 'cron_complete' (cron/scheduler.py:3065). end_session is first-reason-wins and no-ops on an already-ended row, so it never clobbers a compression/agent_close reason. Adds tests/gateway/test_webhook_session_close.py asserting the invariant (a completed webhook session has ended_at set + is prunable), including the error-path case, against a real SessionStore + SessionDB.	2026-07-03 03:26:53 +05:30
infinitycrew39	53063d92b0	test(agent): cover local vLLM context-length resolution Add regression tests for vLLM max_model_len error parsing, stale local cache reconciliation, live probes over llama defaults, and the 64K minimum guard on persistent cache writes. (cherry picked from commit 1cb47ef437de7ce289cb358e8d6b89e9194b43ed)	2026-07-03 03:22:51 +05:30
kshitijk4poor	033d7bf259	fix(slack): guard blank-line list continuation on next-item lookahead Refine the blank-line handling so a blank line only continues a list run when the next non-blank line is another list item. This keeps a list -> paragraph -> list sequence as three separate blocks and matches the contiguous-list layout for mixed/nested lists (one rich_text block, split into sub-lists by (indent, ordered)), rather than emitting a separate block per item. Adds regression tests for the mixed blank-separated layout and the list->paragraph->list boundary.	2026-07-03 02:55:22 +05:30
liuhao1024	d3c8a155cb	fix(slack): keep blank-line-separated ordered items in one rich_text_list When a Markdown ordered list has blank lines between items (common in LLM-authored content), the list run loop breaks on each blank line. Slack numbers each rich_text_list independently, so N items produce N lists each starting at 1. Skip blank lines inside the list run as soft separators instead of breaking, so ordered items stay in one rich_text_list and Slack renders the correct numbering. Fixes #57076	2026-07-03 02:55:22 +05:30
Yingliang Zhang	67472fbaa4	fix(tui_gateway): route setup.runtime_check and setup.status to RPC pool setup.runtime_check and setup.status are polled by the Desktop frontend on connect and periodically (use-status-snapshot → evaluateRuntimeReadiness), but neither was in _LONG_HANDLERS — so dispatch() ran both inline on the WS reader thread. Under GIL pressure from concurrent agent turns (terminal I/O, large output, background-process completions) either can block for seconds: - setup.runtime_check → resolve_runtime_provider() (config read, auth check, may probe the provider endpoint) - setup.status → _has_any_provider_configured() (provider config + credential scan) While either blocks the reader thread the WS read loop can't service later requests; the frontend RPC timeout fires, the client drops the socket, and the lost setup.runtime_check response reads as ready=false — a false "needs setup" / "Settings failed to load" even though the provider is configured. Route both to the RPC pool (same precedent as #55545's session.list/pet.info/ process.list). The handlers are read-only and pool writes go through the lock-guarded write_json, so there's no ordering or safety concern. Test asserts all 5 frontend-polled RPCs are pool-routed. Co-authored-by: izumi0uu <izumi0uu@gmail.com>	2026-07-02 15:44:37 -05:00
Brooklyn Nicholson	1501a338c3	fix(cli): stop profile-bound backends before deleting so rmtree converges delete_profile stopped only the process named in gateway.pid, but a Desktop app spawns a headless `serve`/`dashboard` backend per profile that holds the profile's SQLite connection open and keeps writing sessions/WAL/sandbox files. That backend is never in gateway.pid, so a CLI `hermes profile delete` run while the Desktop app is up left it writing into the tree — rmtree's final rmdir then failed with ENOTEMPTY (#47368 "Bug 2"), and pre-guard it also resurrected the directory. - _profile_bound_backend_pids(): find running Hermes backends bound to this profile via a `--profile <name>` selector or a HERMES_HOME env resolving to the profile dir. Tightly scoped — current-user only, backend subcommands (serve/dashboard/gateway) only so an interactive chat is never killed, and never this process or its ancestors. - _stop_profile_backends(): terminate them (graceful, then force), best-effort so it can never make delete worse. - _rmtree_with_retry(): a few spaced retries absorb the ENOTEMPTY / Windows file-lock race from a just-terminated writer's in-flight -wal/-shm/sandbox writes instead of failing the whole delete on a race the next attempt wins. Complements the recreation guard (deleted profiles no longer reappear) and the Desktop teardown-before-delete flow; this is the CLI-side convergence fix for a delete run while a Desktop-managed backend is live. Part of #47368.	2026-07-02 15:31:35 -05:00
Brooklyn Nicholson	5a6720b884	fix(desktop,tui-gateway,zai): stop thinking-off from reverting to medium A Z.ai desktop user reported thinking reverting to medium after one turn, burning ~200% of a week's credits in 4 days despite reasoning_effort: false in config.yaml. Four compounding bugs: - _session_info reported reasoning_effort "" for disabled reasoning, indistinguishable from unset — the desktop adopted it after the first turn, wiping its sticky "thinking off" pick so every later chat reverted to the default effort. - config.set key=reasoning always wrote agent.reasoning_effort to global config.yaml, so every desktop model-menu selection (preset.effort ?? 'medium') clobbered the user's configured value. Now session-scoped like the messaging gateway's /reasoning, landing on create_reasoning_override so lazily-built sessions keep it too. - YAML `reasoning_effort: false`/`off`/`no` (boolean False) was coerced to "" by every loader's `str(x or "")`, silently re-enabling thinking. parse_reasoning_effort now treats False/"false"/"disabled" as {"enabled": False}; loaders (tui gateway, gateway, cli, cron, delegate) pass the raw value through. The desktop config reader also crashed on the boolean (false.trim()), aborting voice/STT settings. - The zai provider profile never sent thinking on the wire, and GLM-4.5+ defaults to thinking ON server-side — so disabling reasoning was a silent no-op on direct Z.ai, the actual token burner. The profile now emits extra_body.thinking {"type": "enabled"\|"disabled"} for thinking-capable GLM models, mirroring the DeepSeek profile. Also: /new (session reset) now carries reasoning_config across the rebuild like model_override; config.get reasoning prefers the session's live value and maps a config False to "none"; Settings shows "Off" instead of a blank select for hand-written false.	2026-07-02 15:23:47 -05:00
teknium1	254328bf56	fix(auth): remove stale loopback_pkce reference in xAI quarantine removal list The terminal-refresh quarantine filtered in-memory entries on source == "device_code" but built removed_ids from the deleted "loopback_pkce" source name, so the revoked device-code entry was never pruned from the persisted pool in auth.json. Also restores the _print_loopback_ssh_hint test suite scoped to Spotify (the helper's remaining caller) instead of deleting it wholesale.	2026-07-02 13:17:41 -07:00
Jaaneek	5ef0b8acb0	feat(auth): make xAI Grok OAuth device-code-only, drop loopback login Replace the loopback/PKCE-callback server and manual-paste fallback with the RFC 8628 device-code flow as the only xAI Grok OAuth login path. The flow works in headless/SSH/container sessions with no 127.0.0.1 listener, shrinking the local attack surface. - Poll the token endpoint with server-provided interval, honoring slow_down and expires_in; store tokens with auth_mode oauth_device_code. - Adaptive proactive refresh skew for short-lived device-code JWTs; rotated tokens sync back to auth.json, the global root store, and the credential pool (no refresh-token replay). - Clear source suppression on successful re-login (CLI + dashboard) and drop the duplicate dashboard pool entry so exactly one seeded device_code entry exists. - Use the shared device_code source name for consistency with the nous/codex device-code providers. - Desktop: remove the loopback OAuth flow states and dead type variants; pkce providers' sign-in URL selection is unchanged. - Docs (EN + zh-Hans) rewritten for device-code login; drop the deleted --manual-paste flag from documented commands.	2026-07-02 13:17:41 -07:00
LeonSGP43	472d75193f	Prevent deleted profile skeleton revival	2026-07-02 15:11:56 -05:00
teknium1	a2d49de801	fix(terminal): also set MSYS2_ARG_CONV_EXCL for MSYS2/Cygwin bash fallback MSYS_NO_PATHCONV is honored by Git for Windows bash only. _find_bash's final shutil.which fallback can return MSYS2-proper or Cygwin bash, which ignore it and honor MSYS2_ARG_CONV_EXCL instead. Set both so argv path conversion stays disabled regardless of which bash flavor spawns. Also subsumes the cmd /c mangling in #56147.	2026-07-02 11:48:03 -07:00
xxxigm	51c01062d4	test(terminal): cover MSYS_NO_PATHCONV defaults on Windows env builders	2026-07-02 11:48:03 -07:00
David Zhang	30e947e0a0	feat(gateway): persist per-session /model overrides across gateway restarts Per-session /model overrides (_session_model_overrides) were in-memory only, so a gateway restart silently reverted every session to the global default model. Persist the non-secret parts (model/provider/base_url ONLY — never api_key) into the session entry in sessions.json and lazily rehydrate them on first use after a restart, re-resolving credentials through the normal runtime provider resolution. - gateway/session.py: SessionEntry.model_override field with sanitize_model_override() (allowlist: model/provider/base_url) applied on both serialization and deserialization; SessionStore.set_model_override / get_model_override accessors. reset_session() already creates a fresh entry, so /new keeps its clear-on-reset semantics — a restart cannot resurrect an override the user reset away. - gateway/slash_commands.py: write-through at both /model set sites (text command + picker) after storing the in-memory override. - gateway/run.py: _rehydrate_session_model_override() called from _resolve_session_agent_runtime(); in-memory state always wins, credentials are re-resolved per provider (credential-less fallback on failure). Session expiry finalization also drops the persisted override. - tests/gateway/test_session_model_override_persistence.py: restart round-trip, /new clearing, api_key-never-serialized (including tampered sessions.json), rehydration + live-state precedence + credential-failure degradation. Salvaged from #3659 by @Git-on-my-level, narrowed to the restart-persistence gap confirmed in triage.	2026-07-02 05:51:12 -07:00
Jneeee	b98baa3039	feat(config): extra HTTP headers for LLM API calls (#3526 salvage) Named providers / custom_providers entries in config.yaml now accept an extra_headers dict scoped to that endpoint — for reverse proxies, API gateways, and custom auth schemes (e.g. Cloudflare Access service tokens). - hermes_cli/config.py: normalize extra_headers on provider entries (_normalize_custom_provider_entry + providers-dict translation), add get_custom_provider_extra_headers / apply_custom_provider_extra_headers_to_client_kwargs helpers keyed on base_url (case/trailing-slash insensitive, no substring bypass — mirrors the TLS helpers) - hermes_cli/runtime_provider.py: surface extra_headers in the resolved runtime for named custom providers (providers dict, legacy custom_providers list, and the credential-pool path) - run_agent.py / agent/agent_init.py: merge per-provider extra_headers onto the OpenAI client default_headers at construction and on every _apply_client_headers_for_base_url re-application (credential swaps, rebuilds), most-specific level wins; OpenAI-wire only (native Anthropic/Bedrock scoped out) - agent/auxiliary_client.py: accept model.extra_headers as an alias of model.default_headers for the global variant - cli-config.yaml.example: documented commented example - Header values are treated as secrets and never logged Salvaged from PR #3526 by @jneeee, reimplemented against current main. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-07-02 05:33:25 -07:00
Mibayy	4a09b692ec	feat(api-server): per-client model routing via model_routes (#3176 salvage) Adds a no-code routing layer to the OpenAI-compatible API server so one Hermes deployment can map different API clients to different model/provider backends. Clients pick a backend by sending a configured alias as the OpenAI 'model' field; unmatched values fall back to the global model. Configured aliases are listed by GET /v1/models. Precedence (highest first): session /model override > model_routes route > global config. Route provider credentials resolve through _resolve_runtime_agent_kwargs_for_provider (same seam as channel_overrides); per-route api_key/base_url are upstream provider credential overrides — never caller auth, never logged. Salvaged and rebased from PR #3176 by @Mibayy onto current main.	2026-07-02 05:23:28 -07:00
Mibayy	ce9aa869fc	feat(commands): /compact alias + --preview/--dry-run flags for /compress (#3243 salvage) Salvaged from PR #3243 by @Mibayy, reimplemented against current main (the original diff targeted a removed gateway/run.py handler). - /compact is now a first-class alias of /compress (CLI, gateway, Telegram/Slack/Discord command lists, autocomplete) — also fixes the dangling '/compact' references in gateway error messages (gateway/run.py context-exhausted banners). - --preview / --dry-run: report what WOULD be compressed (message counts, token estimate, 'here [N]' boundary) without touching the transcript. Flags coexist with the existing 'here [N]' / focus-topic args on both the CLI and gateway surfaces via shared pure helpers in hermes_cli/partial_compress.py. - --aggressive (LLM-free hard truncation) is intentionally NOT implemented: it would need its own transcript-persistence branch outside the guarded _compress_context rotation machinery (#44794 data-loss class). The flag is recognized and returns an explanatory message pointing at '/compress here [N]' and /undo instead of being mis-parsed as a focus topic. - locales: gateway.compress.aggressive_unsupported added to all 16 catalogs (parity test enforced). - release.py: AUTHOR_MAP entry for contributor credit.	2026-07-02 05:10:31 -07:00

1 2 3 4 5 ...

7034 commits