hermes-agent

Author	SHA1	Message	Date
teknium1	51feecc2b1	fix(security): block shell-collapse rm -rf / spellings at the hardline floor rm -rf //, /., /./, /.. and //* all resolve to / in the shell but slipped past the root-filesystem hardline pattern, whose target group only matched the literal / and /* tokens. They fell to the softer DANGEROUS_PATTERNS 'delete in root path' rule, which --yolo / approvals.mode=off / cron approve-mode are designed to bypass — leaving the one unconditional floor open to a full root wipe under yolo. Broaden the root token from '/\|/\\\|/ \\' to '/[/.]\\*' inside _hardline_rm_path so any root-anchored path whose components collapse back to / (repeated slashes plus ./.. segments) with an optional trailing glob is caught. A trailing real segment (/tmp, /home, /.ssh) still fails to match and stays with the softer rules. Co-authored-by: kernel-t1 <214165399+kernel-t1@users.noreply.github.com>	2026-07-01 02:46:38 -07:00
teknium1	d15a288812	chore(release): map arthurzhang author for PR #34718 salvage	2026-07-01 02:45:22 -07:00
Ruzzgar	e13b6ce1c6	test(redact): cover Slack App-Level (xapp-) token redaction	2026-07-01 02:45:22 -07:00
ArthurZhang	fdb9620ac4	security(agent): redact Slack App-Level (xapp-) tokens The xapp-<num>-<hash> format used by Slack App-Level / Socket Mode tokens was missing from both agent/redact.py prefix patterns and gateway/run.py gateway secret patterns, so SLACK_APP_TOKEN values could leak through to chat users even with security.redact_secrets enabled. Adds an anchored xapp-\d+- pattern to both redaction paths.	2026-07-01 02:45:22 -07:00
skyzh	cc7d20d683	feat(raft): add gateway setup wizard Add an interactive Raft setup flow for hermes gateway setup. The wizard follows the existing platform adapter setup pattern, persists RAFT_PROFILE to the Hermes env file, preserves an existing profile when the user declines reconfiguration, and registers the flow via setup_fn. Add focused Raft adapter coverage for saving RAFT_PROFILE, keeping an existing profile, and registering setup_fn. Signed-off-by: skyzh <skyzh@mail.build> Signed-off-by: HaoHao <HaoHao@mail.build>	2026-07-01 02:45:11 -07:00
Teknium	da6d5fcd13	fix(auth): serialize Codex OAuth pool refresh under the auth-store lock (#56233 ) The credential-pool Codex refresh path synced tokens from auth.json and then POSTed the refresh_token to OpenAI's token endpoint without holding the cross-process auth-store lock across the whole read->POST->write-back sequence. Because Codex refresh tokens are single-use, two concurrent Hermes processes could both adopt the same on-disk token and both POST it; the loser got refresh_token_reused / invalid_grant. Wrap the Codex OAuth branch of _refresh_entry in the existing shared _auth_store_lock (reentrant, cross-process flock) using the same extended-timeout pattern resolve_codex_runtime_credentials() already uses. A waiting process now blocks on the lock and, once inside, the in-lock re-sync picks up the rotated token the winner persisted and skips its own POST. Also send User-Agent: hermes-cli/<version> on the refresh request. Credit @cooper-oai (#34820) for identifying the concurrent-refresh reuse race; this ships the narrow lock-serialization fix without the separate Codex auth-store partition.	2026-07-01 02:45:07 -07:00
heathley	a8a97c358f	fix(matrix): block unsafe image redirects per-hop Matrix outbound image downloads validated only the final URL after following redirects, so a public URL that 302-redirects to loopback / private-network / cloud-metadata endpoints had already connected to the unsafe hop before the check ran. Re-validate every redirect hop before following it: - aiohttp path resolves redirects manually with allow_redirects=False, validating each Location via is_safe_url (aiohttp can't use the httpx response event hook). - httpx fallback installs the shared _ssrf_redirect_guard event hook. Regression tests cover per-hop blocking of an unsafe redirect, following a safe redirect chain, and httpx guard wiring.	2026-07-01 02:44:57 -07:00
Teknium	868fa9566a	fix(security): block /proc//auxv and /proc//pagemap read leaks auxv leaks AT_RANDOM (stack canary seed) + AT_BASE/AT_PHDR load addresses — an ASLR oracle on par with maps. pagemap exposes virtual->physical translation. Both slipped through the endswith tuple alongside the maps family covered by the salvaged commit. Adds regression coverage for auxv/pagemap and for the per-thread /proc/<pid>/task/<tid>/<file> alias form (endswith catches both). Follow-up on #32238, closes #34430.	2026-07-01 02:44:53 -07:00
AhmetArif0	64e6b98ba8	fix(security): extend /proc read block to smaps, smaps_rollup, numa_maps, mem PR #4609 blocked /proc//maps to prevent ASLR layout leakage, but the endswith("/maps") check does not match /proc//smaps or /proc//smaps_rollup — both expose the same virtual-address layout and bypass the guard. /proc//numa_maps carries the same data with NUMA annotations and is equally bypassed. /proc/*/mem (raw process memory) is added as defence-in-depth; it requires address knowledge to exploit but is blocked for consistency. Extends the endswith tuple in _is_blocked_device_path() to cover all four variants and adds regression assertions for all new paths to test_proc_sensitive_pseudo_files_blocked. Partially addresses #4427.	2026-07-01 02:44:53 -07:00
Teknium	275e293f54	fix(matrix): decline dead/abandoned invites instead of retrying forever (#56222 ) An invite to a room with no remaining members surfaces as "no servers in the room have been provided" or "room not found" on join. The pending invite was never cleared, so every gateway startup re-attempted the join and re-emitted the warning indefinitely. Detect that specific failure mode by narrow error-message match and call leave_room to decline the invite; transient/network errors leave the invite untouched for the next sync. Adds 5 tests. Reimplements the matrix portion of #33953 onto the current plugin adapter (gateway/platforms/matrix.py was relocated to plugins/platforms/matrix/adapter.py since the PR was opened). The two gateway/status.py fixes from that PR (wrapper-subcommand rejection, psutil start-time fallback) already landed on main independently. Reported by @Bougey; original patch authored by @KiraKatana.	2026-07-01 02:44:18 -07:00
sprmn24	88d6e833f1	fix(agent): wrap list-type untrusted content in untrusted_tool_result _maybe_wrap_untrusted() only wrapped str-typed tool outputs. When a high-risk tool (web_extract, browser_*) returns a multimodal content list ([{type:text},{type:image_url}]) — which _tool_result_content_for _active_model() produces by unwrapping the _multimodal envelope for vision-capable providers — the text part reached the model completely unguarded. An attacker page that ships one image bypassed the entire untrusted-data wrapper. Extend the wrapper to handle list content: each {type:text} part is run through the same string-wrapping path (min-char threshold, delimiter neutralization, one well-formed block), image/video parts pass through untouched so the list stays valid for vision adapters. Recursing into the existing string branch means the list path inherits the delimiter defang and the no-forgeable-fast-path hardening from #56172 for free. The outer list is rebuilt (not returned by identity), so callers compare by value.	2026-07-01 02:44:09 -07:00
rrevenanttt	0c0b4b6989	fix(security): collapse $IFS whitespace obfuscation before approval checks ## What does this PR do? Closes a critical bypass of the dangerous-command approval system. The normalizer that every command passes through before pattern matching (`_normalize_command_for_detection`) already strips ANSI, null bytes, fullwidth Unicode, backslash escapes and empty-quote token splits — but it did nothing about the shell `IFS` variable. In any POSIX shell `$IFS` and `${IFS}` expand to whitespace, so a command written as `rm${IFS}-rf${IFS}/` is executed by the live shell as `rm -rf /` while the detection regexes — which anchor on literal `\s` between a command and its arguments — never fire. The impact is severe: this evades BOTH layers at once. It slips past every entry in `DANGEROUS_PATTERNS` (so `curl${IFS}...\|sh`, `sed${IFS}-i` against `~/.hermes/config.yaml`, sudo privilege flags, etc. auto-run with no approval prompt) AND the unconditional hardline floor that is documented as un-bypassable "not even with --yolo" (`rm -rf /`, `mkfs`, `dd` to a raw block device, `shutdown`/`reboot`, fork bomb). A prompt-injected or malicious instruction could wipe the host filesystem or power the box off while the approval system reports nothing. Confirmed at runtime before the fix: `detect_hardline_command('rm${IFS}-rf /')` returned `(False, None)`. The fix mirrors the shell's own expansion: it collapses `$IFS` / `${IFS}` (including the bash substring form `${IFS:0:1}`) to a single space inside the existing de-obfuscation block, so the whitespace-anchored patterns match exactly as they do for the un-obfuscated command. It is deliberately narrow and safe — a `\b` word boundary keeps it from touching unrelated variables like `$IFSACONFIG`, so it cannot introduce false positives on legitimate commands. ## Related Issue N/A ## Type of Change - [x] 🔒 Security fix ## Changes Made - `tools/approval.py`: in `_normalize_command_for_detection`, substitute `$IFS` / `${IFS}` (and `${IFS:...}`) expansions with a literal space before dangerous/hardline pattern matching, alongside the existing backslash and empty-quote de-obfuscation. - `tests/tools/test_approval.py`: add `TestIFSWhitespaceBypass` covering the brace, bare and substring IFS forms against both `detect_hardline_command` and `detect_dangerous_command`, plus regression guards that a look-alike variable (`$IFSACONFIG`) and plain safe commands are not flagged. Import `detect_hardline_command`. ## How to Test 1. Reproduce the hole (pre-fix): `detect_hardline_command('rm${IFS}-rf /')` returns `(False, None)` and `detect_dangerous_command(...)` returns `(False, ...)`, i.e. a host-destroying command is auto-approved. 2. With the fix applied, both now flag the command: hardline match "recursive delete of root filesystem" and dangerous match "delete in root path". 3. Run the suite: `pytest tests/tools/test_approval.py tests/tools/test_hardline_blocklist.py -q` — the new `TestIFSWhitespaceBypass` cases pass and nothing else regresses. ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits (`fix(scope):`, etc.) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix (no unrelated commits) - [x] I've run the relevant tests and they pass (two pre-existing failures are environmental: missing optional deps in the minimal venv, not caused by this change) - [x] I've added tests for my changes - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) — the change is a pure string transform with no platform-specific behavior; footgun gate passes - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-07-01 02:44:04 -07:00
mrparker0980	10a54ccc2c	fix(security): anchor @file context refs to canonical read deny-list `@file` / `@folder` context-reference expansion enforced its own narrow deny-list (`_ensure_reference_path_allowed` in `agent/context_references.py`) that only covered `~/.ssh` keys, a handful of shell dotfiles, `~/.hermes/.env`, and `skills/.hub`. It never blocked the credential stores that the canonical read guard (`agent/file_safety.get_read_block_error`) protects: provider API keys (`~/.hermes/auth.json`), Anthropic OAuth tokens (`~/.hermes/.anthropic_oauth.json`), MCP OAuth material (`~/.hermes/mcp-tokens/`), webhook HMAC secrets, and project-local `.env` files. This matters because the messaging gateway feeds untrusted remote text straight into reference expansion: `gateway/run.py` calls `preprocess_context_references_async(..., allowed_root=_msg_cwd)` where `_msg_cwd` defaults to the operator's HOME when `TERMINAL_CWD` is unset. A chat peer (Telegram/Discord/Slack/...) could send `@file:~/.hermes/auth.json`, pass the `allowed_root` check (it resolves under HOME), slip past the narrow list, and have the operator's live keys read into the agent's context — where the model would typically echo or act on them. Rather than duplicate and re-sync a second secret list, this routes the guard through the existing single source of truth. A reviewer might ask "why not just add `auth.json` to the local list?" — because the local list has already drifted once (a prior commit had to add `.config/gh`); anchoring to `get_read_block_error` means every future addition there protects this path too. The narrow checks are kept as a fallback since they also cover dirs that guard does not (`.aws`, `.gnupg`, `.kube`, etc.), and the canonical lookup is wrapped so it can never crash reference expansion. N/A - [x] 🔒 Security fix - `agent/context_references.py`: `_ensure_reference_path_allowed` now also consults `agent.file_safety.get_read_block_error` after its existing checks and refuses the reference when that canonical guard flags the resolved path. The lookup is wrapped so guard-resolution failures fall back to the explicit checks instead of breaking expansion. - `tests/agent/test_context_references.py`: added `test_blocks_canonical_read_denylist_credential_stores`, asserting that `@file` attaches for `auth.json`, `.anthropic_oauth.json`, `mcp-tokens/`, and a project-local `.env` are all refused and their secret bodies never reach the expanded message. - `scripts/release.py`: added the contributor email to `AUTHOR_MAP` (release gate). 1. `scripts/run_tests.sh tests/agent/test_context_references.py` — all 15 tests pass, including the new credential-store case. 2. Regression proof: stash `agent/context_references.py`, run the suite with `-- -k canonical`, and confirm the new test fails (secrets leak into the message) without the fix; restore and confirm it passes. 3. `ruff check agent/context_references.py tests/agent/test_context_references.py` and `python scripts/check-windows-footguns.py agent/context_references.py tests/agent/test_context_references.py` both pass. - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits (`fix(scope):`, etc.) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only* changes related to this fix (plus the AUTHOR_MAP release gate) - [x] I've run the test suite for the touched area and all tests pass - [x] I've added tests for my changes (required for bug fixes) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-07-01 02:43:49 -07:00
kshitijk4poor	53b017f03e	refactor(gateway): share error-text blob between not_found classifiers Follow-up to the #55780 dead-target not_found blast-radius fix (merged in #56225). classify_send_error and is_chat_level_not_found each built their own lowercased error blob, but divergently: classify_send_error appended the exception CLASS NAME while is_chat_level_not_found did not. A caller passing exc= to both could get inconsistent answers on the same failure. - Extract _error_blob(exc, error_text) as the single source of truth both classifiers use (str(exc) when non-empty + class name; no stray leading space). - Align is_chat_level_not_found's signature to (exc, error_text), matching classify_send_error, removing the swapped-positional footgun; update the sole caller and the three tests to keyword form. - Add a regression guard asserting _error_blob keeps the class name. Surfaced by the hermes-pr-review Phase 2c structured review of #56225.	2026-07-01 15:11:38 +05:30
Teknium	01e681aa48	docs: unify /new and /reset rows in gateway slash-commands table (#56235 ) The messaging gateway table still listed /new ("Start a new conversation") and /reset ("Reset conversation history") as two separate commands with divergent descriptions. /reset is an alias of /new (see COMMAND_REGISTRY in hermes_cli/commands.py) — same handler, fresh session ID + history. Collapse them into one row matching the registry wording and the CLI table already on line 39. Closes #42829.	2026-07-01 02:39:39 -07:00
kshitijk4poor	8f1d22d7ed	chore(release): map r266-tech contributor noreply email for #55780 salvage	2026-07-01 15:01:33 +05:30
r266-tech	46f45104c4	fix(gateway): don't mark an entire chat dead on thread/message-level not_found #55115 added the dead-target registry so confirmed-dead delivery targets are short-circuited. Its documented scope (gateway/dead_targets.py) is deliberately narrow: only whole-chat deaths -- the `forbidden` and chat-level `not_found` (`chat not found`) kinds -- should be recorded; "Thread/topic-level not_found is NOT recorded here ... a deleted topic does not mean the parent chat is dead." But the implementation doesn't honor that scope. classify_send_error collapses chat-level "chat not found" AND thread/message-level not_found ("thread not found", "topic_deleted", "message_id_invalid", "message to edit/reply not found") into one "not_found" kind, _DEAD_ERROR_KINDS contains "not_found" wholesale, and deliver()'s except marks the PARENT chat_id dead. So a single deleted Telegram topic or edited-away message permanently marks the entire chat (and every future scheduled / cron / agent delivery to it) dead -- silently. The adapter self-heal the docstring relies on only covers the non-private-group thread retry; named-DM-topic and message-level failures propagate to deliver()'s except and wrongly kill the whole chat. Add is_chat_level_not_found() (factoring the not_found substrings into chat-level vs sub-chat-level constants) and gate the delivery dead-path: a "not_found" only marks the target dead when it is chat-level. classify_send_error's public contract is unchanged (still returns "not_found" for every shape); only the mark_dead decision is refined, restoring the registry's documented scope. Cross-platform: telegram/slack/discord delivery all flow through classify_send_error -> mark_dead. Adds regression tests through the real deliver() path plus helper/classifier units.	2026-07-01 15:01:33 +05:30
Teknium	4580c03e7d	test(gateway): align salvaged #54947-cluster tests with async cache helper The three salvaged PRs (#46647, #54583, #55013) were authored against a tree where _refresh_agent_cache_message_count was sync and _session_db was the raw SessionDB. On current main the helper is async and awaits the AsyncSessionDB facade, and _run_agent was split into _run_agent_inner. - Wrap test _session_db in AsyncSessionDB so the awaited get_session works - Make refresh-calling tests async + await the helper - Point the placement-guard test at _run_agent_inner (recursion lives there post-mixin-extraction) - Relocated production call sites now correctly await the async helper	2026-07-01 02:29:24 -07:00
teknium1	116a63d3a0	chore(release): map jcjc81 + Tranquil-Flow in AUTHOR_MAP for #54947 cluster salvage	2026-07-01 02:29:24 -07:00
Tranquil-Flow	e7562c394f	fix(gateway): skip cross-process guard on session_id switch under same session_key (#54947 ) The cross-process coherence guard (#45966) compares the session's on-disk message_count against the snapshot stored next to the cached agent, and rebuilds the agent on a mismatch. The guard is correct when the cache snapshot and the live count both refer to the same DB row. But the agent cache is keyed by session_key, which can group multiple conversation threads (different session_ids) under the same key — and the message_count values belong to DIFFERENT DB rows. When the user switches from session A to session B under the same session_key, the cache hit returns A's cached agent. The guard then compares A's snapshot count (A.message_count) against B's live count (B.message_count) — they are NEVER equal because they track different conversations — and invalidates the cache. Every session switch busts the prompt cache and forces a fresh agent build. The post-turn re-baseline (#46237) made it worse: it reads the live count from the CURRENT session_entry.session_id, so each switch overwrites the original snapshot with the new session's count, causing the very next switch BACK to the original session to fire the guard again. This is the bug from #54947 (P0, sweeper:risk-session-state, sweeper:risk-caching). Fix: * Record the snapshot's session_id alongside the message_count in the cache tuple: (agent, sig, mc, session_id) — a 4-tuple. The cache build at the AIAgent construction site stores the active session_id. * The cache-hit guard skips the cross-process count comparison when the active session_id differs from the snapshot's session_id — the comparison is meaningless across different DB rows, so the agent is REUSED without invalidation. The cross- process guard still fires when the session_id matches and the live count differs (genuine cross-process write on the SAME session). * _refresh_agent_cache_message_count checks the snapshot's session_id: when it differs from the current session_id, the snapshot is intentionally left untouched (overwriting it would corrupt the original conversation's baseline and cause the switch-back to fire the guard). The legacy 3-tuple shape (no session_id) is still re-baselined as before. * Backward-compat: - 2-tuple (agent, sig) — unchanged, opts out of the guard. - 3-tuple (agent, sig, mc) — unchanged behavior, standard cross-process check. - pending sentinel — unchanged, untouched by re-baseline. - new 4-tuple (agent, sig, mc, session_id) — full session_id- aware guard with skip on mismatch. Tests: * tests/gateway/test_session_id_cache_coherence.py — 7 tests covering L1-L5 from LAYERS.md: - L1 session_id switch must REUSE - L2 cache tuple records snapshot's session_id - L3 re-baseline skips when session_id differs - L4 same-session_id turns still re-baseline (#46237 holds) - L5 legacy 2-tuples and pending sentinels untouched - legacy 3-tuple (no session_id) still guarded (#45966 holds) - 3-tuple transitions to 3-tuple (not 4-tuple) on re-baseline No regressions in 70 existing tests in test_agent_cache.py or 137 related session tests. Co-authored with #52197 (deferred cleanup of evicted agents); both fixes compose cleanly.	2026-07-01 02:29:24 -07:00
Jason	aa4731598c	fix(gateway): re-baseline agent cache count after first-turn session_meta The cross-process cache-coherence guard (#45966) compares a session's on-disk message_count against a snapshot stored next to the cached agent, rebuilding the agent on a mismatch so a foreign writer (e.g. the dashboard backend) can't leave the in-memory transcript stale. On a fresh gateway conversation the post-turn re-baseline (_refresh_agent_cache_message_count) ran BEFORE the first-turn `session_meta` marker row was appended to the transcript. That append goes through append_to_transcript -> append_message, which increments message_count unconditionally. So the snapshot was left exactly one short of the live count, and on turn 2 of every fresh conversation the guard mistook this process's own session_meta write for a foreign write, evicting and rebuilding the cached agent — silently busting the per-conversation prompt cache the cache exists to protect. Move the re-baseline to after the turn's full transcript persistence block (including the session_meta append and the compression session_id swap). The snapshot now matches the live count, so the guard fires only on genuinely foreign writes. This also makes the call honor its own documented contract of using the compaction-updated session_id. Adds a regression test that drives the real _handle_message_with_agent against a real SessionDB and asserts the invariant: after a fresh first turn, snapshot == live message_count, so the next turn's guard reuses the cached agent. Fails before this change, passes after.	2026-07-01 02:29:24 -07:00
Evo	6bc0a7ce80	test(gateway): pin in-band follow-up re-baseline boundary + placement	2026-07-01 02:29:24 -07:00
Evo	b4cacba6ae	fix(gateway): re-baseline agent-cache message_count before in-band queued follow-up turn The cross-process cache-coherence guard (#45966) re-baselines the cached agent's message_count only on the external-turn boundary (#46237, at _handle_message_with_agent). The in-band queued (/queue) follow-up recurses into _run_agent mid-chain with the stale build-time snapshot, so the follow-up's guard sees the first turn's own writes as a mismatch and rebuilds the agent -- re-introducing the every-turn rebuild / prompt-cache destruction #46237 set out to prevent, on the in-band path. Re-baseline before the recursion, symmetric with the accepted external-path fix.	2026-07-01 02:29:24 -07:00
kshitijk4poor	22a137ed40	fix(agent): prefer late-completing real result over timeout message (review) Review follow-up on the concurrent-tool deadline salvage. timed_out_indices is snapshotted from not_done at the deadline; a worker can still finish and write results[i] in the window before the post-execution result loop reads it. The loop unconditionally replaced results[i] with a fabricated 'timed out' message for any snapshotted index, discarding a genuinely-successful (just-late) result. Gate the timeout message on 'and r is None' so a real result always wins. Add a regression test that forces the snapshot-vs-result-loop race deterministically (mutation-checked: reverting the guard fails it). Also document the intentional detached-worker leak at the executor abandon site.	2026-07-01 14:56:52 +05:30
Gustavo Mendes	c1784e9093	fix(agent): bound concurrent tool execution with a wall-clock deadline A tool with no internal interrupt check (read_file, web_search, or a wedged terminal backend) that never returns keeps the concurrent-tool poll loop alive forever: the loop only breaks when all futures finish or an interrupt is requested, and the 30s heartbeat resets the gateway idle monitor so idle-kill never fires. The ThreadPoolExecutor was also used as a context manager, so its __exit__ joined the hung worker with wait=True. Add a wall-clock batch deadline (HERMES_CONCURRENT_TOOL_TIMEOUT_S, default 420s — above the 360s web_extract timeout; 0/negative disables). When it fires: cancel pending futures, signal an interrupt to the worker threads, abandon the executor (shutdown wait=False, cancel_futures=True) so hung threads aren't joined, and return a per-tool 'timed out' result for the unfinished calls while still surfacing the finished ones. Also fixes the latent futures.index(f) lookup (ambiguous with duplicate futures) by tracking a future->index map. Salvaged from #54562. Co-authored-by: Gustavo Mendes <87918773+gustavosmendes@users.noreply.github.com>	2026-07-01 14:56:52 +05:30
Teknium	913e661a09	fix(cache): stop verification-loop synthetic nudges from persisting (#56194 ) verify_on_stop / pre_verify append a synthetic assistant "done" plus a synthetic user nudge to keep the agent going one more turn before it can claim completion. Both were flagged (_verification_stop_synthetic on the nudge only), but the flags were never registered in _EPHEMERAL_SCAFFOLDING_FLAGS, so the central _is_ephemeral_scaffolding() filter that guards both persistence sinks (SQLite flush + JSON snapshot) let them through. The resumed transcript then inherited loop-only scaffolding, invalidating the prompt-prefix cache on later turns. - add _verification_stop_synthetic and _pre_verify_synthetic to _EPHEMERAL_SCAFFOLDING_FLAGS (the single chokepoint both sinks use) - flag the blocked attempt assistant message too, not just the nudge, so the whole synthetic pair drops together and persistence does not keep a premature done with the nudge stripped (assistant to assistant adjacency) The API-payload leak claimed in the report is already handled: the chat_completions transport strips every underscore-prefixed message key before the wire, so the marker never reaches strict providers. Reported by patppham.	2026-07-01 02:26:06 -07:00
Teknium	522a5e93b2	chore(release): map x9x9x9x9x9x91 for #49247 salvage	2026-07-01 02:18:56 -07:00
Tim Roth	24cb80fd72	test(provider): pin api.anthropic.com host on fallback api_mode Pins that a custom provider on the native api.anthropic.com host resolves to anthropic_messages on the try_activate_fallback path. From #49247.	2026-07-01 02:18:56 -07:00
Teknium	18c61bb8cf	fix(provider): match api.anthropic.com host on fallback api_mode detection Widen the salvaged #32243 fix to the try_activate_fallback path: a custom provider pointed at the native api.anthropic.com host (no /anthropic path suffix, name != anthropic) fell through to chat_completions -> POST /v1/chat/completions -> 404. Match the host the same way determine_api_mode() and _detect_api_mode_for_url() now do. Absorbs #49247.	2026-07-01 02:18:56 -07:00
xxxigm	9efe01c3a0	test(runtime): pin Anthropic OAuth → /v1/messages routing across runtime branches End-to-end regression coverage for #32243 that asserts every runtime branch resolving an Anthropic endpoint returns `api_mode == "anthropic_messages"`: * `_resolve_explicit_runtime` — the path used when a Hermes subcommand passes an explicit `--api-key` / `--base-url`. Pins that a stale persisted `model.api_mode: chat_completions` from a prior provider migration cannot override the anthropic pin. * `_resolve_runtime_from_pool_entry` — the path triggered by `hermes auth add anthropic --type oauth` (the exact flow from the issue). Same stale-api_mode regression pinned here. * `_try_resolve_from_custom_pool` — the user-defined `providers:` / `custom_providers:` path that depends on the URL detector fix landed in the prior commit. Asserts both the detector fallback fires for `api.anthropic.com` and that an explicit `api_mode_override` still wins (so users who DELIBERATELY pointed a chat_completions transport at api.anthropic.com for OpenAI-compat experiments aren't hijacked). Co-locates the three contracts so a future refactor of one branch cannot silently diverge from the others and re-introduce the "out of extra usage" 400 on fresh OAuth Pro/Max credentials.	2026-07-01 02:18:56 -07:00
xxxigm	a2251b40eb	test(provider): pin api.anthropic.com → anthropic_messages URL detection Add a dedicated `TestDirectAnthropicHost` class to `test_detect_api_mode_for_url.py` covering the native Anthropic host shape (bare, trailing slash, /v1 suffix, uppercase host) plus the two negative-space regressions that matter for security: lookalike subdomains (`api.anthropic.com.attacker.test`) and path-segment spoofing (`https://proxy.example.test/api.anthropic.com/v1`) must NOT be classified as native — leaking an Anthropic OAuth token to either would be the worst case. Refs #32243.	2026-07-01 02:18:56 -07:00
xxxigm	a344c92050	fix(provider): route api.anthropic.com to anthropic_messages api_mode (#32243 ) `_detect_api_mode_for_url` previously returned `None` for the bare `api.anthropic.com` host, causing every URL-fallback path (custom_providers, direct-alias, the api-key fallback inside `resolve_runtime_provider`) to default to `chat_completions` for native Anthropic — which routes requests to the OpenAI-compat `/chat/completions` shim instead of the native `/v1/messages` endpoint. Pro/Max OAuth subscriptions are only billed against the native Messages API; the shim bills against a separate "extra usage" pool that is empty by default, so a freshly authorized Pro/Max credential 400s with "You're out of extra usage" the moment it's used — even on an account that has consumed nothing for the current cycle. Brings the helper in line with `hermes_cli.providers.determine_api_mode` which already mapped `api.anthropic.com` to `anthropic_messages`.	2026-07-01 02:18:56 -07:00
itenev	f981d47cb0	fix(gateway): prevent Discord disconnects from blocking event loop models_dev.py's fetch uses a synchronous requests.get(timeout=15). Called from the async gateway message handlers, it blocked the event loop for up to 15s, starving Discord heartbeats and causing ClientConnectionResetError disconnects. Adds get_model_context_length_async() which offloads the entire sync resolution chain to a worker thread via asyncio.to_thread(), and switches the two async gateway call sites (_prepare_inbound_message_text, _handle_message_with_agent) to await it. The loop stays responsive; the sync path remains the single source of truth for the cache. Salvaged from PR #22753 by @itenev. Follow-up: dropped the unused fetch_models_dev_async/lookup_models_dev_context_async aiohttp variants from the original PR (dead code with zero callers that had drifted from the sync cache logic) — the to_thread wrapper already runs the sync path off-loop, so they were redundant.	2026-07-01 02:17:35 -07:00
Teknium	d57a4c197c	fix(tools): stop _strategy_exact emitting overlapping matches (#56211 ) _strategy_exact advanced its scan cursor by pos+1 instead of pos+len(pattern), so self-overlapping patterns (e.g. "aa" in "aaaa") matched at overlapping offsets. _apply_replacements works in reverse order, so the second replacement operated on already-modified content using stale offsets — corrupting the file and reporting the wrong count under replace_all=True. Advancing by len(pattern) matches str.replace() semantics.	2026-07-01 02:13:13 -07:00
Teknium	ea533e7f41	chore(release): map justin-cyhuang contributor email for #31960 salvage	2026-07-01 02:12:25 -07:00
Justin Huang	74d2660aeb	fix(gateway): await async post-delivery callbacks in chained wrapper When two features register a post-delivery callback for the same session (e.g. background-review release + /goal continuation), the second registration is composed with the first via a `_chained` wrapper. That wrapper was `def _chained()` — a sync function calling each callback via `_prev()` / `_new()` and discarding the return value. For sync callbacks that's fine. For async callbacks (such as the `_deliver()` coroutine the /goal feature registers to inject the continuation prompt) the returned coroutine was silently dropped: RuntimeWarning: coroutine '_deliver' was never awaited. Outer invoker in `_handle_message` already checks `inspect.isawaitable(_post_result)` and awaits — but only sees the wrapper's return value, which was `None`. Fix: make `_chained` async, iterate over chained callbacks, await any that return an awaitable. Outer invoker already handles awaitable wrappers, so no other change is needed. Tested: * Added two regression tests in test_post_delivery_callback_chaining.py covering an async callback chained behind sync (and vice versa). * Updated existing chaining tests + test_run_cleanup_progress.py to await the popped callback when it's awaitable. * 62 tests pass across the touched suites. Live-validated on Discord: /goal continuations now arrive after the first turn's response is delivered (previously silent). Refs: NousResearch/hermes-agent#31922	2026-07-01 02:12:25 -07:00
xxxigm	8b14080e30	test(tui): pin bundle shape to prevent #31227 from regressing Vitest regression that builds `dist/entry.js` and checks two structural invariants required for startup to not hang: 1. Zero `async "<path>"() { … }` keys inside any `__esm` definition. esbuild only emits the `async` form when a module body contains top-level await; the `__esm` helper at the top of the bundle does not await nested inits, so any async wrapper participating in a circular module graph would deadlock the boot `await Promise.all([…])` in `src/entry.tsx`. 2. No `node_modules/ink/build/index.js` or `node_modules/ink-text-input/build/index.js` modules. Their absence is what makes invariant 1 hold today; if a future commit re-introduces the `ink-text-input` re-export, this test catches it before the bundle ships. The test rebuilds the bundle on demand when the source is newer than `dist/entry.js`, runs in <100ms with no TTY needed, and is hermetic on a clean checkout.	2026-07-01 02:10:32 -07:00
xxxigm	53d2c4191f	docs(tui): clarify why @hermes/ink is aliased to source in build.mjs Update the comment on the `alias` entry to mention the second reason the source-inline is needed: keeping the upstream `ink` / `ink-text-input` graph out of the bundle (which fixed the startup deadlock in #31227). Code path is unchanged.	2026-07-01 02:10:32 -07:00
xxxigm	18297899d7	fix(tui): drop ink-text-input re-export from @hermes/ink entry-exports (#31227 ) The dashboard TUI bundle hung at startup with only 141 bytes of ANSI reset sequences and a blank screen forever. Root cause: esbuild's lightweight `__esm` helper at the top of `dist/entry.js` does not await nested async init, so a circular async cycle in the module graph never resolves. The cycle came from re-exporting ``TextInput`/`UncontrolledTextInput`` from `'ink-text-input'` here — that npm package depends on the upstream `ink` package, whose graph loops back through React + our in-tree `@hermes/ink` ink fork. The result: `init_entry_exports` was emitted as `async … await init_build4()` (where `build4` is `node_modules/ink-text-input/build`), and the top-level `await Promise.all([init_entry_exports().then(...)])` in `src/entry.tsx` deadlocked waiting on the dangling Promise. Nobody in `ui-tui/` actually imports `TextInput` from `@hermes/ink` — the composer uses the in-tree `src/components/textInput.tsx` widget instead. Drop the re-export from the source so the bundle no longer inlines the upstream ink graph at all. Callers that legitimately want the upstream widget can still import it from the dedicated `@hermes/ink/text-input` subpath, which sits outside `entry-exports` and so does not get inlined into consumers' bundles. After the fix: * `dist/entry.js` shrinks from 2.9MB → 2.4MB (~11.5k fewer bundled lines) with zero `async __esm` wrappers remaining. * `init_entry_exports` is now a synchronous `__esm` module. * The bundle's top-level await chain resolves in ~30ms instead of hanging.	2026-07-01 02:10:32 -07:00
kshitijk4poor	a658f3b28b	fix(security): strip dynamic Hermes secrets from all subprocess spawn env Subprocesses spawned by the terminal tool, execute_code, Docker backend, and the codex app-server could inherit Hermes-internal secrets that the name-based `_HERMES_PROVIDER_ENV_BLOCKLIST` can't enumerate, because they're injected into `os.environ` at runtime under dynamic names: - `AUXILIARY_<TASK>_API_KEY` / `AUXILIARY_<TASK>_BASE_URL` — per-task side-LLM credentials bridged from `config.yaml[auxiliary]` by gateway/run.py and cli.py (vision, web_extract, approval, compression, plugin-registered tasks). Often separate, higher-spend keys plus base URLs pointing at private endpoints. - `GATEWAY_RELAY__SECRET` / `_KEY` / `_TOKEN` — relay-auth material provisioned by gateway/relay. Additionally, agent/transports/codex_app_server.py built its spawn env from a raw `os.environ.copy()`, bypassing the centralized `hermes_subprocess_env()` helper entirely — handing every codex subprocess the full Tier-1 secret set (GH_TOKEN, gateway bot tokens, Modal/Daytona infra tokens, dashboard session token) unfiltered. This is the #29157 sibling spawn-site gap; copilot_acp_client already routes through the helper. Fix — single chokepoint: - Add `_is_hermes_internal_secret(key)` in tools/environments/local.py as the single source of truth for the dynamic secret patterns. Matches AUXILIARY__API_KEY / _BASE_URL and GATEWAY_RELAY__SECRET/_KEY/_TOKEN; leaves non-secret AUXILIARY__PROVIDER/_MODEL and GATEWAY_RELAY routing hints visible. - Wire the predicate into every spawn path unconditionally (ignores skill env_passthrough opt-in AND inherit_credentials — a model-driving CLI never needs these): `_sanitize_subprocess_env` (both loops), `_make_run_env` (foreground), `hermes_subprocess_env` (Tier-1), and the Docker forward filter. - Add the static GATEWAY_RELAY_* names to `_HERMES_PROVIDER_ENV_BLOCKLIST` so the exact-match path catches them independently of the predicate. - Add the GATEWAY_RELAY_ID/_SECRET/_DELIVERY_KEY triplet to `_ALWAYS_STRIP_KEYS` (Tier-1) so it is stripped unconditionally on EVERY spawn surface — including the codex/copilot `inherit_credentials=True` path that skips the Tier-2 blocklist. `_SECRET`/`_DELIVERY_KEY` are already predicate-matched; `_ID` has no secret suffix, so enumerating it here is what closes its leak on the inherit path (self-review W1). - Defense in depth: env_passthrough.py `_is_hermes_provider_credential()` now consults the same predicate, so a skill can't register these names as passthrough and tunnel them into an execute_code / terminal child. - Route codex_app_server through `hermes_subprocess_env(inherit_credentials=True)` — strips Tier-1 + dynamic-internal secrets while provider creds (which codex needs to authenticate) still flow. Consolidates PRs #53715 (necoweb3 — the _is_hermes_internal_secret backbone + Docker filter), #53503 (srojk34 — env_passthrough guard), and #55709 (srojk34 — codex routing). Retires #52348 (claudlos): its copilot half is already on main, and its codex half used the full-strip `_sanitize_subprocess_env` which would break codex provider auth — the correct tier is `inherit_credentials=True`. Tests: TestHermesInternalDynamicSecrets (terminal + predicate + passthrough override), TestInternalDynamicSecrets (hermes_subprocess_env both tiers), TestSpawnEnvSecretStripping (codex spawn env), plus env_passthrough defense-in-depth cases. Co-authored-by: necoweb3 <sswdarius@gmail.com> Co-authored-by: srojk34 <286497132+srojk34@users.noreply.github.com> Co-authored-by: claudlos <claudlos@agentmail.to>	2026-07-01 14:37:22 +05:30
Omar Baradei	053424c486	fix(agent): preserve final_response on failure returns AIAgent.run_conversation() promises a dict with final_response, but 16 terminal-failure branches returned dicts that either omitted the key or set it to None. Callers that index result['final_response'] directly (run_agent.py chat() + the __main__ printer) turn a real provider/context failure into an opaque KeyError instead of surfacing the actionable error. Every offending branch already carried usable 'error' text, so this mirrors that text into final_response for all 16 sites (8 that omitted the key, 8 that returned None). Adds an AST regression test that fails if any run_conversation() dict return omits final_response or sets it to a literal None, and tightens the invalid-response test to assert final_response == error.	2026-07-01 02:04:28 -07:00
teknium1	43edbae638	fix(telegram): widen NoneType reconnect guard to the conflict-retry path The network-error reconnect ladder (#55992) captured a stable self._app local across its awaits and failed fast when the adapter was torn down mid-sleep. The 409-conflict retry path had the identical unguarded self._app.updater.start_polling() deref — a concurrent disconnect() during its RETRY_DELAY sleep would raise the same 'NoneType' object has no attribute 'updater' and, on a non-final retry, land in limbo. Apply the same stable-local + fail-fast pattern so the existing except block reschedules or escalates to fatal.	2026-07-01 02:03:58 -07:00
joaomarcos	fb8efbb4a8	fix(gateway): ignore stale fatal-error notifications from superseded adapters A delayed fatal-error notification from an adapter instance that has already been replaced by a successful reconnect (a different adapter object now owns the platform slot) was still processed: it overwrote the platform's runtime status back to retrying/fatal and could re-queue an already-healthy platform for reconnection. Snapshot the current owner of the platform slot at the top of _handle_adapter_fatal_error and bail out before any side effect when it belongs to a different, already-installed adapter.	2026-07-01 02:03:58 -07:00
joaomarcos	a682091044	fix(telegram): close reconnect races that leave adapter half-destroyed _handle_polling_network_error's chained retry never updated self._polling_error_task, so the reentrancy guard shared with the heartbeat loop and the pending-updates probe went stale mid-recovery, letting more than one recovery attempt run concurrently against the same adapter. Combined with a TOCTOU window in _handle_adapter_fatal_error (the adapter was only removed from self.adapters in a finally block after awaiting disconnect()), two concurrent fatal notifications for the same adapter could both pass the "still installed" check and call disconnect() twice, which is where the reported "'NoneType' object has no attribute 'updater'" originates once self._app is cleared by the first call. - Reassign the chained retry task to self._polling_error_task so the guard reflects an in-flight recovery. - Capture self._app in a local variable across the stop/start_polling sequence instead of re-reading self._app between awaits. - Claim (pop) the adapter from self.adapters before awaiting disconnect() in _handle_adapter_fatal_error, not after, closing the TOCTOU window for a concurrent notification on the same adapter.	2026-07-01 02:03:58 -07:00
Teknium	259e6b87a7	fix(teams-pipeline): reject dot-only recording display_name Path(raw).name reduces '..'/'.'/'' to themselves, so basename extraction alone still let a Graph-provided display_name of '..' or '../' escape the temp recording directory (tmp_dir / '..' resolves to the parent). Reject the dot-only basenames explicitly and fall back to the artifact id. Extends @outsourc-e's regression coverage with the dot-only cases.	2026-07-01 02:03:48 -07:00
Eric	ac18a8658b	test(teams-pipeline): cover path traversal sanitization	2026-07-01 02:03:48 -07:00
memosr	3590543312	fix(security): strip directory components from Teams recording display_name to prevent path traversal	2026-07-01 02:03:48 -07:00
teknium1	6d30f8c0ab	chore: add AUTHOR_MAP entry for PR #52534 salvage (@qWaitCrypto)	2026-07-01 02:03:40 -07:00
qWaitCrypto	e1ff736f26	fix(anthropic): preserve ordered replay cache markers	2026-07-01 02:03:40 -07:00
qWaitCrypto	80d71e8d2e	fix(anthropic): preserve tool use cache markers	2026-07-01 02:03:40 -07:00

1 2 3 4 5 ...

13991 commits