hermes-agent

Author	SHA1	Message	Date
Teknium	d57a4c197c	fix(tools): stop _strategy_exact emitting overlapping matches (#56211 ) _strategy_exact advanced its scan cursor by pos+1 instead of pos+len(pattern), so self-overlapping patterns (e.g. "aa" in "aaaa") matched at overlapping offsets. _apply_replacements works in reverse order, so the second replacement operated on already-modified content using stale offsets — corrupting the file and reporting the wrong count under replace_all=True. Advancing by len(pattern) matches str.replace() semantics.	2026-07-01 02:13:13 -07:00
kshitijk4poor	a658f3b28b	fix(security): strip dynamic Hermes secrets from all subprocess spawn env Subprocesses spawned by the terminal tool, execute_code, Docker backend, and the codex app-server could inherit Hermes-internal secrets that the name-based `_HERMES_PROVIDER_ENV_BLOCKLIST` can't enumerate, because they're injected into `os.environ` at runtime under dynamic names: - `AUXILIARY_<TASK>_API_KEY` / `AUXILIARY_<TASK>_BASE_URL` — per-task side-LLM credentials bridged from `config.yaml[auxiliary]` by gateway/run.py and cli.py (vision, web_extract, approval, compression, plugin-registered tasks). Often separate, higher-spend keys plus base URLs pointing at private endpoints. - `GATEWAY_RELAY__SECRET` / `_KEY` / `_TOKEN` — relay-auth material provisioned by gateway/relay. Additionally, agent/transports/codex_app_server.py built its spawn env from a raw `os.environ.copy()`, bypassing the centralized `hermes_subprocess_env()` helper entirely — handing every codex subprocess the full Tier-1 secret set (GH_TOKEN, gateway bot tokens, Modal/Daytona infra tokens, dashboard session token) unfiltered. This is the #29157 sibling spawn-site gap; copilot_acp_client already routes through the helper. Fix — single chokepoint: - Add `_is_hermes_internal_secret(key)` in tools/environments/local.py as the single source of truth for the dynamic secret patterns. Matches AUXILIARY__API_KEY / _BASE_URL and GATEWAY_RELAY__SECRET/_KEY/_TOKEN; leaves non-secret AUXILIARY__PROVIDER/_MODEL and GATEWAY_RELAY routing hints visible. - Wire the predicate into every spawn path unconditionally (ignores skill env_passthrough opt-in AND inherit_credentials — a model-driving CLI never needs these): `_sanitize_subprocess_env` (both loops), `_make_run_env` (foreground), `hermes_subprocess_env` (Tier-1), and the Docker forward filter. - Add the static GATEWAY_RELAY_* names to `_HERMES_PROVIDER_ENV_BLOCKLIST` so the exact-match path catches them independently of the predicate. - Add the GATEWAY_RELAY_ID/_SECRET/_DELIVERY_KEY triplet to `_ALWAYS_STRIP_KEYS` (Tier-1) so it is stripped unconditionally on EVERY spawn surface — including the codex/copilot `inherit_credentials=True` path that skips the Tier-2 blocklist. `_SECRET`/`_DELIVERY_KEY` are already predicate-matched; `_ID` has no secret suffix, so enumerating it here is what closes its leak on the inherit path (self-review W1). - Defense in depth: env_passthrough.py `_is_hermes_provider_credential()` now consults the same predicate, so a skill can't register these names as passthrough and tunnel them into an execute_code / terminal child. - Route codex_app_server through `hermes_subprocess_env(inherit_credentials=True)` — strips Tier-1 + dynamic-internal secrets while provider creds (which codex needs to authenticate) still flow. Consolidates PRs #53715 (necoweb3 — the _is_hermes_internal_secret backbone + Docker filter), #53503 (srojk34 — env_passthrough guard), and #55709 (srojk34 — codex routing). Retires #52348 (claudlos): its copilot half is already on main, and its codex half used the full-strip `_sanitize_subprocess_env` which would break codex provider auth — the correct tier is `inherit_credentials=True`. Tests: TestHermesInternalDynamicSecrets (terminal + predicate + passthrough override), TestInternalDynamicSecrets (hermes_subprocess_env both tiers), TestSpawnEnvSecretStripping (codex spawn env), plus env_passthrough defense-in-depth cases. Co-authored-by: necoweb3 <sswdarius@gmail.com> Co-authored-by: srojk34 <286497132+srojk34@users.noreply.github.com> Co-authored-by: claudlos <claudlos@agentmail.to>	2026-07-01 14:37:22 +05:30
Teknium	7534b5be2c	fix(security): anchor rm hardline rules to command position (#56193 ) A literal "rm -rf /" carried as DATA inside another command's quoted argument — a PR title, a git commit -m message, an echo/printf arg — tripped the unconditional root-filesystem hardline and could not run at all. `gh pr create --title "block rm -rf / spellings"` was blocked outright, because the bare rm path branch matched the mid-string "rm" (via \brm) with the space after "/" satisfying its (\s\|$) terminator. Anchor the shared _RM_FLAG_PREFIX to _CMDPOS so the rm hardline rules fire only when rm is an actual command word (start of line, after a separator ; && \|\| \|, after a subshell opener $()/backtick, or after sudo/env/exec wrappers) — not when the string appears as an argument value. Broaden the bare-path terminator to also accept shell metacharacters ) ` ; \| & so a real wipe inside a command substitution is still caught. The quoted-path branch is unchanged, so quoted root/HOME paths stay blocked. Adds regression tests for both directions: data-arg false positives must NOT block, real wipes at every command position must block.	2026-07-01 01:54:43 -07:00
claudlos	b24708eda0	security(cron): block base_url overrides that exfiltrate provider credentials The model-facing cronjob tool accepts free-form provider + base_url. On fire, the scheduler pairs the named provider's stored credential with the job's base_url, so a prompt-injected job (e.g. provider=anthropic, base_url=https://attacker/v1) sends the real API key to an attacker endpoint. A base_url with no provider inherits the default provider's key for the same effect. Add a fail-closed guard at the tool boundary: a base_url override is allowed only for the custom/BYOK sentinel, a configured custom_providers entry, or when the override host matches the named provider's own endpoint; an override without an explicit provider is rejected. The trust boundary is the caller, so operator-configured base_urls for named providers are unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-07-01 14:23:01 +05:30
necoweb3	dc8b5b4f47	fix(approval): detect encoding-based dangerous command bypass (#30100 ) echo <base64> \| base64 -d \| bash (and base32/base16, xxd -r, tr transforms, openssl base64/enc -d) decode a dangerous command at runtime — the raw text carries no dangerous keyword, so the denylist never fired. Adds DANGEROUS_PATTERNS entries for decode-and-execute pipes into a shell.	2026-07-01 01:39:10 -07:00
YLChen-007	4b5fce66f5	fix(approval): flag remote content via command substitution (#26964 ) eval $(curl ...), source $(wget ...), and . $(curl ...) executed remote content but were not covered by the existing pipe-to-shell / process-substitution patterns. Adds a DANGEROUS_PATTERNS entry so these command-substitution forms consistently request approval. Original authorship preserved from PR #26965 (bot-authored commit re-attributed to the human contributor).	2026-07-01 01:39:10 -07:00
xy200303	1ebc56ca39	fix(approval): detect shell-expanded command names (#36846 ) Command-name obfuscation bypassed the dangerous-command denylist: the executable name could be spelled with shell tricks that survive regex matching but still resolve to a blocked command at runtime — $(echo rm), ${0/x/r}m, backticks, and printf substitutions. Adds a non-executing shell-word scanner that deobfuscates only at command positions (start, after ;\|&&\|\|, inside $(...), after sudo/env/exec/... wrappers) and feeds the resulting variants through the existing HARDLINE_PATTERNS / DANGEROUS_PATTERNS — no second blocklist. Scoping to command words keeps ordinary arguments (echo $(echo rm) -rf /) from being promoted into command names. Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>	2026-07-01 01:39:10 -07:00
teknium1	17f07aebdc	fix(security): close shell line-continuation bypass in command detection `_normalize_command_for_detection` strips backslash-escapes before matching DANGEROUS_PATTERNS and HARDLINE_PATTERNS, but the strip rule was `re.sub(r'\\([^\n])', r'\1', ...)` — its `[^\n]` class deliberately skips newlines. A backslash immediately followed by a newline is a POSIX line continuation: the shell removes BOTH characters and joins the tokens, so `rm -rf \<newline>/` executes as `rm -rf /`. With the dangling backslash left in place, the structured rm/dd/mkfs patterns no longer match because a literal `\` sits wedged between the tokens they expect to be adjacent. The worst consequence is on the HARDLINE floor. The dangerous-command layer still fired here only by accident (the generic `\brm\s+-[^\s]r` "recursive delete" rule needs no path), and that layer is bypassed by `--yolo` / `approvals.mode=off`. The hardline blocklist — the unconditional floor reserved for catastrophic, unrecoverable commands and meant to hold even under yolo — anchors the root path directly after the flags, so `rm -rf \<newline>/`, `rm -r\<newline>f /`, and `rm -rf \<newline>~` all slipped past it entirely. A yolo session could therefore wipe the root filesystem. The fix collapses line continuations (`\` + `\n` or `\r\n`) to nothing, mirroring the shell, before the existing escape strip runs. This was the gap left by `621bf3a87`, which added the escape strip but only for non-newline chars. ## What does this PR do? Closes a shell line-continuation bypass in the dangerous-command detector. Before: `rm -rf \<newline>/` normalized to `rm -rf \<newline>/`, so the hardline root-delete patterns did not match and the command could run under `--yolo`. After: line continuations are collapsed first, the command normalizes to `rm -rf /`, and the hardline floor blocks it unconditionally. ## Related Issue N/A ## Type of Change - [x] 🔒 Security fix ## Changes Made - `tools/approval.py`: in `_normalize_command_for_detection`, add `command = re.sub(r'\\\r?\n', '', command)` ahead of the existing backslash-escape strip so shell line continuations (`\`+newline, LF or CRLF) are removed exactly as the shell would, instead of leaving a stray backslash that breaks the structured patterns. - `tests/tools/test_hardline_blocklist.py`: add a parametrized `test_hardline_blocks_line_continuation` covering the root, in-flag, home, CRLF, and mkfs continuation forms, plus `test_line_continuation_root_wipe_cannot_bypass_hardline` asserting the continuation root wipe stays blocked even with `HERMES_YOLO_MODE=1`. ## How to Test 1. Reproduce: stash the `tools/approval.py` change and run `scripts/run_tests.sh tests/tools/test_hardline_blocklist.py` — the new line-continuation cases fail (`rm -rf \<newline>/` is not flagged hardline, and leaks past the floor under yolo). 2. Restore the change and rerun the file — all 106 tests pass. 3. Regression: `scripts/run_tests.sh tests/tools/test_approval.py` (the existing fullwidth/ANSI/null-byte normalization and multiline cases still pass). ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only* changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15 (Darwin 25.5.0) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) — handles both LF and CRLF line endings - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A # Conflicts: # tools/approval.py	2026-07-01 01:38:59 -07:00
teknium1	1d8bd73414	fix(approval): treat # as comment boundary only when whitespace-preceded The salvaged write-target boundary included `#` in its char class, so a `#` glued to the redirect/tee path (`echo x > .env#backup`) matched as a comment boundary and flagged the write as dangerous. But the shell writes to the distinct file `.env#backup`, not `.env` — a false positive, same class as the config.yaml.bak case the PR already excluded. Drop `#` from the boundary; a real trailing comment is always whitespace-preceded (\\s). Adds regression tests for .env#backup, config.yaml#backup, and tee .env#backup staying out of the deny.	2026-07-01 01:27:26 -07:00
friendshipisover	7bfdc0bca6	fix(security): close env/config write-deny bypass via trailing arg or comment The dangerous-command approval gate has rules that flag a shell command when it overwrites a project `.env` or `config.yaml` — these files hold API keys, DB passwords, and (for `config.yaml`) the approval policy itself, so a write to them should require user approval. The matching `write_file`/`patch` deny on the file-tools side was paired with these terminal-side rules so neither path is an open door. The redirection and `tee` rules anchored the sensitive path with `_COMMAND_TAIL` (`(?:\s(?:&&\|\\|\\|\|;).)?$`), which only tolerates the rest of the line being empty or a command separator. The problem: in POSIX shell the redirection target is fixed regardless of what trails it. `echo secret > .env extra` still truncates `.env` (the `extra` is just another argument to `echo`), and `echo secret > .env # note` does too (the `#` starts a comment). Because neither tail is a separator, the old anchor failed to match and the command sailed through approval — a prompt-injected step could overwrite a project `.env`/`config.yaml` unprompted. The system-path redirection rule one line above never had this restriction and already caught these forms. The fix introduces `_WRITE_TARGET_BOUNDARY`, a lookahead that only requires the path token to END at a shell word boundary (whitespace, quote, separator, redirection operator, `#`, or EOL) rather than demanding the rest of the line be empty. It is applied to the two stream-write rules (redirection and `tee`) where the sensitive path is always a write target. The `cp`/`mv`/`install` rule deliberately keeps `_COMMAND_TAIL`: there the sensitive file is only a target when it is the LAST argument (the destination), so requiring end-of-line is correct and keeps `cp config.yaml backup.yaml` (config.yaml as the source) out of the deny. ## What does this PR do? Closes a bypass in the dangerous-command approval gate where a trailing argument or `#` comment after a `>`/`>>`/`tee` write target let a command overwrite a project `.env` or `config.yaml` without triggering approval, even though the shell still overwrites the file. ## Related Issue N/A ## Type of Change - [x] 🔒 Security fix ## Changes Made - `tools/approval.py`: add `_WRITE_TARGET_BOUNDARY` (a word-boundary lookahead) and use it instead of `_COMMAND_TAIL` in the two project-env/config stream-write patterns ("overwrite project env/config via tee" and "via redirection"). `_COMMAND_TAIL` is kept and still used by the `cp`/`mv`/`install` rule, where end-of-line anchoring is the correct semantics. - `tests/tools/test_approval.py`: add regression tests for `> .env extra`, `> .env # note`, `>> config.yaml foo`, and `tee .env backup` (now flagged), plus `> config.yaml.bak` (must stay safe — different file). ## How to Test 1. Reproduce: before the fix, `detect_dangerous_command("echo secret > .env extra")` returns `(False, None, None)` — the overwrite is not flagged. 2. Apply the fix; the same call now returns the "overwrite project env/config via redirection" detection. 3. Run `pytest tests/tools/test_approval.py -q` — the new cases pass and the existing `cp config.yaml backup.yaml` / `config.yaml.bak` false-positive guards still hold. ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix - [x] I've run the relevant tests and they pass - [x] I've added tests for my changes - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, docs/, docstrings) — or N/A - [x] I've updated cli-config.yaml.example if I added/changed config keys — or N/A - [x] I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-07-01 01:27:26 -07:00
kshitijk4poor	83ae65487e	test(browser): cover guard-inactive + camofox short-circuit paths; fix blank lines Review follow-up on the private-page action guard: - Add test_guard_inactive_does_not_block_or_probe: when the SSRF guard is inactive (local backend / allow_private_urls), click/type/press must proceed WITHOUT probing the page URL. This is the branch most likely to silently regress if the guard condition is inverted; a mutation check (flipping the condition) confirms the test fails as designed. - Add test_camofox_short_circuits_before_guard: camofox mode returns from the dedicated camofox_* path before the guard runs; guards never consulted. - Fix PEP8: 3 -> 2 blank lines before _blocked_private_page_action.	2026-07-01 13:56:49 +05:30
dsad	3e4c138251	fix(browser): block private-page interactions after eval navigation	2026-07-01 13:56:49 +05:30
rrevenanttt	a81b519d41	fix(security): close hardline rm bypass via quoted paths and ${HOME} ## What does this PR do? Closes a critical hole in the hardline command floor. HARDLINE_PATTERNS is the unconditional last line of defense: detect_hardline_command runs BEFORE every yolo / approvals.mode=off / cron approve-mode bypass, so it is the only gate standing between the agent (or a prompt-injected instruction) and an irrecoverable disk wipe. The three rm rules anchored on a bare path token, and _normalize_command_for_detection never strips shell quotes — so the ordinary, recommended shell idioms slipped straight through: rm -rf "/" rm -rf '/' rm -rf "/etc" rm -rf "$HOME" rm -rf ${HOME} rm -rf "${HOME}" All of these returned NO hardline match. A leading quote pushes the path out of reach of the flag group, a trailing quote breaks the `(\s\|$)` terminator, and the `${HOME}` brace form was never listed at all. Under --yolo, approvals.mode=off, or cron approve-mode the dangerous-command layer is also skipped, so these commands reached execution with zero gate — exactly the unrecoverable data loss the floor is documented to make impossible. Because quoting paths and `${HOME}` are normal shell usage, not exotic obfuscation, this is a high-severity, easily-triggered bypass. The fix makes the rm path matcher quote- and brace-tolerant while staying conservative: a path is matched when it is either fully wrapped in its own matching quote pair (`"/"`) or bare with a whitespace/end terminator. The matching-quote requirement is deliberate so the change adds no new false positives — a dangerous-looking string that is merely an argument to another command (e.g. `git commit -m "rm -rf /"`) has a closing quote but no opening quote of its own around the path, so neither branch fires. ## Related Issue N/A ## Type of Change - [x] 🔒 Security fix ## Changes Made - `tools/approval.py`: added `_hardline_rm_path()` (matches a destructive path either fully quoted or bare-with-terminator), factored the protected system-dir list into `_HARDLINE_SYSTEM_DIRS` and the rm flag prefix into `_RM_FLAG_PREFIX`, and rebuilt the three rm `HARDLINE_PATTERNS` on top of them, adding the `${HOME}` brace form. Kept as plain concatenation so regex backslashes never land inside an f-string field (Python 3.11 floor). - `tests/tools/test_hardline_blocklist.py`: added quoted (`"/"`, `'/'`, `"/etc"`, `"$HOME"`, ...) and brace (`${HOME}`, `"${HOME}"`) cases to the must-block set, a dedicated `_QUOTED_BRACE_BYPASS` regression parametrization, no-false-positive guards (`git commit -m "rm -rf /"`), and extended the yolo-cannot-bypass integration test to cover the quoted/brace forms. ## How to Test 1. Reproduce the bypass on `main`: `detect_hardline_command('rm -rf "/"')` returns `(False, None)` — the floor lets it through. 2. With this change it returns `(True, "recursive delete of root filesystem")`; the same holds for `'/'`, `"/etc"`, `"$HOME"`, `${HOME}`, `"${HOME}"`. 3. Run the suite: `scripts/run_tests.sh tests/tools/test_hardline_blocklist.py` — 125 passed, including the new bypass and no-false-positive cases. ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits (`fix(scope):`, etc.) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix (no unrelated commits) - [x] I've run the relevant tests and they pass - [x] I've added tests for my changes (required for bug fixes) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) — pattern-only change, ruff + footgun gate pass - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-07-01 01:25:24 -07:00
zapabob	500c2b1e46	fix(security): close SSRF redirect-guard bypass across all httpx download hooks Inside httpx AsyncClient response event hooks, response.next_request is often None even for a genuine redirect, so guards keyed on `if response.is_redirect and response.next_request` silently never fire. A public URL that 302s to http://169.254.169.254/ was followed anyway, defeating the pre-flight is_safe_url() check. Resolve the redirect target from the Location header (via urljoin, so relative Locations work too), falling back to next_request only when no Location is present. Extracted as tools.url_safety.redirect_target_from_response and wired into every SSRF redirect guard: - gateway/platforms/base.py (shared image + audio download for all platforms) - tools/vision_tools.py (two download hooks) - plugins/platforms/slack/adapter.py Original fix by @zapabob (PR #35940), which targeted the since-refactored gateway/platforms/slack.py; reconstructed onto the current shared sites and widened to the whole bug class.	2026-07-01 01:18:53 -07:00
kshitijk4poor	e09ff88d02	fix(browser): close remaining CDP-URL leak paths in supervisor (review) Review of the salvage found the timeout-message redaction left the more common failure mode unguarded: when the first websockets.connect(cdp_url) fails (bad URI / refused / TLS), the raw websockets exception -- which embeds the full cdp_url incl. ?token= and user:pass@ -- is stashed as _start_error and re-raised verbatim by start(), and two reconnect logger.warning sites log the same raw exception. Add a module-level _redact_cdp_error_text() chokepoint (delegating to agent.redact.redact_cdp_url) and route all four supervisor egress points through it: - start() TimeoutError message (already covered; kept) - start() _start_error re-raise -> now raises a redacted RuntimeError with 'from None' so no secret leaks via message OR traceback cause chain - connect-failed and session-dropped reconnect warnings Guard tests assert the re-raised message is redacted for both token and userinfo, the raw cause is suppressed, and the helper preserves non-secret context (host/reason). Verified with a mutation check: reverting to the raw 'raise err' fails the new tests. Correct the redact_cdp_url docstring to scope its guarantee to direct-URL redaction and point exception callers at the supervisor helper.	2026-07-01 13:43:58 +05:30
kshitijk4poor	c626dded13	refactor(redact): consolidate CDP-URL log redaction into one chokepoint The session-log fix (browser_tool._sanitize_url_for_logs) and the supervisor attach-timeout fix (CDPSupervisor.start) both composed the same three redactors (redact_sensitive_text -> _redact_url_query_params -> _redact_url_userinfo) to mask CDP endpoint credentials. Two copies of one policy drift: tune one site (e.g. add fragment masking) and the other silently re-leaks. Promote that composition to a single public helper redact_cdp_url() in agent/redact.py -- the one place the CDP-URL redaction policy lives -- and route both call sites through it (_sanitize_url_for_logs becomes a thin wrapper; the supervisor imports the helper instead of re-composing the private redactors). Add direct unit tests for the seam covering query tokens, multiple credentials, userinfo passwords, plain-URL passthrough, non-string/exception coercion, and None. No behavior change at the call sites; both leak paths remain closed.	2026-07-01 13:43:58 +05:30
srojk34	265da9cadb	fix(browser): redact CDP URL token in _create_cdp_session log and supervisor timeout PR #54851 added _sanitize_url_for_logs() and wired it into the three log sites inside _resolve_cdp_override(). A fourth site was missed: _create_cdp_session() logs the already-resolved cdp_url unconditionally, and CDPSupervisor.start() interpolates the raw cdp_url[:80] into the attach-timeout TimeoutError (which _ensure_cdp_supervisor() logs with %s). Both leak query-string credentials (e.g. ?token=secret from hosted CDP providers) into Hermes logs. Sanitize the URL at both remaining sites. The raw URL is preserved unmodified in the returned session dict and used for the real connection; only the logged/error representation is redacted. Salvaged from #55883. Co-authored-by: srojk34 <286497132+srojk34@users.noreply.github.com>	2026-07-01 13:43:58 +05:30
Jace Nibarger	060779bb76	fix: bound threat-pattern/FTS5 regex input and cover V4A Move-File edits Salvaged from PR #35130 (the safe subset of jnibarger01's security pass): - threat_patterns.py: replace unbounded (?:\w+\s+)* filler with bounded {0,8} + cap scan input at MAX_SCAN_CHARS (64KiB), and bound the .* runs in the exfil/config-mod patterns. Kills catastrophic backtracking on adversarial near-misses. - hermes_state.py: cap FTS5 query length (MAX_FTS5_QUERY_CHARS) and extract quoted phrases with a linear scan instead of a regex so pathological quote runs can't induce backtracking. - acp_adapter/edit_approval.py + agent/tool_dispatch_helpers.py: recognize '*** Move File: src -> dst' V4A headers so patch-mode edits are permissioned/traversal-checked (previously only Update/Add/Delete), and surface a proposal for mode=patch V4A calls (previously replace-only). Tests: +ReDoS-bound + FTS5-cap + Move-File-target + V4A-approval cases.	2026-07-01 01:05:28 -07:00
zapabob	8e492b5567	fix(file): block credential paths from search results	2026-07-01 01:02:35 -07:00
Matt Kotsenas	dd22c2f533	fix(mcp): preserve 'definitions' as a property name in tool schemas The MCP input-schema normalizer in _normalize_mcp_input_schema promotes the legacy JSON Schema 'definitions' meta-keyword to '$defs' (draft 2019-09+) so local '$ref' resolution works downstream. The previous walk renamed any key named 'definitions' anywhere in the tree, including inside 'properties' dicts. That turned user-facing parameter names into '$defs', producing property keys that contain '$', which Anthropic and OpenAI both reject with HTTP 400 (pattern '^[a-zA-Z0-9_.-]{1,64}$'). Real-world repro: an MCP server that exposes a CI/pipelines tool whose 'definitions' parameter is an array of pipeline-definition IDs. Such a tool is enough on its own to break every conversation, because the full tools array is sent on every request. Fix: when descending into a 'properties' or 'patternProperties' mapping, iterate property-name -> schema pairs directly, leaving the property names verbatim. Ordinary JSON Schema semantics resume inside each property's schema, so a legitimately nested 'definitions' meta-keyword inside a property's schema is still promoted. Adds two regression tests: - test_definitions_as_property_name_is_preserved (the property-name case) - test_definitions_property_and_meta_keyword_coexist (both forms in one schema; the property name stays, the meta-keyword promotes)	2026-07-01 01:02:23 -07:00
峯岸亮	bc6cd46925	fix(agent): restrict todo hydration to paired assistant todo calls The gateway/API server rebuilds the in-memory TodoStore by replaying caller-supplied conversation_history. _hydrate_todo_store previously accepted any role:tool message containing a "todos" array, so a forged bare tool result could seed arbitrary todo state and re-inflate context every turn (GHSA-5g4g-6jrg-mw3g). Restrict hydration to tool results paired with an earlier assistant todo tool call (matching tool_call_id, function name == todo, no user/system boundary between). Reuse the existing _get_tool_call_id/ name_static helpers so dict- and object-shaped tool calls both work. Add a generous MAX_TODO_RESULT_CHARS payload guard to drop absurd forged results before parsing; item/content caps already exist on main. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-07-01 01:02:17 -07:00
binhnt92	bcfc7458fa	fix remote sync-back credential overwrite	2026-07-01 01:00:31 -07:00
Justin Ohms	8f21311906	fix(delegation): route native-SDK providers through runtime resolver; fail on '(empty)' sentinel Two related bugs caused subagent delegation to silently return empty summaries with 0 tokens when the user configured delegation.provider=bedrock alongside delegation.base_url=https://bedrock-runtime.<region>.amazonaws.com. Root cause #1 — misrouting in _resolve_delegation_credentials(): The configured_base_url branch unconditionally forced provider='custom' and api_mode='chat_completions', only specializing for chatgpt.com, anthropic, and kimi hosts. Bedrock (and other native-SDK providers) fell through as 'custom' + chat_completions, which then POSTed OpenAI-shaped JSON at Bedrock's native API. Bedrock rejected the payload and returned nothing, which looked like an empty LLM response to the child agent. Fix: when provider is one of {bedrock, vertex, google, google-genai}, skip the base_url short-circuit and fall through to resolve_runtime_provider(), which knows how to construct the proper SDK client. base_url can still be forwarded through that path for regional overrides. Root cause #2 — '(empty)' sentinel accepted as success: After N retries of empty LLM responses, run_agent.py emits the literal string '(empty)' as final_response. _run_single_child then hit `elif summary:` — '(empty)' is truthy, so status became 'completed' and the parent surfaced a blank result with no error. Users saw api_calls=4, tokens=0, duration~0.4s, status=completed. Fix: treat final_response.strip() == '(empty)' as a failure so the parent surfaces it instead of silently accepting zero-content 'success'. Both paths were reproduced in a live Hermes TUI session on us-west-2 Bedrock (provider=bedrock, model=us.anthropic.claude-sonnet-4-6) and are covered by new tests in tests/tools/test_delegate.py.	2026-07-01 00:45:31 -07:00
LeonSGP43	55d92516c8	fix(skills): publish fetchable metadata for official skills	2026-07-01 00:40:56 -07:00
teknium1	56d4bfe4ba	fix(approval): honour tirith_fail_open in cron-deny tirith path + tests Follow-up to the salvaged #22070. The cron-deny tirith ImportError branch was unconditionally fail-open; now it honours security.tirith_fail_open: false by blocking (a cron session has no user to approve), mirroring the main flow's fail-closed synthesis (#20733). Adds regression tests: tirith-only content threat blocked in cron-deny, plus fail-closed/fail-open ImportError behavior.	2026-07-01 00:13:36 -07:00
Rodrigo	c50f517bff	fix(approval): run tirith check in cron-deny mode to catch content-level threats In check_all_command_guards, the cron-deny path only ran detect_dangerous_command (regex patterns). The tirith check starts at line 1017, after the early return at line 1002, so content-level threats caught only by tirith (homograph URLs, pipe-to-interpreter, terminal injection) were silently approved in cron sessions even with approvals.cron_mode: deny. Add a tirith call inside the cron-deny block, mirroring the same ImportError guard used in the main flow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-07-01 00:13:36 -07:00
DanAsBjorn	a537baa81d	fix(matrix): route text-only send_message through adapter for E2EE support Text-only Matrix messages sent via the send_message engine (hermes send, cron deliver: matrix) arrived unencrypted (red padlock) in E2EE rooms. Media sends already routed through the mautrix adapter and encrypted fine, but text-only sends took the raw-HTTP standalone_sender_fn path, which never encrypts. Route ALL Matrix sends through _send_matrix_via_adapter so text is encrypted too. The adapter reuses the live gateway's E2EE session when available (#46310) and falls back to an encryption-aware ephemeral adapter for standalone/cron contexts. The registry standalone_sender_fn stays registered for the contract; it is simply no longer reached for Matrix. Salvaged from PR #20259 onto current main (the original patched the pre-#41112 _send_matrix branch, which had since moved to the plugin's standalone path). Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-07-01 00:12:11 -07:00
teknium1	0f66995e2a	fix(approval): catch GNU long-flag abbreviations for chown --recursive and git push --force GNU tools accept unique long-option prefix abbreviations at runtime, so `chown --recurs root` and `git push --forc` evaded the approval gate's exact-match `--recursive`/`--force` patterns. Switch those two entries to prefix matches (--recur[a-z], --forc[a-z]). The rm/chmod/sed long-flag patterns were left unchanged: every abbreviation of those is already caught by the sibling short-flag and target patterns (rm -[^s]r, base chmod 777, sed -[^s]i), so prefix-matching them is a no-op. Only chown (beyond the coincidental case-insensitive r->R catch) and git push had genuine gaps. Co-authored-by: Subway2023 <subw3@mail2.sysu.edu.cn>	2026-06-30 17:32:28 -07:00
Scott Gabel	4a7a6fd401	fix(approval): redact secrets in user-facing approval prompts The dangerous-command approval prompt renders the flagged command so the user can decide whether to approve. If the agent constructed it with a credential (curl -H 'Authorization: Bearer sk-...', psql postgres://user:pw@host, an execute_code script with api_key = 'sk-...'), that secret hit stdout and, via the gateway notify payload, Discord/Slack messages — which are screenshottable and forwardable. Apply the existing agent.redact.redact_sensitive_text() to every user-facing approval surface. Redaction is display-only: the raw command still executes after approval, and approval persistence keys off pattern_key (not the command text), so the allowlist is unaffected. Decision context (URL, flags, command structure) is preserved; only the secret value masks. Covers all surfaces, including the execute_code path the original PR missed: - prompt_dangerous_approval(): callback + stdout fallback - check_all_command_guards(): gateway approval_data + cron/batch pending fallback - check_execute_code_guard(): gateway approval_data + no-notifier pending fallback (script body can embed credentials) Adds TestApprovalPromptRedaction covering callback redaction, no-over-redaction of clean commands, and the execute_code pending fallback. Salvaged from PR #13139 by @sgabel; extended to the execute_code surface.	2026-06-30 17:29:11 -07:00
haileymarshall	9f22f36625	fix(mcp-oauth): anchor 401 handler task to prevent GC mid-flight `handle_401` spawned a dedup'd recovery coroutine via `asyncio.create_task(_do_handle())` and discarded the returned task reference. Python's event loop only keeps weak references to tasks, so the coroutine could be garbage-collected before it called `pending.set_result(...)`. Every concurrent caller awaiting that future then hangs forever, and the `finally: entry.pending_401.pop(...)` cleanup never runs — so subsequent 401s for the same key latch onto the dead future too. Same pattern the adapter-side fixes address (#11997, #11998, #12000, #12001, #12006). Hold the task in a process-wide set on the manager and discard it via `add_done_callback` once it completes. Regression test covers both the structural invariant (task tracked, then removed on completion) and a concurrent dedup path with a forced `gc.collect()` between the handler's await points.	2026-06-30 16:56:15 -07:00
WuKongAI-CMU	0ea3861b33	fix: keep persisted tool results inside their storage directory Tool call ids are used to name persisted large-result files. Treating that id as a raw path segment allowed traversal-like ids to resolve outside hermes-results even though the shell command quoted metacharacters. Convert ids to single filename stems, preserve normal ids, and add a short hash when normalization is needed so unsafe ids do not collide silently. Constraint: Avoid new dependencies and preserve existing tool-result paths for normal tool call ids Rejected: Quote only the path \| shell quoting does not prevent ../ path traversal Confidence: high Scope-risk: narrow Reversibility: clean Tested: source /Users/peter/hermes-agent/venv/bin/activate && pytest tests/tools/test_tool_result_storage.py -q Tested: source /Users/peter/hermes-agent/venv/bin/activate && python -m compileall tools/tool_result_storage.py tests/tools/test_tool_result_storage.py Tested: git diff --check	2026-06-30 16:39:41 -07:00
etherman-os	2a3dbcaf46	fix(terminal): prevent corrupted session snapshots during init The init snapshot dumped functions with a line-based filter: declare -f \| grep -vE '^_[^_]' That strips a function's header line (e.g. `_foo () `) but leaves the orphaned `{ ... }` body behind, corrupting the snapshot that is sourced before every command. Sourcing the torn snapshot runs leftover body code and breaks subsequent commands (intermittent exit 127). - Filter private (`_`-prefixed) functions by NAME via `declare -F` and dump only the wanted whole definitions, so a body is never torn. Guard against an empty name list (bare `declare -f` dumps everything). - Treat a non-zero bootstrap exit code as snapshot-init failure, so execution safely falls back to login-shell-per-command mode. - Add a regression test asserting snapshot_ready stays false when bootstrap exits non-zero. Preserves the atomic-write ($BASHPID temp + mv -f) machinery from #38249.	2026-06-30 15:51:17 -07:00
kyssta-exe	20871c1d94	fix(skills): require review forks to read before writing skills	2026-06-30 15:49:36 -07:00
Erosika	a6175d1f93	style(profile): trim verbose comments to one or two lines	2026-06-30 15:30:06 -07:00
Erosika	bc396dafda	test(profile): two-profile regression suite + preserve skills_hub monkeypatch seam - tools/skills_hub.py: the per-call resolvers now honor a test-injected real module attribute (patch.object(hub, 'SKILLS_DIR', ...) / monkeypatch.setattr) before falling back to dynamic profile resolution. PEP 562 __getattr__ only fires when no real attribute exists, so an unpatched module resolves the active profile and a patched one respects the test's value — keeping the existing skills_hub test seam intact (5 tests had broken). - tests/test_profile_isolation_runtime.py: real two-profile (no-mock) suite driving each previously-leaking site under override A then B and asserting the active profile's path/identity is used: skills_hub paths + derived constants + default-arg resolution, gateway cache getters (incl. the monkeypatch-still-wins seam), rich_sent_store path, and thread/executor context propagation (raw-thread hazard documented; primitive + _run_async worker proven to preserve the override).	2026-06-30 15:30:06 -07:00
Erosika	09af0a8c1d	fix(profile): propagate profile context across thread/executor boundaries A bare threading.Thread / ThreadPoolExecutor worker starts with an empty contextvars.Context, so the context-local profile override (_HERMES_HOME_OVERRIDE) does not cross the spawn boundary. In single-process multi-profile runtimes (desktop tui_gateway) the worker then resolves get_hermes_home() to the launch/default profile, leaking one profile's reads/writes into another. The fix primitive (tools.thread_context. propagate_context_to_thread, which copies the parent context) already exists; the leaking spawns simply did not use it. - model_tools.py _run_async: wrap the worker-thread loop runner. This is the generic sync->async bridge for every async tool, so wrapping it here fixes the leak for all async tools at once (verified: an async tool reading get_hermes_home() under an override now resolves the active profile). - run_agent.py bg-review thread: wrap so MEMORY.md / skill review writes land in the spawning turn's profile (#54937 path). - tools/async_delegation.py: wrap both single + batch executor.submit calls so detached children resolve the dispatching profile's paths. Scope: the vision CPU executor is intentionally left unwrapped — it runs pure in-memory encode/resize and never resolves profile-scoped paths.	2026-06-30 15:30:06 -07:00
Erosika	10e60060d9	fix(profile): resolve import-time path globals per-call to honor profile override In single-process multi-profile runtimes (desktop tui_gateway), profile scoping is a context-local ContextVar override, not a process env var. Three subsystems froze their HERMES_HOME-derived paths at import time (or read os.environ directly), pinning every later profile to whichever profile first imported the module — a cross-profile data leak. - tools/skills_hub.py: SKILLS_DIR/HUB_DIR/LOCK_FILE/etc. were module constants frozen at import. Replace with per-call resolver functions; add a PEP 562 module __getattr__ so external 'from tools.skills_hub import SKILLS_DIR' callers (all function-local) resolve dynamically with no call-site changes. Convert default-arg bindings (HubLockFile/TapsManager) and the derived HERMES_INDEX_CACHE_FILE constant too. - gateway/platforms/base.py: image/audio/video/document cache-dir getters now re-resolve via get_hermes_dir() per call, falling back to the module constant when a test has monkeypatched it (preserves the existing test seam). Media-delivery safe-roots already enumerate all profiles' cache dirs (#31733), so per-profile resolution does not break delivery. - gateway/rich_sent_store.py: _store_path() read os.environ['HERMES_HOME'] directly, bypassing the override entirely; route through get_hermes_home().	2026-06-30 15:30:06 -07:00
srojk34	795913d3b0	fix(kanban): restrict goal_mode kanban_block to genuine external blockers The judge gate added for kanban_complete (Issue #38367, PR #38388) only covers one of the two exit paths out of run_kanban_goal_loop(). The loop treats status == "blocked" as terminal identically to "done" (and any other status outside running/ready/done/blocked also stops the loop — see goals.py's status dispatch). A goal_mode worker that has learned kanban_complete is gated can simply call kanban_block(reason="anything") to escape the loop with zero judge involvement, fully defeating the intent of #38367's fix. This is Issue #38696, filed as the explicit follow-up by a reviewer on PR #38388: "kanban_complete is one way out; kanban_block is another... A worker that learns the complete path is gated can shift to calling block to escape the loop with the same effect." Implements the issue's "Option B" (deterministic allowlist, no extra judge LLM call) using the kind taxonomy that already exists in kb.VALID_BLOCK_KINDS, rather than inventing a new judge_goal() outcome type (judge_goal only returns done/continue/wait/skipped — there's no "is this block legitimate" verdict to hook the issue's "Option A" pseudocode onto without expanding the judge's contract). goal_mode tasks may only block with kind in {dependency, needs_input} — the two kinds that represent a genuine external blocker the worker cannot resolve itself. `capability`, `transient`, and an unset kind are rejected with a message directing the worker to kanban_complete instead, which the judge now gates. Non-goal_mode tasks are completely unaffected.	2026-06-30 14:29:42 -07:00
kshitijk4poor	a5e8cd4d40	fix(memory): degrade gracefully after repeated at-capacity consolidation failures (#42405 ) Builds on the zero-match feedback fix (previous commit) to close the silent-hang symptom: when memory is at capacity, a failed `add`/`replace`/`remove` consolidation could loop the whole turn to iteration-budget exhaustion and deliver no user-facing reply. #41755 turned the at-capacity overflow error into a commanded in-turn retry ("...then retry this add — all in this turn"); combined with the fragile substring-only `replace`/`remove` matching (LLMs can't reliably re-quote a long entry verbatim), the model loops add↔replace on inexact guesses until the turn dies. The existing tool_guardrails halt would catch this, but hard_stop_enabled is opt-in (off by default), so a default install still hangs. This fixes it at the memory layer without changing global guardrail behavior: - MemoryStore tracks per-turn consolidation failures; after a cap (3) it drops the "retry in this turn" instruction and returns a terminal "leave memory unchanged, continue your reply" result, so a failed memory side effect can never block the turn's reply. - The counter resets on any successful write (progress) and at each turn boundary (turn_context.reset_consolidation_failures, guarded via getattr so plugin memory stores without the method are a no-op). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-30 20:01:16 +05:30
kyssta-exe	62a1bf4c55	fix(tools): return previews on zero-match in replace/remove to prevent memory retry loops (#42405 ) - replace() and remove() now return entry previews and current_entries when no entry matches old_text, matching the multi-match and add-limit error behavior - add() limit error also now returns previews for consistency - Agent can self-correct after a failed replace/remove instead of looping blindly until turn budget is exhausted with no user response	2026-06-30 20:01:16 +05:30
kshitijk4poor	824f2279da	refactor(registry): drop dead toolset-check helpers after per-tool availability Follow-up to the per-tool availability derivation: `_snapshot_toolset_checks` and `_evaluate_toolset_check` had no remaining callers once the four availability surfaces switched to `_toolset_has_exposable_tools`. Remove both, drop the no-op `quiet` param from the new helper, and document why `_toolset_checks` is still written (banner.py reads it via TOOLSET_REQUIREMENTS to classify unavailable toolsets as lazy-init vs disabled).	2026-06-30 17:47:37 +05:30
xxxigm	6e84257717	fix(registry): derive toolset availability from per-tool checks Doctor and banner used the first check_fn registered for a toolset, so desktop-only read_terminal gated the whole terminal toolset even though terminal and process still expose at runtime. Fixes #54820	2026-06-30 17:47:37 +05:30
memosr	12f5624a76	fix(security): bind tool_override authorization to handler's defining plugin module egilewski found the prior sink gate was transient: it only applied while PluginManager executed register(ctx). A plugin could defer a direct registry.register(..., override=True) to a post-load callback/thread, after the scope was cleared, and still replace a built-in. Make authorization durable by binding it to where the handler is DEFINED (handler.__globals__['__name__']) rather than to call timing. At load, each plugin's module namespace is mapped to its allow_tool_override opt-in in a table that is never cleared. The sink resolves the handler's owning plugin module and rejects an override from any plugin namespace without opt-in, regardless of when or on which thread the call happens. Plugin namespaces with no recorded policy are treated as not-opted-in (fail-closed). Built-in and MCP handlers live outside the plugin namespace and are unaffected. Adds a regression test for the delayed/post-load direct-registry override.	2026-06-30 04:00:42 -07:00
memosr	3101222312	fix(security): enforce tool_override opt-in at registry sink to close direct-import bypass The opt-in gate lived only in PluginContext.register_tool, so a plugin could bypass it by importing tools.registry and calling registry.register(..., override=True) directly. Enforce the same gate at the sink: during plugin load, the registry rejects an override from a plugin without operator opt-in regardless of the path taken. Built-in and MCP registrations (no active plugin scope) are unaffected. Adds a regression test covering the direct-registry bypass.	2026-06-30 04:00:42 -07:00
Jeffgithub0029	b7c4369ca0	fix(telegram): chunk formatted messages with UTF-16 length accounting The standalone send path (_send_telegram, used by the send_message tool, cron delivery, and out-of-process callers) chunked the raw message on UTF-16 length, then formatted and sent the result un-rechunked. MarkdownV2 escaping inflates the text (`!`/`.`/`-` -> `\!`/`\.`/`\-`), so a 4096 UTF-16-unit raw message can become ~8192 units once formatted and gets rejected by Telegram as 'Message is too long'. Move all text chunking into _send_telegram, after formatting: split the formatted MarkdownV2/HTML text on UTF-16 length so every send is <=4096, with per-chunk plain-text fallback and thread-not-found retry preserved. Media attaches after all text chunks. (#28557)	2026-06-30 03:51:08 -07:00
nikshepsvn	d82a69b624	fix(tools): prune acp_command from delegate_task schema when no ACP CLI is on PATH Defense-in-depth follow-up to the runtime guard added in the previous commit. Models on headless hosts (Railway / Fly / Docker / fresh VPS) without any ACP CLI installed occasionally hallucinate ``acp_command="copilot"`` from the schema description, despite the explicit "Do NOT set" instruction. The runtime guard prevented the crash but the model still wasted a tool turn and got an opaque silent fallback. This commit removes the temptation at its source: ``_build_dynamic_schema_overrides`` now strips ``acp_command`` and ``acp_args`` from both the top-level and per-task schemas when none of the known ACP CLIs (``copilot``, ``claude``, ``codex``) are detectable on PATH. The model literally never sees the fields, so it cannot pass them. The runtime guard from the previous commit stays in place as defense-in-depth for internal callers, tests, and any future code path that bypasses the schema. ``_acp_binary_available`` is intentionally NOT cached: ``shutil.which`` is cheap, and avoiding the cache means the schema reacts to mid-session installs without requiring a process restart. Tests: - ``test_schema_prunes_acp_command_when_no_acp_binary`` - ``test_schema_keeps_acp_command_when_binary_available`` - ``test_acp_binary_available_checks_known_clis`` Full ``test_delegate.py`` suite: 136/136 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
nikshepsvn	2e0b591076	fix(tools): validate acp_command binary exists before forcing copilot-acp transport When a model passes `acp_command="copilot"` (or any other binary name) in a `delegate_task` tool call, `_build_child_agent` unconditionally sets `effective_provider = "copilot-acp"`, which routes the subagent through `CopilotACPClient`. That client spawns the named binary via subprocess; if it isn't on PATH, every retry raises RuntimeError and an asyncio cleanup race during error delivery can take the entire gateway down. This is a real failure mode on headless deploys (Railway / Fly / VPS / Docker) where `copilot` / `claude` / etc. aren't installed. The schema does say "Do NOT set unless the user explicitly told you an ACP CLI is installed," but models occasionally pass it anyway — particularly for X (Twitter) search prompts where Grok seems to associate ACP with "search assistance." Reproduction: - Headless install (no `copilot` binary on PATH) - Set provider to xai-oauth + model grok-4.3 - Telegram prompt: "Search X for crypto twitter trends" - Grok decides to delegate and passes `acp_command="copilot"` - Subagent crashes 3x, gateway crashes on the 3rd retry teardown Fix: validate the binary exists on PATH via `shutil.which` before honoring the override. If missing, log a warning and fall through to the parent's default transport. No behavior change when the binary IS present (covered by `test_build_child_agent_honors_acp_command_when_binary_present`). Tests: - `test_build_child_agent_ignores_acp_command_when_binary_missing` - `test_build_child_agent_honors_acp_command_when_binary_present` Verified on Python 3.11 (macOS) and 3.12 (Debian 13 container). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
georgex8001	62b9fb6623	fix(acp): thread-safe interactive approval via contextvars Concurrent ACP sessions run on a shared ThreadPoolExecutor (max_workers=4). Each _run_agent mutated the process-global os.environ["HERMES_INTERACTIVE"] and restored it in finally, so one session's restore could clobber another's set mid-run — dropping the second session onto the non-interactive auto-approve path, executing a dangerous command without the approval callback firing (GHSA-96vc-wcxf-jjff). Replace the env-var flag with a thread/task-local contextvar in tools.approval. The two HERMES_INTERACTIVE read sites in approval.py now go through _is_interactive_cli() (contextvar-first, env fallback for legacy single-threaded CLI callers). The ACP executor sets the contextvar instead of os.environ; the existing contextvars.copy_context() wrapper isolates each session's write. Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>	2026-06-30 03:24:58 -07:00
Markus Phan	cd9f5cc671	fix(delegate): route subagent progress lines through _safe_print for ACP stdio delegate_task's per-task completion display emitted lines like "✓ [1/3] Research done (17.92s)" via a bare print(). Under ACP (and any headless JSON-RPC stdio host where AIAgent routes human output to stderr via a custom _print_fn), these landed on stdout and corrupted the protocol frame stream, surfacing as "Failed to parse JSON message: ✓ [3/3] …" in the ACP adapter. Add _emit_parent_console() which prefers parent_agent._safe_print (the same hook AIAgent uses for every other user-facing print) and falls back to print() only when no router is wired up or it raises. CLI behavior is unchanged. The PR's other fix (preset toolset expansion) is already covered on main by _expand_parent_toolsets(), so only the stdio-safe printing change is salvaged here.	2026-06-30 03:16:22 -07:00
teknium1	35a0803a3b	fix(delegation): budget subagent summaries against parent context headroom Batch delegation returned each subagent's full final_response verbatim into the parent's context. A fan-out of N children could dump 60k+ tokens at once, blowing the parent's context window and — on rate-limited providers — triggering a compression/429 death spiral (429 misread as context-too-large -> window step-down -> retry loop -> conversation dies). Cap each summary against the parent's remaining context headroom split across the batch (not a magic char count). When trimming, mirror the web_extract convention: spill the full text to cache/delegation (mounted into remote backends via credential_files._CACHE_DIRS) and return a head+tail window (75/25, line-snapped) plus a footer with the exact read_file offset to page the omitted middle. Both the subagent's opening AND its closing (outcomes / files-changed / issues, which live at the end) survive in-context, and nothing is lost — the parent can read_file the full version on any backend. delegation.max_summary_chars (default 24000) is a static ceiling layered on top as belt-and-suspenders for models that ignore 'be concise'; 0 disables it. Child prompt tightened to lead with outcomes / bullets. Co-authored-by: rc-int <rcint@klaith.com>	2026-06-30 03:07:40 -07:00

1 2 3 4 5 ...

1973 commits