hermes-agent

Author	SHA1	Message	Date
LeonSGP43	55d92516c8	fix(skills): publish fetchable metadata for official skills	2026-07-01 00:40:56 -07:00
teknium1	56d4bfe4ba	fix(approval): honour tirith_fail_open in cron-deny tirith path + tests Follow-up to the salvaged #22070. The cron-deny tirith ImportError branch was unconditionally fail-open; now it honours security.tirith_fail_open: false by blocking (a cron session has no user to approve), mirroring the main flow's fail-closed synthesis (#20733). Adds regression tests: tirith-only content threat blocked in cron-deny, plus fail-closed/fail-open ImportError behavior.	2026-07-01 00:13:36 -07:00
Rodrigo	c50f517bff	fix(approval): run tirith check in cron-deny mode to catch content-level threats In check_all_command_guards, the cron-deny path only ran detect_dangerous_command (regex patterns). The tirith check starts at line 1017, after the early return at line 1002, so content-level threats caught only by tirith (homograph URLs, pipe-to-interpreter, terminal injection) were silently approved in cron sessions even with approvals.cron_mode: deny. Add a tirith call inside the cron-deny block, mirroring the same ImportError guard used in the main flow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-07-01 00:13:36 -07:00
DanAsBjorn	a537baa81d	fix(matrix): route text-only send_message through adapter for E2EE support Text-only Matrix messages sent via the send_message engine (hermes send, cron deliver: matrix) arrived unencrypted (red padlock) in E2EE rooms. Media sends already routed through the mautrix adapter and encrypted fine, but text-only sends took the raw-HTTP standalone_sender_fn path, which never encrypts. Route ALL Matrix sends through _send_matrix_via_adapter so text is encrypted too. The adapter reuses the live gateway's E2EE session when available (#46310) and falls back to an encryption-aware ephemeral adapter for standalone/cron contexts. The registry standalone_sender_fn stays registered for the contract; it is simply no longer reached for Matrix. Salvaged from PR #20259 onto current main (the original patched the pre-#41112 _send_matrix branch, which had since moved to the plugin's standalone path). Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-07-01 00:12:11 -07:00
teknium1	0f66995e2a	fix(approval): catch GNU long-flag abbreviations for chown --recursive and git push --force GNU tools accept unique long-option prefix abbreviations at runtime, so `chown --recurs root` and `git push --forc` evaded the approval gate's exact-match `--recursive`/`--force` patterns. Switch those two entries to prefix matches (--recur[a-z], --forc[a-z]). The rm/chmod/sed long-flag patterns were left unchanged: every abbreviation of those is already caught by the sibling short-flag and target patterns (rm -[^s]r, base chmod 777, sed -[^s]i), so prefix-matching them is a no-op. Only chown (beyond the coincidental case-insensitive r->R catch) and git push had genuine gaps. Co-authored-by: Subway2023 <subw3@mail2.sysu.edu.cn>	2026-06-30 17:32:28 -07:00
Scott Gabel	4a7a6fd401	fix(approval): redact secrets in user-facing approval prompts The dangerous-command approval prompt renders the flagged command so the user can decide whether to approve. If the agent constructed it with a credential (curl -H 'Authorization: Bearer sk-...', psql postgres://user:pw@host, an execute_code script with api_key = 'sk-...'), that secret hit stdout and, via the gateway notify payload, Discord/Slack messages — which are screenshottable and forwardable. Apply the existing agent.redact.redact_sensitive_text() to every user-facing approval surface. Redaction is display-only: the raw command still executes after approval, and approval persistence keys off pattern_key (not the command text), so the allowlist is unaffected. Decision context (URL, flags, command structure) is preserved; only the secret value masks. Covers all surfaces, including the execute_code path the original PR missed: - prompt_dangerous_approval(): callback + stdout fallback - check_all_command_guards(): gateway approval_data + cron/batch pending fallback - check_execute_code_guard(): gateway approval_data + no-notifier pending fallback (script body can embed credentials) Adds TestApprovalPromptRedaction covering callback redaction, no-over-redaction of clean commands, and the execute_code pending fallback. Salvaged from PR #13139 by @sgabel; extended to the execute_code surface.	2026-06-30 17:29:11 -07:00
haileymarshall	9f22f36625	fix(mcp-oauth): anchor 401 handler task to prevent GC mid-flight `handle_401` spawned a dedup'd recovery coroutine via `asyncio.create_task(_do_handle())` and discarded the returned task reference. Python's event loop only keeps weak references to tasks, so the coroutine could be garbage-collected before it called `pending.set_result(...)`. Every concurrent caller awaiting that future then hangs forever, and the `finally: entry.pending_401.pop(...)` cleanup never runs — so subsequent 401s for the same key latch onto the dead future too. Same pattern the adapter-side fixes address (#11997, #11998, #12000, #12001, #12006). Hold the task in a process-wide set on the manager and discard it via `add_done_callback` once it completes. Regression test covers both the structural invariant (task tracked, then removed on completion) and a concurrent dedup path with a forced `gc.collect()` between the handler's await points.	2026-06-30 16:56:15 -07:00
WuKongAI-CMU	0ea3861b33	fix: keep persisted tool results inside their storage directory Tool call ids are used to name persisted large-result files. Treating that id as a raw path segment allowed traversal-like ids to resolve outside hermes-results even though the shell command quoted metacharacters. Convert ids to single filename stems, preserve normal ids, and add a short hash when normalization is needed so unsafe ids do not collide silently. Constraint: Avoid new dependencies and preserve existing tool-result paths for normal tool call ids Rejected: Quote only the path \| shell quoting does not prevent ../ path traversal Confidence: high Scope-risk: narrow Reversibility: clean Tested: source /Users/peter/hermes-agent/venv/bin/activate && pytest tests/tools/test_tool_result_storage.py -q Tested: source /Users/peter/hermes-agent/venv/bin/activate && python -m compileall tools/tool_result_storage.py tests/tools/test_tool_result_storage.py Tested: git diff --check	2026-06-30 16:39:41 -07:00
etherman-os	2a3dbcaf46	fix(terminal): prevent corrupted session snapshots during init The init snapshot dumped functions with a line-based filter: declare -f \| grep -vE '^_[^_]' That strips a function's header line (e.g. `_foo () `) but leaves the orphaned `{ ... }` body behind, corrupting the snapshot that is sourced before every command. Sourcing the torn snapshot runs leftover body code and breaks subsequent commands (intermittent exit 127). - Filter private (`_`-prefixed) functions by NAME via `declare -F` and dump only the wanted whole definitions, so a body is never torn. Guard against an empty name list (bare `declare -f` dumps everything). - Treat a non-zero bootstrap exit code as snapshot-init failure, so execution safely falls back to login-shell-per-command mode. - Add a regression test asserting snapshot_ready stays false when bootstrap exits non-zero. Preserves the atomic-write ($BASHPID temp + mv -f) machinery from #38249.	2026-06-30 15:51:17 -07:00
kyssta-exe	20871c1d94	fix(skills): require review forks to read before writing skills	2026-06-30 15:49:36 -07:00
Erosika	a6175d1f93	style(profile): trim verbose comments to one or two lines	2026-06-30 15:30:06 -07:00
Erosika	bc396dafda	test(profile): two-profile regression suite + preserve skills_hub monkeypatch seam - tools/skills_hub.py: the per-call resolvers now honor a test-injected real module attribute (patch.object(hub, 'SKILLS_DIR', ...) / monkeypatch.setattr) before falling back to dynamic profile resolution. PEP 562 __getattr__ only fires when no real attribute exists, so an unpatched module resolves the active profile and a patched one respects the test's value — keeping the existing skills_hub test seam intact (5 tests had broken). - tests/test_profile_isolation_runtime.py: real two-profile (no-mock) suite driving each previously-leaking site under override A then B and asserting the active profile's path/identity is used: skills_hub paths + derived constants + default-arg resolution, gateway cache getters (incl. the monkeypatch-still-wins seam), rich_sent_store path, and thread/executor context propagation (raw-thread hazard documented; primitive + _run_async worker proven to preserve the override).	2026-06-30 15:30:06 -07:00
Erosika	09af0a8c1d	fix(profile): propagate profile context across thread/executor boundaries A bare threading.Thread / ThreadPoolExecutor worker starts with an empty contextvars.Context, so the context-local profile override (_HERMES_HOME_OVERRIDE) does not cross the spawn boundary. In single-process multi-profile runtimes (desktop tui_gateway) the worker then resolves get_hermes_home() to the launch/default profile, leaking one profile's reads/writes into another. The fix primitive (tools.thread_context. propagate_context_to_thread, which copies the parent context) already exists; the leaking spawns simply did not use it. - model_tools.py _run_async: wrap the worker-thread loop runner. This is the generic sync->async bridge for every async tool, so wrapping it here fixes the leak for all async tools at once (verified: an async tool reading get_hermes_home() under an override now resolves the active profile). - run_agent.py bg-review thread: wrap so MEMORY.md / skill review writes land in the spawning turn's profile (#54937 path). - tools/async_delegation.py: wrap both single + batch executor.submit calls so detached children resolve the dispatching profile's paths. Scope: the vision CPU executor is intentionally left unwrapped — it runs pure in-memory encode/resize and never resolves profile-scoped paths.	2026-06-30 15:30:06 -07:00
Erosika	10e60060d9	fix(profile): resolve import-time path globals per-call to honor profile override In single-process multi-profile runtimes (desktop tui_gateway), profile scoping is a context-local ContextVar override, not a process env var. Three subsystems froze their HERMES_HOME-derived paths at import time (or read os.environ directly), pinning every later profile to whichever profile first imported the module — a cross-profile data leak. - tools/skills_hub.py: SKILLS_DIR/HUB_DIR/LOCK_FILE/etc. were module constants frozen at import. Replace with per-call resolver functions; add a PEP 562 module __getattr__ so external 'from tools.skills_hub import SKILLS_DIR' callers (all function-local) resolve dynamically with no call-site changes. Convert default-arg bindings (HubLockFile/TapsManager) and the derived HERMES_INDEX_CACHE_FILE constant too. - gateway/platforms/base.py: image/audio/video/document cache-dir getters now re-resolve via get_hermes_dir() per call, falling back to the module constant when a test has monkeypatched it (preserves the existing test seam). Media-delivery safe-roots already enumerate all profiles' cache dirs (#31733), so per-profile resolution does not break delivery. - gateway/rich_sent_store.py: _store_path() read os.environ['HERMES_HOME'] directly, bypassing the override entirely; route through get_hermes_home().	2026-06-30 15:30:06 -07:00
srojk34	795913d3b0	fix(kanban): restrict goal_mode kanban_block to genuine external blockers The judge gate added for kanban_complete (Issue #38367, PR #38388) only covers one of the two exit paths out of run_kanban_goal_loop(). The loop treats status == "blocked" as terminal identically to "done" (and any other status outside running/ready/done/blocked also stops the loop — see goals.py's status dispatch). A goal_mode worker that has learned kanban_complete is gated can simply call kanban_block(reason="anything") to escape the loop with zero judge involvement, fully defeating the intent of #38367's fix. This is Issue #38696, filed as the explicit follow-up by a reviewer on PR #38388: "kanban_complete is one way out; kanban_block is another... A worker that learns the complete path is gated can shift to calling block to escape the loop with the same effect." Implements the issue's "Option B" (deterministic allowlist, no extra judge LLM call) using the kind taxonomy that already exists in kb.VALID_BLOCK_KINDS, rather than inventing a new judge_goal() outcome type (judge_goal only returns done/continue/wait/skipped — there's no "is this block legitimate" verdict to hook the issue's "Option A" pseudocode onto without expanding the judge's contract). goal_mode tasks may only block with kind in {dependency, needs_input} — the two kinds that represent a genuine external blocker the worker cannot resolve itself. `capability`, `transient`, and an unset kind are rejected with a message directing the worker to kanban_complete instead, which the judge now gates. Non-goal_mode tasks are completely unaffected.	2026-06-30 14:29:42 -07:00
kshitijk4poor	a5e8cd4d40	fix(memory): degrade gracefully after repeated at-capacity consolidation failures (#42405 ) Builds on the zero-match feedback fix (previous commit) to close the silent-hang symptom: when memory is at capacity, a failed `add`/`replace`/`remove` consolidation could loop the whole turn to iteration-budget exhaustion and deliver no user-facing reply. #41755 turned the at-capacity overflow error into a commanded in-turn retry ("...then retry this add — all in this turn"); combined with the fragile substring-only `replace`/`remove` matching (LLMs can't reliably re-quote a long entry verbatim), the model loops add↔replace on inexact guesses until the turn dies. The existing tool_guardrails halt would catch this, but hard_stop_enabled is opt-in (off by default), so a default install still hangs. This fixes it at the memory layer without changing global guardrail behavior: - MemoryStore tracks per-turn consolidation failures; after a cap (3) it drops the "retry in this turn" instruction and returns a terminal "leave memory unchanged, continue your reply" result, so a failed memory side effect can never block the turn's reply. - The counter resets on any successful write (progress) and at each turn boundary (turn_context.reset_consolidation_failures, guarded via getattr so plugin memory stores without the method are a no-op). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-30 20:01:16 +05:30
kyssta-exe	62a1bf4c55	fix(tools): return previews on zero-match in replace/remove to prevent memory retry loops (#42405 ) - replace() and remove() now return entry previews and current_entries when no entry matches old_text, matching the multi-match and add-limit error behavior - add() limit error also now returns previews for consistency - Agent can self-correct after a failed replace/remove instead of looping blindly until turn budget is exhausted with no user response	2026-06-30 20:01:16 +05:30
kshitijk4poor	824f2279da	refactor(registry): drop dead toolset-check helpers after per-tool availability Follow-up to the per-tool availability derivation: `_snapshot_toolset_checks` and `_evaluate_toolset_check` had no remaining callers once the four availability surfaces switched to `_toolset_has_exposable_tools`. Remove both, drop the no-op `quiet` param from the new helper, and document why `_toolset_checks` is still written (banner.py reads it via TOOLSET_REQUIREMENTS to classify unavailable toolsets as lazy-init vs disabled).	2026-06-30 17:47:37 +05:30
xxxigm	6e84257717	fix(registry): derive toolset availability from per-tool checks Doctor and banner used the first check_fn registered for a toolset, so desktop-only read_terminal gated the whole terminal toolset even though terminal and process still expose at runtime. Fixes #54820	2026-06-30 17:47:37 +05:30
memosr	12f5624a76	fix(security): bind tool_override authorization to handler's defining plugin module egilewski found the prior sink gate was transient: it only applied while PluginManager executed register(ctx). A plugin could defer a direct registry.register(..., override=True) to a post-load callback/thread, after the scope was cleared, and still replace a built-in. Make authorization durable by binding it to where the handler is DEFINED (handler.__globals__['__name__']) rather than to call timing. At load, each plugin's module namespace is mapped to its allow_tool_override opt-in in a table that is never cleared. The sink resolves the handler's owning plugin module and rejects an override from any plugin namespace without opt-in, regardless of when or on which thread the call happens. Plugin namespaces with no recorded policy are treated as not-opted-in (fail-closed). Built-in and MCP handlers live outside the plugin namespace and are unaffected. Adds a regression test for the delayed/post-load direct-registry override.	2026-06-30 04:00:42 -07:00
memosr	3101222312	fix(security): enforce tool_override opt-in at registry sink to close direct-import bypass The opt-in gate lived only in PluginContext.register_tool, so a plugin could bypass it by importing tools.registry and calling registry.register(..., override=True) directly. Enforce the same gate at the sink: during plugin load, the registry rejects an override from a plugin without operator opt-in regardless of the path taken. Built-in and MCP registrations (no active plugin scope) are unaffected. Adds a regression test covering the direct-registry bypass.	2026-06-30 04:00:42 -07:00
Jeffgithub0029	b7c4369ca0	fix(telegram): chunk formatted messages with UTF-16 length accounting The standalone send path (_send_telegram, used by the send_message tool, cron delivery, and out-of-process callers) chunked the raw message on UTF-16 length, then formatted and sent the result un-rechunked. MarkdownV2 escaping inflates the text (`!`/`.`/`-` -> `\!`/`\.`/`\-`), so a 4096 UTF-16-unit raw message can become ~8192 units once formatted and gets rejected by Telegram as 'Message is too long'. Move all text chunking into _send_telegram, after formatting: split the formatted MarkdownV2/HTML text on UTF-16 length so every send is <=4096, with per-chunk plain-text fallback and thread-not-found retry preserved. Media attaches after all text chunks. (#28557)	2026-06-30 03:51:08 -07:00
nikshepsvn	d82a69b624	fix(tools): prune acp_command from delegate_task schema when no ACP CLI is on PATH Defense-in-depth follow-up to the runtime guard added in the previous commit. Models on headless hosts (Railway / Fly / Docker / fresh VPS) without any ACP CLI installed occasionally hallucinate ``acp_command="copilot"`` from the schema description, despite the explicit "Do NOT set" instruction. The runtime guard prevented the crash but the model still wasted a tool turn and got an opaque silent fallback. This commit removes the temptation at its source: ``_build_dynamic_schema_overrides`` now strips ``acp_command`` and ``acp_args`` from both the top-level and per-task schemas when none of the known ACP CLIs (``copilot``, ``claude``, ``codex``) are detectable on PATH. The model literally never sees the fields, so it cannot pass them. The runtime guard from the previous commit stays in place as defense-in-depth for internal callers, tests, and any future code path that bypasses the schema. ``_acp_binary_available`` is intentionally NOT cached: ``shutil.which`` is cheap, and avoiding the cache means the schema reacts to mid-session installs without requiring a process restart. Tests: - ``test_schema_prunes_acp_command_when_no_acp_binary`` - ``test_schema_keeps_acp_command_when_binary_available`` - ``test_acp_binary_available_checks_known_clis`` Full ``test_delegate.py`` suite: 136/136 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
nikshepsvn	2e0b591076	fix(tools): validate acp_command binary exists before forcing copilot-acp transport When a model passes `acp_command="copilot"` (or any other binary name) in a `delegate_task` tool call, `_build_child_agent` unconditionally sets `effective_provider = "copilot-acp"`, which routes the subagent through `CopilotACPClient`. That client spawns the named binary via subprocess; if it isn't on PATH, every retry raises RuntimeError and an asyncio cleanup race during error delivery can take the entire gateway down. This is a real failure mode on headless deploys (Railway / Fly / VPS / Docker) where `copilot` / `claude` / etc. aren't installed. The schema does say "Do NOT set unless the user explicitly told you an ACP CLI is installed," but models occasionally pass it anyway — particularly for X (Twitter) search prompts where Grok seems to associate ACP with "search assistance." Reproduction: - Headless install (no `copilot` binary on PATH) - Set provider to xai-oauth + model grok-4.3 - Telegram prompt: "Search X for crypto twitter trends" - Grok decides to delegate and passes `acp_command="copilot"` - Subagent crashes 3x, gateway crashes on the 3rd retry teardown Fix: validate the binary exists on PATH via `shutil.which` before honoring the override. If missing, log a warning and fall through to the parent's default transport. No behavior change when the binary IS present (covered by `test_build_child_agent_honors_acp_command_when_binary_present`). Tests: - `test_build_child_agent_ignores_acp_command_when_binary_missing` - `test_build_child_agent_honors_acp_command_when_binary_present` Verified on Python 3.11 (macOS) and 3.12 (Debian 13 container). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
georgex8001	62b9fb6623	fix(acp): thread-safe interactive approval via contextvars Concurrent ACP sessions run on a shared ThreadPoolExecutor (max_workers=4). Each _run_agent mutated the process-global os.environ["HERMES_INTERACTIVE"] and restored it in finally, so one session's restore could clobber another's set mid-run — dropping the second session onto the non-interactive auto-approve path, executing a dangerous command without the approval callback firing (GHSA-96vc-wcxf-jjff). Replace the env-var flag with a thread/task-local contextvar in tools.approval. The two HERMES_INTERACTIVE read sites in approval.py now go through _is_interactive_cli() (contextvar-first, env fallback for legacy single-threaded CLI callers). The ACP executor sets the contextvar instead of os.environ; the existing contextvars.copy_context() wrapper isolates each session's write. Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>	2026-06-30 03:24:58 -07:00
Markus Phan	cd9f5cc671	fix(delegate): route subagent progress lines through _safe_print for ACP stdio delegate_task's per-task completion display emitted lines like "✓ [1/3] Research done (17.92s)" via a bare print(). Under ACP (and any headless JSON-RPC stdio host where AIAgent routes human output to stderr via a custom _print_fn), these landed on stdout and corrupted the protocol frame stream, surfacing as "Failed to parse JSON message: ✓ [3/3] …" in the ACP adapter. Add _emit_parent_console() which prefers parent_agent._safe_print (the same hook AIAgent uses for every other user-facing print) and falls back to print() only when no router is wired up or it raises. CLI behavior is unchanged. The PR's other fix (preset toolset expansion) is already covered on main by _expand_parent_toolsets(), so only the stdio-safe printing change is salvaged here.	2026-06-30 03:16:22 -07:00
teknium1	35a0803a3b	fix(delegation): budget subagent summaries against parent context headroom Batch delegation returned each subagent's full final_response verbatim into the parent's context. A fan-out of N children could dump 60k+ tokens at once, blowing the parent's context window and — on rate-limited providers — triggering a compression/429 death spiral (429 misread as context-too-large -> window step-down -> retry loop -> conversation dies). Cap each summary against the parent's remaining context headroom split across the batch (not a magic char count). When trimming, mirror the web_extract convention: spill the full text to cache/delegation (mounted into remote backends via credential_files._CACHE_DIRS) and return a head+tail window (75/25, line-snapped) plus a footer with the exact read_file offset to page the omitted middle. Both the subagent's opening AND its closing (outcomes / files-changed / issues, which live at the end) survive in-context, and nothing is lost — the parent can read_file the full version on any backend. delegation.max_summary_chars (default 24000) is a static ceiling layered on top as belt-and-suspenders for models that ignore 'be concise'; 0 disables it. Child prompt tightened to lead with outcomes / bullets. Co-authored-by: rc-int <rcint@klaith.com>	2026-06-30 03:07:40 -07:00
MarioYounger	3b2bb30c5d	fix(security): harden heredoc approval, NFKC homograph fold, env-var filter Three independent security-scanner hardenings, re-homed onto the current shared threat-pattern architecture (tools/threat_patterns.py): - approval.py: add bash/sh/zsh/ksh heredoc to DANGEROUS_PATTERNS. The existing heredoc pattern only covered python/perl/ruby/node, so `bash <<'EOF' ... EOF` ran arbitrary shell — including exfil pipelines whose inner commands don't individually match a pattern — with no prompt. - threat_patterns.py: apply unicodedata.normalize("NFKC", ...) before pattern matching so full-width / compatibility homographs (e.g. `ｃａｔ ~/.hermes/.env`) are folded to ASCII and no longer bypass the keyword scanners. Invisible-char detection still runs on the raw content first (NFKC can strip those codepoints). - code_execution_tool.py: add CREDS/BEARER/APIKEY to _SECRET_SUBSTRINGS so vars like HERMES_LLM_CREDS, API_BEARER, MY_APIKEY are scrubbed from the sandbox env. PASS was intentionally dropped from the original proposal — it false-positives on BYPASS_CACHE / COMPASS_DIR / PASSENGER_HOST while PASSWORD/PASSWD already cover the credential cases. The original PR also proposed a 'synonym' injection pattern block (overlook/forget/set aside/bypass/discard + developer-mode); dropped here because it false-positives on ordinary AGENTS.md/SOUL.md prose ("don't forget to follow the rules", "run in developer mode"), exactly the bossy-English class threat_patterns.py is documented to avoid. Salvaged from #9028. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-30 02:59:46 -07:00
0xbyt4	e6f66bc0f0	fix(security): cover Move and no-space headers in patch_tool sensitive path check patch_tool extracts V4A patch paths so _check_sensitive_path can refuse writes to /etc/, /boot/, etc. before they reach the low-level file ops. The extraction regex had two gaps: 1. `* Move File: src -> dst` was never extracted (regex only matched Update/Add/Delete), so a Move targeting /etc/crontab skipped the pre-check and fell back on the narrower file_operations deny list. 2. The regex required `\\s+` after `` but patch_parser uses `\\s`, so `**Update File: /etc/hosts` (no space) parsed + applied while skipping the check. Loosen the leading whitespace to \\s and add a Move regex that checks both endpoints. Move endpoints also run through the same '..' traversal rejection as the other V4A headers (closes the sibling gap on current main, which gained that traversal guard after this PR was opened).	2026-06-30 02:50:24 -07:00
Teknium	b03635daea	fix(approval): catch hermes gateway stop/restart behind a profile flag (#55515 ) The gateway-lifecycle guard's hermes-CLI pattern required `hermes` and `gateway` to be adjacent, so a profile flag slipped the agent past it: `hermes -p ade gateway restart` was not flagged. That is the exact form from the 2026-04-11 ade-profile self-kill loop. Allow an optional run of global flags (`-p ade`, `--profile ade`, multiple flags) between `hermes` and the gateway subcommand. launchctl self-termination is already covered on main by #33071; this narrows the only remaining real gap.	2026-06-30 02:48:30 -07:00
Brooklyn Nicholson	e5253d852b	fix(desktop): tree-kill Windows terminal descendants Ensure Windows desktop and local terminal teardown kill full process trees so Git Bash descendants cannot survive wrapper exits and accumulate across retries.	2026-06-30 04:23:27 -05:00
kshitijk4poor	c9269fbfb6	fix(web_extract): bound stored full-text size + give concrete read_file offset Two robustness gaps from the #54843 truncate-store path: - _store_full_text wrote the full clean page to cache/web with no upper bound (path.write_text(content)); a multi-MB page → unbounded per-extract disk write. Cap at MAX_STORED_TEXT_CHARS (2MB, the pre-truncate-store refusal ceiling) with a marker when capped. - The truncation footer told the model 'read_file ... offset=<line>' — a literal placeholder it had to guess. Compute the real starting line of the omitted middle (head line count + 1) so the first read_file lands in the gap.	2026-06-30 00:19:49 -07:00
beardthelion	14c4a849b7	fix(kanban): make goal_mode judge gate truly fail-open Follow-up to the judge gate. judge_goal() is fail-open at the source: when no auxiliary model is reachable it returns a "continue" verdict that is indistinguishable from a real "not done yet" judgment. The gate treated any non-"done" verdict as a rejection, so an unconfigured or degraded auxiliary model would wedge every goal_mode worker — it could never close its own task. That contradicted the gate's own "fail-open" comment. Probe judge availability before enforcing (the same auxiliary client lookup judge_goal performs) and only gate when a judge is actually reachable. When none is, completion proceeds. Also fix the rejection guidance: kanban_create takes parents=[...], not parent=. Add test_complete_goal_mode_allows_when_judge_unavailable covering the fail-open path; update the rejection test to force the availability probe.	2026-06-29 22:20:19 -07:00
beardthelion	b3c1b3b3f3	fix(kanban): address review feedback on goal_mode judge gate Apply naqerl's review comments on PR #38388: - Hoist `from hermes_cli.goals import judge_goal` to module-level imports so an import failure surfaces at module init, not lazily on the first goal-mode completion (no circular import: hermes_cli package init is trivial and does not load tools.kanban_tools). - Narrow the fail-open `try` to wrap only the judge_goal() call. The verdict check and its rejection `return tool_error(...)` now live outside the handler, so a failure there can no longer be swallowed by the broad except. - Pass `exc_info=True` to the logger.warning call per CONTRIBUTING.md. Update the test mock target to tools.kanban_tools.judge_goal, since the hoisted import rebinds the name into this module's namespace.	2026-06-29 22:20:19 -07:00
beardthelion	0b33bc5396	fix(kanban): gate goal_mode task completion with auxiliary judge Prevents workers in goal_mode from bypassing the auxiliary judge by calling kanban_complete before acceptance criteria are met. The tool handler now synchronously invokes the goal judge against the task's title/body and the completion summary. If the verdict is not "done", the completion is rejected with actionable guidance for the agent. This keeps kanban_db.py as a pure SQLite wrapper while intercepting the bypass exactly at the agent tool-call boundary, aligning with Hermes separation of concerns. Fixes #38367 Co-authored-by: CommandCodeBot <noreply@commandcode.ai>	2026-06-29 22:20:19 -07:00
Jaaneek	9ce79cd642	feat(xai): Imagine public-URL storage, chaining & video edit/extend Add durable public-URL output and URL-based chaining to xAI Grok Imagine: - Store generated media on files-cdn with permanent public HTTPS URLs (public_url: true, no expiry by default). - Chain by URL: generate -> edit -> extend each take a prior result's public HTTPS URL (or a data URI / local file for inputs). - Add provider-specific xai_video_edit and xai_video_extend tools. - Image generation: public-URL/storage output, multi-reference edits, and ~/ local-path support for image edits. Credentials use xAI Grok device-code OAuth (separate PR).	2026-06-29 21:11:58 -07:00
Teknium	ee8cbfdc03	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 ) * feat(web_extract): truncate-and-store instead of LLM summarization web_extract no longer runs an auxiliary LLM over scraped pages. The extract backends (Firecrawl/Tavily/Exa/Parallel) already return clean, boilerplate- stripped markdown, so we return it directly: pages within a char budget (default 15000, web.extract_char_limit) come back whole; larger pages get a head+tail window plus an explicit footer giving the stored full-text path and the read_file call to page through the omitted middle. The full clean text is written to cache/web (mounted read-only into remote backends like the other cache dirs), so nothing is lost. Inline base64 images are converted to [IMAGE: alt] placeholders (token bombs dropped) while real http(s) image URLs are preserved as links so the agent can still web_extract/vision_analyze them. Removes process_content_with_llm + the chunked summarizer + check_auxiliary_model + _resolve_web_extract_auxiliary. context_references._default_url_fetcher is updated to the truncate path and its stale data.documents shape read is fixed to results (it was silently returning empty). Live before/after eval (firecrawl, 4 URLs): 11.7x faster overall (176.6s -> 15.1s); 10-60x on large pages. Quality identical; findability 4/4 (answer recoverable from stored full text on every truncated page). web_search is unchanged. No own scraper added; no changes to web_search. * fix(web_extract): add char_limit to execute_code web_extract stub The new web_extract char_limit param must appear in the code_execution_tool _TOOL_STUBS signature (and doc line) or test_stubs_cover_all_schema_params fails — the stub schema must cover every real schema param.	2026-06-29 10:00:49 -07:00
Ruzzgar	576424cc1c	fix(security): redact browser CDP endpoint logs	2026-06-29 04:25:26 -07:00
teknium1	9f97915163	fix(browser): route open-timeout base through _safe_command_timeout Wire the salvaged _safe_command_timeout() guard into the surviving open-timeout call site. _get_open_command_timeout() feeds the browser_navigate 'open' path; this closes the last call site that could observe a None timeout from a torn cache (#14331), since the original PR's max(_get_command_timeout(), 60) site no longer exists on main (now routed through _get_open_command_timeout).	2026-06-29 02:24:57 -07:00
Sanjay Santhanam	c79e6bceae	fix(browser_tool): resolve race in _get_command_timeout cache returning None (#14331 ) # Conflicts: # tools/browser_tool.py	2026-06-29 02:24:57 -07:00
teknium1	75317d82d0	fix(vision): narrow the fan-out cap to the CPU encode burst only The original cap held a process-global slot across the WHOLE vision analysis (image load + encode + LLM call) with a default of min(CPUs, 4). That serialized legitimate multi-image workflows — "compare these 6 screenshots", "read this 10-page scan", "analyze every frame" — behind a 4-wide gate, and on the native fast path it even throttled calls that make no LLM request at all. Excess calls queued (blocking acquire, nothing dropped), but the latency hit on real fan-out was the wrong tradeoff. The incident was CPU exhaustion, not call count: concurrent base64/resize bursts saturated every core and left none to service the shared event loop serving /api/status. So cap ONLY that: - A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the encode/resize/dimension-check off the caller's loop, sized to the host's usable core count with NO fixed ceiling — the cap tracks the actual exhausted resource (cores), not a magic number. Excess encodes queue on the executor; cores stay free for the loop. - The LLM call is deliberately OUTSIDE the executor, so multi-image workflows keep full request concurrency. - Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY (honored verbatim, including above core count); sub-1 ignored. - _vision_concurrency_slot() is now a no-op shim for back-compat. Tests assert: resolver defaults to host cores with no ceiling; env/config override (incl. above cores); sub-1 rejection; the executor is dedicated and core-sized; encode runs on a vision-encode thread; and crucially that encode bursts are bounded to the cap while the analyses themselves stay fully concurrent (calls_peak > cap).	2026-06-29 01:27:10 -07:00
Ben Barclay	eddfecd2ce	fix(vision): cap vision_analyze fan-out concurrency process-wide A single agent turn can fan out N vision_analyze calls at once — the classic trigger is "analyze every frame of this video", where ffmpeg explodes a clip into dozens of frames and the model calls vision_analyze on each. Every call does a CPU-heavy base64-encode/resize burst AND holds a long-lived LLM stream open. The tool executor runs concurrent tool calls on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple agent sessions share one process (the dashboard runs the agent in-process), so there was no global ceiling. In prod (June 2026) a video-frame fan-out pinned a worker thread at ~100% CPU and starved the shared asyncio event loop that also serves the dashboard's /api/status liveness probe, flapping the instance to UNHEALTHY even though nothing had crashed. Add a process-global threading.BoundedSemaphore that bounds how many vision analyses run concurrently across the whole process, held across the entire analysis (image load + encode + LLM call) in the single _handle_vision_analyze chokepoint (covers both the native fast path and the legacy aux-LLM path). It is a threading semaphore, NOT asyncio: each vision call is dispatched through model_tools._run_async on a per-thread event loop, so an asyncio primitive bound to one loop cannot coordinate across them. The acquire is offloaded via run_in_executor so waiting for a slot never blocks the calling loop. Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency, or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can never be disabled into an unbounded fan-out. Tests: bounded-fan-out regression guard + a control proving it would fail without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu host, env override, and sub-1 rejection. Pre-existing handler tests updated for the now-async _handle_vision_analyze. Verified via the real registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls, peak bounded to cap).	2026-06-29 01:27:10 -07:00
kaishi00	08d6195bc4	fix(camofox): auto-recover from stale tab 404 on navigate When a Camofox browser tab is garbage collected (idle timeout, browser recycle), the held tab_id becomes stale. The next browser_navigate call hits /tabs/{stale_id}/navigate -> HTTP 404 -> unhandled HTTPError. Catch the 404 in camofox_navigate, clear the stale tab_id, and create a fresh tab via _ensure_tab. The agent recovers transparently without requiring a session restart. Other tab operations (snapshot, click, type, etc.) use the same pattern but only fail if the tab dies between successful calls — much rarer. The navigate fix covers 95%+ of cases since navigate is always the entry point.	2026-06-29 01:26:24 -07:00
liuhao1024	fe38d50833	fix(tools): read browser.command_timeout in Camofox HTTP client The Camofox browser backend hardcoded a 30s HTTP timeout via _DEFAULT_TIMEOUT, ignoring the user's browser.command_timeout config. The main browser_tool path already reads this config via _get_command_timeout(). This commit adds an equivalent _get_command_timeout() to browser_camofox.py that reads browser.command_timeout from config with caching, and switches all HTTP helper methods (_post, _get, _get_raw, _delete) to use it as the default timeout. Fixes #40843	2026-06-29 01:26:24 -07:00
刘昊	babd9168ba	fix(browser): send Authorization header in Camofox HTTP calls when CAMOFOX_API_KEY is set The five HTTP call sites in browser_camofox.py (_ensure_tab, _post, _get, _get_raw, _delete) did not include Authorization headers, causing 403 Forbidden when the Camofox server has API key auth enabled. Added _auth_headers() helper and wired it into all five call sites. The health check endpoint (/health) is left without auth since it is a connectivity probe, not a browser operation. Regression test covers: header present when key set, absent when unset, blank key produces empty headers. Fixes #20476	2026-06-29 01:26:24 -07:00
liuhao1024	270456308c	fix(tools): send listItemId instead of sessionKey in Camofox tab creation The Camoufox REST API server expects `listItemId` in the `POST /tabs` body, but `_ensure_tab` was sending `sessionKey`. This caused a 400 Bad Request on every `browser_navigate` call. The parameter name mismatch is visible in the same file: line 283 already reads `tab.get("listItemId")` when adopting existing tabs, confirming the server-side field name. Fixes #37960	2026-06-29 01:26:24 -07:00
Ben Barclay	1289f12812	fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight The supermemory and mem0 memory providers shipped third-party SDKs (supermemory / mem0ai) that are not core dependencies, but — unlike the honcho and hindsight providers — they imported those SDKs directly with no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist entry. On the published Docker image the agent venv is sealed (HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight route through ensure() and install fine there; supermemory/mem0 never called it, so their SDK was never installed on a hosted instance and the provider silently reported itself unavailable even with the API key set. Fixes: - Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist (tools/lazy_deps.py), pinned to current PyPI releases. - Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint (_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend), mirroring honcho's wrapped try/except shape. - Drop the SDK-import gate from supermemory's is_available() — it was a chicken-and-egg trap (provider never loaded on a sealed venv, so ensure() never ran). Now key-presence only, like honcho/mem0. - Add matching pyproject extras [supermemory]/[mem0]; update the lazy-covered-extras contract test (excluded from [all] by policy). Tests prove each path fails without the fix and the real sealed-venv durable-target gate accepts both features.	2026-06-29 00:25:36 -07:00
Ben Barclay	8fe800ee1a	fix(file-tools): sanitize host/relative cwd override before it reaches container sandbox (#54447 ) (#54616 ) (cherry picked from commit 82132f7911ecf71f27ee5657870bf4105cecf8e2) Co-authored-by: Tranquil-Flow <66773372+Tranquil-Flow@users.noreply.github.com>	2026-06-29 15:32:20 +10:00
Ruzzgar	313a8c6833	fix(skills): replace string prefix check with strict path containment	2026-06-28 21:14:01 -07:00
Brooklyn Nicholson	ae465e9fb8	Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-multiterminal	2026-06-28 21:37:52 -05:00

1 2 3 4 5 ...

1950 commits