hermes-agent

Author	SHA1	Message	Date
etherman-os	2a3dbcaf46	fix(terminal): prevent corrupted session snapshots during init The init snapshot dumped functions with a line-based filter: declare -f \| grep -vE '^_[^_]' That strips a function's header line (e.g. `_foo () `) but leaves the orphaned `{ ... }` body behind, corrupting the snapshot that is sourced before every command. Sourcing the torn snapshot runs leftover body code and breaks subsequent commands (intermittent exit 127). - Filter private (`_`-prefixed) functions by NAME via `declare -F` and dump only the wanted whole definitions, so a body is never torn. Guard against an empty name list (bare `declare -f` dumps everything). - Treat a non-zero bootstrap exit code as snapshot-init failure, so execution safely falls back to login-shell-per-command mode. - Add a regression test asserting snapshot_ready stays false when bootstrap exits non-zero. Preserves the atomic-write ($BASHPID temp + mv -f) machinery from #38249.	2026-06-30 15:51:17 -07:00
kyssta-exe	20871c1d94	fix(skills): require review forks to read before writing skills	2026-06-30 15:49:36 -07:00
Erosika	a6175d1f93	style(profile): trim verbose comments to one or two lines	2026-06-30 15:30:06 -07:00
Erosika	bc396dafda	test(profile): two-profile regression suite + preserve skills_hub monkeypatch seam - tools/skills_hub.py: the per-call resolvers now honor a test-injected real module attribute (patch.object(hub, 'SKILLS_DIR', ...) / monkeypatch.setattr) before falling back to dynamic profile resolution. PEP 562 __getattr__ only fires when no real attribute exists, so an unpatched module resolves the active profile and a patched one respects the test's value — keeping the existing skills_hub test seam intact (5 tests had broken). - tests/test_profile_isolation_runtime.py: real two-profile (no-mock) suite driving each previously-leaking site under override A then B and asserting the active profile's path/identity is used: skills_hub paths + derived constants + default-arg resolution, gateway cache getters (incl. the monkeypatch-still-wins seam), rich_sent_store path, and thread/executor context propagation (raw-thread hazard documented; primitive + _run_async worker proven to preserve the override).	2026-06-30 15:30:06 -07:00
Erosika	09af0a8c1d	fix(profile): propagate profile context across thread/executor boundaries A bare threading.Thread / ThreadPoolExecutor worker starts with an empty contextvars.Context, so the context-local profile override (_HERMES_HOME_OVERRIDE) does not cross the spawn boundary. In single-process multi-profile runtimes (desktop tui_gateway) the worker then resolves get_hermes_home() to the launch/default profile, leaking one profile's reads/writes into another. The fix primitive (tools.thread_context. propagate_context_to_thread, which copies the parent context) already exists; the leaking spawns simply did not use it. - model_tools.py _run_async: wrap the worker-thread loop runner. This is the generic sync->async bridge for every async tool, so wrapping it here fixes the leak for all async tools at once (verified: an async tool reading get_hermes_home() under an override now resolves the active profile). - run_agent.py bg-review thread: wrap so MEMORY.md / skill review writes land in the spawning turn's profile (#54937 path). - tools/async_delegation.py: wrap both single + batch executor.submit calls so detached children resolve the dispatching profile's paths. Scope: the vision CPU executor is intentionally left unwrapped — it runs pure in-memory encode/resize and never resolves profile-scoped paths.	2026-06-30 15:30:06 -07:00
Erosika	10e60060d9	fix(profile): resolve import-time path globals per-call to honor profile override In single-process multi-profile runtimes (desktop tui_gateway), profile scoping is a context-local ContextVar override, not a process env var. Three subsystems froze their HERMES_HOME-derived paths at import time (or read os.environ directly), pinning every later profile to whichever profile first imported the module — a cross-profile data leak. - tools/skills_hub.py: SKILLS_DIR/HUB_DIR/LOCK_FILE/etc. were module constants frozen at import. Replace with per-call resolver functions; add a PEP 562 module __getattr__ so external 'from tools.skills_hub import SKILLS_DIR' callers (all function-local) resolve dynamically with no call-site changes. Convert default-arg bindings (HubLockFile/TapsManager) and the derived HERMES_INDEX_CACHE_FILE constant too. - gateway/platforms/base.py: image/audio/video/document cache-dir getters now re-resolve via get_hermes_dir() per call, falling back to the module constant when a test has monkeypatched it (preserves the existing test seam). Media-delivery safe-roots already enumerate all profiles' cache dirs (#31733), so per-profile resolution does not break delivery. - gateway/rich_sent_store.py: _store_path() read os.environ['HERMES_HOME'] directly, bypassing the override entirely; route through get_hermes_home().	2026-06-30 15:30:06 -07:00
srojk34	795913d3b0	fix(kanban): restrict goal_mode kanban_block to genuine external blockers The judge gate added for kanban_complete (Issue #38367, PR #38388) only covers one of the two exit paths out of run_kanban_goal_loop(). The loop treats status == "blocked" as terminal identically to "done" (and any other status outside running/ready/done/blocked also stops the loop — see goals.py's status dispatch). A goal_mode worker that has learned kanban_complete is gated can simply call kanban_block(reason="anything") to escape the loop with zero judge involvement, fully defeating the intent of #38367's fix. This is Issue #38696, filed as the explicit follow-up by a reviewer on PR #38388: "kanban_complete is one way out; kanban_block is another... A worker that learns the complete path is gated can shift to calling block to escape the loop with the same effect." Implements the issue's "Option B" (deterministic allowlist, no extra judge LLM call) using the kind taxonomy that already exists in kb.VALID_BLOCK_KINDS, rather than inventing a new judge_goal() outcome type (judge_goal only returns done/continue/wait/skipped — there's no "is this block legitimate" verdict to hook the issue's "Option A" pseudocode onto without expanding the judge's contract). goal_mode tasks may only block with kind in {dependency, needs_input} — the two kinds that represent a genuine external blocker the worker cannot resolve itself. `capability`, `transient`, and an unset kind are rejected with a message directing the worker to kanban_complete instead, which the judge now gates. Non-goal_mode tasks are completely unaffected.	2026-06-30 14:29:42 -07:00
kshitijk4poor	a5e8cd4d40	fix(memory): degrade gracefully after repeated at-capacity consolidation failures (#42405 ) Builds on the zero-match feedback fix (previous commit) to close the silent-hang symptom: when memory is at capacity, a failed `add`/`replace`/`remove` consolidation could loop the whole turn to iteration-budget exhaustion and deliver no user-facing reply. #41755 turned the at-capacity overflow error into a commanded in-turn retry ("...then retry this add — all in this turn"); combined with the fragile substring-only `replace`/`remove` matching (LLMs can't reliably re-quote a long entry verbatim), the model loops add↔replace on inexact guesses until the turn dies. The existing tool_guardrails halt would catch this, but hard_stop_enabled is opt-in (off by default), so a default install still hangs. This fixes it at the memory layer without changing global guardrail behavior: - MemoryStore tracks per-turn consolidation failures; after a cap (3) it drops the "retry in this turn" instruction and returns a terminal "leave memory unchanged, continue your reply" result, so a failed memory side effect can never block the turn's reply. - The counter resets on any successful write (progress) and at each turn boundary (turn_context.reset_consolidation_failures, guarded via getattr so plugin memory stores without the method are a no-op). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-30 20:01:16 +05:30
kyssta-exe	62a1bf4c55	fix(tools): return previews on zero-match in replace/remove to prevent memory retry loops (#42405 ) - replace() and remove() now return entry previews and current_entries when no entry matches old_text, matching the multi-match and add-limit error behavior - add() limit error also now returns previews for consistency - Agent can self-correct after a failed replace/remove instead of looping blindly until turn budget is exhausted with no user response	2026-06-30 20:01:16 +05:30
kshitijk4poor	824f2279da	refactor(registry): drop dead toolset-check helpers after per-tool availability Follow-up to the per-tool availability derivation: `_snapshot_toolset_checks` and `_evaluate_toolset_check` had no remaining callers once the four availability surfaces switched to `_toolset_has_exposable_tools`. Remove both, drop the no-op `quiet` param from the new helper, and document why `_toolset_checks` is still written (banner.py reads it via TOOLSET_REQUIREMENTS to classify unavailable toolsets as lazy-init vs disabled).	2026-06-30 17:47:37 +05:30
xxxigm	6e84257717	fix(registry): derive toolset availability from per-tool checks Doctor and banner used the first check_fn registered for a toolset, so desktop-only read_terminal gated the whole terminal toolset even though terminal and process still expose at runtime. Fixes #54820	2026-06-30 17:47:37 +05:30
memosr	12f5624a76	fix(security): bind tool_override authorization to handler's defining plugin module egilewski found the prior sink gate was transient: it only applied while PluginManager executed register(ctx). A plugin could defer a direct registry.register(..., override=True) to a post-load callback/thread, after the scope was cleared, and still replace a built-in. Make authorization durable by binding it to where the handler is DEFINED (handler.__globals__['__name__']) rather than to call timing. At load, each plugin's module namespace is mapped to its allow_tool_override opt-in in a table that is never cleared. The sink resolves the handler's owning plugin module and rejects an override from any plugin namespace without opt-in, regardless of when or on which thread the call happens. Plugin namespaces with no recorded policy are treated as not-opted-in (fail-closed). Built-in and MCP handlers live outside the plugin namespace and are unaffected. Adds a regression test for the delayed/post-load direct-registry override.	2026-06-30 04:00:42 -07:00
memosr	3101222312	fix(security): enforce tool_override opt-in at registry sink to close direct-import bypass The opt-in gate lived only in PluginContext.register_tool, so a plugin could bypass it by importing tools.registry and calling registry.register(..., override=True) directly. Enforce the same gate at the sink: during plugin load, the registry rejects an override from a plugin without operator opt-in regardless of the path taken. Built-in and MCP registrations (no active plugin scope) are unaffected. Adds a regression test covering the direct-registry bypass.	2026-06-30 04:00:42 -07:00
Jeffgithub0029	b7c4369ca0	fix(telegram): chunk formatted messages with UTF-16 length accounting The standalone send path (_send_telegram, used by the send_message tool, cron delivery, and out-of-process callers) chunked the raw message on UTF-16 length, then formatted and sent the result un-rechunked. MarkdownV2 escaping inflates the text (`!`/`.`/`-` -> `\!`/`\.`/`\-`), so a 4096 UTF-16-unit raw message can become ~8192 units once formatted and gets rejected by Telegram as 'Message is too long'. Move all text chunking into _send_telegram, after formatting: split the formatted MarkdownV2/HTML text on UTF-16 length so every send is <=4096, with per-chunk plain-text fallback and thread-not-found retry preserved. Media attaches after all text chunks. (#28557)	2026-06-30 03:51:08 -07:00
nikshepsvn	d82a69b624	fix(tools): prune acp_command from delegate_task schema when no ACP CLI is on PATH Defense-in-depth follow-up to the runtime guard added in the previous commit. Models on headless hosts (Railway / Fly / Docker / fresh VPS) without any ACP CLI installed occasionally hallucinate ``acp_command="copilot"`` from the schema description, despite the explicit "Do NOT set" instruction. The runtime guard prevented the crash but the model still wasted a tool turn and got an opaque silent fallback. This commit removes the temptation at its source: ``_build_dynamic_schema_overrides`` now strips ``acp_command`` and ``acp_args`` from both the top-level and per-task schemas when none of the known ACP CLIs (``copilot``, ``claude``, ``codex``) are detectable on PATH. The model literally never sees the fields, so it cannot pass them. The runtime guard from the previous commit stays in place as defense-in-depth for internal callers, tests, and any future code path that bypasses the schema. ``_acp_binary_available`` is intentionally NOT cached: ``shutil.which`` is cheap, and avoiding the cache means the schema reacts to mid-session installs without requiring a process restart. Tests: - ``test_schema_prunes_acp_command_when_no_acp_binary`` - ``test_schema_keeps_acp_command_when_binary_available`` - ``test_acp_binary_available_checks_known_clis`` Full ``test_delegate.py`` suite: 136/136 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
nikshepsvn	2e0b591076	fix(tools): validate acp_command binary exists before forcing copilot-acp transport When a model passes `acp_command="copilot"` (or any other binary name) in a `delegate_task` tool call, `_build_child_agent` unconditionally sets `effective_provider = "copilot-acp"`, which routes the subagent through `CopilotACPClient`. That client spawns the named binary via subprocess; if it isn't on PATH, every retry raises RuntimeError and an asyncio cleanup race during error delivery can take the entire gateway down. This is a real failure mode on headless deploys (Railway / Fly / VPS / Docker) where `copilot` / `claude` / etc. aren't installed. The schema does say "Do NOT set unless the user explicitly told you an ACP CLI is installed," but models occasionally pass it anyway — particularly for X (Twitter) search prompts where Grok seems to associate ACP with "search assistance." Reproduction: - Headless install (no `copilot` binary on PATH) - Set provider to xai-oauth + model grok-4.3 - Telegram prompt: "Search X for crypto twitter trends" - Grok decides to delegate and passes `acp_command="copilot"` - Subagent crashes 3x, gateway crashes on the 3rd retry teardown Fix: validate the binary exists on PATH via `shutil.which` before honoring the override. If missing, log a warning and fall through to the parent's default transport. No behavior change when the binary IS present (covered by `test_build_child_agent_honors_acp_command_when_binary_present`). Tests: - `test_build_child_agent_ignores_acp_command_when_binary_missing` - `test_build_child_agent_honors_acp_command_when_binary_present` Verified on Python 3.11 (macOS) and 3.12 (Debian 13 container). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
georgex8001	62b9fb6623	fix(acp): thread-safe interactive approval via contextvars Concurrent ACP sessions run on a shared ThreadPoolExecutor (max_workers=4). Each _run_agent mutated the process-global os.environ["HERMES_INTERACTIVE"] and restored it in finally, so one session's restore could clobber another's set mid-run — dropping the second session onto the non-interactive auto-approve path, executing a dangerous command without the approval callback firing (GHSA-96vc-wcxf-jjff). Replace the env-var flag with a thread/task-local contextvar in tools.approval. The two HERMES_INTERACTIVE read sites in approval.py now go through _is_interactive_cli() (contextvar-first, env fallback for legacy single-threaded CLI callers). The ACP executor sets the contextvar instead of os.environ; the existing contextvars.copy_context() wrapper isolates each session's write. Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>	2026-06-30 03:24:58 -07:00
Markus Phan	cd9f5cc671	fix(delegate): route subagent progress lines through _safe_print for ACP stdio delegate_task's per-task completion display emitted lines like "✓ [1/3] Research done (17.92s)" via a bare print(). Under ACP (and any headless JSON-RPC stdio host where AIAgent routes human output to stderr via a custom _print_fn), these landed on stdout and corrupted the protocol frame stream, surfacing as "Failed to parse JSON message: ✓ [3/3] …" in the ACP adapter. Add _emit_parent_console() which prefers parent_agent._safe_print (the same hook AIAgent uses for every other user-facing print) and falls back to print() only when no router is wired up or it raises. CLI behavior is unchanged. The PR's other fix (preset toolset expansion) is already covered on main by _expand_parent_toolsets(), so only the stdio-safe printing change is salvaged here.	2026-06-30 03:16:22 -07:00
teknium1	35a0803a3b	fix(delegation): budget subagent summaries against parent context headroom Batch delegation returned each subagent's full final_response verbatim into the parent's context. A fan-out of N children could dump 60k+ tokens at once, blowing the parent's context window and — on rate-limited providers — triggering a compression/429 death spiral (429 misread as context-too-large -> window step-down -> retry loop -> conversation dies). Cap each summary against the parent's remaining context headroom split across the batch (not a magic char count). When trimming, mirror the web_extract convention: spill the full text to cache/delegation (mounted into remote backends via credential_files._CACHE_DIRS) and return a head+tail window (75/25, line-snapped) plus a footer with the exact read_file offset to page the omitted middle. Both the subagent's opening AND its closing (outcomes / files-changed / issues, which live at the end) survive in-context, and nothing is lost — the parent can read_file the full version on any backend. delegation.max_summary_chars (default 24000) is a static ceiling layered on top as belt-and-suspenders for models that ignore 'be concise'; 0 disables it. Child prompt tightened to lead with outcomes / bullets. Co-authored-by: rc-int <rcint@klaith.com>	2026-06-30 03:07:40 -07:00
MarioYounger	3b2bb30c5d	fix(security): harden heredoc approval, NFKC homograph fold, env-var filter Three independent security-scanner hardenings, re-homed onto the current shared threat-pattern architecture (tools/threat_patterns.py): - approval.py: add bash/sh/zsh/ksh heredoc to DANGEROUS_PATTERNS. The existing heredoc pattern only covered python/perl/ruby/node, so `bash <<'EOF' ... EOF` ran arbitrary shell — including exfil pipelines whose inner commands don't individually match a pattern — with no prompt. - threat_patterns.py: apply unicodedata.normalize("NFKC", ...) before pattern matching so full-width / compatibility homographs (e.g. `ｃａｔ ~/.hermes/.env`) are folded to ASCII and no longer bypass the keyword scanners. Invisible-char detection still runs on the raw content first (NFKC can strip those codepoints). - code_execution_tool.py: add CREDS/BEARER/APIKEY to _SECRET_SUBSTRINGS so vars like HERMES_LLM_CREDS, API_BEARER, MY_APIKEY are scrubbed from the sandbox env. PASS was intentionally dropped from the original proposal — it false-positives on BYPASS_CACHE / COMPASS_DIR / PASSENGER_HOST while PASSWORD/PASSWD already cover the credential cases. The original PR also proposed a 'synonym' injection pattern block (overlook/forget/set aside/bypass/discard + developer-mode); dropped here because it false-positives on ordinary AGENTS.md/SOUL.md prose ("don't forget to follow the rules", "run in developer mode"), exactly the bossy-English class threat_patterns.py is documented to avoid. Salvaged from #9028. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-30 02:59:46 -07:00
0xbyt4	e6f66bc0f0	fix(security): cover Move and no-space headers in patch_tool sensitive path check patch_tool extracts V4A patch paths so _check_sensitive_path can refuse writes to /etc/, /boot/, etc. before they reach the low-level file ops. The extraction regex had two gaps: 1. `* Move File: src -> dst` was never extracted (regex only matched Update/Add/Delete), so a Move targeting /etc/crontab skipped the pre-check and fell back on the narrower file_operations deny list. 2. The regex required `\\s+` after `` but patch_parser uses `\\s`, so `**Update File: /etc/hosts` (no space) parsed + applied while skipping the check. Loosen the leading whitespace to \\s and add a Move regex that checks both endpoints. Move endpoints also run through the same '..' traversal rejection as the other V4A headers (closes the sibling gap on current main, which gained that traversal guard after this PR was opened).	2026-06-30 02:50:24 -07:00
Teknium	b03635daea	fix(approval): catch hermes gateway stop/restart behind a profile flag (#55515 ) The gateway-lifecycle guard's hermes-CLI pattern required `hermes` and `gateway` to be adjacent, so a profile flag slipped the agent past it: `hermes -p ade gateway restart` was not flagged. That is the exact form from the 2026-04-11 ade-profile self-kill loop. Allow an optional run of global flags (`-p ade`, `--profile ade`, multiple flags) between `hermes` and the gateway subcommand. launchctl self-termination is already covered on main by #33071; this narrows the only remaining real gap.	2026-06-30 02:48:30 -07:00
Brooklyn Nicholson	e5253d852b	fix(desktop): tree-kill Windows terminal descendants Ensure Windows desktop and local terminal teardown kill full process trees so Git Bash descendants cannot survive wrapper exits and accumulate across retries.	2026-06-30 04:23:27 -05:00
kshitijk4poor	c9269fbfb6	fix(web_extract): bound stored full-text size + give concrete read_file offset Two robustness gaps from the #54843 truncate-store path: - _store_full_text wrote the full clean page to cache/web with no upper bound (path.write_text(content)); a multi-MB page → unbounded per-extract disk write. Cap at MAX_STORED_TEXT_CHARS (2MB, the pre-truncate-store refusal ceiling) with a marker when capped. - The truncation footer told the model 'read_file ... offset=<line>' — a literal placeholder it had to guess. Compute the real starting line of the omitted middle (head line count + 1) so the first read_file lands in the gap.	2026-06-30 00:19:49 -07:00
beardthelion	14c4a849b7	fix(kanban): make goal_mode judge gate truly fail-open Follow-up to the judge gate. judge_goal() is fail-open at the source: when no auxiliary model is reachable it returns a "continue" verdict that is indistinguishable from a real "not done yet" judgment. The gate treated any non-"done" verdict as a rejection, so an unconfigured or degraded auxiliary model would wedge every goal_mode worker — it could never close its own task. That contradicted the gate's own "fail-open" comment. Probe judge availability before enforcing (the same auxiliary client lookup judge_goal performs) and only gate when a judge is actually reachable. When none is, completion proceeds. Also fix the rejection guidance: kanban_create takes parents=[...], not parent=. Add test_complete_goal_mode_allows_when_judge_unavailable covering the fail-open path; update the rejection test to force the availability probe.	2026-06-29 22:20:19 -07:00
beardthelion	b3c1b3b3f3	fix(kanban): address review feedback on goal_mode judge gate Apply naqerl's review comments on PR #38388: - Hoist `from hermes_cli.goals import judge_goal` to module-level imports so an import failure surfaces at module init, not lazily on the first goal-mode completion (no circular import: hermes_cli package init is trivial and does not load tools.kanban_tools). - Narrow the fail-open `try` to wrap only the judge_goal() call. The verdict check and its rejection `return tool_error(...)` now live outside the handler, so a failure there can no longer be swallowed by the broad except. - Pass `exc_info=True` to the logger.warning call per CONTRIBUTING.md. Update the test mock target to tools.kanban_tools.judge_goal, since the hoisted import rebinds the name into this module's namespace.	2026-06-29 22:20:19 -07:00
beardthelion	0b33bc5396	fix(kanban): gate goal_mode task completion with auxiliary judge Prevents workers in goal_mode from bypassing the auxiliary judge by calling kanban_complete before acceptance criteria are met. The tool handler now synchronously invokes the goal judge against the task's title/body and the completion summary. If the verdict is not "done", the completion is rejected with actionable guidance for the agent. This keeps kanban_db.py as a pure SQLite wrapper while intercepting the bypass exactly at the agent tool-call boundary, aligning with Hermes separation of concerns. Fixes #38367 Co-authored-by: CommandCodeBot <noreply@commandcode.ai>	2026-06-29 22:20:19 -07:00
Jaaneek	9ce79cd642	feat(xai): Imagine public-URL storage, chaining & video edit/extend Add durable public-URL output and URL-based chaining to xAI Grok Imagine: - Store generated media on files-cdn with permanent public HTTPS URLs (public_url: true, no expiry by default). - Chain by URL: generate -> edit -> extend each take a prior result's public HTTPS URL (or a data URI / local file for inputs). - Add provider-specific xai_video_edit and xai_video_extend tools. - Image generation: public-URL/storage output, multi-reference edits, and ~/ local-path support for image edits. Credentials use xAI Grok device-code OAuth (separate PR).	2026-06-29 21:11:58 -07:00
Teknium	ee8cbfdc03	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 ) * feat(web_extract): truncate-and-store instead of LLM summarization web_extract no longer runs an auxiliary LLM over scraped pages. The extract backends (Firecrawl/Tavily/Exa/Parallel) already return clean, boilerplate- stripped markdown, so we return it directly: pages within a char budget (default 15000, web.extract_char_limit) come back whole; larger pages get a head+tail window plus an explicit footer giving the stored full-text path and the read_file call to page through the omitted middle. The full clean text is written to cache/web (mounted read-only into remote backends like the other cache dirs), so nothing is lost. Inline base64 images are converted to [IMAGE: alt] placeholders (token bombs dropped) while real http(s) image URLs are preserved as links so the agent can still web_extract/vision_analyze them. Removes process_content_with_llm + the chunked summarizer + check_auxiliary_model + _resolve_web_extract_auxiliary. context_references._default_url_fetcher is updated to the truncate path and its stale data.documents shape read is fixed to results (it was silently returning empty). Live before/after eval (firecrawl, 4 URLs): 11.7x faster overall (176.6s -> 15.1s); 10-60x on large pages. Quality identical; findability 4/4 (answer recoverable from stored full text on every truncated page). web_search is unchanged. No own scraper added; no changes to web_search. * fix(web_extract): add char_limit to execute_code web_extract stub The new web_extract char_limit param must appear in the code_execution_tool _TOOL_STUBS signature (and doc line) or test_stubs_cover_all_schema_params fails — the stub schema must cover every real schema param.	2026-06-29 10:00:49 -07:00
Ruzzgar	576424cc1c	fix(security): redact browser CDP endpoint logs	2026-06-29 04:25:26 -07:00
teknium1	9f97915163	fix(browser): route open-timeout base through _safe_command_timeout Wire the salvaged _safe_command_timeout() guard into the surviving open-timeout call site. _get_open_command_timeout() feeds the browser_navigate 'open' path; this closes the last call site that could observe a None timeout from a torn cache (#14331), since the original PR's max(_get_command_timeout(), 60) site no longer exists on main (now routed through _get_open_command_timeout).	2026-06-29 02:24:57 -07:00
Sanjay Santhanam	c79e6bceae	fix(browser_tool): resolve race in _get_command_timeout cache returning None (#14331 ) # Conflicts: # tools/browser_tool.py	2026-06-29 02:24:57 -07:00
teknium1	75317d82d0	fix(vision): narrow the fan-out cap to the CPU encode burst only The original cap held a process-global slot across the WHOLE vision analysis (image load + encode + LLM call) with a default of min(CPUs, 4). That serialized legitimate multi-image workflows — "compare these 6 screenshots", "read this 10-page scan", "analyze every frame" — behind a 4-wide gate, and on the native fast path it even throttled calls that make no LLM request at all. Excess calls queued (blocking acquire, nothing dropped), but the latency hit on real fan-out was the wrong tradeoff. The incident was CPU exhaustion, not call count: concurrent base64/resize bursts saturated every core and left none to service the shared event loop serving /api/status. So cap ONLY that: - A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the encode/resize/dimension-check off the caller's loop, sized to the host's usable core count with NO fixed ceiling — the cap tracks the actual exhausted resource (cores), not a magic number. Excess encodes queue on the executor; cores stay free for the loop. - The LLM call is deliberately OUTSIDE the executor, so multi-image workflows keep full request concurrency. - Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY (honored verbatim, including above core count); sub-1 ignored. - _vision_concurrency_slot() is now a no-op shim for back-compat. Tests assert: resolver defaults to host cores with no ceiling; env/config override (incl. above cores); sub-1 rejection; the executor is dedicated and core-sized; encode runs on a vision-encode thread; and crucially that encode bursts are bounded to the cap while the analyses themselves stay fully concurrent (calls_peak > cap).	2026-06-29 01:27:10 -07:00
Ben Barclay	eddfecd2ce	fix(vision): cap vision_analyze fan-out concurrency process-wide A single agent turn can fan out N vision_analyze calls at once — the classic trigger is "analyze every frame of this video", where ffmpeg explodes a clip into dozens of frames and the model calls vision_analyze on each. Every call does a CPU-heavy base64-encode/resize burst AND holds a long-lived LLM stream open. The tool executor runs concurrent tool calls on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple agent sessions share one process (the dashboard runs the agent in-process), so there was no global ceiling. In prod (June 2026) a video-frame fan-out pinned a worker thread at ~100% CPU and starved the shared asyncio event loop that also serves the dashboard's /api/status liveness probe, flapping the instance to UNHEALTHY even though nothing had crashed. Add a process-global threading.BoundedSemaphore that bounds how many vision analyses run concurrently across the whole process, held across the entire analysis (image load + encode + LLM call) in the single _handle_vision_analyze chokepoint (covers both the native fast path and the legacy aux-LLM path). It is a threading semaphore, NOT asyncio: each vision call is dispatched through model_tools._run_async on a per-thread event loop, so an asyncio primitive bound to one loop cannot coordinate across them. The acquire is offloaded via run_in_executor so waiting for a slot never blocks the calling loop. Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency, or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can never be disabled into an unbounded fan-out. Tests: bounded-fan-out regression guard + a control proving it would fail without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu host, env override, and sub-1 rejection. Pre-existing handler tests updated for the now-async _handle_vision_analyze. Verified via the real registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls, peak bounded to cap).	2026-06-29 01:27:10 -07:00
kaishi00	08d6195bc4	fix(camofox): auto-recover from stale tab 404 on navigate When a Camofox browser tab is garbage collected (idle timeout, browser recycle), the held tab_id becomes stale. The next browser_navigate call hits /tabs/{stale_id}/navigate -> HTTP 404 -> unhandled HTTPError. Catch the 404 in camofox_navigate, clear the stale tab_id, and create a fresh tab via _ensure_tab. The agent recovers transparently without requiring a session restart. Other tab operations (snapshot, click, type, etc.) use the same pattern but only fail if the tab dies between successful calls — much rarer. The navigate fix covers 95%+ of cases since navigate is always the entry point.	2026-06-29 01:26:24 -07:00
liuhao1024	fe38d50833	fix(tools): read browser.command_timeout in Camofox HTTP client The Camofox browser backend hardcoded a 30s HTTP timeout via _DEFAULT_TIMEOUT, ignoring the user's browser.command_timeout config. The main browser_tool path already reads this config via _get_command_timeout(). This commit adds an equivalent _get_command_timeout() to browser_camofox.py that reads browser.command_timeout from config with caching, and switches all HTTP helper methods (_post, _get, _get_raw, _delete) to use it as the default timeout. Fixes #40843	2026-06-29 01:26:24 -07:00
刘昊	babd9168ba	fix(browser): send Authorization header in Camofox HTTP calls when CAMOFOX_API_KEY is set The five HTTP call sites in browser_camofox.py (_ensure_tab, _post, _get, _get_raw, _delete) did not include Authorization headers, causing 403 Forbidden when the Camofox server has API key auth enabled. Added _auth_headers() helper and wired it into all five call sites. The health check endpoint (/health) is left without auth since it is a connectivity probe, not a browser operation. Regression test covers: header present when key set, absent when unset, blank key produces empty headers. Fixes #20476	2026-06-29 01:26:24 -07:00
liuhao1024	270456308c	fix(tools): send listItemId instead of sessionKey in Camofox tab creation The Camoufox REST API server expects `listItemId` in the `POST /tabs` body, but `_ensure_tab` was sending `sessionKey`. This caused a 400 Bad Request on every `browser_navigate` call. The parameter name mismatch is visible in the same file: line 283 already reads `tab.get("listItemId")` when adopting existing tabs, confirming the server-side field name. Fixes #37960	2026-06-29 01:26:24 -07:00
Ben Barclay	1289f12812	fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight The supermemory and mem0 memory providers shipped third-party SDKs (supermemory / mem0ai) that are not core dependencies, but — unlike the honcho and hindsight providers — they imported those SDKs directly with no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist entry. On the published Docker image the agent venv is sealed (HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight route through ensure() and install fine there; supermemory/mem0 never called it, so their SDK was never installed on a hosted instance and the provider silently reported itself unavailable even with the API key set. Fixes: - Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist (tools/lazy_deps.py), pinned to current PyPI releases. - Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint (_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend), mirroring honcho's wrapped try/except shape. - Drop the SDK-import gate from supermemory's is_available() — it was a chicken-and-egg trap (provider never loaded on a sealed venv, so ensure() never ran). Now key-presence only, like honcho/mem0. - Add matching pyproject extras [supermemory]/[mem0]; update the lazy-covered-extras contract test (excluded from [all] by policy). Tests prove each path fails without the fix and the real sealed-venv durable-target gate accepts both features.	2026-06-29 00:25:36 -07:00
Ben Barclay	8fe800ee1a	fix(file-tools): sanitize host/relative cwd override before it reaches container sandbox (#54447 ) (#54616 ) (cherry picked from commit 82132f7911ecf71f27ee5657870bf4105cecf8e2) Co-authored-by: Tranquil-Flow <66773372+Tranquil-Flow@users.noreply.github.com>	2026-06-29 15:32:20 +10:00
Ruzzgar	313a8c6833	fix(skills): replace string prefix check with strict path containment	2026-06-28 21:14:01 -07:00
Brooklyn Nicholson	ae465e9fb8	Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-multiterminal	2026-06-28 21:37:52 -05:00
Brooklyn Nicholson	e117cfdff0	feat(desktop): live agent terminals + agent-driven tab close Make the read-only agent terminal mirrors stream in real time and give the agent a desktop-only way to dismiss its own tabs. - Stream background output live: the local reader used a blocking read(4096) that buffered small periodic output until EOF, so agent tabs only "filled in" at process exit. Switch to buffer.read1(4096) (decoded) for incremental chunks. - Route agent.terminal.output / terminal.close to the window that owns the process (its gateway session) instead of an empty session id, so events actually reach the desktop renderer. - Add close_terminal: a HERMES_DESKTOP-gated tool (sibling of read_terminal) that drops a process's read-only tab WITHOUT killing it via process_registry.on_close; output keeps buffering and the user can reopen from the status stack. - ⌘W now closes a focused agent tab: mark the agent instance data-terminal and focus it on activation so isFocusWithin routes there. - ensureTerminal() no longer spawns an extra user shell when a tab already exists (e.g. opening a background task from the status stack).	2026-06-28 21:15:14 -05:00
LIC99	dda3268d09	fix(approvals): warn and default to manual on unknown approvals.mode _normalize_approval_mode() previously accepted any string, so an unknown value like 'auto' fell through every downstream mode check (off/smart) and silently behaved like manual with no signal. Validate against the known modes (manual/smart/off), emit a warning for anything else, and default to manual to match the config default and the rest of the function. Bug 1 from the original PR (/approve & /deny bypassing the running-agent guard) already landed on main independently, so only the mode-validation fix is salvaged here. Fixes #4261 Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-28 19:04:18 -07:00
aaronagent	5c1ac6c70d	fix(config): strip `export` prefix in .env parsers across three modules All three .env parsers use `line.partition("=")` without stripping the bash-compatible `export ` prefix first. A line like `export API_KEY=sk-...` produces key `"export API_KEY"` instead of `"API_KEY"`, silently ignoring the variable and causing auth failures for users who copy-paste from bash profiles or follow tutorials that include `export`. - tools/skills_tool.py: `load_env()` for skill environment - hermes_cli/config.py: `load_env()` for core config - hermes_cli/main.py: `_has_any_provider_configured()` inline parser Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-06-28 18:53:00 -07:00
Teknium	9860d93f2a	fix(terminal): require approval for host-bound Docker commands (#54483 ) * fix(terminal): require approval for host-bound Docker commands The Docker terminal backend blanket-skips dangerous-command approval on the assumption that the container is isolated from the host. That holds only when nothing is bind-mounted in. Once a host path is exposed (via TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE or a host-path entry in TERMINAL_DOCKER_VOLUMES), a command like `rm -rf /workspace` reaches real host files but is still auto-approved. Detect host bind mounts and route those sessions through the normal approval flow. Isolated Docker keeps the fast path. The same gating is applied to the execute_code guard, which had the identical blanket skip. Co-authored-by: Hermes Agent <agent@nousresearch.com> * chore: add AUTHOR_MAP entry for PR #6436 salvage (Kolektori) * test: accept has_host_access kwarg in _check_all_guards mocks The host-bound Docker approval fix adds a has_host_access kwarg to the _check_all_guards wrapper. Six pre-existing tests monkeypatch it with a fixed (command, env_type) / (cmd, env) lambda signature, which now raises TypeError when terminal_tool passes the new kwarg. Widen those mock signatures to accept **kwargs. --------- Co-authored-by: Kolektori <256073454+Kolektori@users.noreply.github.com> Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-29 11:35:41 +10:00
Ben Barclay	7cfa2fa13f	fix(docker): gate resource limit flags on cgroup controller availability (#54516 ) On hosts where the cgroup v2 cpu/memory/pids controllers are not delegated to the docker/podman process (unprivileged Proxmox LXCs, some rootless and nested setups), --pids-limit/--cpus/--memory cause every container start to fail with OCI runtime error / exit 126, breaking terminal + execute_code. - Add _cgroup_limits_available(image): one-shot, host-wide cached probe that spawns a throwaway container from the sandbox image itself (sleep 0) with all three flags together, mirroring the existing _storage_opt_supported probe-and-degrade pattern. - Remove --pids-limit from static _BASE_SECURITY_ARGS; apply it (default 256 via _DEFAULT_PIDS_LIMIT) in resource_args gated on the probe. - Gate --cpus and --memory on the same probe. Behavior unchanged on cgroup-capable hosts; graceful degradation with a one-time warning where controllers aren't delegated. Fixes #6568. (cherry picked from commit c933880b7ee2ce4d1167e0f89caa2d233db5639f) Co-authored-by: angelos <angelos@oikos.lan.home.malaiwah.com>	2026-06-29 11:01:08 +10:00
Brooklyn Nicholson	520212cc59	feat(desktop): stream agent terminal output live instead of polling Replace the 5s output_tail poll (which often showed nothing) with a real push stream. The process registry gains an on_output sink called from its reader threads with each chunk; the tui_gateway wires it to emit agent.terminal.output {process_id, chunk} (write_json is _stdout_lock-guarded, so emitting from the reader thread is safe). The desktop routes chunks by process id straight into the read-only agent xterm via a small writer registry, with a capped backlog so a tab opened mid-stream (or reopened) replays what it missed. Drops the fragile poll/tail path: no session-key matching, no truncation, no lag — full-fidelity ANSI, env-agnostic (local/docker/ssh).	2026-06-28 19:33:43 -05:00
Brooklyn Nicholson	cb1bb1a48d	refactor(windows): unify windowless spawn form across the touched sites windows_hide_flags() already returns 0 on POSIX (and creationflags=0 is the no-op default there, exactly how server.py::_list_repo_files does it), so drop the IS_WINDOWS import + ternary/one-use-dict gating and just pass creationflags=windows_hide_flags() directly. Tests lose the now-pointless IS_WINDOWS monkeypatch.	2026-06-28 17:44:47 -05:00
Brooklyn Nicholson	32087e4bc9	fix(windows): hide console flash on checkpoint git + skills_hub gh probes The #54236/#54417 backend git/gh sweep routed git_probe, the repo-file picker, coding_context, context_references, copilot_auth, and the gateway process scans through CREATE_NO_WINDOW, but two sibling spawn legs that also run inside the console-less desktop/gateway backend were missed: - tools/checkpoint_manager.py `_run_git` (and the one-shot `git init --bare` in `_init_store`) — when checkpoints are enabled, every file-mutating turn fires multiple bare `git` calls (status, add, write-tree/commit-tree, update-ref). Spawned from a parent with no console (Electron spawns the backend with windowsHide → CREATE_NO_WINDOW), each one allocates its own conhost window → a flurry of terminal popups. - tools/skills_hub.py `GitHubAuth._try_gh_cli` — `gh auth token`, the same bug class as the already-fixed copilot_auth gh probe. Route both through `windows_hide_flags()` (no-op on POSIX), matching the established per-site pattern. Tests added to tests/test_windows_subprocess_no_window_flags.py.	2026-06-28 17:41:47 -05:00

1 2 3 4 5 ...

1942 commits