hermes-agent

Author	SHA1	Message	Date
liuhao1024	d3c8a155cb	fix(slack): keep blank-line-separated ordered items in one rich_text_list When a Markdown ordered list has blank lines between items (common in LLM-authored content), the list run loop breaks on each blank line. Slack numbers each rich_text_list independently, so N items produce N lists each starting at 1. Skip blank lines inside the list run as soft separators instead of breaking, so ordered items stay in one rich_text_list and Slack renders the correct numbering. Fixes #57076	2026-07-03 02:55:22 +05:30
Yingliang Zhang	67472fbaa4	fix(tui_gateway): route setup.runtime_check and setup.status to RPC pool setup.runtime_check and setup.status are polled by the Desktop frontend on connect and periodically (use-status-snapshot → evaluateRuntimeReadiness), but neither was in _LONG_HANDLERS — so dispatch() ran both inline on the WS reader thread. Under GIL pressure from concurrent agent turns (terminal I/O, large output, background-process completions) either can block for seconds: - setup.runtime_check → resolve_runtime_provider() (config read, auth check, may probe the provider endpoint) - setup.status → _has_any_provider_configured() (provider config + credential scan) While either blocks the reader thread the WS read loop can't service later requests; the frontend RPC timeout fires, the client drops the socket, and the lost setup.runtime_check response reads as ready=false — a false "needs setup" / "Settings failed to load" even though the provider is configured. Route both to the RPC pool (same precedent as #55545's session.list/pet.info/ process.list). The handlers are read-only and pool writes go through the lock-guarded write_json, so there's no ordering or safety concern. Test asserts all 5 frontend-polled RPCs are pool-routed. Co-authored-by: izumi0uu <izumi0uu@gmail.com>	2026-07-02 15:44:37 -05:00
Brooklyn Nicholson	1501a338c3	fix(cli): stop profile-bound backends before deleting so rmtree converges delete_profile stopped only the process named in gateway.pid, but a Desktop app spawns a headless `serve`/`dashboard` backend per profile that holds the profile's SQLite connection open and keeps writing sessions/WAL/sandbox files. That backend is never in gateway.pid, so a CLI `hermes profile delete` run while the Desktop app is up left it writing into the tree — rmtree's final rmdir then failed with ENOTEMPTY (#47368 "Bug 2"), and pre-guard it also resurrected the directory. - _profile_bound_backend_pids(): find running Hermes backends bound to this profile via a `--profile <name>` selector or a HERMES_HOME env resolving to the profile dir. Tightly scoped — current-user only, backend subcommands (serve/dashboard/gateway) only so an interactive chat is never killed, and never this process or its ancestors. - _stop_profile_backends(): terminate them (graceful, then force), best-effort so it can never make delete worse. - _rmtree_with_retry(): a few spaced retries absorb the ENOTEMPTY / Windows file-lock race from a just-terminated writer's in-flight -wal/-shm/sandbox writes instead of failing the whole delete on a race the next attempt wins. Complements the recreation guard (deleted profiles no longer reappear) and the Desktop teardown-before-delete flow; this is the CLI-side convergence fix for a delete run while a Desktop-managed backend is live. Part of #47368.	2026-07-02 15:31:35 -05:00
Brooklyn Nicholson	5a6720b884	fix(desktop,tui-gateway,zai): stop thinking-off from reverting to medium A Z.ai desktop user reported thinking reverting to medium after one turn, burning ~200% of a week's credits in 4 days despite reasoning_effort: false in config.yaml. Four compounding bugs: - _session_info reported reasoning_effort "" for disabled reasoning, indistinguishable from unset — the desktop adopted it after the first turn, wiping its sticky "thinking off" pick so every later chat reverted to the default effort. - config.set key=reasoning always wrote agent.reasoning_effort to global config.yaml, so every desktop model-menu selection (preset.effort ?? 'medium') clobbered the user's configured value. Now session-scoped like the messaging gateway's /reasoning, landing on create_reasoning_override so lazily-built sessions keep it too. - YAML `reasoning_effort: false`/`off`/`no` (boolean False) was coerced to "" by every loader's `str(x or "")`, silently re-enabling thinking. parse_reasoning_effort now treats False/"false"/"disabled" as {"enabled": False}; loaders (tui gateway, gateway, cli, cron, delegate) pass the raw value through. The desktop config reader also crashed on the boolean (false.trim()), aborting voice/STT settings. - The zai provider profile never sent thinking on the wire, and GLM-4.5+ defaults to thinking ON server-side — so disabling reasoning was a silent no-op on direct Z.ai, the actual token burner. The profile now emits extra_body.thinking {"type": "enabled"\|"disabled"} for thinking-capable GLM models, mirroring the DeepSeek profile. Also: /new (session reset) now carries reasoning_config across the rebuild like model_override; config.get reasoning prefers the session's live value and maps a config False to "none"; Settings shows "Off" instead of a blank select for hand-written false.	2026-07-02 15:23:47 -05:00
teknium1	254328bf56	fix(auth): remove stale loopback_pkce reference in xAI quarantine removal list The terminal-refresh quarantine filtered in-memory entries on source == "device_code" but built removed_ids from the deleted "loopback_pkce" source name, so the revoked device-code entry was never pruned from the persisted pool in auth.json. Also restores the _print_loopback_ssh_hint test suite scoped to Spotify (the helper's remaining caller) instead of deleting it wholesale.	2026-07-02 13:17:41 -07:00
Jaaneek	5ef0b8acb0	feat(auth): make xAI Grok OAuth device-code-only, drop loopback login Replace the loopback/PKCE-callback server and manual-paste fallback with the RFC 8628 device-code flow as the only xAI Grok OAuth login path. The flow works in headless/SSH/container sessions with no 127.0.0.1 listener, shrinking the local attack surface. - Poll the token endpoint with server-provided interval, honoring slow_down and expires_in; store tokens with auth_mode oauth_device_code. - Adaptive proactive refresh skew for short-lived device-code JWTs; rotated tokens sync back to auth.json, the global root store, and the credential pool (no refresh-token replay). - Clear source suppression on successful re-login (CLI + dashboard) and drop the duplicate dashboard pool entry so exactly one seeded device_code entry exists. - Use the shared device_code source name for consistency with the nous/codex device-code providers. - Desktop: remove the loopback OAuth flow states and dead type variants; pkce providers' sign-in URL selection is unchanged. - Docs (EN + zh-Hans) rewritten for device-code login; drop the deleted --manual-paste flag from documented commands.	2026-07-02 13:17:41 -07:00
LeonSGP43	472d75193f	Prevent deleted profile skeleton revival	2026-07-02 15:11:56 -05:00
teknium1	a2d49de801	fix(terminal): also set MSYS2_ARG_CONV_EXCL for MSYS2/Cygwin bash fallback MSYS_NO_PATHCONV is honored by Git for Windows bash only. _find_bash's final shutil.which fallback can return MSYS2-proper or Cygwin bash, which ignore it and honor MSYS2_ARG_CONV_EXCL instead. Set both so argv path conversion stays disabled regardless of which bash flavor spawns. Also subsumes the cmd /c mangling in #56147.	2026-07-02 11:48:03 -07:00
xxxigm	51c01062d4	test(terminal): cover MSYS_NO_PATHCONV defaults on Windows env builders	2026-07-02 11:48:03 -07:00
David Zhang	30e947e0a0	feat(gateway): persist per-session /model overrides across gateway restarts Per-session /model overrides (_session_model_overrides) were in-memory only, so a gateway restart silently reverted every session to the global default model. Persist the non-secret parts (model/provider/base_url ONLY — never api_key) into the session entry in sessions.json and lazily rehydrate them on first use after a restart, re-resolving credentials through the normal runtime provider resolution. - gateway/session.py: SessionEntry.model_override field with sanitize_model_override() (allowlist: model/provider/base_url) applied on both serialization and deserialization; SessionStore.set_model_override / get_model_override accessors. reset_session() already creates a fresh entry, so /new keeps its clear-on-reset semantics — a restart cannot resurrect an override the user reset away. - gateway/slash_commands.py: write-through at both /model set sites (text command + picker) after storing the in-memory override. - gateway/run.py: _rehydrate_session_model_override() called from _resolve_session_agent_runtime(); in-memory state always wins, credentials are re-resolved per provider (credential-less fallback on failure). Session expiry finalization also drops the persisted override. - tests/gateway/test_session_model_override_persistence.py: restart round-trip, /new clearing, api_key-never-serialized (including tampered sessions.json), rehydration + live-state precedence + credential-failure degradation. Salvaged from #3659 by @Git-on-my-level, narrowed to the restart-persistence gap confirmed in triage.	2026-07-02 05:51:12 -07:00
Jneeee	b98baa3039	feat(config): extra HTTP headers for LLM API calls (#3526 salvage) Named providers / custom_providers entries in config.yaml now accept an extra_headers dict scoped to that endpoint — for reverse proxies, API gateways, and custom auth schemes (e.g. Cloudflare Access service tokens). - hermes_cli/config.py: normalize extra_headers on provider entries (_normalize_custom_provider_entry + providers-dict translation), add get_custom_provider_extra_headers / apply_custom_provider_extra_headers_to_client_kwargs helpers keyed on base_url (case/trailing-slash insensitive, no substring bypass — mirrors the TLS helpers) - hermes_cli/runtime_provider.py: surface extra_headers in the resolved runtime for named custom providers (providers dict, legacy custom_providers list, and the credential-pool path) - run_agent.py / agent/agent_init.py: merge per-provider extra_headers onto the OpenAI client default_headers at construction and on every _apply_client_headers_for_base_url re-application (credential swaps, rebuilds), most-specific level wins; OpenAI-wire only (native Anthropic/Bedrock scoped out) - agent/auxiliary_client.py: accept model.extra_headers as an alias of model.default_headers for the global variant - cli-config.yaml.example: documented commented example - Header values are treated as secrets and never logged Salvaged from PR #3526 by @jneeee, reimplemented against current main. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-07-02 05:33:25 -07:00
Mibayy	4a09b692ec	feat(api-server): per-client model routing via model_routes (#3176 salvage) Adds a no-code routing layer to the OpenAI-compatible API server so one Hermes deployment can map different API clients to different model/provider backends. Clients pick a backend by sending a configured alias as the OpenAI 'model' field; unmatched values fall back to the global model. Configured aliases are listed by GET /v1/models. Precedence (highest first): session /model override > model_routes route > global config. Route provider credentials resolve through _resolve_runtime_agent_kwargs_for_provider (same seam as channel_overrides); per-route api_key/base_url are upstream provider credential overrides — never caller auth, never logged. Salvaged and rebased from PR #3176 by @Mibayy onto current main.	2026-07-02 05:23:28 -07:00
Mibayy	ce9aa869fc	feat(commands): /compact alias + --preview/--dry-run flags for /compress (#3243 salvage) Salvaged from PR #3243 by @Mibayy, reimplemented against current main (the original diff targeted a removed gateway/run.py handler). - /compact is now a first-class alias of /compress (CLI, gateway, Telegram/Slack/Discord command lists, autocomplete) — also fixes the dangling '/compact' references in gateway error messages (gateway/run.py context-exhausted banners). - --preview / --dry-run: report what WOULD be compressed (message counts, token estimate, 'here [N]' boundary) without touching the transcript. Flags coexist with the existing 'here [N]' / focus-topic args on both the CLI and gateway surfaces via shared pure helpers in hermes_cli/partial_compress.py. - --aggressive (LLM-free hard truncation) is intentionally NOT implemented: it would need its own transcript-persistence branch outside the guarded _compress_context rotation machinery (#44794 data-loss class). The flag is recognized and returns an explanatory message pointing at '/compress here [N]' and /undo instead of being mis-parsed as a focus topic. - locales: gateway.compress.aggressive_unsupported added to all 16 catalogs (parity test enforced). - release.py: AUTHOR_MAP entry for contributor credit.	2026-07-02 05:10:31 -07:00
Morgan K	39bff67957	feat(gateway): add 'log' option to display.tool_progress Salvage of #3459 by @keslerm, reimplemented against the restructured progress-callback block in gateway/run.py (resolve_display_setting, needs_progress_queue, thinking-relay). Duplicate PR #3458 by @dlkakbs was submitted 4 minutes earlier with the same feature — both credited. Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com> tool_progress: log keeps the chat silent and appends timestamped tool-call lines to ~/.hermes/logs/tool_calls.log via a dedicated queue drained by an async writer (RotatingFileHandler 5MB x 3, RedactingFormatter so secrets never land on disk). Gateway-only by design; thinking_progress relaying and the webhook gate are unaffected. /verbose now cycles off -> new -> all -> verbose -> log.	2026-07-02 05:09:38 -07:00
Mibayy	070ac2a719	fix(status): label provider as custom when config.yaml model.base_url is set Salvage of the surviving hunk of #3296 by @Mibayy. The PR's gateway _handle_provider_command hunk targets code removed on main (/provider was absorbed into /model + /status, which already read model.base_url); the hermes status mislabel was the remaining live symptom: _effective_provider_label() only checked the legacy OPENAI_BASE_URL env var, so a custom endpoint configured canonically in config.yaml still displayed as OpenRouter.	2026-07-02 04:59:02 -07:00
kshitijk4poor	019950560d	refactor(image-gen): reuse shared image sniffer + raster allowlist in codex backend Replace the plugin-local _IMAGE_MAGIC_MIME table + _sniff_image_mime body with a delegation to agent.image_routing._sniff_mime_from_bytes, the canonical magic-byte sniffer already used across the codebase, then gate its result to the raster formats gpt-image-2's Responses input_image actually accepts (png/jpeg/gif/webp). The shared sniffer also recognizes SVG/TIFF/ICO; without the allowlist those would pass local validation and be rejected server-side with an opaque HTTP 400. Gating locally fails them cleanly as invalid_image_input. Adds a regression test for SVG rejection. Follow-up on top of @CrazyBoyM's #55828.	2026-07-02 17:12:24 +05:30
CrazyBoyM	460235d584	test(image-gen): cap Codex reference inputs	2026-07-02 17:12:24 +05:30
CrazyBoyM	ecffd290a3	feat(image-gen): support Codex image inputs	2026-07-02 17:12:24 +05:30
Evo	a4a562ff0c	fix(browser): guard Camofox snapshot/vision/images on private pages Follow-up to #56874, which added the Camofox private-page SSRF guard (_camofox_current_page_private_url) but wired it only into the Camofox eval path (_camofox_eval). The other Camofox content-read tools — camofox_snapshot, camofox_get_images, and camofox_vision — still read the current page's accessibility tree / images / screenshot without the guard, so on a non-local Camofox backend they can return the content of an intranet or cloud-metadata page (e.g. 169.254.169.254) that the terminal itself can't reach. Apply the same guard, gated on _eval_ssrf_guard_active (non-local backend, not a local sidecar, allow_private_urls unset) and fail-open on probe failure, matching the eval-path guard and the main-browser snapshot/vision guards. camofox_back is intentionally not changed: its target is unknown until navigation completes, and the subsequent content read is already guarded. Adds regression tests covering the three read tools blocking on a private page, the public-page pass-through, and the guard-inactive no-probe path.	2026-07-02 17:07:17 +05:30
HexLab98	ede4d12561	test(codex): cover gateway-scale stale timeout floor and TTFB gate	2026-07-02 17:05:05 +05:30
Teknium	3f2a56d1a4	fix(cli): reliable interrupts, bounded exit, and exit feedback (#57000 ) Three CLI reliability fixes: 1. Interrupt reliability: chat() only re-queued the user's interrupt message when the turn result carried interrupted=True. When the agent thread raced past its last interrupt check (or finished) before the interrupt landed, the message was silently dropped — and the stale _interrupt_requested flag left on the agent instantly aborted the NEXT turn. Un-acknowledged interrupt messages are now re-queued as the next turn and the stale flag is cleared (only when the agent thread actually exited). The clarify-race path also parks the message in _pending_input instead of dropping it. 2. Slow exit (5+ min): stdlib ThreadPoolExecutor workers are non-daemon and joined unconditionally by concurrent.futures' atexit hook — even after shutdown(wait=False). One wedged tool worker (abandoned after interrupt/timeout) held the process open forever. Promoted async_delegation's daemon executor to a shared tools/daemon_pool module and adopted it in tool_executor (concurrent tool batches), memory_manager (background sync), delegate_tool (child timeout wrapper + batch fan-out), and skills_hub (source fan-out). Added a 30s exit watchdog (HERMES_EXIT_WATCHDOG_S) armed at _run_cleanup start as a backstop for wedged cleanup steps. 3. Exit jank: after prompt_toolkit tears down the input/status bars the terminal sat silent for the whole cleanup window, looking hung. Print 'Shutting down… (finalizing session)' immediately at exit start. E2E: live PTY interrupt of a foreground 'sleep 120' terminal tool now aborts in ~1s and the typed message runs as the next turn; wedged-worker + wedged-cleanup subprocess exits in 5.8s (watchdog) instead of hanging.	2026-07-02 04:20:43 -07:00
Tarun Ravikumar	2068754d6f	feat(api-server): inline MEDIA: image tags as base64 data URLs for remote frontends Salvage of the surviving piece of #2696 by @tarunravi. The PR's other two changes (tool progress streaming, SSE None-sentinel fix) were independently superseded on main by the structured hermes.tool.progress SSE events and the rewritten queue-drain loop. Remote OpenAI-compatible frontends can't read server-local file paths, so MEDIA:<path> tags (browser screenshots, generated images) were dead text. _resolve_media_to_data_urls() now inlines small (<=5MB) local images as markdown data URLs across all four response surfaces: chat completions (non-streaming), session chat, session chat stream final event, and the Responses API. Non-image, missing, or oversized paths pass through untouched.	2026-07-02 03:23:44 -07:00
CharmingGroot	88bd1c01e1	fix(email): harden adapter against malformed IMAP responses Salvage of #2794 by @CharmingGroot, ported to the relocated plugins/platforms/email/adapter.py: - Guard raw_email = msg_data[0][1] against IndexError/TypeError and non-bytes payloads. UIDs are added to _seen_uids before fetch, so an exception mid-batch permanently skipped every remaining message in the batch — now the bad message is logged and skipped instead. - Message-ID domain generation falls back to 'localhost' when EMAIL_ADDRESS lacks '@' (now via a shared _message_id_domain() helper covering all 3 send paths; the PR fixed 2 of 3).	2026-07-02 03:12:53 -07:00
crazywriter1	0010c14e66	feat(gateway): per-channel model and system prompt overrides (Fixes #1955 ) - ChannelOverride + channel_overrides on PlatformConfig - Resolve model/runtime: session /model, then channel_overrides, then global - Thread/parent channel lookup; bridge discord.channel_overrides from YAML - Drop unrelated test and delegate_tool changes from PR scope	2026-07-02 03:08:11 -07:00
crazywriter1	ebef73f6b8	feat(gateway): per-channel model and system prompt overrides (Fixes #1955 ) - config: ChannelOverride + PlatformConfig.channel_overrides - run: _resolve_model_for_channel, _get_system_prompt_for_channel, channel provider runtime - tests: channel overrides + config guard for bare runner; conftest asyncio fix; slack/whatsapp warning filters Made-with: Cursor	2026-07-02 03:08:11 -07:00
Teknium	902b0b70e4	test: env-flag 'on' truthy behavior contract (#2863 follow-up)	2026-07-02 03:00:59 -07:00
VolodymyrBg	ea5d75befd	fix(webhook): remove unused payload from delivery state	2026-07-02 03:00:17 -07:00
Teknium	6e369a3762	feat(delegation): unify concurrency caps — deprecate max_async_children (#56955 ) delegation.max_concurrent_children is now the single cap for both a batch's parallelism and concurrent background delegation units. - _get_max_async_children() delegates to _get_max_concurrent_children(); a leftover max_async_children key logs a one-time deprecation warning - config v32→33 migration removes the stale key, folding a raised max_async_children into max_concurrent_children (max wins, no lost headroom) - capacity error messages now point at max_concurrent_children - pool-at-capacity sync fallback now attaches an explanatory note so the model/user know why the call blocked instead of dispatching async Previously users who raised max_concurrent_children (e.g. to 15) still hit the invisible default-3 async cap: the 4th background delegate_task silently ran inline, blocking the turn with no signal.	2026-07-02 02:53:39 -07:00
Teknium	14639ded77	fix(terminal): stop stripping CLAUDE_CODE_OAUTH_TOKEN from spawned subprocesses (#56935 ) CLAUDE_CODE_OAUTH_TOKEN is set and owned by the user's Claude Code install (subscription OAuth), not a Hermes-managed inference credential — Claude subscription auth is not a working Hermes provider path. Blocklisting it broke agent-spawned claude CLIs: with no token in the child env, claude fell through to the shared macOS Keychain / ~/.claude/.credentials.json store and, on auth failure, cleared it — logging the user out of their interactive Claude sessions and the desktop app. Exempt it from _HERMES_PROVIDER_ENV_BLOCKLIST (it arrives via the anthropic registry entry, so discard explicitly with rationale). ANTHROPIC_API_KEY / ANTHROPIC_TOKEN and every other provider credential remain stripped, and the GHSA-rhgp-j443-p4rf fail-closed passthrough guard is unchanged for everything still on the blocklist. Fixes #55878	2026-07-02 02:13:30 -07:00
kshitijk4poor	b837f07dcd	fix(agent): route restore custom-pool match through canonical helper Follow-up on the salvaged #56392 guard. The cherry-picked change matched custom:<name> pool entries against the primary by raw base_url string equality, which (a) can't disambiguate two named custom providers sharing one gateway base_url and (b) left a latent bare-"custom" entry bypass. Route the match through get_custom_provider_pool_key(rt[base_url]) compared against the entry's custom:<name> key, mirroring the sibling guard in recover_with_credential_pool. Use CUSTOM_POOL_PREFIX instead of the literal. Add regression tests for the custom same-endpoint (swap) and cross-endpoint (skip) branches, plus the plain-provider fallback-pool case from #56885.	2026-07-02 13:41:53 +05:30
openhands	820a052575	fix(agent): keep primary runtime restore on matching credential pool (#56374 )	2026-07-02 13:41:53 +05:30
Teknium	fb403a3a73	fix(auxiliary): retry transient blips harder + isolate client cache per model (#56889 ) Two related hardening fixes for auxiliary calls (which include MoA reference advisors — a pinned-model path where provider fallback is not a meaningful recovery): 1. Transient-transport retries: the same-provider retry on a connection reset / timeout / 5xx / 408 was a single attempt, then fallback. For a pinned aux call a second blip silently loses the call (root of the run2 double-advisor 'Connection error' collapse — a genuine upstream blip). Now retries N times with exponential backoff, N = auxiliary.transient_retries (default 2 -> 3 total attempts, clamped [0,6]). Compression-on-timeout fast-fail carve-out preserved. 2. Per-model client-cache isolation: _client_cache_key excluded the model, so two concurrent auxiliary calls to the same provider/base_url/key but different models (e.g. an opus + gpt-5.5 MoA fan-out) shared one cache entry and could race each other's client lifecycle. Model now participates in the key -> distinct clients, no cross-call races. Same-model reuse unchanged. - agent/auxiliary_client.py: _transient_retry_count() + backoff loop; model in _client_cache_key and both call sites. - hermes_cli/config.py: auxiliary.transient_retries default (2). - tests: new retry/isolation tests; updated 2 stale-expectation tests to the corrected behavior (per-model resolve; N-retry escalation). Backoff base is overridable (_TRANSIENT_RETRY_BACKOFF_BASE) so tests don't sleep.	2026-07-02 01:09:37 -07:00
Nick Mason	80733413f9	fix(tools): don't drop a toolset from platform inference when a tool is registered into it _get_platform_tools reverse-maps a platform composite to configurable toolsets with an all-tools subset test. Because get_toolset() merges registry-registered tools into a toolset, a tool added to a toolset (delegate_cli -> delegation; desktop-only read_terminal -> terminal) that the static composite never listed made the subset test fail, silently dropping the entire toolset on api_server and other inference-based platforms. Compare the toolset's static membership at all three reverse-map sites. Fixes #49622. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-07-02 13:25:25 +05:30
Nick Mason	5317993a6d	fix(tools): expose static (pre-registry-merge) toolset view for platform inference Adds include_registry=True kwarg to resolve_toolset/get_toolset. When False, returns only the static TOOLSETS view with no registry-merged tools — the composite-authored membership platform reverse-mapping must compare against. Default True preserves all existing behavior; this is the enabling half of the api_server toolset-drop fix (#49622). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-07-02 13:25:25 +05:30
Ray	6a58badfdc	fix(browser): guard Camofox eval private pages Extends the browser private-network eval guard to the Camofox backend. On main, _browser_eval() returned early in Camofox mode before running the shared private-URL literal pre-scan and before re-checking the page URL after eval, leaving Camofox as a sibling backend that could execute browser_console(expression=...) against private/internal targets. - move the eval private-URL literal pre-scan before the Camofox early return - add a Camofox current-page private-URL probe via the evaluate endpoint - withhold Camofox eval results when the page is now private/internal Follow-up to browser private-network hardening in #56173, #56526, #56664. Salvage of #56764 by @rayjun (rayoo), cherry-picked to preserve authorship.	2026-07-02 13:10:30 +05:30
kshitijk4poor	f2b8a5d541	test(gateway): assert _record_gateway_session_peer fires only on the persisted split The fake _SessionStore tracked peer_records but no test read it, leaving #55300's peer-record behavior unasserted. Add a positive assertion on the persist path and negative (== []) assertions on the two stale/moved-binding skip paths, so the peer-record side effect is bound. Mutation-verified: removing the production _record_gateway_session_peer call makes the positive assertion fail. Co-authored-by: João Vitor Cunha <jvsantos.cunha@gmail.com>	2026-07-02 12:49:42 +05:30
kshitijk4poor	ed6f80a20c	test(gateway): align fake SessionStore with _record_gateway_session_peer The #55300 peer-recording call now fires on the failed-turn compression split path; the fake _SessionStore in test_compression_failure_session_sync (carried in with #55721's test changes) lacked that method. Add a call-tracking no-op so the combined salvage's tests pass. Co-authored-by: João Vitor Cunha <jvsantos.cunha@gmail.com>	2026-07-02 12:49:42 +05:30
r266-tech	2a04137322	fix(gateway): preserve platform + gateway_session_key on /compress temp agent Manual /compress built a temporary AIAgent without the originating platform / stable gateway session key, so an external context engine ingested the retained transcript tail as source=cli during /compress and again as the real platform on resume (duplicate cli,telegram rows). Pass platform=_platform_config_key(source.platform) + the in-scope gateway_session_key, mirroring the normal gateway turn. Assigned into runtime_kwargs (single-valued, authoritative) so they neither collide into a duplicate-kwarg TypeError nor lose to a stale resolver value. Fixes #50422.	2026-07-02 12:49:42 +05:30
Jake Present	00ec3b1884	fix(gateway): ignore stale compression session splits	2026-07-02 12:49:42 +05:30
João Vitor Cunha	d5b4879d4a	fix(gateway): preserve peer routing across compression recovery	2026-07-02 12:49:42 +05:30
Teknium	543d305bbb	feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756 ) MoA per-turn latency is dominated by advisor GENERATION: turn wall time correlates ~0.88 with output tokens and ~-0.03 with input tokens (measured over 52 turns). Each turn waits for the slowest advisor to finish writing, and advisors were uncapped — writing multi-thousand-token essays the aggregator only needs the gist of. Add an opt-in per-preset reference_max_tokens knob (mirrors reference_temperature) that caps ADVISOR output only; the acting aggregator is never capped. Default None = uncapped, so existing presets are byte-for-byte unchanged (no regression). Wired through both MoA execution paths (MoAChatCompletions.create and aggregate_moa_context). E2E: same task, closed preset uncapped vs reference_max_tokens=600 -> 59s to 33s (~44% faster), final answer identical/correct. - hermes_cli/moa_config.py: _coerce_int_or_none helper + reference_max_tokens in _normalize_preset/_default_preset/flattened view - agent/moa_loop.py: read preset.reference_max_tokens, pass to reference fan-out - agent/conversation_loop.py: pass reference_max_tokens on the per-turn path - tests + docs	2026-07-02 00:16:35 -07:00
Ben Barclay	9be39de0f2	fix(auth): make HERMES_PORTAL_BASE_URL/NOUS_PORTAL_BASE_URL bypass the Portal host allowlist (#56864 ) Ben caught that the initial approach (widening _NOUS_PORTAL_ALLOWED_HOSTS to include the staging host) was the wrong fix -- env vars are supposed to override the allowlist, mirroring how NOUS_INFERENCE_BASE_URL already bypasses _ALLOWED_NOUS_INFERENCE_HOSTS via _nous_inference_env_override(). The actual bug: both resolve_nous_access_token and resolve_nous_runtime_credentials read `_optional_base_url(state.get("portal_base_url")) or os.getenv(...) or ...` -- a plain `or` chain where the STORED state value wins first (short-circuits before the env vars are even read), and then whichever value won gets run through the same _NOUS_PORTAL_ALLOWED_HOSTS gate regardless of its source. So a hosted agent stamped with HERMES_PORTAL_BASE_URL=<staging> in its env AND a staging portal_base_url already persisted to auth.json would still get silently rewritten to prod on every refresh, because the env var never even got a chance to be consulted. Revert the previous _NOUS_PORTAL_ALLOWED_HOSTS widening entirely -- staying prod-only preserves the allowlist's actual job (rejecting an untrusted network-provided portal_base_url persisted to auth.json by a compromised Portal response). Add _nous_portal_env_override() (mirrors _nous_inference_env_override()) and restructure both call sites so the env override is checked FIRST and, when set, wins outright and skips the allowlist gate entirely -- the allowlist only ever runs against the fallback (stored-state-or-default) path now. Rewrote tests/hermes_cli/test_nous_portal_staging_allowlist.py to test the actual fix: the helper function, and an end-to-end resolve_nous_access_token proof that the env override wins even when state ALSO has the staging host stored (the exact incident shape), that it wins over a stored PROD host too, and that the allowlist's heal-to-prod behaviour for an untrusted stored value is preserved when no override is set.	2026-07-02 06:52:46 +00:00
kshitij	88d1d6206f	fix(streaming): handle completed responses with empty/None choices (#55933 ) (#56713 ) * fix(streaming): handle completed responses with empty/None choices The streaming fallback guard added in #55932 recognized a completed response object only when its `choices` was a non-empty list. But an adapter can return a completed response whose `choices` is `None` or an empty list (an error / content-filter / terminal frame) — still a whole, non-iterable response, not a token stream. Those shapes fell through to `for chunk in stream` and crashed with 'types.SimpleNamespace' object is not iterable which is exactly issue #55933 (MoA `openai-codex` aggregator on TUI/Desktop, where a stream consumer forces the streaming path). Broaden the guard to discriminate on the PRESENCE of a `choices` attribute (a genuine provider Stream object exposes none), disable streaming for the session, and return the completed object so the outer loop's normal invalid-response validation handles empty/None choices via its retry path instead of iterating. Based on the diagnosis in #56525 by @spiky02plateau (that PR normalized the MoA aggregator return with a one-shot chunk iterator; the common text/tool-call crash was already fixed at this seam by #55932, so this extends the existing guard to cover only the remaining empty/None-choices gap). Fixes #55933 * refactor(streaming): simplify empty-choices guard body and parametrize tests Post-review cleanup (no behavior change): - Inline the single-use `response_choices` local and drop the redundant `if first_choice is not None else None` guard (getattr(None, ...) already returns the default safely). - Collapse the two near-identical empty/None-choices regression tests into one `@pytest.mark.parametrize` case. Mutation-verified: reverting the guard to the old non-empty-list condition still makes both parametrized cases fail with the historical 'types.SimpleNamespace' object is not iterable. --------- Co-authored-by: spiky02plateau <155588579+spiky02plateau@users.noreply.github.com>	2026-07-02 06:36:20 +05:30
kshitijk4poor	76be770091	test(moa): assert aux cap against model resolver, not frozen literal Follow-up to the salvaged fix: the regression test asserted a frozen max_tokens == 128_000 literal, coupling it to the Opus-4-8 model table. Assert against _get_anthropic_max_output("claude-opus-4-8") plus > 2000 instead, so the test survives model-table churn while still catching a regression to the old `or 2000` fallback.	2026-07-02 06:31:18 +05:30
helix4u	7951250947	fix(moa): lift hidden Anthropic aux output cap	2026-07-02 06:31:18 +05:30
kshitij	4d5d9fffd0	Merge pull request #56582 from srojk34/fix/vertex-credentials-env-leak security(terminal): strip VERTEX_CREDENTIALS_PATH/GOOGLE_APPLICATION_CREDENTIALS from subprocess env	2026-07-02 06:08:55 +05:30
srojk34	7f64cce96d	security(vertex): route credential/project/region resolution through the profile secret scope agent/vertex_adapter.py resolved VERTEX_CREDENTIALS_PATH, GOOGLE_APPLICATION_CREDENTIALS, VERTEX_PROJECT_ID, and VERTEX_REGION via raw os.environ.get() instead of the profile-scoped get_secret() every other credential lookup in hermes_cli/runtime_provider.py uses. In a multiplex gateway serving several profiles from one process, os.environ still holds whichever profile's .env python-dotenv loaded at boot — so a raw read here let one profile's turn silently mint a Vertex OAuth2 token from, and get billed against, a different profile's GCP service account. No error, no fail-closed guard: the multiplex UnscopedSecretError protection was bypassed entirely because these reads never went through get_secret(). - _resolve_credentials_path/_resolve_project_override/_resolve_region now call agent.secret_scope.get_secret(), matching the _getenv() pattern already used for every other provider's credentials. - get_vertex_credentials()'s ADC fallback (google.auth.default()) reads GOOGLE_APPLICATION_CREDENTIALS from os.environ internally, bypassing get_secret() entirely — closed with a narrow guard: when multiplexing is active and this profile's scope has no Vertex credentials of its own, but os.environ still carries a value (left by a different profile's boot-time dotenv load), refuse ADC rather than silently authenticate as a stranger. - Zero behavior change for single-profile installs: get_secret() falls through to os.environ transparently whenever multiplexing is off. Same bug class as the already-fixed _HERMES_OAUTH_FILE/_AUTH_JSON_PATH/ HOOKS_DIR cross-profile leaks, now closed for Vertex's OAuth2 credential path.	2026-07-02 06:07:56 +05:30
kshitij	2f7c51a3e2	Merge pull request #56605 from simpolism/codex/discord-inline-bot-mentions fix(discord): ignore reply-ping-only mentions for bot-authored messages	2026-07-02 05:23:44 +05:30
dsad	830860306d	Guard browser CDP on private pages	2026-07-02 05:23:23 +05:30
kshitijk4poor	676236bb1d	fix(agent): honor custom CA certs on aux client + harden TLS resolution The salvaged fix wired per-provider ssl_ca_cert / ssl_verify (and HERMES_CA_BUNDLE) into the MAIN OpenAI client. This follow-up: - Auxiliary client parity: process_bootstrap.build_keepalive_http_client accepts and forwards verify; auxiliary_client._resolve_aux_verify mirrors the main-client TLS resolution (via load_config_readonly, the read-only fast path) so compression/vision/web_extract/title-gen/session_search honor the same per-provider CA. Without this, chat worked against a private-CA endpoint but every auxiliary call still failed APIConnectionError. - switch_model now reads custom_providers from live config (load_config_readonly) instead of the init-time agent._custom_providers snapshot, so ssl_ca_cert / ssl_verify edits are honored on mid-session model switch — matching the context-length reload (#15779). - Drop the dead client-level verify= where a custom httpx transport is used (httpx ignores it there); verify lives on the transport. Fix docstrings. Applies to both run_agent._build_keepalive_http_client and process_bootstrap. - resolve_httpx_verify: add CURL_CA_BUNDLE to the env chain (consistency with agent/ssl_guard._CA_BUNDLE_ENV_VARS) and emit a loud logger.warning naming the endpoint whenever ssl_verify:false disables verification. - get_custom_provider_tls_settings: case-insensitive base_url match (config dedup already lowercases; scheme/host are case-insensitive) so a mixed-case entry doesn't silently drop its CA. Exact match preserved — no prefix bypass. - Demote best-effort except Exception: pass in agent_init/switch_model to logger.debug(exc_info=True). - Tests for aux verify forwarding, _resolve_aux_verify, case-insensitive match, and prefix-bypass rejection.	2026-07-02 04:51:56 +05:30

1 2 3 4 5 ...

6997 commits