hermes-agent

Author	SHA1	Message	Date
snav	35eb93c8df	fix(codex-runtime): re-running /codex-runtime codex_app_server when already enabled now triggers migration The /codex-runtime slash command short-circuits with "openai_runtime already set" when invoked with the same value as the current config, and crucially skips the entire migration block below. The check conflates two things: (a) "the config value is correct" and (b) "the world state (managed block in ~/.codex/config.toml, hermes-tools MCP callback, plugin discovery) is converged". Common footgun this exposes: a user who pre-sets `model.openai_runtime: codex_app_server` directly in config.yaml (reasonable thing to do) and then runs /codex-runtime codex_app_server to trigger migration sees "already set" and silently gets no migration. ~/.codex/config.toml never receives the managed block, the hermes-tools MCP callback never registers, and codex falls through to its default runtime instead of the app-server one — visibly successful but functionally partial setup. The migration is idempotent by design (it replaces its own managed block in place between MIGRATION_MARKER and MIGRATION_END_MARKER), so re-running it is safe and cheap. Fix the short-circuit to fall through to migration when re-applying codex_app_server while skipping the config persist (no value-level change needed). The disable case (re-applying "auto") still short-circuits because disabling doesn't touch ~/.codex/config.toml at all. The user-visible message changes to "openai_runtime already set to codex_app_server — re-applying migration" so re-runs surface what happened. Regression test (test_reapply_codex_app_server_runs_migration) asserts: - migrate() was called when re-applying - persist_callback was NOT called (no config write on no-op transitions) - migration output (MCP servers, sandbox default) surfaces in the user-visible message - requires_new_session is True so callers know to /reset Verified RED→GREEN: the test fails on origin/main with "migration must run on reapply, not just first enable" and passes with this fix. Full test_codex_runtime_switch.py suite: 31 passed.	2026-07-01 23:51:54 +05:30
Teknium	eae3700b16	fix(moa): raise aux timeouts to 900s and give the Codex aux path a stable prompt_cache_key (#56395 ) Two independent MoA auxiliary-call fixes: #53866 — auxiliary.moa_reference.timeout and auxiliary.moa_aggregator.timeout were 600s while moa_agent was 120s. Raise both to 900s so a genuinely long reference/aggregator turn (mixed providers, deep reasoning, long tool chains) has headroom instead of being cut mid-generation. #53735 — _CodexCompletionsAdapter (the Codex/Responses auxiliary path used by the MoA acting-aggregator, compression, web_extract, session_search, etc.) never set prompt_cache_key, so it stayed cache-cold while the MAIN Responses transport (agent/transports/codex.py) was warm. Derive the same content-addressed key via the shared _content_cache_key(instructions, tools) helper and set it on the aux Responses request, with the same host guards the main transport uses (xAI carries the key in extra_body; GitHub/Copilot opts out of cache-key routing). Tests: 5 new prompt_cache_key cases (set+prefixed, stable across identical prefix, differs on different instructions, skipped for xai/github hosts). tests/agent/test_auxiliary_client.py 279 pass; tests/hermes_cli/test_config.py 130 pass.	2026-07-01 06:02:40 -07:00
teknium1	3f6c6bd29e	fix(vertex): surface Vertex on the desktop Keys tab for provider parity The provider-parity contract (tests/hermes_cli/test_provider_parity.py) requires every hermes model provider to be configurable in the desktop Providers tabs. Vertex authenticates via OAuth2 (service-account JSON / ADC) and has no api_key_env_vars, so — like bedrock's aws_sdk — it needs its credential env var tagged to the provider card explicitly. Tag VERTEX_CREDENTIALS_PATH to the vertex card in _catalog_provider_env_metadata().	2026-07-01 05:25:33 -07:00
Steve Lawton	c73e74386b	feat(vertex): add Google Vertex AI provider for Gemini (OAuth2) Adds Vertex AI as a first-class provider for Gemini models via Vertex's OpenAI-compatible endpoint. Vertex authenticates with short-lived OAuth2 access tokens (service-account JSON or ADC), not a static API key — the missing piece behind the recurring requests (#13484, #12639, #56259). - agent/vertex_adapter.py: OAuth2 token minting + refresh-on-expiry (5-min margin), ADC->service-account fallback, global vs regional endpoint URLs. Config precedence: env var > config.yaml > default. - plugins/model-providers/vertex/: provider profile (auth_type=vertex), reuses Gemini's extra_body.google.thinking_config translation. - runtime_provider: vertex short-circuit BEFORE the credential pool so a credentials-file path is never mistaken for a static API key; mints a fresh token + computes base_url per resolve. - run_agent + conversation_loop: _try_refresh_vertex_client_credentials() re-mints the token and rebuilds the client on a mid-session 401, so a long-lived gateway agent survives token expiry (~1h). - auxiliary_client: vertex auth_type branch for side-LLM tasks. - config.yaml: vertex.project_id / vertex.region (non-secret, bridged to env); credential path stays in .env (VERTEX_CREDENTIALS_PATH). - setup wizard + model picker: dedicated _model_flow_vertex; curated google/gemini-* model list; --provider choices. - pricing/metadata: Vertex prices off the gemini docs snapshot; endpoint host auto-maps to the vertex provider (no probe spam). - lazy_deps + pyproject [vertex] extra: google-auth, opt-in only. - docs: guides/google-vertex.md + providers page; tests for adapter + runtime resolution. Salvages and modernizes #8427 by @slawt onto current main: rewired from the legacy PROVIDER_REGISTRY path to the provider-profile architecture, moved non-secret config out of .env into config.yaml, and added the per-turn 401 token-refresh the original lacked.	2026-07-01 05:25:33 -07:00
HODLCLONE	70f8b96d17	fix: preserve Nous runtime auth path label	2026-07-01 05:06:00 -07:00
HODLCLONE	6ed2f5d76f	fix: make Nous Portal access token resolution resilient - Track auth store source path on Nous state reads and write rotated OAuth refresh tokens back to the same store, preventing stale-token replays when Hermes falls back to a global/root auth.json. - Skip Nous fallback entries locally when no access/refresh token is present, suppressing repeated failed resolution attempts within a session. - Sync session model metadata after fallback switches so the gateway DB reflects the backend that actually served the latest turn.	2026-07-01 05:06:00 -07:00
yongjin	a0beb52a50	fix(browser): harden browser tool safety boundaries Add policy gates and output redaction for browser/CDP surfaces, strengthen session ownership tracking, and block credential-like query parameters before third-party browser/web backends receive URLs. Inspired by the agbrowse review: keep local browser magic-link flows possible while preventing cloud reader/browser escalation from receiving opaque token, code, signature, or key query parameters.	2026-07-01 05:04:41 -07:00
teknium1	34de127200	fix(auth): widen portal_base_url allowlist guard to runtime credential path The salvaged PR guarded only resolve_nous_access_token; the primary resolve_nous_runtime_credentials path also POSTs the refresh token to portal_base_url on refresh with no allowlist check. Mirror the guard there so a poisoned host can't receive the bearer, and drop the stray duplicated allowlist comment. Adds a sibling-site regression test.	2026-07-01 04:57:40 -07:00
szzhoujiarui	f3c5327e67	fix(auth): validate portal_base_url and migrate stale api.nousresearch.com (#44710 )	2026-07-01 04:57:40 -07:00
Jack Earnest	9138176dcd	fix(gateway): don't resolve node symlink into profile dir generate_systemd_unit() and generate_launchd_plist() used Path(shutil.which('node')).resolve().parent to find the node bin dir. When ~/.local/bin/node is a symlink into a specific profile's node install (e.g. ~/.hermes/profiles/<p>/node/bin/node), .resolve() chases it and bakes that one profile's path into EVERY profile's service definition. This breaks profile isolation and makes systemd_unit_is_current() perpetually False: each gateway rewrites its unit + daemon-reload on every boot, destabilizing multi-profile setups into a ~5-minute restart loop (observed NRestarts ~1600 across two gateways). Fix: use Path(resolved_node).parent — the directory where node is found on PATH — instead of chasing the symlink to its resolved target. This keeps generated service definitions profile-agnostic. Affects both the systemd (Linux) and launchd (macOS) unit generators.	2026-07-01 04:57:21 -07:00
srojk34	a76aa6198c	fix(cli): flush un-persisted messages before /resume and /branch end the old session compress_context() and /new already flush un-persisted messages before calling end_session() (fixed in #47202), but /resume and /branch still call end_session() directly. When a turn is interrupted mid-flight and the user immediately runs /resume or /branch, messages generated during that turn have not yet been written to state.db and are silently lost on session rotation. Add the same best-effort _flush_messages_to_session_db() call before end_session() in both _handle_resume_command and _handle_branch_command, mirroring the pattern established in cli.py:new_session(). Regression tests verify the flush is called when an agent is present.	2026-07-01 17:08:55 +05:30
kshitijk4poor	fb7a38ad21	fix(macos): compose launchd reload retry with _launchctl_bootstrap + drain-aware window Reworks @valenteff's #53277 fix per review (Teknium's 3 findings): - Route refresh_launchd_plist_if_needed's bootstrap through the existing _launchctl_bootstrap() EIO-recovery helper (canonical since #56256), wrapped in a wall-clock retry loop, instead of an ad-hoc 5x2s loop. - Window sized to agent.restart_drain_timeout (default 180s), not a fixed ~10s: the failure happens while the old gateway is still draining (finding 1). - Retry on subprocess.TimeoutExpired too, not just CalledProcessError — a bootstrap timeout after bootout otherwise escapes and leaves the service unloaded (finding 2). - Confirm success with launchctl list, not a bare bootstrap exit 0 (finding 3); mirror verify+drain-window in the detached-helper bash path. - Shared helpers _launchd_reload_log_path / _append_launchd_reload_log / _launchctl_label_registered / _retry_launchctl_bootstrap_until_registered. 3 new tests cover retry-until-listed, TimeoutExpired-retried, deadline-exhaust. E2E: real reload log + mocked launchctl — retries CalledProcessError+TimeoutExpired, verifies via launchctl list, logs failures.	2026-07-01 16:56:14 +05:30
Fabio Fernandes Valente	7a7d19e73b	fix(macos): retry launchd reload on transient bootstrap failure refresh_launchd_plist_if_needed ran `launchctl bootout` then `launchctl bootstrap` with errors silenced (`2>/dev/null` in the detached helper, `check=False` in the direct subprocess path). Under high load or a launchd race, the bootout succeeds — removing the service from launchd — but the follow-up bootstrap fails silently. The service stays unregistered; KeepAlive can't revive a service launchd no longer knows about, so the gateway stays dark until a manual `launchctl bootstrap`. Observed incident (2026-06-26): `/restart` in chat triggered a planned drain; during the drain a separate call re-triggered the plist refresh, which bootout'd the live service. Under loadavg 9.48 the bootstrap failed silently — 2h35min offline until manual recovery. Fix: retry the bootstrap up to 5 times with 2s back-off, verify with `launchctl list <label>` afterwards, and log failures to ~/.hermes/logs/launchd-reload.log so the health watchdog can detect a persistent orphan. Mirrors the contract across both the detached helper (refresh inside gateway tree) and the direct subprocess path (refresh from external CLI). Existing tests pass: - test_refresh_defers_reload_when_running_inside_gateway_tree - test_refresh_uses_direct_reload_when_not_inside_gateway_tree Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-07-01 16:56:14 +05:30
Teknium	81595cd588	fix(dashboard): run plugin gate after auth + enable example fixture Follow-up on the salvaged #47491 commits: - Register _plugin_api_runtime_gate BEFORE the auth middlewares so it executes AFTER them, and add an explicit auth check: unauthenticated requests to /api/plugins/<name>/ fall through to auth's 401 instead of this gate's 404. Prevents the gate from becoming a plugin-name oracle (an unauthenticated caller could otherwise fingerprint installed/enabled plugins by status code). Keeps test_non_kanban_plugin_route_requires_auth green. - Enable the 'example' user plugin in the _install_example_plugin test fixture so the auth / static-asset-allowlist tests still reach the real serving paths now that user plugins are gated on plugins.enabled. - Mark the runtime-gate unit-test scopes as authenticated so they exercise the enabled/disabled policy under the new auth-first ordering.	2026-07-01 04:05:15 -07:00
manusjs	b2e0086f1b	fix(dashboard): enforce plugin disabled gate at request time and for bundled assets Address two residual bypasses identified in review: 1. Add _plugin_api_runtime_gate middleware that checks plugins.enabled/ plugins.disabled on every request to /api/plugins/{name}/... routes. Previously, disabling a plugin at runtime had no effect on its already- mounted API routes until a restart. 2. Extend serve_plugin_asset to check plugins.disabled for bundled plugins. Previously, only user plugins were gated — a bundled plugin in plugins.disabled would still serve assets from the unauthenticated /dashboard-plugins/{name}/... endpoint. Both fixes ensure the enabled/disabled policy is evaluated live at request time, not just at startup. Adds regression tests covering: - Middleware blocks disabled user plugin API routes (404) - Middleware blocks user plugin removed from enabled set (404) - Middleware passes enabled user plugin API routes - Middleware blocks disabled bundled plugin API routes (404) - Bundled plugin assets return 404 when disabled - Bundled plugin assets served normally when not disabled - User plugin asset gating still works correctly	2026-07-01 04:05:15 -07:00
manusjs	7cff95644d	fix(dashboard): gate plugin asset serving and API mount on plugins.enabled User-installed dashboard plugins had their assets served and Python backend code imported without checking the plugins.enabled allowlist. This meant a plugin installed in the plugins directory but not enabled could still execute code at dashboard startup and serve arbitrary files. Changes: - get_dashboard_plugins API: filter out user plugins not in enabled set - serve_plugin_asset: reject requests for disabled/non-enabled user plugins - _mount_plugin_api_routes: skip Python import for non-enabled user plugins - Bundled plugins still load by default but respect explicit disables Fixes #46435	2026-07-01 04:05:15 -07:00
HiaHia	8feeb0ccb8	fix(gateway): retry launchd bootstrap after bootout on EIO for install/start On macOS, `launchctl bootstrap` of a label still registered in the domain fails with 5: Input/output error (EIO). That is the already loaded case — a stale registration from an interrupted restart or a bootout that didn't settle — recoverable by booting the leftover out and bootstrapping again, and distinct from the domain being genuinely unmanageable. launchd_install and launchd_start (both bootstrap paths) treated exit 5 as 'launchd cannot manage this macOS version' and silently degraded to a detached process, losing auto-start at login and crash-restart. Centralize bootstrap in _launchctl_bootstrap(), which on EIO boots the stale label out and retries once; only if the retry also fails does the error propagate so callers apply their existing _launchctl_domain_unsupported fallback for a genuinely broken domain. launchd_restart already boots out before bootstrapping (its drained job is almost always still registered, so a plain bootstrap would hit EIO on the common path), so it keeps its explicit pre-bootout rather than routing through the bootstrap-first helper. Corrected the stale exit-5 comment that claimed it always meant an unmanageable domain. Adds TestLaunchctlBootstrapEioRetry covering clean bootstrap (no bootout), EIO -> bootout -> retry success, persistent EIO re-raise, and non-EIO re-raise without a spurious bootout.	2026-07-01 03:21:20 -07:00
teknium1	b48cacb97b	fix(gateway,cron): guard cron model-tool path + add auto-resume loop breaker (#30719 ) Completes the #30719 restart-loop defenses. Defenses 1-2 (the _HERMES_GATEWAY guard on `hermes gateway stop\|restart` + terminal_tool, and the cron-creation lifecycle filter) already landed on main, but two gaps remained: - The agent's `cronjob` model tool calls cron.jobs.create_job directly, bypassing the hermes_cli.cron.cron_create CLI filter, so lifecycle commands scheduled via the model tool were only blocked at execution time (terminal_tool), not at creation. Moved the filter to a shared cron/lifecycle_guard.py enforced at create_job — the single chokepoint every job-creation path hits (CLI + model tool). Re-exported _contains_gateway_lifecycle_command from hermes_cli.cron so terminal_tool's import keeps working. - No breaker for the auto-resume loop itself. Defenses 1-2 cover the cron/CLI/terminal paths, but any other SIGTERM source (e.g. a raw terminal("launchctl kickstart ai.hermes.gateway")) still triggers the boot->auto-resume->re-run cycle. Added gateway/restart_loop_guard.py: counts restart-interrupted boots in a rolling window (config gateway.restart_loop_guard, default 3 boots / 60s) and skips auto-resume for that boot once tripped. The gateway still comes up and serves real inbound messages; it just stops replaying the session that keeps killing it, putting a human back in the loop. Also tightened the lifecycle regex over main's version: dropped `hermes gateway start` (benign), required the gateway identifier on the launchctl/systemctl branches (so `launchctl unload ai.hermes.update-checker.plist` and `systemctl restart hermes-meta.service` no longer false-positive), added the inverse pkill token order, and fixed the binary-script bypass (decode with errors='replace' instead of swallowing UnicodeDecodeError). The create_job guard resolves relative script paths under HERMES_HOME/scripts the same way the scheduler does, so a bare script name is scanned as the file that actually runs. Design and much of defense-2 originate from PR #33395 (@kshitijk4poor), which itself salvaged #30728 (@SimoKiihamaki). Rebuilt against current main since defenses 1-2 had already landed under different names. Closes #30719. Co-authored-by: SimoKiihamaki <simo.kiihamaki@gmail.com> Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-07-01 02:48:36 -07:00
Teknium	da6d5fcd13	fix(auth): serialize Codex OAuth pool refresh under the auth-store lock (#56233 ) The credential-pool Codex refresh path synced tokens from auth.json and then POSTed the refresh_token to OpenAI's token endpoint without holding the cross-process auth-store lock across the whole read->POST->write-back sequence. Because Codex refresh tokens are single-use, two concurrent Hermes processes could both adopt the same on-disk token and both POST it; the loser got refresh_token_reused / invalid_grant. Wrap the Codex OAuth branch of _refresh_entry in the existing shared _auth_store_lock (reentrant, cross-process flock) using the same extended-timeout pattern resolve_codex_runtime_credentials() already uses. A waiting process now blocks on the lock and, once inside, the in-lock re-sync picks up the rotated token the winner persisted and skips its own POST. Also send User-Agent: hermes-cli/<version> on the refresh request. Credit @cooper-oai (#34820) for identifying the concurrent-refresh reuse race; this ships the narrow lock-serialization fix without the separate Codex auth-store partition.	2026-07-01 02:45:07 -07:00
xxxigm	a344c92050	fix(provider): route api.anthropic.com to anthropic_messages api_mode (#32243 ) `_detect_api_mode_for_url` previously returned `None` for the bare `api.anthropic.com` host, causing every URL-fallback path (custom_providers, direct-alias, the api-key fallback inside `resolve_runtime_provider`) to default to `chat_completions` for native Anthropic — which routes requests to the OpenAI-compat `/chat/completions` shim instead of the native `/v1/messages` endpoint. Pro/Max OAuth subscriptions are only billed against the native Messages API; the shim bills against a separate "extra usage" pool that is empty by default, so a freshly authorized Pro/Max credential 400s with "You're out of extra usage" the moment it's used — even on an account that has consumed nothing for the current cycle. Brings the helper in line with `hermes_cli.providers.determine_api_mode` which already mapped `api.anthropic.com` to `anthropic_messages`.	2026-07-01 02:18:56 -07:00
Harish Kukreja	01bf61c865	fix(runtime): honor NOUS_INFERENCE_BASE_URL across pool/explicit/aux paths Upstream #52270 added `_nous_inference_env_override()` but wired it into only `resolve_nous_runtime_credentials`. Three sibling resolution paths still ignored the override, so a self-hosted Nous inference endpoint set via `NOUS_INFERENCE_BASE_URL` was silently dropped whenever credentials arrived through any of them: - the credential-pool path (`_resolve_runtime_from_pool_entry`) - the explicit-provider path (`_resolve_explicit_runtime`) - the auxiliary side-LLM client (`_pool_runtime_base_url`) Route all three through the same auth-layer reader so every `NOUS_INFERENCE_BASE_URL` read shares one normalization path (trailing-slash stripping, blank -> empty) and the documented trusted-bypass intent stays in one place. The override is live-only: it wins for the base URL returned this run but is never persisted to auth.json or the credential pool, so an ephemeral dev/staging value cannot poison durable auth state. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 01:52:06 -07:00
kernel-t1	b944c6e821	fix(cli): stop .env sanitizer from splitting secrets that embed a known KEY= ## What does this PR do? A single, perfectly valid `.env` line was being silently corrupted on read and write. When a secret's value happened to contain a known Hermes env var name followed by `=` — for example a webhook or proxy base URL carrying a query parameter like `OPENAI_BASE_URL=https://proxy.example.com/v1?TAVILY_API_KEY=sk-...` — `_sanitize_env_lines()` treated the embedded `KEY=` as a second entry. It truncated the real secret at the inner match and fabricated a bogus second variable. A related path silently dropped any text before the first matched key. Because this runs on every `load_env()`, `save_env_value()`, `remove_env_value()` and `sanitize_env_file()`, the damage was written back to `~/.hermes/.env` and re-applied on every read — persistent loss/corruption of the canonical secrets store. The concatenation splitter now only acts when the line actually begins with a known `KEY=` (so leading text is never dropped) and when every value that precedes a boundary is a plain token. If a preceding value looks structured — a URL/query string (`://`, `?`, `&`) or contains whitespace — the embedded `KEY=` is understood to be part of that value, and the line is kept verbatim. Genuine concatenations of plain-token secrets still split as before. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) ## Changes Made - `hermes_cli/config.py`: added `_looks_like_structured_value()` helper and reworked the split logic in `_sanitize_env_lines()` to anchor splits to the line start and skip splitting when a preceding value looks like a URL/query string or holds whitespace. - `tests/hermes_cli/test_config.py`: added two regression tests — a value that embeds a known `KEY=` is preserved verbatim, and leading text before the first key is not dropped. ## How to Test 1. Run the sanitizer tests: `pytest tests/hermes_cli/test_config.py -k anitize -q`. 2. Confirm the new cases reproduce the bug on the old code and pass on the new: `OPENAI_BASE_URL=https://proxy.example.com/v1?TAVILY_API_KEY=sk-embedded` is returned unchanged instead of being split into a truncated value plus a fabricated `TAVILY_API_KEY` entry. 3. Run the full file: `pytest tests/hermes_cli/test_config.py -q` (97 passed). ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-07-01 01:50:32 -07:00
redactdeveloper	6b21a935af	fix(doctor): ignore disabled toolsets in missing-API-key summary hermes doctor's final 'configure missing API keys' summary counted every toolset with unmet key requirements, including default-off and explicitly disabled ones. Filter the summary to toolsets actually enabled for the CLI platform, with a graceful fallback to prior behavior when config resolution fails. Fixes #11336	2026-07-01 01:25:43 -07:00
teknium1	836732f54f	fix(cron): null-safe deliver in cron list + re-resolve BSM secrets per run Two live cron bugs, both surfaced by @banditburai in #35616 (whose larger watchdog/supervisor work is already superseded by the CronScheduler provider refactor on main): - #32896: `cron list` crashed on a present-but-null `deliver` field — `job.get("deliver", ["local"])` returns None for an explicit null, which then hit `", ".join(None)`. Coalesce with `or ["local"]` (same pitfall the sibling `repeat` line already guards against). - #33465: cron jobs 401'd on Bitwarden/BSM-backed secrets. The per-run env reload used a bare `load_dotenv(override=True)`, which re-applied only the .env placeholder — startup had already recorded this HERMES_HOME in env_loader._APPLIED_HOMES, so the external-secret re-pull no-oped. Route the reload through load_hermes_dotenv() and call reset_secret_source_cache() first to force the re-pull (Bitwarden's 300s value-cache keeps it off the network; override honours secrets.bitwarden.override_existing, mirroring startup). Tests: null-deliver regression guard in test_cron.py; reset-before-reload ordering guard in test_scheduler.py. Migrated 31 scheduler-reload test seams from patching dotenv.load_dotenv to the new load_hermes_dotenv / reset_secret_source_cache seam.	2026-07-01 01:05:33 -07:00
JezzaHehn	54f32af4a7	fix(security): require explicit consent before uploading debug logs `hermes debug share` printed a privacy notice and then uploaded the report to a public paste service in the same breath — the user never got to say yes or no. Add a consent gate: an interactive [y/N] prompt, a --yes/-y flag to skip it, and a hard refusal (exit 1) in non-interactive contexts (no TTY on stdin) so debug data can't be exposed silently in scripts/CI. - New _confirm_upload() helper gates the actual upload after the notice. - Applied to BOTH upload paths: the public paste.rs path and the --nous Nous-S3 path (the latter is a sibling site the original PR missed). - The /debug slash command passes yes=True (typing /debug is itself the consent action, and input() would hang inside prompt_toolkit). - Rewrote the privacy notice for accuracy: secrets (API keys/tokens/ passwords) ARE force-redacted before upload; PII (display name, platform user ID, verbatim message content, filesystem paths) is NOT, and that URL is public. Fixes #22016. Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>	2026-07-01 00:38:17 -07:00
Teknium	8d78be5460	revert: back out prompt_caching.enabled toggle (#56105 ) for re-evaluation (#56126 ) * Revert "fix(caching): honor prompt_caching.enabled across model switch + fallback" This reverts commit `36f9f50145`. * Revert "fix: allow disabling prompt caching" This reverts commit `c1c1a12fe6`.	2026-07-01 00:20:32 -07:00
Jan Renz	c1c1a12fe6	fix: allow disabling prompt caching	2026-07-01 00:10:42 -07:00
Teknium	2e8748ed22	feat(moa): opt-in full-turn trace persistence to JSONL (#56101 ) Adds moa.save_traces (default off). When on, every MoA turn that runs the reference fan-out appends one JSON line to <hermes_home>/moa-traces/<session_id>.jsonl capturing the TRUE FULL turn: each reference model's exact input messages (system advisory prompt + full advisory view, not the truncated display preview) + full output + usage + per-advisor cost, and the aggregator's exact input (including the injected reference-context guidance block) + output. Lets MoA runs be audited and improved offline — what every model saw, said, and cost. - agent/moa_trace.py: config-gated JSONL writer, profile-aware path via get_hermes_home(), best-effort (never breaks a turn), moa.trace_dir override. - agent/moa_loop.py: _RefAccounting now carries full input/output/model/ provider/temperature; create() stashes the full turn on a cache MISS (once per turn, never on the cache-HIT repeat iterations); non-streaming aggregator output captured inline, streaming marked + pointed at the session assistant message. consume_and_save_trace(session_id) flushes it. - agent/conversation_loop.py: flushes the trace with the live session_id right after MoA usage consumption. No-op for non-MoA clients. - hermes_cli/config.py: moa.save_traces + moa.trace_dir defaults. Traces are a side channel — NOT the messages table, never in replay, safe to delete. Off by default; only overhead when off is one config read on a MoA cache-MISS turn. Tests: full-trace-when-enabled (per-ref input+output+cost, aggregator input-with-guidance + output), nothing-when-disabled. Live E2E through run_conversation confirmed the loop wiring writes the file.	2026-07-01 00:09:42 -07:00
Ben	98d550e035	feat(debug): support /debug [nous\|local] in the CLI/TUI slash command The --nous flag was only wired into the argparse `hermes debug share` subcommand. The /debug slash command (classic CLI + TUI, both via process_command -> _handle_debug_command) built a hardcoded args namespace with no `nous` attribute, so it always took the default paste.rs path. Pass cmd_original through to _handle_debug_command and parse an optional destination word: /debug -> public paste (default, unchanged) /debug nous -> Nous-internal S3 /debug local -> stdout, no upload local wins over nous (never touches the network); unknown words fall back to the default. Add args_hint="[nous\|local]" so help/autocomplete surface it. New TestDebugSlashCommand covers the parsing + dispatch.	2026-06-30 17:29:23 -07:00
Ben	89653db403	feat(debug): drop dead confirm step from --nous upload (stateless NAS) NAS PR #349 (merged) ships a stateless presigned-PUT endpoint: the only route is POST /api/diagnostics/upload-url, and the object's existence in S3 is the only state. There is no /api/diagnostics/confirm route — confirming live against the merged preview returns 404. The client's confirm_upload() therefore fired a guaranteed-404 request on every --nous upload (harmless, since errors were swallowed, but dead). Remove it and simplify share_to_nous() to the 2-step mint + PUT flow that matches the shipped contract. Drop the corresponding TestConfirmUpload class and confirm assertions; add a test that the share succeeds even when the response carries no id (we no longer depend on it). The separately-flagged cross-repo requirement from #349's review -- sizeBytes is now REQUIRED and signed into the presigned URL's ContentLength -- was already satisfied: share_to_nous() sends len(bundle) as sizeBytes and urllib sets a matching Content-Length on the PUT. Verified against the live merged preview (missing sizeBytes -> 400 invalid_body; present -> 503 dark). Tested: pytest tests/hermes_cli/test_diagnostics_upload.py tests/hermes_cli/test_debug.py -> 95 passed.	2026-06-30 17:29:23 -07:00
Ben Barclay	51eeb70cb8	feat(debug): add --nous flag to upload diagnostics to Nous S3 `hermes debug share --nous` uploads the (force-redacted) debug bundle to Nous-internal S3 storage via a presigned URL minted by the Nous account service, instead of a public paste. The bundle is private — viewable only by Nous staff / allowlisted mods through a Google-OAuth-gated viewer — and auto-deletes after 14 days. The paste.rs path is unchanged and remains the default. - hermes_cli/diagnostics_upload.py (new): stdlib-urllib NAS client — request_upload_url(), put_bundle(), confirm_upload() (best-effort), share_to_nous() orchestrator. Base URL via HERMES_DIAGNOSTICS_BASE_URL (default https://portal.nousresearch.com). - hermes_cli/debug.py: extract collect_share_bundle() from build_debug_share() so the Nous path reuses the exact same redaction/collection (paste.rs behaviour unchanged); add build_nous_bundle() producing the gzipped {"format":"hermes-debug-share/1","redacted":...,"files":...} envelope the discord-support viewer parses; add the --nous run path with a privacy notice and a clean fallback (suggest --local) on failure. - hermes_cli/main.py: add the --nous flag + help/epilog entry on `debug share`. - tests: test_diagnostics_upload.py (new) mocks urllib; test_debug.py adds bundle/Nous coverage. 97 passing.	2026-06-30 17:29:23 -07:00
HiddenPuppy	0e4c879a3b	fix: keep plain custom GPT-5 relays on chat completions Generic provider:custom relays were force-routed to the OpenAI Responses API whenever the model matched gpt-5*, and a stale persisted model.api_mode=codex_responses survived /reset and upgrades. Some OpenAI-compatible relays do not implement Responses semantics, which surfaced as malformed function_call.name replay errors in gateway sessions. - runtime_provider: route custom-provider api_mode through _resolve_plain_custom_api_mode(), which drops a stale codex_responses unless the URL is direct OpenAI/xAI - run_agent: _provider_model_requires_responses_api returns False for custom; direct api.openai.com / api.x.ai URLs still upgrade via _is_direct_openai_url() / URL detection - regression coverage for plain relays vs direct OpenAI/xAI URLs Co-authored-by: HiddenPuppy <HiddenPuppy@users.noreply.github.com>	2026-06-30 15:57:52 -07:00
kshitijk4poor	7b12753948	feat(gateway): expose platform_connect_timeout in config.yaml Adds gateway.platform_connect_timeout (default 30s) to DEFAULT_CONFIG and bridges it to the internal HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT env var at gateway startup, following the existing gateway_timeout config->env pattern. The env var remains the manual-override escape hatch and wins if set explicitly; otherwise config.yaml supplies the value. This closes the issue's documentation/config-surface request (#19776 suggestion 2) on top of the adapter ready-wait fix, so users no longer need an undocumented env var to raise the Discord connect timeout. Refs #19776	2026-06-30 15:03:25 -07:00
Brooklyn Nicholson	08be8e5ef7	feat(journey): wire list/delete/edit through CLI, RPC, and REST Expose learning_mutations via hermes journey subcommands, TUI gateway learning.detail\|delete\|edit, and /api/learning/node for the desktop app.	2026-06-30 15:07:22 -05:00
brooklyn!	9f8de4dfbe	Merge pull request #55555 from NousResearch/bb/memory-graph-cli-tui feat(journey): CLI + TUI learning timeline (/journey)	2026-06-30 14:43:10 -05:00
kshitijk4poor	c717be8ded	fix(config): route every migration write through one default-stripping chokepoint A single 'hermes update' / 'hermes -p' could rewrite a hand-curated config.yaml into a near-full DEFAULT_CONFIG dump (the 'you blow up my profile config on one tweak' reports). Root cause: migrate_config() had ~16 independent save_config() call sites, each author deciding ad hoc whether to materialise a value, and many persisted pure schema defaults with strip_defaults=False. Defaults already merge transparently at read time via load_config(), so writing them is pure bloat that also shadows future default changes (see save_config's docstring). Architectural fix (not a per-site patch): introduce a single _persist_migration() chokepoint that enforces one invariant — a migration may persist only values that DIFFER from the current schema default, plus explicit removals/renames of user data; pure defaults are never written. Every migration write (all 17 sites incl. the version-bump finalizer) now routes through it. The invariant is mechanically correct for all cases and verified empirically: - pure-default seeds (timezone='', curator/auxiliary.curator blocks, interim flag, curator.consolidate=False, empty plugins.enabled) are stripped → merged in at read time; - non-default values (write_approval=True, model_catalog.ttl_hours=1) preserved via explicit-raw-path preservation; - behaviour flips (agent.verify_on_stop=False, schema default still 'auto') preserved because False != 'auto'; - data transforms (custom_providers->providers, stt.model relocation, write_mode->write_approval, compression.summary_* removal, MCP-disable) persist their removals/renames. An explicitly user-set non-default value (e.g. matrix.require_mention: false) is preserved across the bump. Guard tests lock the architecture: an AST check asserts migrate_config() makes no direct save_config() call (all writes go through _persist_migration), and a full-range v1->latest test asserts a lean config is never dumped. Two existing change-detector tests that froze the on-disk representation of default-valued keys are rewritten to assert the effective value via load_config() (behaviour contract, not snapshot). Validation: lean v1->latest migration drops from ~567 bytes to ~196 bytes; 148 config+setup and 196 profile/curator/migrate tests pass on scripts/run_tests.sh.	2026-06-30 20:30:22 +05:30
Vladimir Smirnov	c080a530ae	fix(cli): redact status API keys with --all	2026-06-30 04:38:43 -07:00
Teknium	e7ca53e6b8	fix(moa): disabled presets no longer hijack a plain model switch (#55598 ) exact_moa_preset_name matched any bare model name equal to a preset key, regardless of the preset's enabled flag. On the no-explicit-provider switch path (PATH B in model_switch.py), a plain /model switch whose name collided with a preset key (e.g. "default") silently pivoted the session onto the MoA virtual provider — even when the user had set enabled: false to opt out (issue #55187). The LLM driving a routine model switch could land on a broken moa provider with empty default_preset / unconfigured aggregator credentials. Gate the implicit bare-name match on the per-preset enabled flag. Explicit selection via --provider moa / the model picker uses PATH A and does not go through exact_moa_preset_name, so a disabled preset stays reachable when the user explicitly asks for it.	2026-06-30 04:22:32 -07:00
teknium1	bff61f558f	feat(plugins): enable-time consent prompt for tool_override grant Builds on memosr's sink-level opt-in gate (#29249). Enabling a non-bundled plugin now surfaces the privileged allow_tool_override decision at `hermes plugins enable` time instead of leaving the operator to discover the config key after a runtime rejection. - `hermes plugins enable <name>` prompts for non-bundled plugins: 'Allow this plugin to replace built-in tools?' Default is deny (blank Enter / non-interactive stdin / EOF all fail closed). - --allow-tool-override / --no-allow-tool-override flags for non-interactive and scripted use (and a future desktop checkbox). - Bundled plugins are trusted: never prompted, no entry written. - Writes plugins.entries.<key>.allow_tool_override, the same key the sink gate reads (manifest.key == discovery key), so consent and enforcement compose end to end.	2026-06-30 04:00:42 -07:00
memosr	12f5624a76	fix(security): bind tool_override authorization to handler's defining plugin module egilewski found the prior sink gate was transient: it only applied while PluginManager executed register(ctx). A plugin could defer a direct registry.register(..., override=True) to a post-load callback/thread, after the scope was cleared, and still replace a built-in. Make authorization durable by binding it to where the handler is DEFINED (handler.__globals__['__name__']) rather than to call timing. At load, each plugin's module namespace is mapped to its allow_tool_override opt-in in a table that is never cleared. The sink resolves the handler's owning plugin module and rejects an override from any plugin namespace without opt-in, regardless of when or on which thread the call happens. Plugin namespaces with no recorded policy are treated as not-opted-in (fail-closed). Built-in and MCP handlers live outside the plugin namespace and are unaffected. Adds a regression test for the delayed/post-load direct-registry override.	2026-06-30 04:00:42 -07:00
memosr	3101222312	fix(security): enforce tool_override opt-in at registry sink to close direct-import bypass The opt-in gate lived only in PluginContext.register_tool, so a plugin could bypass it by importing tools.registry and calling registry.register(..., override=True) directly. Enforce the same gate at the sink: during plugin load, the registry rejects an override from a plugin without operator opt-in regardless of the path taken. Built-in and MCP registrations (no active plugin scope) are unaffected. Adds a regression test covering the direct-registry bypass.	2026-06-30 04:00:42 -07:00
memosr	179eb8c2a3	fix(security): require operator opt-in for plugin tool_override to prevent silent built-in tool replacement The tool_override flag landed in v0.14.0 (#26759) so plugins can replace a built-in tool with their own implementation. It works as advertised but there is no trust gate, so any enabled third-party plugin can silently override any built-in like shell_exec, write_file, or web_fetch and exfiltrate everything the agent invokes through it. The only trace is a DEBUG-level log line. Compare with ctx.llm (#23194) which does gate the equivalent privilege escalation: overriding the provider requires plugins.entries.<id>.llm.allow_provider_override: true in config.yaml. The policy shape exists, it just was not extended to tool overrides. Fix: * Add PluginToolOverrideError(PermissionError) for the gate failure. * register_tool() now checks _tool_override_allowed(name) when override=True. Bundled plugins (manifest.source == 'bundled') are trusted by default. Every other source requires plugins.entries.<plugin_id>.allow_tool_override: true in config.yaml. * fail-closed: if config.yaml cannot be loaded for any reason, _tool_override_allowed returns False. Same posture as MSGraphWebhookAdapter.connect() in #22353. Backwards compatibility: * Bundled plugins: no change (source == 'bundled' short-circuits the gate). * Third-party plugins not using override: no change (gate is only consulted when override=True). * Third-party plugins using override: registration fails until the operator opts in. The error message includes the exact config path to add, so the fix is one config edit away for legitimate use cases. Same migration path users went through for allow_provider_override after #23194 landed. Regression tests: * tests/hermes_cli/test_plugins.py::test_register_tool_override_replaces_existing and ::test_register_tool_override_on_new_name_is_noop_path were written before the gate existed. Updated their test configs to include allow_tool_override: true under plugins.entries.<plugin_id>, mirroring how a legitimate operator would now grant the privilege. * New regression test ::test_register_tool_override_blocked_without_operator_opt_in exercises both the PluginManager-catches-error path (built-in tool is preserved, attacker plugin is skipped) and the direct-call path (PluginToolOverrideError is raised with a message that names the config key to set). Verified the test fails without this fix and passes with it. * All 73 tests in test_plugins.py continue to pass.	2026-06-30 04:00:42 -07:00
teknium1	15e44527ab	fix(copilot): prefer endpoints.api for base URL, guard empty chat base URL Folds @trevorgordon981's #50590 into difujia's #15139: - exchange_copilot_token now prefers the authoritative endpoints.api from the token-exchange response, falling back to the proxy-ep-derived host - resolve_api_key_provider_credentials gains a copilot branch that resolves the account-specific base URL and a non-empty last-resort guard, so chat inference never wedges on an empty base URL (#50252) Co-authored-by: Trevor Gordon <trevorbgordon@gmail.com>	2026-06-30 03:27:41 -07:00
NiuNiu Xia	fbd15e285c	fix(copilot): switch to VS Code client ID and derive enterprise base URL Two changes that complete the Copilot auth story (#7731 parts 3 and 4): 1. Switch OAuth client ID from opencode (Ov23li8tweQw6odWQebz) to VS Code (Iv1.b507a08c87ecfe98). The old ID produces gho_* tokens that return 404 on /copilot_internal/v2/token, making token exchange non-functional. The new ID produces ghu_* tokens that support exchange. 2. Derive enterprise API base URL from the proxy-ep field in the exchanged token. Enterprise accounts get tokens containing e.g. "proxy-ep=proxy.enterprise.githubcopilot.com" which is converted to "https://api.enterprise.githubcopilot.com" and stored in the credential pool. Individual accounts (no proxy-ep) continue using the default URL. The COPILOT_API_BASE_URL env var remains as a user escape hatch. Tested on both Individual and Enterprise Copilot accounts: - Individual: device flow works, exchange succeeds, base_url=None (default) - Enterprise: device flow works, exchange succeeds, 39 models returned including claude-opus-4.6-1m (936K), enterprise base URL derived Parts 3 and 4 of #7731.	2026-06-30 03:27:41 -07:00
Peetwan	ebb81f10cb	fix(tui_gateway): prevent WS disconnect under GIL pressure Three targeted fixes for Desktop GUI WebSocket stability when agent turns starve the uvicorn event loop of CPU (GIL contention): 1. Loosen ws_ping_timeout for loopback binds (QW-1) - Loopback (Desktop): ping 30s interval / 60s timeout - Non-loopback (Cloudflare Tunnel): unchanged 20/20 - A GIL-heavy agent turn can stall the event loop past 20s; uvicorn's keepalive ping runs on that same starved loop, so a 20s timeout kills an otherwise-healthy local connection over a recoverable stall. 60s rides out the stall without affecting half-open detection on public binds. 2. Coalesce streaming token frames in WSTransport (CF-2) - Buffer high-frequency delta frames (message.delta, reasoning.delta, thinking.delta) and flush as a batch every ~33ms (~30fps) - Non-streaming frames (RPC responses, control/tool/completion events) flush pending tokens first — wire ordering preserved - Thread-safe via threading.Lock; worker threads return immediately instead of blocking on per-token loop wakeups - Reduces event-loop wakeup churn by orders of magnitude during model streaming, directly cutting GIL pressure 3. Loop heartbeat watchdog (CF-1) - Self-rearming call_later tick (2s) measures drift between expected and actual fire time using loop.time() (monotonic) - Logs 'event loop stalled Ns (GIL pressure suspected)' when drift >5s - Turns mysterious WS drops into diagnosable log entries - Uses call_later chain (not a task) — dies with the loop, nothing to cancel on shutdown Root cause: uvicorn's ws keepalive ping (20/20s) runs on the same starved event loop as agent turns. Under GIL pressure from heavy agent turns or delegation, the loop can't service the ping within 20s, so the websockets protocol declares the connection dead. Reconnects fail with ready_send_failed because the old process's loop is still wedged. None of these fixes touch the model-facing message array, prompt caching, message role alternation, or the wire protocol — they are strictly display-transport improvements plus a config tweak and a diagnostic log. Tests: 762 passed, 17 skipped (0 failures) across test_tui_gateway_ws, test_tui_gateway_server, test_web_server, and tui_gateway/ suites.	2026-06-30 03:11:13 -07:00
teknium1	35a0803a3b	fix(delegation): budget subagent summaries against parent context headroom Batch delegation returned each subagent's full final_response verbatim into the parent's context. A fan-out of N children could dump 60k+ tokens at once, blowing the parent's context window and — on rate-limited providers — triggering a compression/429 death spiral (429 misread as context-too-large -> window step-down -> retry loop -> conversation dies). Cap each summary against the parent's remaining context headroom split across the batch (not a magic char count). When trimming, mirror the web_extract convention: spill the full text to cache/delegation (mounted into remote backends via credential_files._CACHE_DIRS) and return a head+tail window (75/25, line-snapped) plus a footer with the exact read_file offset to page the omitted middle. Both the subagent's opening AND its closing (outcomes / files-changed / issues, which live at the end) survive in-context, and nothing is lost — the parent can read_file the full version on any backend. delegation.max_summary_chars (default 24000) is a static ceiling layered on top as belt-and-suspenders for models that ignore 'be concise'; 0 disables it. Child prompt tightened to lead with outcomes / bullets. Co-authored-by: rc-int <rcint@klaith.com>	2026-06-30 03:07:40 -07:00
kshitij	26f39f7b90	fix(credentials): prefer ~/.hermes/.env over stale os.environ on key rotation (#55528 ) `_resolve_api_key_provider_secret` resolved API keys via `get_env_value`, which returns the `os.environ` value first and only falls back to `~/.hermes/.env`. After a user rotates a key in `.env`, a stale value still exported in the parent shell (Codex CLI, test runner, login profile) shadows the fresh key on every request, producing persistent 401s. The credential-pool seeding path was already fixed to prefer `.env` (#18254/#18755), but the live request-time resolution path was not — so the pool re-seeded with the fresh key while `_resolve_api_key_provider_secret` kept returning the stale shell export. This closes that remaining path. - config: add `get_env_value_prefer_dotenv()` — checks `~/.hermes/.env` first, then `os.environ`. Distinct from `get_env_value()` (unchanged, os.environ-first) so only Hermes-managed credential resolution flips precedence; the generic helper's many callers are unaffected. - auth: `_resolve_api_key_provider_secret` resolves through the new helper. - tests: regression coverage for both the pool-seeding path and the auth resolution path (a rotated `.env` key must beat a stale shell export). Closes #20591. Co-authored-by: 0xDevNinja <manmit0x@gmail.com>	2026-06-30 09:49:52 +00:00
Brooklyn Nicholson	e971dc1e9d	feat(journey): CLI + TUI learning timeline (/journey) Terminal rendition of the desktop Star Map / Memory Graph: learned skills and memories on a timeline, shared by `hermes journey` and the TUI `/journey` overlay via one size-aware Python renderer (agent/learning_graph_render.py). - TUI overlay mirrors /agents: static chart overview + selectable slice list → slice detail → single skill/memory body, with the shared inverse-row selection treatment and a pinned footer. - Reuse primitives: extract OverlayScrollbar into its own module (now shared with agentsOverlay), scroll the item body via ScrollBox, and unify both lists through one table-driven ListRow. - No animation/playback in the TUI — pure data; the renderer's reveal scrubber stays available in the CLI (`--play`, `--reveal`).	2026-06-30 04:44:58 -05:00
brooklyn!	1d495cfbbf	Merge pull request #55226 from NousResearch/bb/desktop-memory-graph feat(desktop): memory graph — playable timeline of memories + skills over time	2026-06-30 04:36:17 -05:00
Teknium	3f19df2a5b	fix(mcp): late-refresh must see desktop/dashboard discovery thread owner (#55514 ) MCP tools connected and enabled but never surfaced into the agent's session toolset on the desktop app + dashboard WebUI (#51587). There are two independent background MCP discovery thread owners by surface: tui_gateway.entry (stdio 'hermes --tui') and hermes_cli.mcp_startup (desktop app + dashboard WS sidecar via tui_gateway/ws.py, and 'hermes dashboard'). The late-refresh scheduler gates on tui_gateway.entry.mcp_discovery_in_flight(), which read ONLY the entry thread global. On the desktop/dashboard surfaces that global is None, so a server slower than the bounded build-time wait never triggered a late refresh and its tools stayed invisible for the whole session. Make mcp_discovery_in_flight() / join_mcp_discovery() consult BOTH thread owners. Adds the matching in-flight/join helpers to hermes_cli.mcp_startup and has tui_gateway.entry delegate to them as a second owner.	2026-06-30 02:08:37 -07:00

1 2 3 4 5 ...

3201 commits