Commit graph

14165 commits

Author SHA1 Message Date
kshitijk4poor
e2ffbf0cf4 chore(release): add AUTHOR_MAP entries for compression-routing salvage
Map the two contributor emails whose commits are cherry-picked into the
compression-routing-integrity salvage so scripts/contributor_audit.py
attributes them at release time:

- jvsantos.cunha@gmail.com -> plcunha (PR #55300)
- jakepresent1@gmail.com   -> jakepresent (PR #55721)

r266-tech (PR #50517) is already mapped.
2026-07-02 12:49:42 +05:30
Teknium
543d305bbb
feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756)
MoA per-turn latency is dominated by advisor GENERATION: turn wall time
correlates ~0.88 with output tokens and ~-0.03 with input tokens (measured over
52 turns). Each turn waits for the slowest advisor to finish writing, and
advisors were uncapped — writing multi-thousand-token essays the aggregator
only needs the gist of.

Add an opt-in per-preset reference_max_tokens knob (mirrors reference_temperature)
that caps ADVISOR output only; the acting aggregator is never capped. Default
None = uncapped, so existing presets are byte-for-byte unchanged (no regression).
Wired through both MoA execution paths (MoAChatCompletions.create and
aggregate_moa_context).

E2E: same task, closed preset uncapped vs reference_max_tokens=600 -> 59s to 33s
(~44% faster), final answer identical/correct.

- hermes_cli/moa_config.py: _coerce_int_or_none helper + reference_max_tokens
  in _normalize_preset/_default_preset/flattened view
- agent/moa_loop.py: read preset.reference_max_tokens, pass to reference fan-out
- agent/conversation_loop.py: pass reference_max_tokens on the per-turn path
- tests + docs
2026-07-02 00:16:35 -07:00
Ben Barclay
9be39de0f2
fix(auth): make HERMES_PORTAL_BASE_URL/NOUS_PORTAL_BASE_URL bypass the Portal host allowlist (#56864)
Ben caught that the initial approach (widening _NOUS_PORTAL_ALLOWED_HOSTS to
include the staging host) was the wrong fix -- env vars are supposed to
override the allowlist, mirroring how NOUS_INFERENCE_BASE_URL already
bypasses _ALLOWED_NOUS_INFERENCE_HOSTS via _nous_inference_env_override().

The actual bug: both resolve_nous_access_token and
resolve_nous_runtime_credentials read
`_optional_base_url(state.get("portal_base_url")) or os.getenv(...) or ...`
-- a plain `or` chain where the STORED state value wins first (short-circuits
before the env vars are even read), and then whichever value won gets run
through the same _NOUS_PORTAL_ALLOWED_HOSTS gate regardless of its source.
So a hosted agent stamped with HERMES_PORTAL_BASE_URL=<staging> in its env
AND a staging portal_base_url already persisted to auth.json would still
get silently rewritten to prod on every refresh, because the env var never
even got a chance to be consulted.

Revert the previous _NOUS_PORTAL_ALLOWED_HOSTS widening entirely --
staying prod-only preserves the allowlist's actual job (rejecting an
untrusted network-provided portal_base_url persisted to auth.json by a
compromised Portal response).

Add _nous_portal_env_override() (mirrors _nous_inference_env_override())
and restructure both call sites so the env override is checked FIRST and,
when set, wins outright and skips the allowlist gate entirely -- the
allowlist only ever runs against the fallback (stored-state-or-default)
path now.

Rewrote tests/hermes_cli/test_nous_portal_staging_allowlist.py to test the
actual fix: the helper function, and an end-to-end
resolve_nous_access_token proof that the env override wins even when state
ALSO has the staging host stored (the exact incident shape), that it wins
over a stored PROD host too, and that the allowlist's heal-to-prod
behaviour for an untrusted stored value is preserved when no override is
set.
2026-07-02 06:52:46 +00:00
kshitij
88d1d6206f
fix(streaming): handle completed responses with empty/None choices (#55933) (#56713)
* fix(streaming): handle completed responses with empty/None choices

The streaming fallback guard added in #55932 recognized a completed
response object only when its `choices` was a non-empty list. But an
adapter can return a completed response whose `choices` is `None` or an
empty list (an error / content-filter / terminal frame) — still a whole,
non-iterable response, not a token stream. Those shapes fell through to
`for chunk in stream` and crashed with

    'types.SimpleNamespace' object is not iterable

which is exactly issue #55933 (MoA `openai-codex` aggregator on
TUI/Desktop, where a stream consumer forces the streaming path).

Broaden the guard to discriminate on the PRESENCE of a `choices`
attribute (a genuine provider Stream object exposes none), disable
streaming for the session, and return the completed object so the outer
loop's normal invalid-response validation handles empty/None choices via
its retry path instead of iterating.

Based on the diagnosis in #56525 by @spiky02plateau (that PR normalized
the MoA aggregator return with a one-shot chunk iterator; the common
text/tool-call crash was already fixed at this seam by #55932, so this
extends the existing guard to cover only the remaining empty/None-choices
gap).

Fixes #55933

* refactor(streaming): simplify empty-choices guard body and parametrize tests

Post-review cleanup (no behavior change):
- Inline the single-use `response_choices` local and drop the redundant
  `if first_choice is not None else None` guard (getattr(None, ...) already
  returns the default safely).
- Collapse the two near-identical empty/None-choices regression tests into
  one `@pytest.mark.parametrize` case.

Mutation-verified: reverting the guard to the old non-empty-list condition
still makes both parametrized cases fail with the historical
'types.SimpleNamespace' object is not iterable.

---------

Co-authored-by: spiky02plateau <155588579+spiky02plateau@users.noreply.github.com>
2026-07-02 06:36:20 +05:30
kshitijk4poor
76be770091 test(moa): assert aux cap against model resolver, not frozen literal
Follow-up to the salvaged fix: the regression test asserted a frozen
max_tokens == 128_000 literal, coupling it to the Opus-4-8 model table.
Assert against _get_anthropic_max_output("claude-opus-4-8") plus > 2000
instead, so the test survives model-table churn while still catching a
regression to the old `or 2000` fallback.
2026-07-02 06:31:18 +05:30
helix4u
7951250947 fix(moa): lift hidden Anthropic aux output cap 2026-07-02 06:31:18 +05:30
kshitij
4d5d9fffd0
Merge pull request #56582 from srojk34/fix/vertex-credentials-env-leak
security(terminal): strip VERTEX_CREDENTIALS_PATH/GOOGLE_APPLICATION_CREDENTIALS from subprocess env
2026-07-02 06:08:55 +05:30
srojk34
7f64cce96d security(vertex): route credential/project/region resolution through the profile secret scope
agent/vertex_adapter.py resolved VERTEX_CREDENTIALS_PATH,
GOOGLE_APPLICATION_CREDENTIALS, VERTEX_PROJECT_ID, and VERTEX_REGION via raw
os.environ.get() instead of the profile-scoped get_secret() every other
credential lookup in hermes_cli/runtime_provider.py uses. In a multiplex
gateway serving several profiles from one process, os.environ still holds
whichever profile's .env python-dotenv loaded at boot — so a raw read here
let one profile's turn silently mint a Vertex OAuth2 token from, and get
billed against, a different profile's GCP service account. No error, no
fail-closed guard: the multiplex UnscopedSecretError protection was bypassed
entirely because these reads never went through get_secret().

- _resolve_credentials_path/_resolve_project_override/_resolve_region now
  call agent.secret_scope.get_secret(), matching the _getenv() pattern
  already used for every other provider's credentials.
- get_vertex_credentials()'s ADC fallback (google.auth.default()) reads
  GOOGLE_APPLICATION_CREDENTIALS from os.environ internally, bypassing
  get_secret() entirely — closed with a narrow guard: when multiplexing is
  active and this profile's scope has no Vertex credentials of its own, but
  os.environ still carries a value (left by a different profile's boot-time
  dotenv load), refuse ADC rather than silently authenticate as a stranger.
- Zero behavior change for single-profile installs: get_secret() falls
  through to os.environ transparently whenever multiplexing is off.

Same bug class as the already-fixed _HERMES_OAUTH_FILE/_AUTH_JSON_PATH/
HOOKS_DIR cross-profile leaks, now closed for Vertex's OAuth2 credential
path.
2026-07-02 06:07:56 +05:30
kshitij
2f7c51a3e2
Merge pull request #56605 from simpolism/codex/discord-inline-bot-mentions
fix(discord): ignore reply-ping-only mentions for bot-authored messages
2026-07-02 05:23:44 +05:30
dsad
830860306d Guard browser CDP on private pages 2026-07-02 05:23:23 +05:30
kshitijk4poor
676236bb1d fix(agent): honor custom CA certs on aux client + harden TLS resolution
The salvaged fix wired per-provider ssl_ca_cert / ssl_verify (and
HERMES_CA_BUNDLE) into the MAIN OpenAI client. This follow-up:

- Auxiliary client parity: process_bootstrap.build_keepalive_http_client
  accepts and forwards verify; auxiliary_client._resolve_aux_verify mirrors
  the main-client TLS resolution (via load_config_readonly, the read-only
  fast path) so compression/vision/web_extract/title-gen/session_search
  honor the same per-provider CA. Without this, chat worked against a
  private-CA endpoint but every auxiliary call still failed APIConnectionError.
- switch_model now reads custom_providers from live config (load_config_readonly)
  instead of the init-time agent._custom_providers snapshot, so ssl_ca_cert /
  ssl_verify edits are honored on mid-session model switch — matching the
  context-length reload (#15779).
- Drop the dead client-level verify= where a custom httpx transport is used
  (httpx ignores it there); verify lives on the transport. Fix docstrings.
  Applies to both run_agent._build_keepalive_http_client and process_bootstrap.
- resolve_httpx_verify: add CURL_CA_BUNDLE to the env chain (consistency with
  agent/ssl_guard._CA_BUNDLE_ENV_VARS) and emit a loud logger.warning naming
  the endpoint whenever ssl_verify:false disables verification.
- get_custom_provider_tls_settings: case-insensitive base_url match (config
  dedup already lowercases; scheme/host are case-insensitive) so a mixed-case
  entry doesn't silently drop its CA. Exact match preserved — no prefix bypass.
- Demote best-effort except Exception: pass in agent_init/switch_model to
  logger.debug(exc_info=True).
- Tests for aux verify forwarding, _resolve_aux_verify, case-insensitive
  match, and prefix-bypass rejection.
2026-07-02 04:51:56 +05:30
HexLab98
3a2ba959ce fix(agent): honor custom CA certs for custom_providers HTTPS endpoints
Wire ssl_ca_cert and ssl_verify through custom_providers config and env
vars into the keepalive httpx client, fixing APIConnectionError against
mkcert/self-signed Ollama proxies behind HTTPS.
2026-07-02 04:51:56 +05:30
HexLab98
7e957cbd0b feat(agent): add resolve_httpx_verify for custom CA bundle TLS
Introduce a shared helper that maps HERMES_CA_BUNDLE, SSL_CERT_FILE, and
per-provider ssl_ca_cert settings to httpx verify contexts.
2026-07-02 04:51:56 +05:30
brooklyn!
b3bc302370
Merge pull request #56641 from NousResearch/bb/journey-cli-robustness
fix(journey): crash on non-dict skill metadata + ANSI leaks in CLI/desktop
2026-07-01 16:34:50 -05:00
Brooklyn Nicholson
89cf65ab63 fix(tui_gateway): strip ANSI from slash-worker output for desktop chat
Desktop chat bubbles render plain text, but a worker-routed command that
builds its own Rich Console (e.g. /journey) picks up truecolor from the
gateway's inherited COLORTERM and leaks raw escapes into the bubble. Strip
ANSI at the single worker-return choke point so every command renders cleanly.
The TUI opens /journey as an overlay, so it never travels this path.
2026-07-01 16:28:34 -05:00
Brooklyn Nicholson
428b9a0c42 fix(cli): render /journey color instead of leaking raw ANSI
In the interactive CLI, /journey dispatched straight to `args.func(args)`,
letting Rich write ANSI to stdout — which patch_stdout's StdoutProxy passes
through as literal `?[38;2;…m` garbage. Route the read-only views (default +
`list`) through a captured, force-color Console and re-emit via `_cprint`
(prompt_toolkit's ANSI parser), matching the `ChatConsole` idiom.
`delete`/`edit` stay on real stdio since they prompt / open `$EDITOR`.
2026-07-01 16:25:48 -05:00
Brooklyn Nicholson
ec319e4e3e fix(learning_graph): guard non-dict metadata so /journey can't crash
parse_frontmatter's malformed-YAML fallback stores every value as a string,
so a skill's `metadata` can be a str. `_category`/`_related` chained
`.get("metadata", {}).get("hermes", {})` and blew up with `'str' object has
no attribute 'get'`, taking down `build_learning_graph()` (and thus /journey
and `hermes journey`) whenever any installed skill had bad frontmatter.

Extract a `_hermes_meta()` helper that returns the nested dict only when it
really is one. Fixes the whole class, not just the two call sites.
2026-07-01 16:25:48 -05:00
Teknium
76a468e513
feat(models): add claude-fable-5, claude-sonnet-5, fugu-ultra to curated OpenRouter + Nous lists (#56617)
- claude-fable-5 placed above claude-opus-4.8 in both curated lists
- claude-sonnet-5 replaces claude-sonnet-4.6
- sakana/fugu-ultra added near the bottom (before routers/free tier)
- regenerated website/static/api/model-catalog.json via scripts/build_model_catalog.py (live-pulled by CLI, published on merge — no release needed)
2026-07-01 13:21:42 -07:00
Teknium
7c1a029553
chore: release v0.18.0 (2026.7.1) (#56611) 2026-07-01 13:07:40 -07:00
snav
e9bceb5ae0 fix(discord): ignore reply-ping-only mentions for bot-authored messages
Two Hermes bots sharing a channel could volley replies at each other
indefinitely. Root cause: Discord reply-pings (allowed_mentions
replied_user=true) add the replied-to bot to message.mentions without a
literal <@bot> token in the body, so the existing bot-admission gate
treated a reply chip as an explicit @mention and re-triggered the peer.

Adds opt-in discord.bots_require_inline_mention (default false; env
DISCORD_BOTS_REQUIRE_INLINE_MENTION). When enabled, bot-authored
messages must carry a raw inline <@id>/<@!id> mention in the content;
reply-ping-only mentions no longer admit the message. Human messages and
all existing defaults are unchanged.

The new _self_is_raw_mentioned helper deliberately ignores the resolved
message.mentions list (which reply-ping populates) and checks only the
raw content token via the shared _raw_mentioned_user_ids primitive.
2026-07-01 15:38:34 -04:00
srojk34
1a0d7878c6 security(terminal): strip VERTEX_CREDENTIALS_PATH/GOOGLE_APPLICATION_CREDENTIALS from subprocess env
Vertex AI authenticates via OAuth2 (service-account JSON path / ADC), not
PROVIDER_REGISTRY, and VERTEX_CREDENTIALS_PATH is declared with
password=False (it's a path, not a bare key) under category="provider" —
a category the registry-derived blocklist loop never checks. Both it and
GOOGLE_APPLICATION_CREDENTIALS (the ADC fallback the adapter also reads)
fell through every existing blocklist source and leaked the on-disk
location of a GCP service-account key into every spawned subprocess
(terminal, codex/copilot app-server, browser workers) — the same leak
class already closed for every other provider's credentials in #53503.
2026-07-01 22:05:14 +03:00
kshitij
60b1f6ce3f
Merge pull request #56526 from srojk34/fix/browser-back-private-network-guard
security(browser): re-check private-network guard after browser_back navigation
2026-07-02 00:15:16 +05:30
kshitijk4poor
b225b30d08 fix(kanban): route notifier wake via profile chokepoint; harden review findings
Follow-up review fixes on the salvage of #54872 (原作者 张满良/@zmlgit):

1. [HIGH] Adapter selection now goes through the shared
   _authorization_adapter chokepoint (gateway/authz_mixin.py) instead of a
   local inline lookup that fell back to the DEFAULT profile's same-platform
   adapter when the owning profile had a registry entry but no adapter for
   that platform. That fallback re-introduced the exact cross-profile
   mis-delivery ([230002] Bot can NOT be out of the chat) this change exists
   to fix. Adds a mutation-verified guard test
   (test_notifier_owning_profile_adapter_no_default_fallback).

2. [HIGH→documented] The creator-wake SessionSource cannot faithfully
   reconstruct a DM/thread creator's session key because chat_type is neither
   persisted on the subscription nor carried on the session-context bridge.
   Documented the limitation inline; behavior degrades to a fresh group
   session (never an exception). The end-to-end fix (stamp + persist
   chat_type) is a scoped follow-up, not bundled into this salvage.

3. [MED] Documented that archived/unblocked are intentionally claimed (cursor
   hygiene) but silent, and excluded from wake kinds.

4. [MED] Wake-injection failure now logs at WARNING with exc_info=True (the
   cursor has already advanced, so a broken wake must not be a silent no-op).
2026-07-02 00:05:48 +05:30
张满良
3545d74915 fix(kanban): i18n wake messages — address review feedback on #54872
Addresses @tonydwb's review on PR #54872 (12:05 UTC, 2026-06-29):

  > the hardcoded Chinese text in the wake messages (lines 118-128 of
  > the diff) should be replaced with English or internationalized.
  > The rest of the codebase uses English for user-facing messages,
  > and hardcoded Chinese will confuse non-Chinese users. Consider
  > using a constants dict or the existing i18n infrastructure.

Used the existing i18n infrastructure (agent/i18n.py::t()) — the same
surface gateway/run.py and slash_commands.py already use for static
user-facing strings.

## Changes

- gateway/kanban_watchers.py: import `t` from agent.i18n; replace the
  hardcoded Chinese strings in the synthetic wake-up message with
  t("gateway.kanban.wake.*") lookups. Behavior unchanged for zh users
  (zh catalog preserves the original Chinese phrasing).

- locales/en.yaml: new `gateway.kanban.wake.*` baseline keys (English):
  completed / gave_up / crashed / timed_out / blocked / status_default
  / status_joiner / message (with {task_id} {status} {title}
  {assignee} {board} placeholders).

- locales/zh.yaml: Chinese translation of the new keys, preserving the
  exact wording the original code used (so existing zh users see no
  visible change).

- locales/{zh-hant,ja,de,es,fr,tr,uk,af,ko,it,ga,pt,ru,hu}.yaml: added
  the same key set with English fallback values. The i18n invariant
  test (tests/agent/test_i18n.py::test_catalog_keys_match_english)
  requires every catalog to carry the same key set as en.yaml; native
  translations can land incrementally without breaking users (the
  loader falls back to en.yaml per-key when a translation is missing,
  but the key must still exist).

## Verification

- scripts/run_tests.sh tests/agent/test_i18n.py
  tests/gateway/test_kanban_watchers_mixin.py
  tests/gateway/test_kanban_notifier.py
  tests/gateway/test_kanban_notifier_watcher_dispatch_gate.py
  → 60 passed, 0 failed (i18n catalog parity + placeholders parity +
  existing kanban notifier behavior).

- Manual: with HERMES_LANGUAGE=en, t("gateway.kanban.wake.completed")
  returns "completed"; with HERMES_LANGUAGE=zh, returns "已完成";
  with HERMES_LANGUAGE=ja (translation pending), falls back to
  "completed" per-key.
2026-07-02 00:05:48 +05:30
张满良
c69643026a feat(kanban): route notifications via owning profile + wake creator agent
Three connected changes that fix kanban notifications in multiplex_profile
gateways and enable event-driven agent collaboration:

1. Session profile propagation
   - Add HERMES_SESSION_PROFILE ContextVar (session_context.py)
   - Gateway stamps source.profile at dispatch time (run.py)
   - _maybe_auto_subscribe reads profile from ContextVar instead of
     os.environ which is unset in the gateway main process (kanban_tools.py)

2. Notifier profile-aware routing (kanban_watchers.py)
   - Adapter selection: prefer _profile_adapters[sub.notifier_profile]
     so each profile's bot delivers its own task notifications
   - Relax profile skip-filter: process cross-profile subscriptions when
     the gateway has an adapter for the owning profile
   - Extend TERMINAL_KINDS with status/archived/unblocked

3. Creator agent wakeup on terminal events (kanban_watchers.py)
   - After delivering completed/blocked/gave_up/crashed/timed_out
     notifications, inject a synthetic MessageEvent into the creator's
     session via adapter.handle_message to trigger their agent loop
   - SessionSource built from subscription metadata — no session_store
     lookup needed
2026-07-02 00:05:48 +05:30
kshitijk4poor
7322da487f refactor(codex-runtime): tidy reapply-migration control flow
Self-review follow-up (hermes-pr-review Phase 2, non-blocking clarity findings).

- Collapse the reapplying_enable predicate to a single chained comparison
  (new_value == current == "codex_app_server") instead of a two-clause AND
  that re-tested new_value == current.
- Dedent the msg_lines list literals (drop trailing single-element commas).

No behavior change: reapply still falls through to the idempotent migrate()
while skipping set_runtime/persist (prompt cache preserved), and the auto-disable
early-return is unchanged. 31/31 tests green.
2026-07-01 23:51:54 +05:30
snav
35eb93c8df fix(codex-runtime): re-running /codex-runtime codex_app_server when already enabled now triggers migration
The /codex-runtime slash command short-circuits with "openai_runtime
already set" when invoked with the same value as the current config,
and crucially skips the entire migration block below. The check
conflates two things: (a) "the config value is correct" and (b) "the
world state (managed block in ~/.codex/config.toml, hermes-tools MCP
callback, plugin discovery) is converged".

Common footgun this exposes: a user who pre-sets
`model.openai_runtime: codex_app_server` directly in config.yaml
(reasonable thing to do) and then runs /codex-runtime codex_app_server
to trigger migration sees "already set" and silently gets no migration.
~/.codex/config.toml never receives the managed block, the hermes-tools
MCP callback never registers, and codex falls through to its default
runtime instead of the app-server one — visibly successful but
functionally partial setup.

The migration is idempotent by design (it replaces its own managed
block in place between MIGRATION_MARKER and MIGRATION_END_MARKER), so
re-running it is safe and cheap. Fix the short-circuit to fall through
to migration when re-applying codex_app_server while skipping the
config persist (no value-level change needed). The disable case
(re-applying "auto") still short-circuits because disabling doesn't
touch ~/.codex/config.toml at all.

The user-visible message changes to "openai_runtime already set to
codex_app_server — re-applying migration" so re-runs surface what
happened.

Regression test (test_reapply_codex_app_server_runs_migration) asserts:
- migrate() was called when re-applying
- persist_callback was NOT called (no config write on no-op transitions)
- migration output (MCP servers, sandbox default) surfaces in the
  user-visible message
- requires_new_session is True so callers know to /reset

Verified RED→GREEN: the test fails on origin/main with
"migration must run on reapply, not just first enable" and passes with
this fix. Full test_codex_runtime_switch.py suite: 31 passed.
2026-07-01 23:51:54 +05:30
kshitij
118febb4d9
Merge pull request #56530 from kshitijk4poor/chore-authormap-54872
chore: add AUTHOR_MAP entry for zmlgit (PR #54872 salvage)
2026-07-01 23:36:42 +05:30
kshitijk4poor
b23e1c3077 refactor(approval): extract is_approval_bypass_active(); use frozen-env bypass in codex routing
Self-review follow-up on the salvaged approval-routing fix.

The initial adaptation re-read os.getenv("HERMES_YOLO_MODE") at session-build
time. That diverges from the repo's security invariant: HERMES_YOLO_MODE is
frozen into tools.approval._YOLO_MODE_FROZEN at import time precisely so a skill
running mid-process cannot set the env var and instantly flip the approval
bypass (a prompt-injection escalation path). A live re-read re-opened that hole
for the codex routing path.

- Add tools.approval.is_approval_bypass_active() — the canonical three-source
  bypass check (frozen --yolo/HERMES_YOLO_MODE + session /yolo + approvals.mode
  off) in one place. This is the 4th inline copy of that OR-chain (the three
  sites in approval.py and tui_gateway/server.py:3121 all use the same idiom);
  the helper is the shared chokepoint they can collapse onto.
- codex_runtime.py now calls is_approval_bypass_active() instead of the
  hand-rolled mode-or-session check plus a runtime env re-read.
- Update the env-yolo test to patch _YOLO_MODE_FROZEN (the canonical test
  pattern, e.g. tests/tools/test_yolo_mode.py) rather than setenv, which is
  dead-on-arrival against the frozen constant.

Fail-closed default preserved on every branch; 28 integration + 77 session/yolo
tests pass; E2E confirms the real exec decision flips decline->accept only when
bypass is active.
2026-07-01 22:58:37 +05:30
snav
0b8e81996f fix(codex-app-server): honor approvals.mode/yolo for gateway-context approval routing
On gateway/cron/non-CLI contexts the codex app-server runtime has no UI to
surface codex's exec/apply_patch approval requests, so they fail closed
(silently decline) — the bot appears responsive but cannot write files, with
no approval prompt anywhere ("patch rejected by user").

When the user has explicitly opted out of Hermes approvals (approvals.mode: off,
the /yolo session toggle, or HERMES_YOLO_MODE=1), collapse to codex's own
sandbox permission profile (~/.codex/config.toml) as the policy gate by passing
_ServerRequestRouting(auto_approve_exec=True, auto_approve_apply_patch=True) to
the session. Defaults (manual/smart/unset) preserve the current fail-closed
behavior — a no-op for users who have not opted out.

Reads the mode via the canonical tools.approval._get_approval_mode() (which
already normalizes the YAML-1.1 bare-'off'->False case) at session-build time,
so a mid-session /yolo toggle is honored too.

5 integration tests: each opt-out mechanism (config off, YAML False, env var,
session yolo) plus the default fail-closed regression guard.

Closes #26530

Co-authored-by: snav <jake@nousresearch.com>
2026-07-01 22:58:37 +05:30
kshitijk4poor
148674e27c chore: add AUTHOR_MAP entry for zmlgit (PR #54872 salvage) 2026-07-01 22:42:58 +05:30
srojk34
4612ee9464 security(browser): re-check private-network guard after browser_back navigation
Every other content-returning browser tool entry point
(browser_snapshot/vision/console/eval, and click/type/press via
_blocked_private_page_action) re-checks window.location.href against the
private/internal/cloud-metadata floor after the page could have changed --
because a redirect chain or client-side navigation can land on an address
the initial browser_navigate preflight never saw. browser_back was the one
navigation-triggering entry point missing this: it called
_run_browser_command(..., "back", []) and returned the resulting URL
straight to the model with no re-check.

On a cloud/CDP (non-local) backend, if browser history contains a
private/internal address (e.g. a prior redirect touched an internal host),
browser_back would navigate the live browser there and hand the URL back
to the model with no guard -- the exact class of gap the private-page
guard exists to close, just on the one entry point it hadn't reached yet.

Re-check happens after the navigation succeeds (not before, unlike
click/type/press) since it's the resulting page -- not the one being left
-- whose safety matters. A failed back navigation (no history) skips the
check entirely since nothing changed. Verified live: the new regression
test fails (returns the private URL instead of a blocked payload) on the
pre-fix code and passes after.
2026-07-01 20:01:55 +03:00
Teknium
9be292f1e6
fix(desktop): make MoA preset selection persistent, not one-shot (#54670) (#56417)
The MoA preset section in the composer model dropdown presented presets like
persistent model selections, but selecting one dispatched the one-shot `/moa`
command (command.dispatch name=moa) — it ran a single turn through MoA and then
silently reverted to the prior model. The user saw MoA context for one message,
then it vanished with no indication.

Route MoA preset selection through the same persistent path real provider
selections use: onSelectModel({ model: preset, provider: 'moa' }) →
config.set model="<preset> --provider moa" → the gateway's switch_model. The
check mark now reflects the real current selection (currentProvider === 'moa'
&& currentModel === preset) instead of transient local state, and the
now-unused activeMoaPreset state is removed.

Tests: new model-menu-panel.test.tsx (2) — selecting a preset calls
onSelectModel with provider 'moa' (persistent), and the check renders on the
active preset. tsc -b clean.
2026-07-01 06:40:20 -07:00
nankingjing
5eaccf5802 fix(gateway): queue interrupts during in-flight context compression
With the default busy_input_mode=interrupt, a burst of rapid gateway
messages arriving while context compression is in flight could interrupt
the current turn and start a fresh turn against the pre-rotation parent
session. Because compression is interrupt-immune (#23975), the still-
running compression later rotates the id out from under that new turn,
and if the new turn also grew past the compression threshold it started
its own uncancellable compression on the same stale parent — forking
multiple orphaned one-shot sibling continuations (#56391).

While a state.db compression lock is held for the session, demote
'interrupt' busy-input mode to 'queue' semantics (mirroring the subagent
protection in #30170), so the follow-up message waits for the in-flight
compression + its id rotation to land instead of racing a new turn
against the stale parent. Ack copy explains the compression demotion.

Fixes #56391.
2026-07-01 06:38:24 -07:00
Teknium
1641441837
fix(desktop): don't false-timeout long prompt.submit turns (MoA, deep reasoning) (#56411)
prompt.submit is fire-and-forget — turn completion is signaled by stream /
message.complete events, not the RPC return — but it inherited the generic 30s
default RPC timeout. A turn that legitimately takes >30s to ACK (MoA presets
running references + aggregator in series, deep reasoning, large tool chains)
popped a false 'request timed out: prompt.submit' toast at 30s while the turn
was still running and streamed its real answer in 60-120s later (#55024).

Add PROMPT_SUBMIT_REQUEST_TIMEOUT_MS (1_800_000 = the backend's
agent.gateway_timeout ceiling) and pass it on all four prompt.submit call sites
(submit, resume-recovery retry, regenerate, rewind), mirroring the existing
SESSION_LIST_REQUEST_TIMEOUT_MS opt-out precedent. Widen the GatewayRequest
type (+ the inline requestGateway prop type) to carry the optional timeoutMs the
runtime impl already accepts.

Tests: use-prompt-actions/index.test.tsx 34/34 pass; tsc -b clean.
2026-07-01 06:33:47 -07:00
Teknium
eae3700b16
fix(moa): raise aux timeouts to 900s and give the Codex aux path a stable prompt_cache_key (#56395)
Two independent MoA auxiliary-call fixes:

#53866 — auxiliary.moa_reference.timeout and auxiliary.moa_aggregator.timeout
were 600s while moa_agent was 120s. Raise both to 900s so a genuinely long
reference/aggregator turn (mixed providers, deep reasoning, long tool chains)
has headroom instead of being cut mid-generation.

#53735 — _CodexCompletionsAdapter (the Codex/Responses auxiliary path used by
the MoA acting-aggregator, compression, web_extract, session_search, etc.)
never set prompt_cache_key, so it stayed cache-cold while the MAIN Responses
transport (agent/transports/codex.py) was warm. Derive the same
content-addressed key via the shared _content_cache_key(instructions, tools)
helper and set it on the aux Responses request, with the same host guards the
main transport uses (xAI carries the key in extra_body; GitHub/Copilot opts out
of cache-key routing).

Tests: 5 new prompt_cache_key cases (set+prefixed, stable across identical
prefix, differs on different instructions, skipped for xai/github hosts).
tests/agent/test_auxiliary_client.py 279 pass; tests/hermes_cli/test_config.py
130 pass.
2026-07-01 06:02:40 -07:00
Teknium
aa605b66c8
fix(moa): price aggregator turn at its real model so session cost isn't advisor-only (#56394)
On the MoA path agent.model/provider are the virtual preset name (e.g.
"closed") and "moa", which have no pricing entry. estimate_usage_cost()
returned None for the aggregator turn, so the `if amount_usd is not None`
guard skipped it and the session's estimated_cost_usd reflected only the
advisor fan-out — a ~50% undercount when the aggregator does the full acting
loop (verified: $0.91 advisor-only vs $1.96 true, aggregator = 54%).

MoAChatCompletions.create() now stashes the resolved aggregator slot as
last_aggregator_slot (exposed via MoAClient); conversation_loop reads it to
price the aggregator turn at its real model/provider. cost_source flips from
'none' to 'provider_models_api'.
2026-07-01 06:02:33 -07:00
kshitijk4poor
b795a45b8d fix(compaction): detect and strip merge-into-tail summaries past the delimiter
Follow-up to the END-MARKER reorder: moving the summary prefix after the
[PRIOR CONTEXT] wrapper meant _is_context_summary_content (prefix-at-start)
no longer recognized a merged-tail summary. That silently broke three
consumers — the last-real-user anchor (would pick the merged summary as a
real user turn, causing active-task loss), the carry-forward summary find,
and the auto-focus skip. _strip_summary_prefix would also carry the wrapper
+ stale tail content forward as the next summary body.

Extract the two delimiter strings into _MERGED_PRIOR_CONTEXT_HEADER /
_MERGED_SUMMARY_DELIMITER constants (writer + detector stay in sync), teach
_is_context_summary_content and _strip_summary_prefix to look past the
delimiter, and add a regression test. Standalone summaries unchanged.
2026-07-01 18:23:01 +05:30
Gromykoss
a1a8a967e1 fix(compaction): place END MARKER last in merge-into-tail summaries
When the compression summary is merged into the first tail message
(the alternation corner case where a standalone summary role would
collide with both head and tail), the old format was
SUMMARY + END_MARKER + OLD_TAIL_CONTENT — so the preserved tail content
appeared AFTER the end marker and the model could read it as a fresh
message to respond to.

Reorder so the END MARKER is always last: old tail content is wrapped in
[PRIOR CONTEXT ...][END OF PRIOR CONTEXT — COMPACTION SUMMARY BELOW]
delimiters, then the summary, then the END MARKER. _append_text_to_content
handles both string and multimodal-list content.

Salvaged from #56372 by @Gromykoss. Only the END-MARKER reorder half is
carried over. The PR's second change (a post-compaction pass that strips
user-role messages before the first summary marker on compression_count>=2)
was dropped: on 2nd+ compactions the protected head decays to system-only
(_effective_protect_first_n -> 0, #11996) so the targeted 'ghost head user'
does not occur, and where the strip does fire it deletes legitimate recent
tail user turns (data loss) and can leave consecutive assistant messages
(role-alternation violation).
2026-07-01 18:23:01 +05:30
teknium1
d00762623b fix(i18n): add gateway.resume.blocked_not_owner to all locales
The salvaged PR added the new key to locales/en.yaml only, so the i18n
catalog-parity test (tests/agent/test_i18n.py::test_catalog_keys_match_english)
failed for all 15 non-English locales. Add the key to every locale with the
English string (matching the existing convention for the untranslated
matrix_cross_room_success key), preserving the {name} placeholder so the
placeholder-parity test also passes.
2026-07-01 05:38:03 -07:00
teknium1
5b3f064259 security(gateway): fail closed on persisted /resume when caller keys on user_id_alt
The persisted (DB-fallback) branch of _resume_target_allowed() compared only
sessions.user_id against source.user_id, but build_session_key() keys the
participant on `user_id_alt or user_id` (Signal/Feishu carry the canonical
participant in user_id_alt). The sessions table has no user_id_alt column, so a
per-user row a caller shares the user_id of — but not the user_id_alt — maps to a
DIFFERENT live session key, yet the row's user_id matched both participants:
a co-member could resume/enumerate another member's persisted per-user group or
no-chat_id DM session (IDOR, CWE-639).

The live-origin guard (_same_origin_chat) already compares user_id_alt; the
persisted fallback couldn't. Fail closed on both identity-bearing per-user
branches (non-DM per-user group, no-chat_id DM) whenever the caller carries a
user_id_alt. Shared group/thread sessions (no participant scoping) and DMs keyed
on a present chat_id are unaffected; callers keyed on user_id (e.g. Telegram)
still resume their own rows; admin --all override still applies.

Regression: tests/gateway/test_resume_command.py::
test_resume_persisted_fallback_fails_closed_on_user_id_alt.
2026-07-01 05:38:03 -07:00
claudlos
f1e58d8c1a security(gateway): allow shared-group resume in persisted /resume fallback
Addresses egilewski follow-up on PR #52355: the persisted-row fallback required
row_uid == caller_uid for every identity-bearing caller, which wrongly blocked a
legitimately SHARED non-DM group session. With group_sessions_per_user=False,
build_session_key resolves every participant of a chat to one session key, so a
co-member (different user_id) in the same chat shares Bob's session — but the
guard returned "/resume blocked".

Mirror is_shared_multi_user_session() in the fallback, exactly as the live-origin
branch (_same_origin_chat) already does: for a non-DM caller, first require the
same platform + chat + thread provenance (unchanged — blank/mismatching chat
still fails closed), then allow without user-id equality when the session is
shared, and keep requiring the same owner for per-user group/thread sessions.
DM scoping is unchanged (always per-user).

Adds a regression: shared group → co-member allowed; per-user group → blocked;
different chat → blocked even when shared.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
claudlos
599a6391d4 security(gateway): fail closed on no-provenance persisted /resume for non-DM callers
Addresses egilewski/CodeRabbit follow-up on PR #52355: the identity-bearing
persisted fallback compared row_chat == caller_chat, which SUCCEEDS when both
normalize to "" — so a legacy row with no stored chat provenance could still be
resumed by a caller that also has no chat_id (probe: a group caller with
chat_id=None resuming a NULL-chat telegram row on matching user_id).

A non-DM session (group/channel/forum/thread) is keyed by chat_id in
build_session_key, so a blank chat on either side is NOT proof of same-chat.
Require both row and caller chat_id to be non-blank and equal for non-DM
callers; a legacy NULL-chat row (or a caller missing its chat_id) now fails
closed. DMs are unchanged: they are keyed on user_id, so a no-chat_id DM row
stays resumable by the same user (and a mismatching chat_id, when present, is
still rejected).

Adds the blank-caller-chat group probe and a DM no-chat_id same-user/other-user
regression.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
claudlos
5248877c61 security(gateway): prove chat/thread origin for persisted /resume; tighten DM scoping
Addresses the egilewski/CodeRabbit and teknium1 reviews on PR #52355.

1) Persisted-row chat scope (egilewski/CodeRabbit). The sessions table stored
   only source + user_id, so an identity-bearing caller could resume/list an
   INACTIVE persisted row that matched source+user_id but belonged to a
   DIFFERENT chat (probe: same user moves `same_user_chat_b` into chat-a).
   Persist the messaging origin and compare it:
   - schema: sessions gains origin_chat_id / origin_thread_id (declarative
     auto-migration via the existing column reconciler).
   - SessionDB._insert_session_row accepts + writes the two columns.
   - the gateway records them at every origin-bearing creation: both
     SessionStore create paths (get_or_create_session + reset/switch) and the
     /title path that materializes a store-only session into the DB.
   - _resume_target_allowed's identity branch now also requires
     origin_chat_id AND origin_thread_id to match the caller. Legacy rows with
     NULL origin (created before this change) cannot prove chat origin and
     fail closed — resume them via a live session or an admin --all override.
   The /sessions listing inherits the fix (non-Matrix rows route through the
   same helper).

2) DM key-contract mirror (teknium1). _same_origin_chat's DM branch only
   compared user_id and allowed when either side was missing, diverging from
   build_session_key (no-chat_id DM keys are built from user_id_alt or
   user_id). It now: treats an equal non-blank chat_id as sufficient (the DM
   key IS the chat_id when present), and otherwise compares the effective
   participant id (user_id_alt or user_id), failing closed on a
   missing/different participant so two no-chat_id DM origins are never
   conflated.

Tests: add same-user/different-chat (e2e + unit) and chat-scope unit cases;
add DM no-chat_id / user_id_alt / no-identity / same-chat_id cases; update
existing fixtures to record origin_chat_id like the gateway does; make the
cross-room `/resume --all` listing test run as admin (cross-room listing is
admin-gated) and give the boundary-state resume runner a live same-origin so
its post-resume clearing assertions exercise an authorized resume.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
claudlos
33a5090bf6 security(gateway): fail closed on persisted /resume for identity-less callers
Addresses egilewski (Codex/CodeRabbit) follow-up on PR #52355: the no-identity
branch of _resume_target_allowed() returned True after only checking that the
row's source didn't mismatch the caller platform. The sessions table has no
chat_id, so same-platform alone is not ownership proof — a Telegram group
caller in chat-a with user_id=None could resume (and /sessions could list) a
persisted row owned by another chat/user (e.g. victim_chat_b_uid,
source=telegram, user_id=victim).

Fail closed: an identity-less caller can no longer bind to or enumerate a
persisted session by id/title. A legitimate same-chat resume of an ACTIVE
session still works via the live-origin branch (which compares chat_id), and an
operator can use the admin --all override. The listing path inherits the fix
because _resume_row_visible() routes non-Matrix rows through the same helper.

Adds an end-to-end no-identity probe (resume blocked) and a unit-level
persisted-fallback assertion.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
claudlos
bb6e216aab security(gateway): scope Matrix /resume by thread, not just room
Addresses egilewski (Codex) CR on PR #52355: the Matrix direct /resume <id>
guard (and the Matrix listing guard) used _same_matrix_room(), which compared
only platform + chat_id. But build_session_key() appends thread_id for every
chat type when present, and Matrix scopes the model's turn to the current
room/thread — so a live session in another thread of the SAME room is a
DIFFERENT session. A caller in thread A could resume a target whose live origin
was in thread B (switch_session fired on the victim session).

Add a thread_id equality check to _same_matrix_room so room scoping also
enforces the thread boundary. Non-threaded rooms have empty thread_id on both
sides ("" == ""), so existing room-level sharing is preserved unchanged; only
cross-thread access is newly blocked. This mirrors the thread handling already
in _same_origin_chat for the non-Matrix adapters.

Adds regressions replaying the reviewer's thread-a -> thread-b probe (direct
guard + listing path), plus same-thread-shared and thread-vs-no-thread cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
claudlos
a0018cafd0 security(gateway): fail closed on blank-source rows in /resume scoping
Addresses egilewski (Codex) CR on PR #52355: the persisted-row fallback in
_resume_target_allowed() skipped the platform/source check when sessions.source
was blank (the row_src guard only rejects a *mismatching* non-blank source),
then accepted the row on user_id equality alone. A legacy/malformed row with a
blank source but a matching user_id was therefore resumable — an identified
caller could bind to a transcript whose origin it can't prove.

Now an identity-bearing caller is allowed only when the row proves BOTH the
same owner (non-blank user_id match) AND the same platform/origin (non-blank
source match). A blank/legacy source fails closed, exactly like a missing
user_id. No-identity (single-user) callers are unaffected.

Adds a regression replaying the reviewer's blank-source same-uid probe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
claudlos
c4f278c021 security(gateway): scope /resume and /sessions to the caller's origin (IDOR)
/resume resolved a persisted session id/title with no ownership check on any
adapter except Matrix, so an authorized caller could bind their gateway session
to another user's/room's transcript and read it. The titled-session listing and
numeric index were also globally enumerable on non-Matrix platforms, exposing
the ids and previews needed to target the IDOR.

Generalize the Matrix-only room guard to an adapter-agnostic ownership check
(live origin when active; DB row source + user_id for persisted-only sessions,
the only fields available), applied to the direct-id/title path and the
listing/numeric paths on every platform. An explicit admin --all override is
honored. The Matrix path is preserved unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 05:38:03 -07:00
teknium1
5d613a5638 fix(terminal): route init_session bootstrap cd through Windows path conversion
The Windows _quote_cwd_for_cd override only reached _wrap_command; the
snapshot bootstrap cd in init_session still used a bare shlex.quote(),
so on Windows the bootstrap cd failed and pwd -P captured the login
shell's dir instead of terminal.cwd. Route it through _quote_cwd_for_cd
too, and add -- for hyphen-safety to match _wrap_command.
2026-07-01 05:35:34 -07:00
LeonSGP43
9ed7252a98 fix terminal cwd handling on windows 2026-07-01 05:35:34 -07:00