Attribution audit gate: the salvaged contributor commits carry
kiljadn@gmail.com (Nick Mason / @designnotdrum). Add the mapping so
contributor_audit.py resolves the author on this PR.
Map the two contributor emails whose commits are cherry-picked into the
compression-routing-integrity salvage so scripts/contributor_audit.py
attributes them at release time:
- jvsantos.cunha@gmail.com -> plcunha (PR #55300)
- jakepresent1@gmail.com -> jakepresent (PR #55721)
r266-tech (PR #50517) is already mapped.
With the default busy_input_mode=interrupt, a burst of rapid gateway
messages arriving while context compression is in flight could interrupt
the current turn and start a fresh turn against the pre-rotation parent
session. Because compression is interrupt-immune (#23975), the still-
running compression later rotates the id out from under that new turn,
and if the new turn also grew past the compression threshold it started
its own uncancellable compression on the same stale parent — forking
multiple orphaned one-shot sibling continuations (#56391).
While a state.db compression lock is held for the session, demote
'interrupt' busy-input mode to 'queue' semantics (mirroring the subagent
protection in #30170), so the follow-up message waits for the in-flight
compression + its id rotation to land instead of racing a new turn
against the stale parent. Ack copy explains the compression demotion.
Fixes#56391.
Adds Vertex AI as a first-class provider for Gemini models via Vertex's
OpenAI-compatible endpoint. Vertex authenticates with short-lived OAuth2
access tokens (service-account JSON or ADC), not a static API key — the
missing piece behind the recurring requests (#13484, #12639, #56259).
- agent/vertex_adapter.py: OAuth2 token minting + refresh-on-expiry
(5-min margin), ADC->service-account fallback, global vs regional
endpoint URLs. Config precedence: env var > config.yaml > default.
- plugins/model-providers/vertex/: provider profile (auth_type=vertex),
reuses Gemini's extra_body.google.thinking_config translation.
- runtime_provider: vertex short-circuit BEFORE the credential pool so a
credentials-file path is never mistaken for a static API key; mints a
fresh token + computes base_url per resolve.
- run_agent + conversation_loop: _try_refresh_vertex_client_credentials()
re-mints the token and rebuilds the client on a mid-session 401, so a
long-lived gateway agent survives token expiry (~1h).
- auxiliary_client: vertex auth_type branch for side-LLM tasks.
- config.yaml: vertex.project_id / vertex.region (non-secret, bridged to
env); credential path stays in .env (VERTEX_CREDENTIALS_PATH).
- setup wizard + model picker: dedicated _model_flow_vertex; curated
google/gemini-* model list; --provider choices.
- pricing/metadata: Vertex prices off the gemini docs snapshot; endpoint
host auto-maps to the vertex provider (no probe spam).
- lazy_deps + pyproject [vertex] extra: google-auth, opt-in only.
- docs: guides/google-vertex.md + providers page; tests for adapter +
runtime resolution.
Salvages and modernizes #8427 by @slawt onto current main: rewired from
the legacy PROVIDER_REGISTRY path to the provider-profile architecture,
moved non-secret config out of .env into config.yaml, and added the
per-turn 401 token-refresh the original lacked.
- Correct the exit-75 comment: Hermes-generated units set
StartLimitIntervalSec=0 (rate limiting disabled), so StartLimitBurst
does not bound loops. The real bound is that genuine crashes exit
non-zero-but-not-75, and RestartForceExitStatus=75 only whitelists
the planned code.
- Add randomuser2026x AUTHOR_MAP entry (CI blocks unmapped emails).
Wrap each Telegram initialize() attempt in asyncio.wait_for(HERMES_TELEGRAM_INIT_TIMEOUT,
default 30s). When api.telegram.org and all fallback IPs are unreachable, the connect
chain has no outer bound, so a single initialize() blocks for minutes and the
retry-on-exception loop never fires — the gateway appears to hang after the banner.
The timeout guarantees each attempt is bounded, then retries with backoff, then fails
with an actionable error. Also adds WARNING-level progress logs before DoH discovery
and each connect attempt (visible at default log level).
Salvaged onto plugins/platforms/telegram/adapter.py (Telegram moved from
gateway/platforms/ since the PR was opened). Adds env var to docs + AUTHOR_MAP.
Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>
Follow-up on the salvaged #49830 hardening. The contributor's sensitive
query-param set included bare English words (code, key, auth, session,
sig) that double as ordinary page facets — ?code= on promo/challenge
pages, ?key= as a search facet, ?session= on blogs — so web_extract and
cloud browser_navigate would refuse a large slice of normal browsing.
Narrow the set to unambiguously credential-named params (access_token,
authorization, client_secret, password, token, x-amz-signature, ...).
Prefix-based vendor-key redaction (is_safe_url) still catches recognizable
key shapes; this set is the belt-and-suspenders for opaque secrets carried
under an explicit credential-named parameter.
Also fixes two intra-PR-staleness test breakages surfaced by salvaging onto
current main:
- web_extract_tool() no longer accepts use_llm_processing= (signature
changed since the PR was authored) — dropped the invalid kwarg.
- agent.redact now fully masks keyed 'token=<secret>' to 'token=***'
instead of partial 'sk-...'; the console-redaction test now asserts the
real invariant (secret body gone) rather than the exact mask format.
Added a regression test that generic English-word query params are NOT
blocked by the credential guard.
Add tmp_path symlink regression tests for both generate_systemd_unit and
generate_launchd_plist (~/.local/bin/node -> profile node install must not
leak the profile target into the generated unit PATH). Register
jearnest11's AUTHOR_MAP entry for the salvage cherry-pick.
Register the Matrix room-message, reaction, and invite handlers with
mautrix's wait_sync=True. mautrix's handle_sync() only returns the tasks
for handlers registered as sync-awaited; non-waited handlers are
fire-and-forget via background_task.create() and are NOT returned. Since
_dispatch_sync() awaits only the returned tasks (await asyncio.gather),
the inbound handlers previously had no completion point, so Tuwunel/
mautrix homeservers connected and completed initial sync but dispatched
zero inbound messages.
Fixes#46142.
Co-authored-by: Zeheng Huang <153708448+hunjaiboy@users.noreply.github.com>
Think-enabled models (MiniMax M2.7, DeepSeek, etc.) emit inline
<think>...</think> reasoning even for simple prompts like title
generation, and the raw XML was leaking into session titles. Route the
title-model response through the canonical strip_think_blocks scrubber
before cleanup so every tag variant — closed pairs, unterminated blocks,
orphan closes, mixed case — is handled, not just a single literal
<think> pair.
- 2 regression tests: closed <think> pair stripped, unterminated block
at start yields no title.
Salvaged from PR #44126 by @shawchanshek.
Maps the two plain-email contributors whose PRs are being salvaged so
contributor_audit.py passes:
- info@djimit.nl -> djimit (PR #48034)
- lubos@komfi.health -> lubosxyz (PR #49225)
The other two PRs in the batch (#50405 sasquatch9818, #48764 srojk34)
use users.noreply.github.com emails, which check-attribution auto-skips.
Slack Workflow Builder posts (and other app/bot messages) arrive as
subtype=bot_message with user=None. _is_user_authorized rejected them at
the `if not user_id: return False` guard, which runs *before* the #4466
{PLATFORM}_ALLOW_BOTS bypass — so @mentioning the bot from a Slack
workflow silently did nothing, even with SLACK_ALLOW_BOTS (or
SLACK_ALLOW_ALL_USERS) set. The chat-scoped allowlist for Telegram/QQ
already runs before that guard for the same reason (channel broadcasts
with no from_user); Slack was both missing from the bot-bypass map and
had the bypass running too late.
- gateway/authz_mixin: move the {PLATFORM}_ALLOW_BOTS bypass ahead of the
no-user-id guard and add Platform.SLACK -> SLACK_ALLOW_BOTS.
- plugins/platforms/slack/adapter: set is_bot=True on inbound
bot_message events so the gateway can identify workflow/app senders
(they carry no user_id to match against the allowlist).
Tested: new tests/gateway/test_slack_bot_auth_bypass.py plus the existing
Discord/Feishu bot-auth and gateway authz/gating suites all pass.
Follow-up on the salvaged resume_pending fix: the empty-turn safety net
now emits the same reason-aware recovery note as the _is_resume_pending
branch (reason phrase + 'session restored' guidance + no-re-execute
instruction) instead of a second, differently-worded note. Also adds the
AUTHOR_MAP entry for the salvaged commit.
The standalone thread-pool fallback in _deliver_result() runs inside the
`except RuntimeError:` block (taken when asyncio.run() sees a running loop).
When future.result() raised there (SMTP ConnectionError, timeout, etc.), the
exception was NOT caught by the sibling `except Exception:` — it escaped
_deliver_result() and crashed the whole delivery loop, silently skipping every
remaining target. Multi-target delivery (e.g. deliver: 'email:a,email:b') is a
documented feature, so this broke a promised contract.
Wrap the fallback in its own try/except so a per-target failure is logged with
exc_info and the loop continues to the next target.
Fixes#47163
The streaming think-tag suppressors in cli.py (_stream_delta) and
gateway/stream_consumer.py (_filter_and_accumulate) matched tag names
with case-sensitive str.find(), so only the exact-case literals in the
tag tuples were caught. Mixed-case variants a model may emit — <Think>,
<ThInK>, <REASONING>, <Thought> — slipped through and leaked raw
reasoning into the user-visible stream.
Match against a lowercased view of the buffer with lowercased tag names
at all three sites (open-tag boundary search, partial-tag hold-back,
close-tag search) in both paths. Only KNOWN tag names are matched — no
substring matching — and the block-boundary gating that protects prose
mentions of <think> is preserved.
- 6 parametrized case-insensitive regression tests in each of
tests/gateway/test_stream_consumer.py and
tests/cli/test_stream_delta_think_tag.py.
Salvaged from PR #27289 by @YLChen-007.
When a model emits an inline <think>...</think> block but the opening
tag is dropped upstream (thinking-mode toggle, truncated stream, or
incomplete upstream filtering), the bare </think> close tag leaked
through to the user in the live progressive edit. The agent-side final
scrubber (agent/think_scrubber.py) already had _strip_orphan_close_tags;
this ports the same logic into GatewayStreamConsumer so the streaming
display stays clean too.
- _filter_and_accumulate: strip orphan close tags before appending the
'no-opening-tag' branch text to _accumulated.
- _flush_think_buffer: same on stream end for held-back partials.
- 14 regression tests (TestStripOrphanCloseTags): all 6 close-tag
variants, multi-tag, partial-tag-untouched, trailing whitespace,
and end-to-end through _filter_and_accumulate / _flush_think_buffer.
Only strips KNOWN close-tag names (case-insensitive) — never arbitrary
tag-shaped substrings — so comparison operators and unrelated prose are
preserved.
Salvaged from PR #43192 by @testingbuddies24.
scripts/release.py AUTHOR_MAP is greped by the Contributor Attribution
Check to resolve a commit author's email -> GitHub username. Add
huangsen365@gmail.com -> huangsen365 so this PR's commits pass the check.
(This commit originally also carried a gateway race-test flake fix; that
edit is now dropped because main independently hardened the same test with
a superior server._sessions snapshot/restore isolation, making ours
redundant.)
`@file` / `@folder` context-reference expansion enforced its own narrow
deny-list (`_ensure_reference_path_allowed` in `agent/context_references.py`)
that only covered `~/.ssh` keys, a handful of shell dotfiles, `~/.hermes/.env`,
and `skills/.hub`. It never blocked the credential stores that the canonical
read guard (`agent/file_safety.get_read_block_error`) protects: provider API
keys (`~/.hermes/auth.json`), Anthropic OAuth tokens
(`~/.hermes/.anthropic_oauth.json`), MCP OAuth material (`~/.hermes/mcp-tokens/`),
webhook HMAC secrets, and project-local `.env` files.
This matters because the messaging gateway feeds **untrusted** remote text
straight into reference expansion: `gateway/run.py` calls
`preprocess_context_references_async(..., allowed_root=_msg_cwd)` where
`_msg_cwd` defaults to the operator's HOME when `TERMINAL_CWD` is unset. A chat
peer (Telegram/Discord/Slack/...) could send `@file:~/.hermes/auth.json`, pass
the `allowed_root` check (it resolves under HOME), slip past the narrow list,
and have the operator's live keys read into the agent's context — where the
model would typically echo or act on them.
Rather than duplicate and re-sync a second secret list, this routes the guard
through the existing single source of truth. A reviewer might ask "why not just
add `auth.json` to the local list?" — because the local list has already drifted
once (a prior commit had to add `.config/gh`); anchoring to
`get_read_block_error` means every future addition there protects this path too.
The narrow checks are kept as a fallback since they also cover dirs that guard
does not (`.aws`, `.gnupg`, `.kube`, etc.), and the canonical lookup is wrapped
so it can never crash reference expansion.
N/A
- [x] 🔒 Security fix
- `agent/context_references.py`: `_ensure_reference_path_allowed` now also
consults `agent.file_safety.get_read_block_error` after its existing checks
and refuses the reference when that canonical guard flags the resolved path.
The lookup is wrapped so guard-resolution failures fall back to the explicit
checks instead of breaking expansion.
- `tests/agent/test_context_references.py`: added
`test_blocks_canonical_read_denylist_credential_stores`, asserting that
`@file` attaches for `auth.json`, `.anthropic_oauth.json`, `mcp-tokens/*`, and
a project-local `.env` are all refused and their secret bodies never reach the
expanded message.
- `scripts/release.py`: added the contributor email to `AUTHOR_MAP` (release
gate).
1. `scripts/run_tests.sh tests/agent/test_context_references.py` — all 15 tests
pass, including the new credential-store case.
2. Regression proof: stash `agent/context_references.py`, run the suite with
`-- -k canonical`, and confirm the new test fails (secrets leak into the
message) without the fix; restore and confirm it passes.
3. `ruff check agent/context_references.py tests/agent/test_context_references.py`
and `python scripts/check-windows-footguns.py agent/context_references.py
tests/agent/test_context_references.py` both pass.
- [x] I've read the Contributing Guide
- [x] My commit messages follow Conventional Commits (`fix(scope):`, etc.)
- [x] I searched for existing PRs to make sure this isn't a duplicate
- [x] My PR contains **only** changes related to this fix (plus the AUTHOR_MAP release gate)
- [x] I've run the test suite for the touched area and all tests pass
- [x] I've added tests for my changes (required for bug fixes)
- [x] I've tested on my platform: macOS 15 (Darwin 25.5)
- [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A
- [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A
- [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A
- [x] I've considered cross-platform impact (Windows, macOS) — or N/A
- [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A
models_dev.py's fetch uses a synchronous requests.get(timeout=15). Called
from the async gateway message handlers, it blocked the event loop for up
to 15s, starving Discord heartbeats and causing ClientConnectionResetError
disconnects.
Adds get_model_context_length_async() which offloads the entire sync
resolution chain to a worker thread via asyncio.to_thread(), and switches
the two async gateway call sites (_prepare_inbound_message_text,
_handle_message_with_agent) to await it. The loop stays responsive; the
sync path remains the single source of truth for the cache.
Salvaged from PR #22753 by @itenev. Follow-up: dropped the unused
fetch_models_dev_async/lookup_models_dev_context_async aiohttp variants
from the original PR (dead code with zero callers that had drifted from
the sync cache logic) — the to_thread wrapper already runs the sync path
off-loop, so they were redundant.
Path(raw).name reduces '..'/'.'/'' to themselves, so basename
extraction alone still let a Graph-provided display_name of '..' or
'../' escape the temp recording directory (tmp_dir / '..' resolves to
the parent). Reject the dot-only basenames explicitly and fall back to
the artifact id. Extends @outsourc-e's regression coverage with the
dot-only cases.