startUpdatePoller() only called checkBackendUpdates() — never checkUpdates().
The statusbar version pill reads $updateStatus (set by checkUpdates()), so the
commit-behind counter stayed null after restart. It only appeared when the user
manually clicked the pill, which triggered checkUpdates() via openUpdateOverlayFor.
Added void checkUpdates() in three places alongside the existing
checkBackendUpdates() calls:
- On startup in startUpdatePoller()
- In the 30-minute setInterval callback
- In the onFocus handler
checkUpdates() uses the Electron IPC bridge (local git check), not the gateway,
so no mode gating is needed. The existing $updateChecking atom guard prevents
double-fire on overlap.
Fixes#53079
parseSlashCommand used /^(\S+)\s*(.*)$/ where `.` can't cross a newline and
`$` anchors end-of-string, so any slash command whose arg contained a newline
(/goal <multi-line text>, a skill command with a long pasted context) failed
the whole match, parsed as an empty name, and rendered "empty slash command"
while the payload vanished — cleared from the composer and absent from the
Up-arrow history ring, which only derives from sent user messages.
- name now splits on any whitespace ([\s\S]* arg), matching the CLI and the
gateway's split(maxsplit=1); multiline args flow to slash.exec intact
- the residual empty-name branch (bare "/", "/ text") restores the submitted
text to the composer draft instead of eating it
Fixes#41323. Fixes#55510.
Per-session /model overrides (_session_model_overrides) were in-memory only,
so a gateway restart silently reverted every session to the global default
model. Persist the non-secret parts (model/provider/base_url ONLY — never
api_key) into the session entry in sessions.json and lazily rehydrate them
on first use after a restart, re-resolving credentials through the normal
runtime provider resolution.
- gateway/session.py: SessionEntry.model_override field with
sanitize_model_override() (allowlist: model/provider/base_url) applied on
both serialization and deserialization; SessionStore.set_model_override /
get_model_override accessors. reset_session() already creates a fresh entry,
so /new keeps its clear-on-reset semantics — a restart cannot resurrect an
override the user reset away.
- gateway/slash_commands.py: write-through at both /model set sites (text
command + picker) after storing the in-memory override.
- gateway/run.py: _rehydrate_session_model_override() called from
_resolve_session_agent_runtime(); in-memory state always wins, credentials
are re-resolved per provider (credential-less fallback on failure). Session
expiry finalization also drops the persisted override.
- tests/gateway/test_session_model_override_persistence.py: restart
round-trip, /new clearing, api_key-never-serialized (including tampered
sessions.json), rehydration + live-state precedence + credential-failure
degradation.
Salvaged from #3659 by @Git-on-my-level, narrowed to the restart-persistence
gap confirmed in triage.
Named providers / custom_providers entries in config.yaml now accept an
extra_headers dict scoped to that endpoint — for reverse proxies, API
gateways, and custom auth schemes (e.g. Cloudflare Access service tokens).
- hermes_cli/config.py: normalize extra_headers on provider entries
(_normalize_custom_provider_entry + providers-dict translation), add
get_custom_provider_extra_headers /
apply_custom_provider_extra_headers_to_client_kwargs helpers keyed on
base_url (case/trailing-slash insensitive, no substring bypass —
mirrors the TLS helpers)
- hermes_cli/runtime_provider.py: surface extra_headers in the resolved
runtime for named custom providers (providers dict, legacy
custom_providers list, and the credential-pool path)
- run_agent.py / agent/agent_init.py: merge per-provider extra_headers
onto the OpenAI client default_headers at construction and on every
_apply_client_headers_for_base_url re-application (credential swaps,
rebuilds), most-specific level wins; OpenAI-wire only (native
Anthropic/Bedrock scoped out)
- agent/auxiliary_client.py: accept model.extra_headers as an alias of
model.default_headers for the global variant
- cli-config.yaml.example: documented commented example
- Header values are treated as secrets and never logged
Salvaged from PR #3526 by @jneeee, reimplemented against current main.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Adds a no-code routing layer to the OpenAI-compatible API server so one
Hermes deployment can map different API clients to different
model/provider backends. Clients pick a backend by sending a configured
alias as the OpenAI 'model' field; unmatched values fall back to the
global model. Configured aliases are listed by GET /v1/models.
Precedence (highest first): session /model override > model_routes
route > global config. Route provider credentials resolve through
_resolve_runtime_agent_kwargs_for_provider (same seam as
channel_overrides); per-route api_key/base_url are upstream provider
credential overrides — never caller auth, never logged.
Salvaged and rebased from PR #3176 by @Mibayy onto current main.
Salvaged from PR #3243 by @Mibayy, reimplemented against current main
(the original diff targeted a removed gateway/run.py handler).
- /compact is now a first-class alias of /compress (CLI, gateway,
Telegram/Slack/Discord command lists, autocomplete) — also fixes the
dangling '/compact' references in gateway error messages
(gateway/run.py context-exhausted banners).
- --preview / --dry-run: report what WOULD be compressed (message
counts, token estimate, 'here [N]' boundary) without touching the
transcript. Flags coexist with the existing 'here [N]' / focus-topic
args on both the CLI and gateway surfaces via shared pure helpers in
hermes_cli/partial_compress.py.
- --aggressive (LLM-free hard truncation) is intentionally NOT
implemented: it would need its own transcript-persistence branch
outside the guarded _compress_context rotation machinery (#44794
data-loss class). The flag is recognized and returns an explanatory
message pointing at '/compress here [N]' and /undo instead of being
mis-parsed as a focus topic.
- locales: gateway.compress.aggressive_unsupported added to all 16
catalogs (parity test enforced).
- release.py: AUTHOR_MAP entry for contributor credit.
Salvage of #3459 by @keslerm, reimplemented against the restructured
progress-callback block in gateway/run.py (resolve_display_setting,
needs_progress_queue, thinking-relay). Duplicate PR #3458 by @dlkakbs was
submitted 4 minutes earlier with the same feature — both credited.
Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
tool_progress: log keeps the chat silent and appends timestamped tool-call
lines to ~/.hermes/logs/tool_calls.log via a dedicated queue drained by an
async writer (RotatingFileHandler 5MB x 3, RedactingFormatter so secrets
never land on disk). Gateway-only by design; thinking_progress relaying and
the webhook gate are unaffected. /verbose now cycles
off -> new -> all -> verbose -> log.
Salvage of the surviving hunk of #3296 by @Mibayy. The PR's gateway
_handle_provider_command hunk targets code removed on main (/provider was
absorbed into /model + /status, which already read model.base_url); the
hermes status mislabel was the remaining live symptom:
_effective_provider_label() only checked the legacy OPENAI_BASE_URL env var,
so a custom endpoint configured canonically in config.yaml still displayed
as OpenRouter.
WhatsApp has migrated to Linked Identity Device (LID) format for user
IDs (e.g. 244645917392975@lid instead of 18505551234@s.whatsapp.net).
The bridge already resolves LIDs to phone numbers for its own allowlist
check via buildLidMap(), but the senderId field in the message payload
sent to the gateway still contained the raw LID. This caused the
gateway's WHATSAPP_ALLOWED_USERS check to reject all messages as
unauthorized, since the LID numbers don't match the phone numbers in
the allowlist.
Fix: resolve LID → phone in the senderId, senderName, and chatName
fields of the event payload before sending to the gateway, using the
existing lidToPhone mapping.
Replace the plugin-local _IMAGE_MAGIC_MIME table + _sniff_image_mime
body with a delegation to agent.image_routing._sniff_mime_from_bytes,
the canonical magic-byte sniffer already used across the codebase, then
gate its result to the raster formats gpt-image-2's Responses
input_image actually accepts (png/jpeg/gif/webp).
The shared sniffer also recognizes SVG/TIFF/ICO; without the allowlist
those would pass local validation and be rejected server-side with an
opaque HTTP 400. Gating locally fails them cleanly as invalid_image_input.
Adds a regression test for SVG rejection.
Follow-up on top of @CrazyBoyM's #55828.
Follow-up to #56874, which added the Camofox private-page SSRF guard
(_camofox_current_page_private_url) but wired it only into the Camofox
eval path (_camofox_eval). The other Camofox content-read tools —
camofox_snapshot, camofox_get_images, and camofox_vision — still read the
current page's accessibility tree / images / screenshot without the
guard, so on a non-local Camofox backend they can return the content of
an intranet or cloud-metadata page (e.g. 169.254.169.254) that the
terminal itself can't reach.
Apply the same guard, gated on _eval_ssrf_guard_active (non-local
backend, not a local sidecar, allow_private_urls unset) and fail-open on
probe failure, matching the eval-path guard and the main-browser
snapshot/vision guards. camofox_back is intentionally not changed: its
target is unknown until navigation completes, and the subsequent content
read is already guarded.
Adds regression tests covering the three read tools blocking on a private
page, the public-page pass-through, and the guard-inactive no-probe path.
The helper docstring described the typical ~15-25k gateway payload but
read as if that were the trigger range; the floor actually engages above
10k tokens. Clarify the prose to match the gate.
Lower the openai-codex stale-timeout floor from 25k to 10k estimated
tokens so Telegram/gateway sessions (~20k tools+instructions) are not
aborted at the generic 90s cutoff while Codex is still prefilling.
Adds the AUTHOR_MAP entry for CrazyBoyM (ai-lab@foxmail.com) so the
contributor-attribution CI check passes when PR #55828's commits are
rebase-merged with authorship preserved.
Three CLI reliability fixes:
1. Interrupt reliability: chat() only re-queued the user's interrupt
message when the turn result carried interrupted=True. When the agent
thread raced past its last interrupt check (or finished) before the
interrupt landed, the message was silently dropped — and the stale
_interrupt_requested flag left on the agent instantly aborted the
NEXT turn. Un-acknowledged interrupt messages are now re-queued as
the next turn and the stale flag is cleared (only when the agent
thread actually exited). The clarify-race path also parks the message
in _pending_input instead of dropping it.
2. Slow exit (5+ min): stdlib ThreadPoolExecutor workers are non-daemon
and joined unconditionally by concurrent.futures' atexit hook — even
after shutdown(wait=False). One wedged tool worker (abandoned after
interrupt/timeout) held the process open forever. Promoted
async_delegation's daemon executor to a shared tools/daemon_pool
module and adopted it in tool_executor (concurrent tool batches),
memory_manager (background sync), delegate_tool (child timeout wrapper
+ batch fan-out), and skills_hub (source fan-out). Added a 30s exit
watchdog (HERMES_EXIT_WATCHDOG_S) armed at _run_cleanup start as a
backstop for wedged cleanup steps.
3. Exit jank: after prompt_toolkit tears down the input/status bars the
terminal sat silent for the whole cleanup window, looking hung. Print
'Shutting down… (finalizing session)' immediately at exit start.
E2E: live PTY interrupt of a foreground 'sleep 120' terminal tool now
aborts in ~1s and the typed message runs as the next turn; wedged-worker
+ wedged-cleanup subprocess exits in 5.8s (watchdog) instead of hanging.
Salvage of the surviving piece of #2696 by @tarunravi. The PR's other two
changes (tool progress streaming, SSE None-sentinel fix) were independently
superseded on main by the structured hermes.tool.progress SSE events and the
rewritten queue-drain loop.
Remote OpenAI-compatible frontends can't read server-local file paths, so
MEDIA:<path> tags (browser screenshots, generated images) were dead text.
_resolve_media_to_data_urls() now inlines small (<=5MB) local images as
markdown data URLs across all four response surfaces: chat completions
(non-streaming), session chat, session chat stream final event, and the
Responses API. Non-image, missing, or oversized paths pass through
untouched.
Salvage of #2794 by @CharmingGroot, ported to the relocated
plugins/platforms/email/adapter.py:
- Guard raw_email = msg_data[0][1] against IndexError/TypeError and
non-bytes payloads. UIDs are added to _seen_uids before fetch, so an
exception mid-batch permanently skipped every remaining message in
the batch — now the bad message is logged and skipped instead.
- Message-ID domain generation falls back to 'localhost' when
EMAIL_ADDRESS lacks '@' (now via a shared _message_id_domain() helper
covering all 3 send paths; the PR fixed 2 of 3).
- ChannelOverride + channel_overrides on PlatformConfig
- Resolve model/runtime: session /model, then channel_overrides, then global
- Thread/parent channel lookup; bridge discord.channel_overrides from YAML
- Drop unrelated test and delegate_tool changes from PR scope
Salvage of #2863 by @aydnOktay, reimplemented against current main using the
existing utils.env_var_enabled / TRUTHY_STRINGS helper instead of per-site
tuple edits. Covers the 7 gateway/config.py env-flag sites that still rejected
'on' (WHATSAPP_ENABLED, SIGNAL_IGNORE_STORIES, MATRIX_ENCRYPTION,
API_SERVER_ENABLED, WEBHOOK_ENABLED, MSGRAPH_WEBHOOK_ENABLED,
BLUEBUBBLES_SEND_READ_RECEIPTS) plus HERMES_DESKTOP gating in
read_terminal/close_terminal. The PR's approval.py HERMES_YOLO_MODE portion is
already on main via is_truthy_value.
delegation.max_concurrent_children is now the single cap for both a
batch's parallelism and concurrent background delegation units.
- _get_max_async_children() delegates to _get_max_concurrent_children();
a leftover max_async_children key logs a one-time deprecation warning
- config v32→33 migration removes the stale key, folding a raised
max_async_children into max_concurrent_children (max wins, no lost
headroom)
- capacity error messages now point at max_concurrent_children
- pool-at-capacity sync fallback now attaches an explanatory note so
the model/user know why the call blocked instead of dispatching async
Previously users who raised max_concurrent_children (e.g. to 15) still
hit the invisible default-3 async cap: the 4th background delegate_task
silently ran inline, blocking the turn with no signal.
CLAUDE_CODE_OAUTH_TOKEN is set and owned by the user's Claude Code
install (subscription OAuth), not a Hermes-managed inference
credential — Claude subscription auth is not a working Hermes provider
path. Blocklisting it broke agent-spawned claude CLIs: with no token in
the child env, claude fell through to the shared macOS Keychain /
~/.claude/.credentials.json store and, on auth failure, cleared it —
logging the user out of their interactive Claude sessions and the
desktop app.
Exempt it from _HERMES_PROVIDER_ENV_BLOCKLIST (it arrives via the
anthropic registry entry, so discard explicitly with rationale).
ANTHROPIC_API_KEY / ANTHROPIC_TOKEN and every other provider credential
remain stripped, and the GHSA-rhgp-j443-p4rf fail-closed passthrough
guard is unchanged for everything still on the blocklist.
Fixes#55878
Follow-up to the #39227 salvage: config refreshes fire mid-session too
(gateway events, settings saves), so applying terminal.cwd
unconditionally would yank the workspace out from under an attached
session. Gate the override on activeSessionIdRef like the sibling
reasoning/tier settings, keep branch refresh on the live cwd, and add
coverage for the active-session path. Also lint-polish the new test
file (typed config mock, prettier formatting).
Follow-up on the salvaged #56392 guard. The cherry-picked change matched
custom:<name> pool entries against the primary by raw base_url string
equality, which (a) can't disambiguate two named custom providers sharing
one gateway base_url and (b) left a latent bare-"custom" entry bypass.
Route the match through get_custom_provider_pool_key(rt[base_url]) compared
against the entry's custom:<name> key, mirroring the sibling guard in
recover_with_credential_pool. Use CUSTOM_POOL_PREFIX instead of the literal.
Add regression tests for the custom same-endpoint (swap) and cross-endpoint
(skip) branches, plus the plain-provider fallback-pool case from #56885.
Two related hardening fixes for auxiliary calls (which include MoA reference
advisors — a pinned-model path where provider fallback is not a meaningful
recovery):
1. Transient-transport retries: the same-provider retry on a connection reset /
timeout / 5xx / 408 was a single attempt, then fallback. For a pinned aux
call a second blip silently loses the call (root of the run2 double-advisor
'Connection error' collapse — a genuine upstream blip). Now retries N times
with exponential backoff, N = auxiliary.transient_retries (default 2 -> 3
total attempts, clamped [0,6]). Compression-on-timeout fast-fail carve-out
preserved.
2. Per-model client-cache isolation: _client_cache_key excluded the model, so
two concurrent auxiliary calls to the same provider/base_url/key but
different models (e.g. an opus + gpt-5.5 MoA fan-out) shared one cache entry
and could race each other's client lifecycle. Model now participates in the
key -> distinct clients, no cross-call races. Same-model reuse unchanged.
- agent/auxiliary_client.py: _transient_retry_count() + backoff loop; model in
_client_cache_key and both call sites.
- hermes_cli/config.py: auxiliary.transient_retries default (2).
- tests: new retry/isolation tests; updated 2 stale-expectation tests to the
corrected behavior (per-model resolve; N-retry escalation).
Backoff base is overridable (_TRANSIENT_RETRY_BACKOFF_BASE) so tests don't sleep.
Attribution audit gate: the salvaged contributor commits carry
kiljadn@gmail.com (Nick Mason / @designnotdrum). Add the mapping so
contributor_audit.py resolves the author on this PR.
Follow-up on the #56480 salvage: the include_registry=False docstring said
None is returned only for registry/MCP-only toolsets; it also applies to
registry-derived aliases, which have no static TOOLSETS counterpart.
_get_platform_tools reverse-maps a platform composite to configurable
toolsets with an all-tools subset test. Because get_toolset() merges
registry-registered tools into a toolset, a tool added to a toolset
(delegate_cli -> delegation; desktop-only read_terminal -> terminal) that the
static composite never listed made the subset test fail, silently dropping the
entire toolset on api_server and other inference-based platforms. Compare the
toolset's static membership at all three reverse-map sites.
Fixes#49622.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds include_registry=True kwarg to resolve_toolset/get_toolset. When False,
returns only the static TOOLSETS view with no registry-merged tools — the
composite-authored membership platform reverse-mapping must compare against.
Default True preserves all existing behavior; this is the enabling half of
the api_server toolset-drop fix (#49622).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extends the browser private-network eval guard to the Camofox backend.
On main, _browser_eval() returned early in Camofox mode before running the
shared private-URL literal pre-scan and before re-checking the page URL
after eval, leaving Camofox as a sibling backend that could execute
browser_console(expression=...) against private/internal targets.
- move the eval private-URL literal pre-scan before the Camofox early return
- add a Camofox current-page private-URL probe via the evaluate endpoint
- withhold Camofox eval results when the page is now private/internal
Follow-up to browser private-network hardening in #56173, #56526, #56664.
Salvage of #56764 by @rayjun (rayoo), cherry-picked to preserve authorship.