hermes-agent

Author	SHA1	Message	Date
teknium1	49a87bcd1e	chore(release): map SahilRakhaiya05 contributor email for #44073 salvage	2026-07-01 03:56:28 -07:00
SahilRakhaiya05	bb304b4914	fix(gateway): fail-closed external-surface defaults + profile-aware multiplex authz Aligns runtime behaviour with SECURITY.md 2.6: externally reachable messaging adapters must fail closed unless access is explicitly configured. Closes the confirmed multiplex authorization bypass a secondary profile's open dm/group policy no longer inherits the default profile's allowlist trust. - Own-policy adapters (WhatsApp, WeCom, Weixin, QQBot, Yuanbao) default dm_policy/group_policy to pairing/allowlist instead of open; open now requires an explicit GATEWAY_ALLOW_ALL_USERS or per-platform allow-all. - Startup guard (_own_policy_open_startup_violation) refuses to boot when an enabled adapter is open without the allow-all opt-in; the guard now runs for every secondary profile in multiplex mode too. - Profile-aware own-policy authorization: _authorization_adapter / _adapter_for_source resolve the live adapter via SessionSource.profile, so _is_user_authorized and the ingress/pairing/busy/queue paths read the originating profile's adapter policy, not the default profile's. - Fail-closed intake for Email, Feishu P2P, and Discord (blank-principal denial, empty-allowlist deny, missing-interaction.user deny). Salvaged from #44073 (external-surface hardening), split into a focused gateway-authz PR per maintainer request. Follow-up fix by Hermes Agent: the Discord slash-auth channel bypass now matches DISCORD_ALLOWED_CHANNELS by the same name-inclusive keys (id + name + #name + parent) the on_message scope gate uses, so a name-form channel allowlist authorizes slash interactions consistently (was id-only, breaking #name matching). Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-07-01 03:56:28 -07:00
srojk34	8e94e8f882	fix(discord): tag unverified channel-context senders like Slack threads Discord's _fetch_channel_context backfills recent channel/thread activity (from any member who can post there, not just the allowlisted user) into the agent's context with no sender-trust distinction. Slack's equivalent _fetch_thread_context was fixed to prefix non-allowlisted senders with [unverified] and add LLM guidance not to act on their content, mitigating indirect prompt injection from third parties in shared channels/threads. Port the same mechanism to Discord using the already-wired _is_sender_authorized/set_authorization_check plumbing.	2026-07-01 16:25:16 +05:30
kshitijk4poor	23518a5e02	test(review): add integration guards for the two isolation wirings (review) Phase 2c mutation-check found the salvaged tests covered only the pure helpers (_is_background_review_harness_message / _strip_background_review_harness) — the two integration WIRINGS had zero coverage: removing the _persist_disabled guard in _flush_messages_to_session_db, or the _strip call in get_messages_as_conversation, left all 13 tests green. Add: - TestPersistDisabledHardStop: a _persist_disabled agent's flush writes nothing to a live SessionDB (guards the run_agent hard-stop). - TestGetMessagesAsConversationStripsHarness: a session with stray harness rows resumes clean end-to-end through get_messages_as_conversation (guards the hermes_state load-time wiring). Mutation-checked: each new test fails when its wiring is reverted.	2026-07-01 16:21:39 +05:30
arminanton	e2fa509bf3	fix(review): isolate the background-review fork from the canonical session The forked skill/memory review agent shares the parent's session_id for prompt-cache warmth. Without isolation it wrote its harness turn ('Review the conversation above and update the skill library…') plus its curator-mode reply straight into the user's REAL session in state.db; the next live turn re-read that injected user message as a standing instruction and the agent 'became' the curator, refusing the actual task. Root fix: a _persist_disabled flag on the fork that hard-stops every DB write and lazy-open path (_flush_messages_to_session_db, _ensure_db_session, _get_session_db_for_recall) — the review writes only to the skill/memory stores via its tools. Defense-in-depth: _strip_background_review_harness drops any stray harness message (and the assistant reply that followed) at load time in get_messages_as_conversation, so an already-polluted session resumes clean. Salvaged from #50296. Co-authored-by: arminanton <29869547+arminanton@users.noreply.github.com>	2026-07-01 16:21:39 +05:30
Swissly	242c9639a8	fix(cron): prevent multi-target delivery loop crash on per-target failure The standalone thread-pool fallback in _deliver_result() runs inside the `except RuntimeError:` block (taken when asyncio.run() sees a running loop). When future.result() raised there (SMTP ConnectionError, timeout, etc.), the exception was NOT caught by the sibling `except Exception:` — it escaped _deliver_result() and crashed the whole delivery loop, silently skipping every remaining target. Multi-target delivery (e.g. deliver: 'email:a,email:b') is a documented feature, so this broke a promised contract. Wrap the fallback in its own try/except so a per-target failure is logged with exc_info and the loop continues to the next target. Fixes #47163	2026-07-01 03:48:37 -07:00
kshitijk4poor	d3010b74db	test(agent): strengthen id-reuse regression + refresh flush docstring (review) Phase 2c review follow-up on the id()-reuse persistence fix: - test_recycled_id_in_dedup_set_still_persists_new_message seeded an EMPTY dedup set, so it never injected a collision and passed under id-based dedup too (couldn't distinguish the designs). Replace with test_stale_seed_id_from_prior_flush_cannot_suppress_new_message, which asserts the durable invariant: the seed is empty after every flush (mutation-checked: removing the post-flush reset now fails BOTH id-reuse tests). - Refresh the _flush_messages_to_session_db docstring: it still described the old per-session identity tracking; document the intrinsic-marker mechanism, that _flushed_db_message_ids is now a one-shot seed, and the shared-dict mutation safety note.	2026-07-01 16:17:46 +05:30
rrevenanttt	e4c6d1b22b	fix(agent): persist messages by intrinsic marker to stop id() reuse data loss _flush_messages_to_session_db deduped persisted messages with a retained {id(msg)} set (_flushed_db_message_ids) kept across turns. Once a flushed dict is dropped from the live list (scaffolding rewind / in-place compaction) and GC'd, CPython recycles its address onto a new assistant/tool dict whose id() collides with the stale entry — so the real turn is silently never written to state.db. Replace the retained id-set with an intrinsic _DB_PERSISTED_MARKER stamped on each dict. The id-set is demoted to a one-shot seed (valid only while the caller's objects are alive) that is translated to markers and cleared after every flush, so no id() outlives a flush to alias a future message. The marker is _-prefixed so the wire sanitizers strip it before any request leaves. Preserves the existing _is_ephemeral_scaffolding skip. Salvaged from #50372. Co-authored-by: rrevenanttt <290873280+rrevenanttt@users.noreply.github.com>	2026-07-01 16:17:46 +05:30
kshitij	1d6645b17f	Merge pull request #56296 from kshitijk4poor/fix/gateway-force-exit-pidlock-release fix(gateway): release PID file + runtime lock in the force-exit backstop	2026-07-01 16:14:26 +05:30
kshitijk4poor	b7adad1a72	test(error-classifier): parametrize 5xx overflow test over 500/502/503/529 Review nit (helix4u): the fix covers 500/502/503/529 but the positive tests only asserted 500 and 503. Parametrize over all four so 502/529 are covered too; keep the plain-5xx negatives.	2026-07-01 16:14:16 +05:30
pefontana	a04b7024ff	fix(error-classifier): route 5xx context-overflow into compression Local inference servers (llama.cpp/llama-server, vLLM/Ollama behind a Cloudflare/Tailscale hop) report context overflow with HTTP 500/502/503/529 instead of 400/413. _classify_by_status returned server_error/overloaded and retried blindly, then dropped the turn with no compaction. Route explicit _CONTEXT_OVERFLOW_PATTERNS matches on those 5xx codes to context_overflow (should_compress=True); plain 500 stays server_error, plain 503 overloaded.	2026-07-01 16:14:16 +05:30
Teknium	74809b4e94	fix(cli): reap dead-locked worktrees so .worktrees/ can't grow unbounded (#56288 ) hermes -w locks each worktree (reason 'hermes pid=<pid>'). git worktree remove --force (single -f) refuses a locked tree, so a crashed session's lock was never released and its worktree accumulated forever — a real contributor to .worktrees/ bloat. _prune_stale_worktrees now classifies each lock via _worktree_lock_is_live: a live-owner pid is skipped at any age; a dead-owner (or foreign) lock is unlocked first so the aggressive age-based cleanup can actually reap it. The >72h reap tier is kept (that cleanup is intentional) but now guarded so dirty/unpushed work is preserved, and branch deletion is gated on git worktree remove succeeding. New fail-safe helpers _worktree_is_dirty and _worktree_lock_is_live (pid liveness via gateway.status._pid_exists, Windows-safe).	2026-07-01 03:43:20 -07:00
teknium1	5c2dccd06f	chore(release): map kangsoo-bit author for PR #47508 salvage	2026-07-01 03:42:32 -07:00
kangsoo-bit	7a2369718a	fix(telegram): keep polling alive during transient bootstrap outages A transient Bot API network error during gateway bootstrap (deleteWebhook or the initial start_polling) currently raises out of connect() and marks the Telegram adapter fatal, restart-looping the whole gateway even though the right behavior is to degrade the Telegram channel and let the existing reconnect ladder recover in the background. - _delete_webhook_best_effort(): swallow only transient network errors and continue to polling; non-network errors (e.g. auth failures) still raise. - _start_polling_resilient(): on a transient conflict/network error at bootstrap, schedule background recovery and return degraded instead of raising; non-transient errors still propagate. - Track the polling error-callback recovery tasks in _background_tasks so they can't be garbage-collected mid-flight. - Add a second Telegram Bot API seed fallback IP (149.154.166.110). Reconnect keeps its existing 10-retry -> supervisor-restart semantics; this change only fixes the bootstrap raise, it does not alter the retry ladder.	2026-07-01 03:42:32 -07:00
teknium1	9dd6451c80	chore(release): add WXBR to AUTHOR_MAP for #46183 salvage	2026-07-01 03:34:49 -07:00
WXBR	59e7e9d007	fix(agent): persist recovered final responses Close a recovery/fallback final_response with an assistant transcript entry before session persistence so durable history cannot end at a tool/user message after the caller receives a final answer. Adds a regression for a tool-tail transcript with a non-empty final_response. Related to #46071 / #46053, but covers the adjacent case where the assistant message was never appended before persistence.	2026-07-01 03:34:49 -07:00
kshitijk4poor	df27267ed7	fix(gateway): release PID file + runtime lock in the force-exit backstop Follow-up to #54111. That PR routed the early SystemExit exit paths (clean-fatal-config #51228, startup-aborted-before-running) through _exit_after_graceful_shutdown / os._exit. Those paths raise right after runner.start() without going through _stop_impl, so they relied on atexit to release the PID file + runtime lock — and os._exit bypasses atexit, leaking both. Release them explicitly in the backstop (the single guaranteed cleanup chokepoint). Both calls are idempotent: no-op on the normal _stop_impl path, actual cleanup on the early-exit paths. Corrects the now-inaccurate docstring claim that teardown always ran first. Adds a guard test plus the missing str-code->1 coverage. E2E: real PID file written + lock acquired, _exit_after_graceful_shutdown(78) exits code 78 AND removes the PID file (leak confirmed closed).	2026-07-01 15:59:37 +05:30
YLChen-007	e23f723389	fix: make streaming reasoning-tag filter case-insensitive The streaming think-tag suppressors in cli.py (_stream_delta) and gateway/stream_consumer.py (_filter_and_accumulate) matched tag names with case-sensitive str.find(), so only the exact-case literals in the tag tuples were caught. Mixed-case variants a model may emit — <Think>, <ThInK>, <REASONING>, <Thought> — slipped through and leaked raw reasoning into the user-visible stream. Match against a lowercased view of the buffer with lowercased tag names at all three sites (open-tag boundary search, partial-tag hold-back, close-tag search) in both paths. Only KNOWN tag names are matched — no substring matching — and the block-boundary gating that protects prose mentions of <think> is preserved. - 6 parametrized case-insensitive regression tests in each of tests/gateway/test_stream_consumer.py and tests/cli/test_stream_delta_think_tag.py. Salvaged from PR #27289 by @YLChen-007.	2026-07-01 03:25:02 -07:00
pprism13	f049227f31	fix(state): order conversation replay by id, not timestamp get_messages_as_conversation ordered rows by (timestamp, id). append_message stamps each row with time.time(), which is not monotonic — on WSL2, after an NTP step, or when a VM/laptop resumes from sleep the clock can jump backwards mid-conversation. A later row then carries an earlier timestamp than its predecessor, so ORDER BY timestamp sorts an assistant tool_calls row after its tool response, orphaning the tool call and triggering an HTTP 400 on the next completion. Order by the AUTOINCREMENT id (true insertion order) instead. This is the sibling path to `c03acca50`, which already fixed get_messages but missed get_messages_as_conversation. Salvaged from #50356. Co-authored-by: pprism13 <290877921+pprism13@users.noreply.github.com>	2026-07-01 15:52:37 +05:30
kshitijk4poor	cde3ca4ebf	fix(gateway): widen force-exit to SystemExit paths + os._exit regression tests (#53107 ) Builds on the salvaged force-exit fix: - Route the start_gateway() SystemExit paths (clean-fatal-config #51228, planned-restart, service-restart) through the same os._exit backstop. Those paths previously fell through to normal interpreter finalization, leaving them vulnerable to the SAME wedged-non-daemon-thread hang the boolean-return paths now avoid. main() catches SystemExit and converts its code (None->0, int->code, str->1) to os._exit. Every exit path is now wedge-proof. - Document in the helper why bypassing atexit is safe (remove_pid_file + release_gateway_runtime_lock are performed explicitly in start_gateway teardown) and why logging is not flushed (synchronous RotatingFileHandlers). - Tests: assert termination via os._exit not SystemExit (adapted from @AgenticSpark's PR #53122, a duplicate of #53121), plus SystemExit(78) is routed through os._exit(78) and SystemExit(None) maps to os._exit(0).	2026-07-01 15:51:57 +05:30
teknium1	1c350728ec	chore(release): map Lazymonter into AUTHOR_MAP for PR #42914 salvage	2026-07-01 03:21:20 -07:00
HiaHia	8feeb0ccb8	fix(gateway): retry launchd bootstrap after bootout on EIO for install/start On macOS, `launchctl bootstrap` of a label still registered in the domain fails with 5: Input/output error (EIO). That is the already loaded case — a stale registration from an interrupted restart or a bootout that didn't settle — recoverable by booting the leftover out and bootstrapping again, and distinct from the domain being genuinely unmanageable. launchd_install and launchd_start (both bootstrap paths) treated exit 5 as 'launchd cannot manage this macOS version' and silently degraded to a detached process, losing auto-start at login and crash-restart. Centralize bootstrap in _launchctl_bootstrap(), which on EIO boots the stale label out and retries once; only if the retry also fails does the error propagate so callers apply their existing _launchctl_domain_unsupported fallback for a genuinely broken domain. launchd_restart already boots out before bootstrapping (its drained job is almost always still registered, so a plain bootstrap would hit EIO on the common path), so it keeps its explicit pre-bootout rather than routing through the bootstrap-first helper. Corrected the stale exit-5 comment that claimed it always meant an unmanageable domain. Adds TestLaunchctlBootstrapEioRetry covering clean bootstrap (no bootout), EIO -> bootout -> retry success, persistent EIO re-raise, and non-EIO re-raise without a spurious bootout.	2026-07-01 03:21:20 -07:00
teknium1	69f08c2eb5	fix(telegram): guard _post_connect_task access for object.__new__ test pattern disconnect() reads self._post_connect_task, but several tests build a bare TelegramAdapter via object.__new__() without calling __init__ (which sets the attr). Use getattr(..., None) so disconnect() works on those instances too (pitfall #17).	2026-07-01 03:18:57 -07:00
LeonSGP43	3362bdb4e5	fix(telegram): defer post-connect housekeeping off the connect path Command-menu registration (set_my_commands), the status-indicator, and DM-topic setup make Bot API calls that can stall for certain bot tokens. They ran inside connect() before/after _mark_connected() but still within the coroutine the gateway wraps in a connect timeout, so one slow call blew the whole connect and the adapter never came up — even though polling/webhook was already live (getMe works via curl). Fixes #46298. - mark connected as soon as polling/webhook startup succeeds - move command-menu, status-indicator, and DM-topic setup into a cancellable background housekeeping task (_run_post_connect_housekeeping) - cancel that task during disconnect so it can't fire into a torn-down client - harden scope-name lookup with getattr fallback Salvaged onto the relocated plugin adapter (plugins/platforms/telegram/ adapter.py) since the original PR #46404 targeted the pre-migration gateway/platforms/telegram.py path. Co-authored-by: Hermes Agent <teknium@nousresearch.com>	2026-07-01 03:18:57 -07:00
Tranquil-Flow	122e5bc037	fix(agent): retry 413 after stripping vision payloads (#47339 ) When text compression can't reduce a 413 request further, evict base64 image parts from tool messages and retry once instead of dead-ending with 'Payload too large and cannot compress further.' A 413 is a request-body byte-size limit, not a token limit. browser_vision screenshots (2-5MB base64 each) keep the HTTP body oversized even after aggressive summarization. The strip pass passes remember_model=False so a 413 does not poison _no_list_tool_content_models — that set is for providers that reject list-type tool content, a distinct failure mode. Cherry-picked from #47397 by Tranquil-Flow; placed onto main's current token-aware 413 recovery else branch.	2026-07-01 03:18:41 -07:00
Teknium	2b8adb8683	chore(release): map tgmerritt author for PR #43553 salvage	2026-07-01 03:17:48 -07:00
Tyler Merritt	320c587256	fix(context): parse vLLM's token-based output-cap error format vLLM (and other OpenAI-compatible servers) report context overflow with both the window and the prompt in tokens: "This model's maximum context length is 131072 tokens. However, you requested 65536 output tokens and your prompt contains at least 65537 input tokens, for a total of at least 131073 tokens." parse_available_output_tokens_from_error() already classified this as an output-cap error (the "requested N output tokens" gate), but none of the extraction patterns matched the "prompt contains [at least] N input tokens" phrasing, so it returned None. The recovery path then misclassified the failure as prompt-too-long and looped through compression — which frees little while each retry keeps requesting the same oversized max_tokens — terminating in "cannot compress further" even though simply lowering the output cap would have succeeded. Add an extraction branch for the token-based phrasing: available output = window - reported input. When the input alone is at or over the window it still returns None, so the caller correctly falls through to compression. Relates to #43547. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-07-01 03:17:48 -07:00
annguyenNous	a1f62f4777	fix(gateway): freshness-gate resume_pending against per-message zombies A crash-interrupted session marked resume_pending is returned by get_or_create_session so its transcript reloads intact. The idle/daily reset policy (#54442) keys on updated_at, which is bumped to now on every message — so a zombie session that keeps receiving messages never trips it and resumes stale context forever (context bleed reported on Telegram and Feishu). Gate the resume_pending branch on last_resume_marked_at (set once at resume-mark, never bumped per-message) against the auto-continue freshness window. If resume has been pending past the window, fall through to auto-reset with reason "resume_pending_expired". A window <= 0 disables the gate (opt-out for the pre-fix always-fresh behaviour). Also hoist auto_continue_freshness_window() into gateway/session.py as the single source of truth; gateway/run._auto_continue_freshness_window() now delegates to it (keeps the existing import/patch surface). Fixes #46934 Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-07-01 03:17:20 -07:00
teknium1	ac3f4aed96	docs(cron): correct stale 'no new seed code' comments for in_channel The in_channel surface DOES add a seed: _seed_cron_channel_session CREATES the flat (platform, chat_id, None) session and mirrors the brief into it, because mirror_to_session only APPENDS to an existing session and the flat channel row is otherwise absent for a chat_postMessage delivery. Correct the scheduler thread-skip comment and the test class docstring, which still described the earlier 'let the existing mirror seed it' design.	2026-07-01 03:16:13 -07:00
Ben	751a300fca	docs(cron): scope in_channel to channels; document DM continuation knob Live DM testing showed a reply to a DM cron brief did NOT continue the job. Root cause: for a 1:1 DM the governing knob is dm_top_level_threads_as_sessions (default True), NOT reply_in_thread / cron_continuable_surface. Under the default, each top-level DM keys to a per-message session (…:dm:<chat>:<ts>), so a reply mints a new ts and can never converge with the flat …:dm:<chat> session the cron seed creates. A 1:1 DM has no thread-vs-timeline split, so "in_channel" has no coherent meaning for a DM — cron_continuable_surface is a channel concept and is a no-op for DMs. DM continuation is governed entirely by dm_top_level_threads_as_sessions: - false → all top-level DMs share …:dm:<chat> → seed + reply converge → works - true (default) → per-message sessions → no continuation (cron or interactive) Option A (chosen): document the requirement; no code change (the flat-DM seed from the prior commit already lands correctly when the knob is false). Adds a ":::note 1:1 DMs" admonition to cron.md + the zh-Hans mirror. Verification (real inbound handler, not a hard-coded assumption — the mistake that made the earlier DM E2E falsely pass): tests/manual/cron_inchannel_dm_e2e.py drives the REAL _handle_slack_message for a top-level DM under both knob values and asserts false→converges (…:dm:D_TESTDM == seed), true→diverges (…:dm:D_TESTDM:<ts>). See decisions.md D9.	2026-07-01 03:16:13 -07:00
Ben	2c84fb42b0	fix(cron/slack): CREATE the flat session for in_channel (mirror only appends) Live testing exposed a real bug: an in_channel continuable cron delivered flat to the channel (✅) but the reply did NOT continue the job — the bot had no brief in context and confabulated the answer. Root cause: mirror_to_session only APPENDS to a session that already exists (_find_session_id → no-op when none matches); it never CREATEs one. A flat (slack, chat_id, None) row is only created when a human posts a top-level message the bot processes — a cron chat_postMessage delivery never goes through the inbound handler, so the row is absent and the brief is silently dropped. The prior impl relied on the bare mirror (F5/OQ-1 concluded "deletion only" — wrong). Fix: _seed_cron_channel_session mirrors _seed_cron_thread_session — get_or_create_session FIRST (chat_type = "dm" if is_dm else "group", thread_id=None), keyed to the ORIGIN USER'S id, then mirror. The channel session key embeds user_id (…:group:<chat>:<user>), so a system:cron id would key the seed away from the reply; the origin user's id makes seed key == inbound reply key. DM key ignores user_id but needs chat_type=dm to match the prefix. Wired into the in_channel branch after delivery; suppresses the generic mirror to avoid double-write. DM validated (per request): the seeded key equals the inbound DM reply key for a 1:1 DM; continuation works there too. Tests: - Rewrote the in_channel tests to use a real _session_store and the origin user_id; assert get_or_create_session is called with the flat, correctly- keyed source. Prove-fail: (a) reverting the create step and (b) seeding with system:cron each turn a targeted test RED; restore → GREEN. - +2 direct _seed_cron_channel_session unit tests asserting the KEY-MATCH invariant (seed key == inbound reply key) via build_session_key, for both channel and DM. - Rewrote tests/manual/cron_inchannel_e2e.py to drive a REAL SessionStore + real mirror_to_session + real _find_session_id + real build_session_key (no session-layer mocks — the old mocked E2E is exactly why the bug shipped). Asserts the brief lands in the transcript and the reply resolves to the same session, for BOTH channel and 1:1 DM. Full relevant sweep: 283 passed.	2026-07-01 03:16:13 -07:00
Ben	4b4349eb9a	feat(cron/slack): flat in-channel continuable cron delivery surface Add a per-platform `cron_continuable_surface` extra key (`thread` default \| `in_channel`) so a continuable cron job can deliver FLAT into a Slack channel — no dedicated thread — and still be replied-to. In `in_channel` mode the scheduler skips the thread-open branch (leaves `thread_id=None`); the shipped origin-mirror then seeds the `(slack, chat_id, None)` shared-channel session — the same bucket `reply_in_thread: false` routes inbound channel replies to — so a plain channel reply continues the job in context. Design: specs/cron-inchannel-continuable (D1–D7, F5). Model B (shared-channel session), NOT anchoring to the delivery `ts` — on Slack replying to a specific message IS threading, so a `ts` anchor would only relocate the thread, never deliver true threadless continuable. - gateway/platforms/base.py: `supports_inchannel_continuable` capability flag (default False → unsupported platforms fail SAFE to `thread`). - plugins/platforms/slack/adapter.py: flag=True; `_cron_continuable_surface()` resolver (coerces to the two-value enum); `_warn_if_inchannel_without_flat_reply` connect-time warning (D5: warn, not hard-require — the misconfig fails safe). - gateway/config.py: shared-key bridge line (top-level OR nested config). - cron/scheduler.py: read the key generically from platform config, gate the `in_channel` branch on the adapter capability flag, skip thread-open. No new seed function (reuses the existing mirror — G6). Pairing (docs): `in_channel` + `reply_in_thread: false` + `require_mention: false` (or a free-response channel). Missing `reply_in_thread: false` fails safe to a threaded continuation. Gateway-side config flag — `/restart` to apply; NO Slack app reinstall. Tests (from inside the worktree, PYTHONPATH=$PWD): - +6 cron scheduler tests (in_channel skips thread-open; seeds flat channel session with thread_id=None; thread-mode regression; fail-safe on unsupported platform; value coercion). Prove-fail: removing the `and not in_channel_surface` guard turns the two load-bearing tests RED; restore → GREEN. - +10 slack resolver/capability/warning tests; +2 config-bridge tests. - tests/manual/cron_inchannel_e2e.py: offline E2E driving BOTH real legs (delivery seed + inbound reply keying) → both converge on (slack, C, None). - No regressions: test_slack.py 216 passed alone; broader sweep green (4 pre-existing cross-file-ordering failures reproduce identically on pristine origin/main). Docs: cron.md + slack.md + zh-Hans mirrors of both.	2026-07-01 03:16:13 -07:00
kshitijk4poor	daf4f1a7a9	fix(tools): close the same session leak on the hermes_subprocess_env spawn surface (review) Review of the #50531 salvage found the cross-session HERMES_SESSION_* leak also survives on the non-terminal spawn helper hermes_subprocess_env (added by #56202 after #50531 was written), which does os.environ.copy() without the guard. Of its six callers, five re-bind the session identity explicitly (slash_worker/ACP via --session-key argv) and are safe by accident; but tui_gateway cli.exec (server.py) spawns a fresh CLI with NO --session-key under the engaged TUI host, so it inherits a possibly-foreign HERMES_SESSION_* from the last-writer-wins global and would stamp Kanban rows / telemetry with another session's id. Route hermes_subprocess_env through the same _inject_session_context_env chokepoint, restoring the single-uniform-policy-across-every-spawn-surface invariant the codebase already claims for the internal-secret filter. Safe for all six callers: bound ContextVars win (re-binders unaffected), _UNSET strips (closes cli.exec). Adds 3 guard tests; mutation-checked.	2026-07-01 15:42:19 +05:30
PolyphonyRequiem	cc395e8050	fix(gateway): close cross-session HERMES_SESSION_* leak into subprocess env Session vars (HERMES_SESSION_*) have a process-global os.environ mirror written last-writer-wins as a CLI/cron fallback and never cleared. Under a concurrent multi-session host (messaging gateway, ACP adapter, API server, TUI) that global belongs to whichever turn wrote it last. A subprocess spawned from a task whose session ContextVar is _UNSET (a sibling task that never bound, or one that inherited another session's context) inherited the FOREIGN global and acted on another session's identity. Add a session_context_engaged() latch (set once any host calls set_session_vars) and route both terminal spawn paths through a single _inject_session_context_env chokepoint: once engaged, a bound ContextVar (incl. "") is authoritative and an _UNSET var is STRIPPED rather than inheriting the possibly-foreign global. Pure single-process CLI/one-shot (never engaged) keeps the inherited fallback. Salvaged from #50531 (supersedes #49922). local.py hunk re-applied by intent onto the current hermes_subprocess_env refactor. Co-authored-by: PolyphonyRequiem <3107779+PolyphonyRequiem@users.noreply.github.com>	2026-07-01 15:42:19 +05:30
kshitijk4poor	e3819a4143	test(anthropic): add adjacency behavior test for #52145 + fix vacuous refresh-UA test (review) Review follow-up on the anthropic_adapter batch salvage: 1. #52145 shipped no behavior test for the adjacency rewrite. Add test_strips_tool_use_when_result_not_immediately_adjacent (a tool_use whose result appears later but NOT in the immediately-following user message must be stripped — the exact case the old global id-match got wrong) plus an adjacent-pair control. Mutation-checked: reverting to a global match fails the non-adjacent test. 2. test_token_refresh_ua_prefix was vacuous — it bound to _refresh_oauth_token (a wrapper with no urllib.request.Request), so its assert never ran and it did NOT guard the real refresh UA site. Retarget it at refresh_anthropic_oauth_pure (:1048) with the header-scoped check. Mutation- checked: reverting :1048 to claude-cli/ now fails it.	2026-07-01 15:42:15 +05:30
kshitijk4poor	5efbd7cb05	test(anthropic): scope OAuth-UA source check to header lines, not any mention The salvaged test_token_exchange_ua_prefix did a naive whole-function substring check for 'claude-cli/', which false-positives on an explanatory comment that references the old (blocked) UA. Scope it to actual User-Agent header lines — mirroring the sibling test_no_claude_cli_in_source — so a comment documenting why claude-cli/ is avoided doesn't trip it. Mutation-checked: an actual claude-cli/ UA header still fails the test.	2026-07-01 15:42:15 +05:30
DhivinX	49e129e495	fix(anthropic): use claude-code/ UA prefix for OAuth to avoid 404 (#48534 ) Anthropic's OAuth endpoints 404 for the claude-cli/ User-Agent prefix. Switch all three OAuth UA sites (build_anthropic_client, refresh_anthropic_oauth_pure, run_hermes_oauth_login_pure) to the claude-code/ prefix Anthropic expects. Salvaged from #51948. Co-authored-by: DhivinX <20087092+DhivinX@users.noreply.github.com>	2026-07-01 15:42:15 +05:30
fsaad1984	5881791adc	fix(adapter): enforce tool_use/tool_result adjacency in _strip_orphaned_tool_blocks _strip_orphaned_tool_blocks collected tool_result ids across ALL user messages and kept any assistant tool_use whose id appeared anywhere, rather than requiring the result to be in the immediately-following user message. A stale match elsewhere in the transcript could keep a genuinely-orphaned tool_use, which Anthropic rejects. Rewrite to adjacency-checked two-pass logic so a tool_use is kept only when its result immediately follows. Salvaged from #52145. Co-authored-by: fsaad1984 <38867992+fsaad1984@users.noreply.github.com>	2026-07-01 15:42:15 +05:30
kshitijk4poor	ede5c09f3b	docs(disk-cleanup): clarify cron output-root protection is exact-match Review follow-up: the _is_protected_cron_path docstring listed output/ next to jobs.json/.tick.lock as 'the directory itself', which is slightly ambiguous. Spell out that the match is EXACT-path only and must not be 'simplified' into a blanket cron/output/* guard (children stay cleanable) — prevents a future editor from re-introducing the wholesale-delete bug this fix closes.	2026-07-01 15:42:04 +05:30
martinramos002-bot	d173e8c3a7	fix: protect cron output root from cleanup Only classify files below cron/output/ as disposable cron output. The cron/output directory itself is a durable container for retained job history and should not be tracked or deleted wholesale. Add regression coverage for both category detection and cleanup of a stale tracked entry pointing at the output root.	2026-07-01 15:42:04 +05:30
kshitijk4poor	7f71a48a3a	fix(cron): release TERMINAL_CWD lock even when run_job body raises Rework follow-up on the per-job TERMINAL_CWD readers-writer lock. The lock was acquired BEFORE the try: whose finally: is the only release site, with the env-override statements (os.environ[TERMINAL_CWD] = workdir; logger.info) sitting in the unprotected window between acquire and try. Any exception there — a raising log handler, an os.environ error, a thread interrupt — propagated out of run_job WITHOUT running the finally, leaking the lock. A leaked writer permanently deadlocks the whole scheduler (every future cron job blocks on acquire_*); a leaked reader blocks all writers. - Snapshot _prior_terminal_cwd before the acquire (so the finally can always restore env even if the body raises before the override). - Open the try: immediately after acquire and move the env-override lines inside it, so the existing finally always releases the lock. - Add a mutation-verified regression test: a workdir job whose in-window logger.info raises must still release the writer lock (a subsequent acquire_write must not block).	2026-07-01 15:39:48 +05:30
entropy-0x	abc349bd79	fix(cron): isolate per-job TERMINAL_CWD from concurrent cron jobs A cron job with a per-job `workdir` overrides the process-global `os.environ["TERMINAL_CWD"]` for the entire duration of its agent run and restores it afterwards. The scheduler dispatches workdir jobs on a single-thread sequential pool and workdir-less jobs on a separate parallel pool, and the in-code comments claimed this made the override safe. That only prevents two workdir jobs from overlapping each other. The two pools run concurrently in the same process and share `os.environ`, so while a workdir job has `TERMINAL_CWD` pointed at its project directory, any workdir-less job firing in the same window reads that same global through the terminal, file, and code-exec tools and runs its commands in the wrong directory. The corruption window spans the whole workdir-job run, and a file write or delete can land in another job's tree. This serializes the override with a writer-preferring readers-writer lock. Workdir jobs acquire it as writers (exclusive for their whole run); workdir- less jobs acquire it as readers, so they still run in parallel with each other but never alongside a workdir job's override. The guarantee is based on run overlap rather than tick boundaries, so it also holds when a workdir job spans ticks. ## What does this PR do? Fixes a directory-isolation bug in the cron scheduler: a workdir cron job's process-global `TERMINAL_CWD` override could be observed by a concurrently running workdir-less cron job, causing that job's shell/file/code-exec commands to execute in the wrong directory. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) - [ ] ✨ New feature (non-breaking change that adds functionality) - [ ] 🔒 Security fix - [ ] 📝 Documentation update - [ ] ✅ Tests (adding or improving test coverage) - [ ] ♻️ Refactor (no behavior change) - [ ] 🎯 New skill (bundled or hub) ## Changes Made - `cron/scheduler.py`: add `_ReadWriteLock` (writer-preferring) and the module-global `_terminal_cwd_lock`. - `cron/scheduler.py`: in `run_job`, acquire the lock as a writer for workdir jobs and as a reader for workdir-less jobs, spanning the `TERMINAL_CWD` override and its restore in the `finally` block. - `cron/scheduler.py`: correct the stale comments in `run_job` and `tick` that claimed the sequential pool alone made the override safe. - `tests/cron/test_terminal_cwd_lock.py`: new tests for reader concurrency, writer exclusion, and the no-cross-observation regression. ## How to Test 1. `python -m pytest tests/cron/test_terminal_cwd_lock.py -q` — the regression test `test_reader_never_observes_writer_override` fails without the lock and passes with it. 2. `python -m pytest tests/cron/test_cron_workdir.py tests/cron/test_parallel_pool.py -q` — confirms the existing `TERMINAL_CWD` set/restore and pool behaviour are unchanged. ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits (`fix(scope):`, etc.) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix - [x] I've run the affected `tests/cron/` suites and all tests pass - [x] I've added tests for my changes (required for bug fixes) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (docstrings/comments) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture — N/A - [x] I've considered cross-platform impact (Windows, macOS) — uses stdlib `threading` only - [x] I've updated tool descriptions/schemas if I changed tool behavior — N/A	2026-07-01 15:39:48 +05:30
srojk34	db0fd8f290	fix(security): use caller package root for deregister opt-in policy lookup _plugin_override_policy is keyed by the plugin package root (e.g. hermes_plugins.allowed), but the lookup used caller_mod (the exact leaf module string). A call from hermes_plugins.allowed.cleanup would evaluate _plugin_override_policy.get("hermes_plugins.allowed.cleanup") → False and raise PermissionError even when the plugin registered opt-in under its package root. Switch the policy lookup to caller_root (.join of the first two segments) so submodule callers inherit the package-level allow_tool_override grant. Adds a focused regression test for the opted-in submodule case.	2026-07-01 15:37:58 +05:30
testingbuddies24	e07768a53f	fix(gateway): strip orphan think-tag close tags in progressive stream When a model emits an inline <think>...</think> block but the opening tag is dropped upstream (thinking-mode toggle, truncated stream, or incomplete upstream filtering), the bare </think> close tag leaked through to the user in the live progressive edit. The agent-side final scrubber (agent/think_scrubber.py) already had _strip_orphan_close_tags; this ports the same logic into GatewayStreamConsumer so the streaming display stays clean too. - _filter_and_accumulate: strip orphan close tags before appending the 'no-opening-tag' branch text to _accumulated. - _flush_think_buffer: same on stream end for held-back partials. - 14 regression tests (TestStripOrphanCloseTags): all 6 close-tag variants, multi-tag, partial-tag-untouched, trailing whitespace, and end-to-end through _filter_and_accumulate / _flush_think_buffer. Only strips KNOWN close-tag names (case-insensitive) — never arbitrary tag-shaped substrings — so comparison operators and unrelated prose are preserved. Salvaged from PR #43192 by @testingbuddies24.	2026-07-01 03:04:01 -07:00
amathxbt	6a6fd42111	fix(security): block subshell/brace-group wrappers at the hardline floor Wrapping a catastrophic command in a bare subshell or brace group walked straight past the unconditional hardline floor -- even under --yolo, /yolo, approvals.mode=off, and cron approve mode. The command-substitution forms were already caught; the bare paren / brace-group forms were the gap. Rather than add the paren and brace openers to the flat _CMDPOS pattern class (which cannot tell a real subshell opener from one sitting inside a quoted argument, and would false-positive on ordinary prose such as a PR title that merely mentions the trigger word), teach the existing QUOTE-AWARE command-start tokenizer (_iter_shell_command_starts) to treat the paren and brace openers as command starts, then emit a detection variant that marks each real command start with a newline (already a _CMDPOS separator). Openers inside quotes never register as starts, so quoted arguments are left untouched while real subshell/brace bypasses now anchor. One place covers every _CMDPOS rule (shutdown/reboot/init/ systemctl/telinit and the rm root/home/system floor). Tests: subshell/brace bypasses added to the hardline-block, root-wipe, and yolo-bypass sets; a regression set asserts quoted paren/brace prose is NOT blocked (guards our own gh-pr-create workflow).	2026-07-01 03:03:05 -07:00
teknium1	6d1291f2cc	chore(deps): bump aiohttp to patched 3.14.1 (from 3.14.0) 3.14.1 is the current patched release on the 3.14 line; both CVE-2026-34993 (CookieJar.load RCE) and CVE-2026-47265 (per-request cookie leak on cross-origin redirect) are fixed as of 3.14.0, and 3.14.1 rolls up the subsequent point fixes. Re-locked uv.lock.	2026-07-01 02:51:45 -07:00
Wing Huang	6c37b2c785	security(deps): enforce aiohttp CVE floor on all lazy messaging paths + coverage guard The messaging extra and platform.slack pin aiohttp==3.14.0, but several lazy messaging features listed only their SDK and let aiohttp come in transitively. Each of those SDKs caps aiohttp loosely enough that a vulnerable already-installed aiohttp still satisfies the range, so the eager extras got the patched floor while the lazy paths did not: - discord.py (aiohttp>=3.7.4,<4) - mautrix / aiohttp-socks (aiohttp>=3,<4 / aiohttp>=3.10.0) [Matrix] - microsoft-teams-apps (aiohttp<4) [Teams] (Teams additionally shipped an explicit but stale aiohttp==3.13.4 in both the pyproject `teams` extra and platform.teams.) - tools/lazy_deps.py: add aiohttp==3.14.0 to platform.discord, platform.matrix; bump the stale platform.teams pin 3.13.4 -> 3.14.0. - pyproject.toml: add aiohttp==3.14.0 to the matrix extra; bump the teams extra 3.13.4 -> 3.14.0 (homeassistant/sms/messaging already at 3.14.0). - tests/test_packaging_metadata.py: test_security_pins_present_in_mirrored_lazy_features now covers platform.discord/slack/matrix/teams. The existing agree-guard only compares packages pinned in BOTH sources, so it can't catch a lazy feature that omits a pin entirely; this guard is an explicit coverage contract (security package -> lazy features that must carry it) and fails with 'platform.matrix: aiohttp=MISSING' if a floor is dropped again. - uv.lock: regenerated, zero drift (aiohttp 3.14.0).	2026-07-01 02:51:45 -07:00
Wing Huang	828f33e6b1	fix(ci): map contributor email for attribution check scripts/release.py AUTHOR_MAP is greped by the Contributor Attribution Check to resolve a commit author's email -> GitHub username. Add huangsen365@gmail.com -> huangsen365 so this PR's commits pass the check. (This commit originally also carried a gateway race-test flake fix; that edit is now dropped because main independently hardened the same test with a superior server._sessions snapshot/restore isolation, making ours redundant.)	2026-07-01 02:51:45 -07:00
Wing Huang	6f956d7405	test(deps): guard pyproject<->lazy_deps pin consistency Adds two checks to tests/test_packaging_metadata.py: 1. No package is exact-pinned to two different versions across pyproject.toml's [project.dependencies] / extras. 2. Every package pinned in BOTH the pyproject extras and the LAZY_DEPS allowlist in tools/lazy_deps.py uses the same version. This is the regression guard for the drift the rest of this PR fixes: the two pin sources are hand-maintained mirrors (lazy_deps even documents "update both this map AND the corresponding extra"), and they have silently diverged on aiohttp and anthropic. Run against the pre-fix tree, check (2) fails on `anthropic: pyproject=['0.86.0'] lazy_deps=['0.87.0']`. The lazy_deps side is parsed via AST (not imported) so the test stays free of tools/lazy_deps.py runtime imports; only exact `==` pins are compared.	2026-07-01 02:51:45 -07:00
Wing Huang	db57cbbaf6	security(deps): bump aiohttp to 3.14.0, anthropic to 0.87.0; pin cryptography floor - aiohttp 3.13.4 -> 3.14.0 (messaging/slack/homeassistant/sms extras + lazy_deps platform.slack) — picks up CVE-2026-34993 (RCE via CookieJar.load deserialization) and CVE-2026-47265 (per-request cookie leak on cross-origin redirect). Both are fixed only in 3.14.0; there is no 3.13.x backport. - anthropic 0.86.0 -> 0.87.0 (anthropic extra) — CVE-2026-34450 / CVE-2026-34452. lazy_deps provider.anthropic was already 0.87.0; the extra pin had drifted back to the vulnerable 0.86.0, so this realigns it. - cryptography pinned explicitly at 46.0.7 in core deps — CVE-2026-39892, CVE-2026-34073. It only arrives transitively via PyJWT[crypto]; the explicit floor keeps the WeCom/Weixin crypto paths from drifting below the fix. uv.lock regenerated; only aiohttp / anthropic moved (cryptography already resolved to 46.0.7). Verified 3.14.0 satisfies discord.py 2.7.1 (aiohttp>=3.7.4,<4) and slack-sdk 3.40.1 (aiohttp>=3.7.3,<4).	2026-07-01 02:51:45 -07:00

1 2 3 4 5 ...

14043 commits