hermes-agent

Author	SHA1	Message	Date
SahilRakhaiya05	bb304b4914	fix(gateway): fail-closed external-surface defaults + profile-aware multiplex authz Aligns runtime behaviour with SECURITY.md 2.6: externally reachable messaging adapters must fail closed unless access is explicitly configured. Closes the confirmed multiplex authorization bypass a secondary profile's open dm/group policy no longer inherits the default profile's allowlist trust. - Own-policy adapters (WhatsApp, WeCom, Weixin, QQBot, Yuanbao) default dm_policy/group_policy to pairing/allowlist instead of open; open now requires an explicit GATEWAY_ALLOW_ALL_USERS or per-platform allow-all. - Startup guard (_own_policy_open_startup_violation) refuses to boot when an enabled adapter is open without the allow-all opt-in; the guard now runs for every secondary profile in multiplex mode too. - Profile-aware own-policy authorization: _authorization_adapter / _adapter_for_source resolve the live adapter via SessionSource.profile, so _is_user_authorized and the ingress/pairing/busy/queue paths read the originating profile's adapter policy, not the default profile's. - Fail-closed intake for Email, Feishu P2P, and Discord (blank-principal denial, empty-allowlist deny, missing-interaction.user deny). Salvaged from #44073 (external-surface hardening), split into a focused gateway-authz PR per maintainer request. Follow-up fix by Hermes Agent: the Discord slash-auth channel bypass now matches DISCORD_ALLOWED_CHANNELS by the same name-inclusive keys (id + name + #name + parent) the on_message scope gate uses, so a name-form channel allowlist authorizes slash interactions consistently (was id-only, breaking #name matching). Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-07-01 03:56:28 -07:00
srojk34	8e94e8f882	fix(discord): tag unverified channel-context senders like Slack threads Discord's _fetch_channel_context backfills recent channel/thread activity (from any member who can post there, not just the allowlisted user) into the agent's context with no sender-trust distinction. Slack's equivalent _fetch_thread_context was fixed to prefix non-allowlisted senders with [unverified] and add LLM guidance not to act on their content, mitigating indirect prompt injection from third parties in shared channels/threads. Port the same mechanism to Discord using the already-wired _is_sender_authorized/set_authorization_check plumbing.	2026-07-01 16:25:16 +05:30
kangsoo-bit	7a2369718a	fix(telegram): keep polling alive during transient bootstrap outages A transient Bot API network error during gateway bootstrap (deleteWebhook or the initial start_polling) currently raises out of connect() and marks the Telegram adapter fatal, restart-looping the whole gateway even though the right behavior is to degrade the Telegram channel and let the existing reconnect ladder recover in the background. - _delete_webhook_best_effort(): swallow only transient network errors and continue to polling; non-network errors (e.g. auth failures) still raise. - _start_polling_resilient(): on a transient conflict/network error at bootstrap, schedule background recovery and return degraded instead of raising; non-transient errors still propagate. - Track the polling error-callback recovery tasks in _background_tasks so they can't be garbage-collected mid-flight. - Add a second Telegram Bot API seed fallback IP (149.154.166.110). Reconnect keeps its existing 10-retry -> supervisor-restart semantics; this change only fixes the bootstrap raise, it does not alter the retry ladder.	2026-07-01 03:42:32 -07:00
teknium1	69f08c2eb5	fix(telegram): guard _post_connect_task access for object.__new__ test pattern disconnect() reads self._post_connect_task, but several tests build a bare TelegramAdapter via object.__new__() without calling __init__ (which sets the attr). Use getattr(..., None) so disconnect() works on those instances too (pitfall #17).	2026-07-01 03:18:57 -07:00
LeonSGP43	3362bdb4e5	fix(telegram): defer post-connect housekeeping off the connect path Command-menu registration (set_my_commands), the status-indicator, and DM-topic setup make Bot API calls that can stall for certain bot tokens. They ran inside connect() before/after _mark_connected() but still within the coroutine the gateway wraps in a connect timeout, so one slow call blew the whole connect and the adapter never came up — even though polling/webhook was already live (getMe works via curl). Fixes #46298. - mark connected as soon as polling/webhook startup succeeds - move command-menu, status-indicator, and DM-topic setup into a cancellable background housekeeping task (_run_post_connect_housekeeping) - cancel that task during disconnect so it can't fire into a torn-down client - harden scope-name lookup with getattr fallback Salvaged onto the relocated plugin adapter (plugins/platforms/telegram/ adapter.py) since the original PR #46404 targeted the pre-migration gateway/platforms/telegram.py path. Co-authored-by: Hermes Agent <teknium@nousresearch.com>	2026-07-01 03:18:57 -07:00
Ben	4b4349eb9a	feat(cron/slack): flat in-channel continuable cron delivery surface Add a per-platform `cron_continuable_surface` extra key (`thread` default \| `in_channel`) so a continuable cron job can deliver FLAT into a Slack channel — no dedicated thread — and still be replied-to. In `in_channel` mode the scheduler skips the thread-open branch (leaves `thread_id=None`); the shipped origin-mirror then seeds the `(slack, chat_id, None)` shared-channel session — the same bucket `reply_in_thread: false` routes inbound channel replies to — so a plain channel reply continues the job in context. Design: specs/cron-inchannel-continuable (D1–D7, F5). Model B (shared-channel session), NOT anchoring to the delivery `ts` — on Slack replying to a specific message IS threading, so a `ts` anchor would only relocate the thread, never deliver true threadless continuable. - gateway/platforms/base.py: `supports_inchannel_continuable` capability flag (default False → unsupported platforms fail SAFE to `thread`). - plugins/platforms/slack/adapter.py: flag=True; `_cron_continuable_surface()` resolver (coerces to the two-value enum); `_warn_if_inchannel_without_flat_reply` connect-time warning (D5: warn, not hard-require — the misconfig fails safe). - gateway/config.py: shared-key bridge line (top-level OR nested config). - cron/scheduler.py: read the key generically from platform config, gate the `in_channel` branch on the adapter capability flag, skip thread-open. No new seed function (reuses the existing mirror — G6). Pairing (docs): `in_channel` + `reply_in_thread: false` + `require_mention: false` (or a free-response channel). Missing `reply_in_thread: false` fails safe to a threaded continuation. Gateway-side config flag — `/restart` to apply; NO Slack app reinstall. Tests (from inside the worktree, PYTHONPATH=$PWD): - +6 cron scheduler tests (in_channel skips thread-open; seeds flat channel session with thread_id=None; thread-mode regression; fail-safe on unsupported platform; value coercion). Prove-fail: removing the `and not in_channel_surface` guard turns the two load-bearing tests RED; restore → GREEN. - +10 slack resolver/capability/warning tests; +2 config-bridge tests. - tests/manual/cron_inchannel_e2e.py: offline E2E driving BOTH real legs (delivery seed + inbound reply keying) → both converge on (slack, C, None). - No regressions: test_slack.py 216 passed alone; broader sweep green (4 pre-existing cross-file-ordering failures reproduce identically on pristine origin/main). Docs: cron.md + slack.md + zh-Hans mirrors of both.	2026-07-01 03:16:13 -07:00
skyzh	cc7d20d683	feat(raft): add gateway setup wizard Add an interactive Raft setup flow for hermes gateway setup. The wizard follows the existing platform adapter setup pattern, persists RAFT_PROFILE to the Hermes env file, preserves an existing profile when the user declines reconfiguration, and registers the flow via setup_fn. Add focused Raft adapter coverage for saving RAFT_PROFILE, keeping an existing profile, and registering setup_fn. Signed-off-by: skyzh <skyzh@mail.build> Signed-off-by: HaoHao <HaoHao@mail.build>	2026-07-01 02:45:11 -07:00
heathley	a8a97c358f	fix(matrix): block unsafe image redirects per-hop Matrix outbound image downloads validated only the final URL after following redirects, so a public URL that 302-redirects to loopback / private-network / cloud-metadata endpoints had already connected to the unsafe hop before the check ran. Re-validate every redirect hop before following it: - aiohttp path resolves redirects manually with allow_redirects=False, validating each Location via is_safe_url (aiohttp can't use the httpx response event hook). - httpx fallback installs the shared _ssrf_redirect_guard event hook. Regression tests cover per-hop blocking of an unsafe redirect, following a safe redirect chain, and httpx guard wiring.	2026-07-01 02:44:57 -07:00
Teknium	275e293f54	fix(matrix): decline dead/abandoned invites instead of retrying forever (#56222 ) An invite to a room with no remaining members surfaces as "no servers in the room have been provided" or "room not found" on join. The pending invite was never cleared, so every gateway startup re-attempted the join and re-emitted the warning indefinitely. Detect that specific failure mode by narrow error-message match and call leave_room to decline the invite; transient/network errors leave the invite untouched for the next sync. Adds 5 tests. Reimplements the matrix portion of #33953 onto the current plugin adapter (gateway/platforms/matrix.py was relocated to plugins/platforms/matrix/adapter.py since the PR was opened). The two gateway/status.py fixes from that PR (wrapper-subcommand rejection, psutil start-time fallback) already landed on main independently. Reported by @Bougey; original patch authored by @KiraKatana.	2026-07-01 02:44:18 -07:00
teknium1	43edbae638	fix(telegram): widen NoneType reconnect guard to the conflict-retry path The network-error reconnect ladder (#55992) captured a stable self._app local across its awaits and failed fast when the adapter was torn down mid-sleep. The 409-conflict retry path had the identical unguarded self._app.updater.start_polling() deref — a concurrent disconnect() during its RETRY_DELAY sleep would raise the same 'NoneType' object has no attribute 'updater' and, on a non-final retry, land in limbo. Apply the same stable-local + fail-fast pattern so the existing except block reschedules or escalates to fatal.	2026-07-01 02:03:58 -07:00
joaomarcos	a682091044	fix(telegram): close reconnect races that leave adapter half-destroyed _handle_polling_network_error's chained retry never updated self._polling_error_task, so the reentrancy guard shared with the heartbeat loop and the pending-updates probe went stale mid-recovery, letting more than one recovery attempt run concurrently against the same adapter. Combined with a TOCTOU window in _handle_adapter_fatal_error (the adapter was only removed from self.adapters in a finally block after awaiting disconnect()), two concurrent fatal notifications for the same adapter could both pass the "still installed" check and call disconnect() twice, which is where the reported "'NoneType' object has no attribute 'updater'" originates once self._app is cleared by the first call. - Reassign the chained retry task to self._polling_error_task so the guard reflects an in-flight recovery. - Capture self._app in a local variable across the stop/start_polling sequence instead of re-reading self._app between awaits. - Claim (pop) the adapter from self.adapters before awaiting disconnect() in _handle_adapter_fatal_error, not after, closing the TOCTOU window for a concurrent notification on the same adapter.	2026-07-01 02:03:58 -07:00
Glen Workman	5505dbbf43	fix(telegram): accept both list and mapping shapes for group_topics config The forum-topic skill-binding lookup assumed config.extra['group_topics'] was always a list of {chat_id, topics} entries. When an operator writes the natural mapping shape ({"-100...": [...]}), iterating yields string keys and chat_entry.get(...) raises AttributeError, breaking dispatch for that group. Normalize both shapes to a common iterator and guard non-dict/non-list entries so malformed config falls through cleanly instead of crashing.	2026-07-01 01:20:14 -07:00
zapabob	500c2b1e46	fix(security): close SSRF redirect-guard bypass across all httpx download hooks Inside httpx AsyncClient response event hooks, response.next_request is often None even for a genuine redirect, so guards keyed on `if response.is_redirect and response.next_request` silently never fire. A public URL that 302s to http://169.254.169.254/ was followed anyway, defeating the pre-flight is_safe_url() check. Resolve the redirect target from the Location header (via urljoin, so relative Locations work too), falling back to next_request only when no Location is present. Extracted as tools.url_safety.redirect_target_from_response and wired into every SSRF redirect guard: - gateway/platforms/base.py (shared image + audio download for all platforms) - tools/vision_tools.py (two download hooks) - plugins/platforms/slack/adapter.py Original fix by @zapabob (PR #35940), which targeted the since-refactored gateway/platforms/slack.py; reconstructed onto the current shared sites and widened to the whole bug class.	2026-07-01 01:18:53 -07:00
0xsir0000	50a7dce6bd	fix(discord): auto-thread failure must not silently fall back to inline reply When discord.auto_thread is enabled and a top-level server-channel message should be routed to a new thread, a transient thread-create failure (e.g. Cannot connect to host discord.com:443) returned None and _handle_message fell through to an inline parent-channel reply — dumping a new task into a shared channel and breaking thread-first workflows. - _auto_create_thread retries the primary + seed-message paths once after a 750ms backoff for transient connect errors. - _handle_message treats None as a hard failure: posts a short visible notice in the parent channel and returns without invoking the agent. The notify send is wrapped so a secondary connect error can't raise. Fixes #20243	2026-07-01 00:12:17 -07:00
nocturnum91	cc1e4c32c0	fix(telegram): normalize thread id in group gating via shared helper Group gating (_should_process_message) read the raw message_thread_id, while event routing (_build_message_event) normalized it. A plain non-forum group reply's message_thread_id is a reply-UI anchor, not a topic, so an anchor id matching an ignored_threads entry wrongly dropped the message, and the anchor was treated as a routable topic under allowed_topics. Extract _effective_message_thread_id and route both gating and event-building through it, so gating and session routing agree on one normalized value: real topic/forum messages keep their thread id, reply anchors are dropped, and forum General-topic messages normalize to the General-topic id.	2026-07-01 00:11:46 -07:00
teknium1	88c9dfecb2	docs(slack): correct block_kit docstrings to reflect native table blocks The renderer now emits native Block Kit table blocks; the module and _rich_blocks_enabled docstrings still described the earlier monospace-only approach.	2026-07-01 00:10:12 -07:00
Ben	7c7b489813	feat(slack): render markdown tables as native Block Kit table blocks Replace the interim monospace table fallback with Slack's native `table` block (rows of rich_text cells). Addresses the core ask in #18918. - _table_block(): builds type:"table" with rich_text cells, so inline formatting (bold, links, code) renders inside cells. - Column alignment parsed from the markdown separator row (:---, :-:, --:) into column_settings (left = default/null-skip, center/right emitted). - Escaped pipes (\\\|) are not treated as column separators. - Respects Slack's table limits (100 rows / 20 cols / 10k aggregate chars); oversized or unparseable tables gracefully fall back to aligned monospace (rich_text_preformatted), so a big table never breaks the message. Docs (EN + zh-Hans) updated to describe native tables + the fallback. Tests: native table shape, alignment->column_settings, inline-formatted cells, oversized/too-wide monospace fallback, escaped-pipe cell. Prove- failed against a stubbed _table_block (native-table tests fail, fallback tests stay green). All existing Slack tests still pass.	2026-07-01 00:10:12 -07:00
Ben	b080b93ad8	feat(slack): opt-in Block Kit rendering for agent messages Add platforms.slack.extra.rich_blocks (default off). When enabled, the final agent message is sent as Slack Block Kit blocks — section headers, dividers, and true nested lists via rich_text — instead of flat mrkdwn. - New plugins/platforms/slack/block_kit.py: pure markdown->blocks renderer (headers, dividers, nested ordered/bullet lists, blockquotes, fenced code; pipe-tables as aligned monospace since Block Kit has no robust table block). Enforces Slack's 50-block / 3000-char section limits and returns None to fall back to plain text on empty/oversized/unexpected input. Never raises. - adapter.send(): render blocks on the single-chunk primary message; a text= fallback is ALWAYS sent alongside (notifications/accessibility). - adapter.edit_message(): blocks only on finalize=True, so intermediate streaming edits stay plain mrkdwn (no per-flush block re-derivation). - Docs (EN + zh-Hans) + config example. Send-side only: no app reinstall. Tests: pure-renderer unit suite + adapter integration suite (blocks present when on, plain text when off, text fallback always set, finalize gating, multi-chunk fallback). Prove-failed against a stubbed renderer.	2026-07-01 00:10:12 -07:00
syahidfrd	0198713c33	fix(security): reuse auth chain when tagging unverified senders in Slack threads Mitigates indirect prompt injection (CWE-863) in Slack thread context. When the bot is mentioned mid-thread for the first time, _fetch_thread_context pulls the full thread via conversations.replies and prepends every reply to the LLM prompt. Replies from senders not on the allowlist were rendered identically to authorised senders, letting a third party in a shared channel inject instructions the model might act on when answering the next authorised message. - BasePlatformAdapter.set_authorization_check / _is_sender_authorized, registered by GatewayRunner._make_adapter_auth_check() with a closure over the existing _is_user_authorized chain (platform/global/group allowlists, allow-all flags, pairing store all stay the single source of truth — no env-var re-parsing). - Tags non-bot thread messages whose sender fails the auth check with an [unverified] prefix; strengthens the header with soft guidance only when at least one unverified message is present, so setups without an allowlist see no behaviour change. - Wired into all three adapter-init sites in run.py (start, reconnect watcher, restart) so the reconnect path is covered too. Softened wording: adapted from the original [untrusted] tag to [unverified] and non-accusatory header framing — the label reflects allowlist status, not a judgment about the person. Adapter relocated to plugins/platforms/slack/ since the PR was authored. Salvaged from #17059.	2026-06-30 18:05:43 -07:00
CRWuTJ	8ad15ff7dd	fix(telegram): cancel delayed deliveries on disconnect Buffered text/photo/media-group flushes and the polling-error recovery task sit behind an asyncio.sleep(). On disconnect they kept running and dispatched handle_message() into a torn-down session, producing stale or duplicate deliveries. disconnect() only cancelled media-group and photo batch tasks — text batches and the polling-error task leaked. Set a _drop_delayed_deliveries flag from _mark_disconnected/_set_fatal_error (cleared by _mark_connected) and check it in all enqueue+flush paths so a flush that wins the race against teardown drops instead of dispatching. _cancel_pending_delivery_tasks() now cancels+clears all four task maps, skipping the current task. Media-group flush finally-block guarded so a cancelled stale flush cannot erase a replacement task handle.	2026-06-30 17:39:30 -07:00
teknium1	36bfe3a449	fix(anthropic+feishu): model-gate max_tokens fallback; wire Feishu channel_prompt Two independent fixes salvaged from #12811 (closing it; one of its three bundled fixes — Discord free_response — is already on main). Anthropic max_tokens (#12790): the chat-completions max_tokens fallback only fired for OpenRouter/Nous URLs, so any other proxy serving a Claude model (AWS Bedrock, NVIDIA, LiteLLM, vLLM, corporate gateways) shipped requests with no max_tokens and inherited the proxy's low default (Bedrock: 4096), exhausting on thinking + large tool calls. Changed the gate in chat_completion_helpers.build_api_kwargs from URL-gated to model-gated: fires whenever the model matches an _ANTHROPIC_OUTPUT_LIMITS key. This also fixes a latent miss — the old 'claude' substring gate skipped MiniMax and Qwen3 even on OpenRouter. Remains a last-resort fallback (build_kwargs only applies it after ephemeral/user/profile max_tokens), so it never overrides an explicit value, and only touches the chat-completions transport (native Anthropic Messages API is a separate path). Feishu channel_prompt (#12805): the Feishu adapter never resolved channel_prompts config, unlike Discord/Slack, so per-channel role prompts were silently ignored. Added _resolve_channel_prompt() (delegating to the shared gateway.platforms.base.resolve_channel_prompt) and wired it into all three MessageEvent construction sites — inbound message, reaction routing, and card-action routing. Tests: tests/gateway/test_feishu_channel_prompts.py (6 cases) covering exact match, parent-thread fallback, no-match, missing-config safety, and event propagation.	2026-06-30 17:20:41 -07:00
codexGW	608e8a6062	fix(discord): accept raw direct bot mentions and ignore bare mention-only pings Some legitimate @bot pings were dropped because the mention gates relied on message.mentions alone, which does not always populate raw <@ID> / <@!ID> forms (mobile, edited, relayed messages). A bare @bot with no other text could also spawn a fake empty-text turn. - add _self_is_explicitly_mentioned() / _raw_mentioned_user_ids() helpers that treat the bot as mentioned via resolved mentions OR raw content forms - use them at the allow_bots=mentions gate, multi-agent bot filtering, the mention-strip/mention_prefix step, and the require_mention gate - drop bare mention-only pings (no text, no media, no injection, no backfill context) instead of injecting a placeholder empty turn Co-authored-by: Teknium <teknium1@gmail.com>	2026-06-30 16:38:31 -07:00
PRATHAMESH75	e55e9fad2c	fix(telegram): recover when polling updater stops while process stays alive The polling heartbeat's pending-update probe treated a stopped updater (running=False) as "someone else's job" and silently reset its counter, so a long-poll task that disappears with no reconnect in flight was never recovered. get_me() on the general request path stays healthy, so neither PTB's error_callback nor the connectivity probe ever fires — the gateway keeps running but stops receiving messages indefinitely (#55769). Detect the stopped-updater case directly in _probe_pending_updates and feed it into the existing _handle_polling_network_error ladder, debounced over two consecutive probes so a just-starting updater or the brief stop()->start_polling() window of an in-flight reconnect never trips it.	2026-06-30 15:36:58 -07:00
Erosika	1f1d346ced	fix(profile): resolve WhatsApp media-path cache roots per-call The inbound-media validator _is_allowed_bridge_path() checked against IMAGE_CACHE_DIR / AUDIO_CACHE_DIR / VIDEO_CACHE_DIR / DOCUMENT_CACHE_DIR value-imported at module load. After the base.py cache-dir getters became per-call resolvers, the bridge writes media into the active profile's cache while the validator still matched the frozen launch-profile constants — so media was rejected under a profile override (multi-profile gateway). Resolve the cache roots per-call via the get_*_cache_dir() getters and drop the now-unused frozen value-imports. Caught by automated review on #55867.	2026-06-30 15:30:06 -07:00
konsisumer	46ab06c238	fix(gateway): honor Discord connect timeout for ready wait	2026-06-30 15:03:25 -07:00
teknium1	af5cea04ab	fix(discord): split oversized final edits, truncate mid-stream previews (#27881 ) DiscordAdapter.edit_message clipped any formatted payload over the 2,000-char cap to [:1997]+"..." and returned success=True, so the stream consumer believed the full reply landed and stopped — the user lost everything past the boundary and perceived the agent as quitting mid-task. edit_message is now overflow-aware, mirroring Telegram's proven contract: - finalize=True: split-and-deliver via _edit_overflow_split — edit chunk 1 in place, send chunks 2..N as reply-threaded continuations, return the last visible id in message_id plus continuation_message_ids so the stream consumer keeps editing the most recent chunk and can clean them all up. - finalize=False (mid-stream): truncate a one-message preview in place, never split. A mid-stream split moves the edit target to a continuation and the next accumulated-token tick re-splits, looping forever (the Telegram #48648 lesson the original port predated). - Reactive 50035 '2000 or fewer in length' on edit runs the same branch logic. - Partial continuation failure still reports success with a partial_overflow raw_response so the consumer retries the tail instead of marking a clipped reply complete. Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com> Co-authored-by: AhmetArif0 <147827411+AhmetArif0@users.noreply.github.com>	2026-06-30 03:49:52 -07:00
Keira Voss	a61cf774ce	feat(whatsapp): tag owner-typed inbound text with [owner reply] prefix When WHATSAPP_FORWARD_OWNER_MESSAGES is enabled and the bridge marks an inbound message with fromOwner=true, also prefix MessageEvent.text with "[owner reply] " at construction time. This makes the disambiguation survive any downstream plugin failure (e.g. handover-rule errors that bypass silent_ingest), so transcripts never misattribute owner-typed text to the customer. Idempotent: re-applies are guarded so a future producer that pre-tags text won't be double-prefixed.	2026-06-30 03:41:43 -07:00
keiravoss94	84f350efe0	feat(whatsapp): opt-in forwarding of owner-typed messages in bot mode In `WHATSAPP_MODE=bot` the bridge currently drops every fromMe inbound message — they are all assumed to be echoes of our own /send calls. That makes it impossible for plugins / agents to detect when a human owner has typed directly into a customer chat from the same WhatsApp Business account (e.g. via a linked phone or WhatsApp Web). This adds an opt-in `WHATSAPP_FORWARD_OWNER_MESSAGES` env var. When true, the bridge classifies fromMe inbound by looking up `key.id` in a bounded LRU of recently-sent message IDs (the existing 50-entry echo suppressor, bumped to 512 and extracted to a testable `outbound_ids.js` helper). Hits in the LRU are still dropped (echoes); misses are forwarded to the Python adapter with `fromOwner: true`. The Python adapter lifts that flag onto `MessageEvent.metadata["whatsapp_from_owner"]`. `metadata` is a new free-form dict on the event so future per-platform signals don't each need their own field. Default behaviour is unchanged: with the env flag unset, bot mode still drops every fromMe message exactly as before. Use cases for downstream consumers: - Implicit handover activation when the owner replies manually - Sliding TTL on owner activity (keep an active session alive while the owner is engaged) - Audit trails of owner interventions - Analytics on human-vs-bot reply ratios Heuristic limitation (documented in code): the LRU is in-memory. After a bridge restart, in-flight delivery receipts of pre-restart sends will briefly look like owner-typed for a few seconds until the set is repopulated. Persisting isn't worth the disk churn — downstream consumers should treat the flag as best-effort. Tests: - tests/gateway/test_whatsapp_from_owner.py (new): adapter sets the metadata flag iff the bridge payload has `fromOwner: true`; absent otherwise. - scripts/whatsapp-bridge/outbound_ids.test.mjs (new): LRU bounds, eviction order, falsy-id handling. Backwards compatibility: with the env flag unset, every code path is identical to before. No existing deployment is affected.	2026-06-30 03:41:43 -07:00
teknium1	f5eb4c307b	fix(gateway): stop Matrix upload fallback from leaking host path The Matrix adapter's _upload_file fell back to sending "(file not found: {file_path})" directly into the room — the same host-path leak class fixed for the base adapter and Slack in the previous commit. Replace it with a friendly notice, log the path at WARN for operators, and preserve any caller-supplied caption.	2026-06-30 03:24:36 -07:00
UgwujaGeorge	cb9d18c759	fix(gateway): stop media-send fallbacks from leaking host paths into chat The base BasePlatformAdapter implementations of send_voice, send_video, send_document, and send_image_file forwarded their _path argument verbatim into the chat text (e.g. "🎬 Video: /home/.../hermes/cache/..."). Telegram, Discord, and Slack adapters all fall back to those base methods when their native send raises — so a rejected video on Telegram surfaced the host filesystem layout to the user instead of a useful message. Replace the path-echo with a friendly notice, log the path for operator diagnostics, and keep the user-supplied caption intact. The Slack adapter had three identical sites that fell through to the same path-echo on its own native upload failures; fix those too. send_document still surfaces the caller-provided file_name (or the basename derived from it) since that is the user-facing filename, not a host path. Add regression tests asserting the _path argument never appears in the fallback content while caption text and explicit file_name still do.	2026-06-30 03:24:36 -07:00
teknium1	b6045170bb	fix(discord): extend channel-name matching to slash-command auth; clamp flush deadline to disconnect budget Follow-up to the salvaged #8008 fix: - Sibling-site fix: _evaluate_slash_authorization gated DISCORD_ALLOWED_CHANNELS / DISCORD_IGNORED_CHANNELS on numeric IDs only, so name/#name config that now works for on_message still silently failed for slash-command interactions. Refactor the channel-key helper to _discord_channel_keys_from_channel(channel, parent) and reuse it at the interaction gate. Fail-closed on missing channel id is preserved. - The contributor's hardcoded 8s flush deadline could be hard-cancelled mid-flush: _teardown_adapter already wraps cancel_background_tasks() in the per-adapter disconnect budget (HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT, default 5s). The flush deadline now derives from that budget with headroom so it always completes inside it. - AUTHOR_MAP: map cypher@augmentl.com -> Nickperillo for CI. - Tests: slash-auth name/#name allow + name ignore matching.	2026-06-30 02:48:42 -07:00
Cypher	cb9308f0a6	fix(discord): channel name matching and flush pending sends on shutdown Two related fixes to the Discord gateway adapter: 1. Channel name matching (free-response, allowed, ignored, no-thread channels) Previously these config values only matched against numeric channel IDs. If a user configured free_response_channels: cypher (by name), the adapter would silently ignore it because it only intersected against channel_ids. Now the adapter builds a channel_keys set that includes the channel ID, channel name, and #channel-name form, and checks all three for each gate. 2. Flush pending text-batch tasks before shutdown The Discord adapter uses _pending_text_batch_tasks (its own dict) for merging rapid successive message chunks. These tasks were NOT added to self._background_tasks (the base class list), so the base cancel_background_tasks() never awaited them on restart/shutdown. This caused a race: in-flight response deliveries were cancelled before Discord had a chance to send them, resulting in silent dropped messages visible to users as tool-log-only replies with no text body. Fix: override cancel_background_tasks() in DiscordAdapter to await all pending text-batch tasks (8s deadline) before delegating to the base class.	2026-06-30 02:48:42 -07:00
Ben	184c10cf97	fix(slack): warn when configured token is a user token, not a bot token A Slack user/legacy token (xoxp-...) makes auth.test resolve to the installing human's member ID with no bot_id, so the adapter binds its identity (_bot_user_id / _team_bot_user_ids) to that human. Every "is this the bot?" check then misfires: that person's <@...> mentions wake the bot and are stripped as the bot's own mention, so the agent is genuinely told it was @mentioned and replies to messages merely addressed to that human (symptom: bot responds to "@trevor ..." and insists it was explicitly mentioned). There is no runtime API error to catch — a user token still sends/receives — so the only detectable moment is connect time. Add a warning-only nudge (_warn_if_not_bot_token) alongside the existing group-DM scope nudge: when auth.test resolves a user_id but no bot_id, log that the token is a user token and to use the xoxb-... Bot User OAuth Token. Warning-only: does not block a working-but-misconfigured install. Fires once per workspace per process.	2026-06-29 20:57:43 -07:00
Mibayy	9e490138a0	fix(security): fail-closed feishu webhook rate limiter + whatsapp bridge path guard Salvages the two still-valid hardenings from #5381 onto the relocated plugin adapters (the discord/feishu/whatsapp adapters moved to plugins/platforms/ since the PR was opened, and 4 of its 6 hunks are already on main or superseded). - feishu: rate limiter now denies untracked keys when the tracking table is at capacity after pruning stale entries (was: allow through without tracking). At-capacity-with-all-fresh-entries only happens under abuse, so allowing untracked requests let an attacker who flooded the table bypass the limiter entirely. Already-tracked keys and post-prune room are unaffected. - whatsapp: absolute file paths handed back by the Baileys bridge are now validated to resolve inside a known media cache dir before being attached. A compromised/buggy bridge could otherwise return an arbitrary path (e.g. /etc/passwd) that would be sent verbatim to the model. Guard resolves symlinks and accepts both the canonical cache/<kind> and legacy <kind>_cache layouts.	2026-06-29 04:25:31 -07:00
teknium1	34e616e778	feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required Follow-up to the group-DM manifest fix. The manifest change only helps NEW installs; existing apps keep their old (mpim-less) scopes until the admin reinstalls. Since a missing message.mpim event delivers nothing (no runtime API error to catch), detect stale installs at connect time from the auth.test x-oauth-scopes header and log an actionable reinstall nudge when im:history is granted but mpim:history is not. Also promote message.mpim from Recommended to Required in the docs event tables so the default setup path can't drop it.	2026-06-29 01:02:53 -07:00
Teknium	74541beb9c	fix(security): cap WeCom callback body size before pre-auth XML parse (#54615 ) The WeCom callback endpoint (internet-facing, 0.0.0.0) parsed untrusted request bodies before signature verification. defusedxml already guards the entity-expansion class on main, but there was no cap on raw body size, so an unauthenticated POST could still force unbounded read work pre-auth. Set client_max_size=64KB on the aiohttp app (413 at the framework layer) plus an explicit length guard in _handle_callback as defense in depth. WeCom callbacks are small encrypted XML envelopes — media is delivered out-of-band via MediaId, never inline — so 64KB is ample for legitimate traffic. Adds tests for oversized (413) and normal-sized (not 413) bodies. Salvaged from #10192 by @memosr (body-size limit half; defusedxml half already superseded on main).	2026-06-28 22:35:43 -07:00
aaronagent	d836b2bac4	fix(matrix,mattermost): invite auth check + API path traversal guard Two platform-security hardenings: - Matrix: _on_invite now checks the inviter against the existing allow-list (_allowed_user_ids / GATEWAY_ALLOW_ALL_USERS) before auto-joining. Without this any federated Matrix user could invite the bot into arbitrary rooms, exposing its presence and metadata. The message and reaction paths already enforce this allow-list; the invite path bypassed it. - Mattermost: _api_get / _api_post / _api_put reject any path containing '..'. WebSocket-event values (channel_id, post_id, file_id) are interpolated directly into API paths, so a malicious or compromised server could craft traversal payloads to make the bot issue authenticated requests to arbitrary endpoints with its bearer token. The configurable-E2EE-passphrase change from the original PR is dropped: the matrix adapter was rewritten onto mautrix and the passphrase-protected key-export file no longer exists.	2026-06-28 20:47:33 -07:00
teknium1	c648ecdca5	fix(telegram): reject unauthorized users before event construction (#40863 ) Removed/unauthorized Telegram users could inject prompt content before the per-user auth gate fired. The adapter ran `_should_process_message`, `_build_message_event`, and text/photo batching — and dispatched to the runner — before `_is_user_authorized()` (gateway/authz_mixin.py) rejected the sender. Unmentioned group chatter from a removed user was also persisted into the session transcript via `_observe_unmentioned_group_message`, leaking into the agent's observed context independent of dispatch. Add `_is_user_authorized_from_message()` as an intake prefilter that runs in `_handle_text_message`, `_handle_command`, `_handle_location_message`, and `_handle_media_message` BEFORE batching, event construction, and the unmentioned-group observe branch. It reuses the runner's `_is_user_authorized()` with a correctly-shaped SessionSource (group vs forum vs dm, real chat_id for TELEGRAM_GROUP_ALLOWED_* allowlists), falls back to env allowlists, and only rejects when an allowlist actually exists — unknown DMs with no allowlist still reach the pairing flow. Channel posts authorize via `sender_chat` identity when `from_user` is absent. Co-authored-by: liuhao1024 <sunsky.lau@gmail.com> Co-authored-by: Carlos Manuel Cejas <carlosmcejas@gmail.com>	2026-06-28 14:25:15 -07:00
Brooklyn Nicholson	eeca59f489	fix(windows): hide remaining backend console-flash legs missed on main main (`cb982ad99`) wired windows_hide_flags() into the auxiliary git/gh/wmic/ bash/powershell/taskkill legs but left two it didn't reach, plus the Electron backend-launch leg it explicitly deferred. Cover them the same way: - apps/desktop/electron/main.cjs: getNoConsoleVenvPython resolves the BASE pythonw.exe instead of the venv Scripts\pythonw.exe shim, which re-execs a console python.exe and flashes a conhost the desktop backend can't suppress. Both backend creators put the venv site-packages on PYTHONPATH so imports still resolve under the base interpreter. (main's commit said this Electron leg "needs a Windows-tested change of its own".) - tools/tts_tool.py, tools/transcription_tools.py, plugins/platforms/discord: ffmpeg conversions (voice notes / TTS / STT) via windows_hide_flags(). - plugins/platforms/whatsapp: netstat + taskkill bridge-port cleanup via windows_hide_flags(). All no-ops on POSIX. Tests assert the base-pythonw preference and the ffmpeg legs pass CREATE_NO_WINDOW.	2026-06-28 10:19:21 -05:00
teknium1	d5ba374c03	fix(telegram): detect wedged getUpdates consumer via pending_update_count The merged CLOSE-WAIT heartbeat (#52744) only probes get_me(), which uses the general request path and stays healthy while PTB's getUpdates consumer is silently wedged (updater.running=True but the long-poll task is stuck, observed on WSL2). DMs then queue in the Bot API and never reach handlers (#42909). Augment the existing _polling_heartbeat_loop to also probe get_webhook_info().pending_update_count. After two consecutive probes that see a non-draining queue while the updater claims to be running, escalate into the existing _handle_polling_network_error recovery ladder — no new restart machinery. No-ops in webhook mode, when the updater is not running, or when a reconnect is already in flight. Credit to @gazzumatteo, whose PR #42959 identified the pending_update_count signal as the missing liveness probe. This reuses the existing heartbeat + recovery path rather than adding a parallel watchdog. Fixes #42909.	2026-06-28 02:44:17 -07:00
liuhao1024	14baeefe1d	fix(matrix): record DM rooms in m.direct on invite to prevent group misclassification Rebase onto plugins/platforms/matrix/adapter.py (code moved from gateway/platforms/matrix.py). Same logic: _on_invite checks is_direct on invite events and calls _record_dm_room to persist in m.direct account data. Fixes #44679	2026-06-28 02:37:52 -07:00
yungchentang	7e2ca7f68d	fix(telegram): reset send pool after pool timeouts	2026-06-28 02:34:17 -07:00
Teknium	2ecb6f7fe6	fix(telegram): clear send_path_degraded on successful reconnect (#35205 ) (#54076 ) * fix(telegram): clear send_path_degraded on successful reconnect _send_path_degraded was cleared only in _verify_polling_after_reconnect, 60s after reconnect and only if scheduled. A clean start_polling() reconnect left the flag stuck True, short-circuiting send() and blocking all outbound messages until the deferred probe ran (or forever if it never did). Clear the flag the moment start_polling() succeeds — that is the recovery signal. The deferred probe remains a defensive re-check that re-enters the reconnect ladder (re-setting the flag) if it detects a silent wedge. Fixes #35205. * docs: add infographic for #35205 telegram send-path fix	2026-06-28 01:38:17 -07:00
konsisumer	3f543229f2	fix(telegram): notify user when clarify button tap arrives after expiry	2026-06-28 01:07:53 -07:00
sweetcornna	fc70d023d8	fix(telegram): apply bot auth policy to Telegram sources # Conflicts: # gateway/config.py	2026-06-28 00:57:03 -07:00
Teknium	f03823014b	fix(telegram): kill 409 polling conflict loop by disarming PTB retry synchronously (#53941 ) Telegram polling entered a self-inflicted ~31s loop of 409 Conflict -> retry -> resume -> Conflict. The error_callback PTB invokes synchronously inside its internal network_retry_loop only scheduled our async recovery task (loop.create_task) and returned, so PTB kept polling getUpdates on its own while our handler concurrently ran stop -> sleep -> start_polling. The two polling sessions overlapped and Telegram returned a fresh 409. Fix: in the conflict branch of the error_callback, synchronously set PTB's private polling stop_event before scheduling recovery. PTB's loop exits on its next tick (it races that event in do_action), so our handler owns polling alone. The handler's await updater.stop() drains the task and PTB clears the event, so the subsequent start_polling() builds a fresh event and is not poisoned. Keeps the existing reconnect ladder intact (option B) — fixes only the race. Defensive: probes mangled + unmangled stop_event spellings and no-ops (prior behaviour) if neither exists; never flips _running, which would make the handler skip stop() and leave the loop wedged.	2026-06-27 20:46:08 -07:00
konsisumer	11b0be8d15	fix(gateway): avoid Matrix pending invite boot loops	2026-06-27 20:45:51 -07:00
xxxigm	6f1a176b33	fix(gateway/discord): REST liveness probe to detect zombie clients (#26656 ) The Discord adapter could enter a silent zombie state after a network outage / proxy stall: the process is alive, _client looks open, but the underlying socket is dead. discord.py's WebSocket reconnect never sees a RST through a wedged proxy/NAT, so client.start() spins forever without exiting — which means the bot-task done callback (which only fires on task completion) never trips either. The bot stays "offline" in Discord until a manual `hermes gateway restart`. Reported offline for 13-17h. Adds an out-of-band REST liveness probe in DiscordAdapter. Every `discord.liveness_interval_seconds` (default 60s) the adapter issues a cheap fetch_user(bot_id) — the same REST path as message delivery, so it fails when the proxy/NAT is wedged. After `discord.liveness_failure_threshold` consecutive failures (default 3) the probe closes the wedged client and surfaces a retryable fatal error, which trips the gateway's existing _platform_reconnect_watcher and rebuilds the adapter. Operators disable it by setting either knob to 0. Config lives in config.yaml (discord.liveness_) per the .env-is-secrets policy; _apply_yaml_config bridges it to internal env vars the adapter reads, matching the existing HERMES_DISCORD_TEXT_BATCH_ pattern. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-27 19:30:32 -07:00
Teknium	db16854f34	fix(telegram): surface failed media downloads to user and agent, not a silent empty turn (#53912 ) When a Telegram attachment download/cache fails (typically a transient httpx.ConnectError to Telegram's CDN), the except handler logged a warning and fell through to handle_message() with empty media and no text — the user thought the file was delivered, the agent saw a content-less turn with no signal an attachment was attempted, and the only record was a buried log line. Adds _surface_media_cache_failure(): replies to the user in Telegram so they know to retry, and appends an agent-visible notice to event.text via the existing _append_observed_note channel so the agent knows an attachment was attempted and failed. No new event fields (structured-event refactor is out of scope per #23045). Wired into all five cache-failure sites — photo, voice, audio, video, document — since they shared the identical silent fall-through. Bug 1 from #23045 (unsupported types routed as fake user messages) no longer exists on main: the document handler now accepts any file type, so there is no rejection branch to fix. Closes #23045	2026-06-27 19:12:57 -07:00
bykim0119	851f75d4df	fix(discord): honor "" wildcard in DISCORD_ALLOWED_USERS (#22334 ) DISCORD_ALLOWED_USERS="" now means "allow everyone", matching the SIGNAL_ALLOWED_USERS / DISCORD_ALLOWED_CHANNELS wildcard convention and the value `claw migrate` emits. Previously _is_allowed_user did exact ID matching only, so "" matched no user and blocked every non-self sender — a P1 with no workaround. Three sites, all required for the fix to hold at runtime: - _is_allowed_user: short-circuit when "" is in the allowlist. - connect(): exclude "" from the intents.members trigger so the wildcard does not request the privileged Server Members intent (which can block the bot from coming online). - _resolve_allowed_usernames: preserve "" verbatim; otherwise it lands in the username-resolution bucket, matches no member, and is silently dropped from the set and env var on the first on_ready — quietly undoing the fix. Slash auth delegates to _is_allowed_user (auto-covered); component auth already honors "*" on main.	2026-06-27 19:11:30 -07:00

1 2 3 4 5

247 commits