Commit graph

691 commits

Author SHA1 Message Date
Victor Kyriazakos
accd672054 fix(slack): MPIMs (group DMs) obey shared-surface mention gating + reaction guard
Group DMs (MPIMs) were classified as DMs and thereby exempted from every
operator control that shared surfaces are supposed to honor: allowed_channels,
require_mention, strict_mention, free_response_channels, and the reaction
guard. Symptom: the bot added 👀/ to unmentioned MPIM
messages and still invoked the agent (which then returned NO_REPLY) instead of
the gateway dropping the event before model execution. Removing an MPIM from
allowed_channels did not disable it.

Root cause is the DM classification at adapter.py:
    is_dm = channel_type in {"im", "mpim"}
used for BOTH routing exemptions and reaction gating. An MPIM is a shared
surface (multiple humans can see and trigger the bot), not a private 1:1 DM,
so it must be gated like a channel.

This behavior was introduced/reinforced by a trail of Slack group-DM PRs:
- #4633  fix(slack): treat group DMs (mpim) like DMs + reaction guard
- #54632 fix(slack): subscribe to message.mpim + mpim scopes so group DMs work
- #54663 fix(slack): group DMs work OOTB + reinstall nudge
#54632/#54663 correctly made MPIM messages *reachable*; #4633 over-reached by
giving them the DM mention/reaction *exemptions*. This corrects only that
over-reach.

Fix (minimal): introduce `is_one_to_one_dm = channel_type == "im"` and key the
two EXEMPTION sites off it instead of `is_dm`:
- mention/allowlist gating block (`if not is_one_to_one_dm and bot_uid:`)
- reaction guard (`(is_one_to_one_dm or is_mentioned)`)
`is_dm` is intentionally retained for session/thread scoping and chat_type
labeling, where treating an MPIM as a persistent multi-party conversation is
correct — only the mention/reaction exemptions were wrong.

Docs: slack.md now distinguishes 1:1 DMs (mention-exempt) from group DMs
(shared surface; obey require_mention/strict_mention/allowed_channels/
free_response_channels; reactions only when @mentioned).

Tests: +7 in test_slack_mention.py (MPIM unmentioned dropped under
require_mention and strict_mention; MPIM mentioned processed; MPIM off
allowed_channels dropped; MPIM in free_response opted in; 1:1 IM still exempt;
reaction guard drops unmentioned MPIM). Updated _would_process to model the
is_one_to_one_dm gating + strict_mention. 72 passed.
2026-07-03 12:34:53 +05:30
kshitijk4poor
9f60467426 refactor(slack): extract _is_list_line helper for list-marker checks
Deduplicate the '_BULLET_RE.match or _ORDERED_RE.match' idiom used at the
list-run entry guard and the blank-line lookahead into a single helper, so
adding future marker types is a one-point change. Pure refactor, no
behavior change (22 block_kit tests still pass).
2026-07-03 02:55:22 +05:30
kshitijk4poor
033d7bf259 fix(slack): guard blank-line list continuation on next-item lookahead
Refine the blank-line handling so a blank line only continues a list run
when the next non-blank line is another list item. This keeps a list ->
paragraph -> list sequence as three separate blocks and matches the
contiguous-list layout for mixed/nested lists (one rich_text block, split
into sub-lists by (indent, ordered)), rather than emitting a separate
block per item.

Adds regression tests for the mixed blank-separated layout and the
list->paragraph->list boundary.
2026-07-03 02:55:22 +05:30
liuhao1024
d3c8a155cb fix(slack): keep blank-line-separated ordered items in one rich_text_list
When a Markdown ordered list has blank lines between items (common in
LLM-authored content), the list run loop breaks on each blank line.
Slack numbers each rich_text_list independently, so N items produce N
lists each starting at 1.

Skip blank lines inside the list run as soft separators instead of
breaking, so ordered items stay in one rich_text_list and Slack renders
the correct numbering.

Fixes #57076
2026-07-03 02:55:22 +05:30
Brooklyn Nicholson
5a6720b884 fix(desktop,tui-gateway,zai): stop thinking-off from reverting to medium
A Z.ai desktop user reported thinking reverting to medium after one turn,
burning ~200% of a week's credits in 4 days despite reasoning_effort: false
in config.yaml. Four compounding bugs:

- _session_info reported reasoning_effort "" for disabled reasoning,
  indistinguishable from unset — the desktop adopted it after the first
  turn, wiping its sticky "thinking off" pick so every later chat
  reverted to the default effort.
- config.set key=reasoning always wrote agent.reasoning_effort to global
  config.yaml, so every desktop model-menu selection (preset.effort ??
  'medium') clobbered the user's configured value. Now session-scoped
  like the messaging gateway's /reasoning, landing on
  create_reasoning_override so lazily-built sessions keep it too.
- YAML `reasoning_effort: false`/`off`/`no` (boolean False) was coerced
  to "" by every loader's `str(x or "")`, silently re-enabling thinking.
  parse_reasoning_effort now treats False/"false"/"disabled" as
  {"enabled": False}; loaders (tui gateway, gateway, cli, cron,
  delegate) pass the raw value through. The desktop config reader also
  crashed on the boolean (false.trim()), aborting voice/STT settings.
- The zai provider profile never sent thinking on the wire, and GLM-4.5+
  defaults to thinking ON server-side — so disabling reasoning was a
  silent no-op on direct Z.ai, the actual token burner. The profile now
  emits extra_body.thinking {"type": "enabled"|"disabled"} for
  thinking-capable GLM models, mirroring the DeepSeek profile.

Also: /new (session reset) now carries reasoning_config across the
rebuild like model_override; config.get reasoning prefers the session's
live value and maps a config False to "none"; Settings shows "Off"
instead of a blank select for hand-written false.
2026-07-02 15:23:47 -05:00
kshitijk4poor
019950560d refactor(image-gen): reuse shared image sniffer + raster allowlist in codex backend
Replace the plugin-local _IMAGE_MAGIC_MIME table + _sniff_image_mime
body with a delegation to agent.image_routing._sniff_mime_from_bytes,
the canonical magic-byte sniffer already used across the codebase, then
gate its result to the raster formats gpt-image-2's Responses
input_image actually accepts (png/jpeg/gif/webp).

The shared sniffer also recognizes SVG/TIFF/ICO; without the allowlist
those would pass local validation and be rejected server-side with an
opaque HTTP 400. Gating locally fails them cleanly as invalid_image_input.
Adds a regression test for SVG rejection.

Follow-up on top of @CrazyBoyM's #55828.
2026-07-02 17:12:24 +05:30
CrazyBoyM
460235d584 test(image-gen): cap Codex reference inputs 2026-07-02 17:12:24 +05:30
CrazyBoyM
ecffd290a3 feat(image-gen): support Codex image inputs 2026-07-02 17:12:24 +05:30
CharmingGroot
88bd1c01e1 fix(email): harden adapter against malformed IMAP responses
Salvage of #2794 by @CharmingGroot, ported to the relocated
plugins/platforms/email/adapter.py:

- Guard raw_email = msg_data[0][1] against IndexError/TypeError and
  non-bytes payloads. UIDs are added to _seen_uids before fetch, so an
  exception mid-batch permanently skipped every remaining message in
  the batch — now the bad message is logged and skipped instead.
- Message-ID domain generation falls back to 'localhost' when
  EMAIL_ADDRESS lacks '@' (now via a shared _message_id_domain() helper
  covering all 3 send paths; the PR fixed 2 of 3).
2026-07-02 03:12:53 -07:00
snav
e9bceb5ae0 fix(discord): ignore reply-ping-only mentions for bot-authored messages
Two Hermes bots sharing a channel could volley replies at each other
indefinitely. Root cause: Discord reply-pings (allowed_mentions
replied_user=true) add the replied-to bot to message.mentions without a
literal <@bot> token in the body, so the existing bot-admission gate
treated a reply chip as an explicit @mention and re-triggered the peer.

Adds opt-in discord.bots_require_inline_mention (default false; env
DISCORD_BOTS_REQUIRE_INLINE_MENTION). When enabled, bot-authored
messages must carry a raw inline <@id>/<@!id> mention in the content;
reply-ping-only mentions no longer admit the message. Human messages and
all existing defaults are unchanged.

The new _self_is_raw_mentioned helper deliberately ignores the resolved
message.mentions list (which reply-ping populates) and checks only the
raw content token via the shared _raw_mentioned_user_ids primitive.
2026-07-01 15:38:34 -04:00
Steve Lawton
c73e74386b feat(vertex): add Google Vertex AI provider for Gemini (OAuth2)
Adds Vertex AI as a first-class provider for Gemini models via Vertex's
OpenAI-compatible endpoint. Vertex authenticates with short-lived OAuth2
access tokens (service-account JSON or ADC), not a static API key — the
missing piece behind the recurring requests (#13484, #12639, #56259).

- agent/vertex_adapter.py: OAuth2 token minting + refresh-on-expiry
  (5-min margin), ADC->service-account fallback, global vs regional
  endpoint URLs. Config precedence: env var > config.yaml > default.
- plugins/model-providers/vertex/: provider profile (auth_type=vertex),
  reuses Gemini's extra_body.google.thinking_config translation.
- runtime_provider: vertex short-circuit BEFORE the credential pool so a
  credentials-file path is never mistaken for a static API key; mints a
  fresh token + computes base_url per resolve.
- run_agent + conversation_loop: _try_refresh_vertex_client_credentials()
  re-mints the token and rebuilds the client on a mid-session 401, so a
  long-lived gateway agent survives token expiry (~1h).
- auxiliary_client: vertex auth_type branch for side-LLM tasks.
- config.yaml: vertex.project_id / vertex.region (non-secret, bridged to
  env); credential path stays in .env (VERTEX_CREDENTIALS_PATH).
- setup wizard + model picker: dedicated _model_flow_vertex; curated
  google/gemini-* model list; --provider choices.
- pricing/metadata: Vertex prices off the gemini docs snapshot; endpoint
  host auto-maps to the vertex provider (no probe spam).
- lazy_deps + pyproject [vertex] extra: google-auth, opt-in only.
- docs: guides/google-vertex.md + providers page; tests for adapter +
  runtime resolution.

Salvages and modernizes #8427 by @slawt onto current main: rewired from
the legacy PROVIDER_REGISTRY path to the provider-profile architecture,
moved non-secret config out of .env into config.yaml, and added the
per-turn 401 token-refresh the original lacked.
2026-07-01 05:25:33 -07:00
Brett
9f03095044 fix(telegram): cap initialize() with per-attempt timeout so unreachable fallback IPs can't hang startup
Wrap each Telegram initialize() attempt in asyncio.wait_for(HERMES_TELEGRAM_INIT_TIMEOUT,
default 30s). When api.telegram.org and all fallback IPs are unreachable, the connect
chain has no outer bound, so a single initialize() blocks for minutes and the
retry-on-exception loop never fires — the gateway appears to hang after the banner.
The timeout guarantees each attempt is bounded, then retries with backoff, then fails
with an actionable error. Also adds WARNING-level progress logs before DoH discovery
and each connect attempt (visible at default log level).

Salvaged onto plugins/platforms/telegram/adapter.py (Telegram moved from
gateway/platforms/ since the PR was opened). Adds env var to docs + AUTHOR_MAP.

Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>
2026-07-01 05:07:10 -07:00
Zeheng Huang
4c2c54c78c fix(matrix): await inbound sync handlers
Register the Matrix room-message, reaction, and invite handlers with
mautrix's wait_sync=True. mautrix's handle_sync() only returns the tasks
for handlers registered as sync-awaited; non-waited handlers are
fire-and-forget via background_task.create() and are NOT returned. Since
_dispatch_sync() awaits only the returned tasks (await asyncio.gather),
the inbound handlers previously had no completion point, so Tuwunel/
mautrix homeservers connected and completed initial sync but dispatched
zero inbound messages.

Fixes #46142.

Co-authored-by: Zeheng Huang <153708448+hunjaiboy@users.noreply.github.com>
2026-07-01 04:42:33 -07:00
Gary Walker
09dbe76955 fix(matrix): reset _device_id_unverified at start of connect()
Per review feedback on #53997 from @teknium1: the flag was set True
on failed device_id resolution but never reset, so a same-adapter
reconnect that successfully resolves a real device_id would keep
skipping server-side key verification indefinitely.

Reset now happens at the top of connect(), before resolution runs,
so every connect() attempt starts clean. A repeat failure re-sets
the flag (unchanged behavior); a recovery correctly clears it.

Adds TestDeviceIdRecoveryOnReconnect to cover the transition.
2026-07-01 16:46:40 +05:30
Gary Walker
9048457eab fix(matrix): device_id fallback prevents E2EE init failure on fresh bot accounts
- Resolve device_id via query_keys({mxid: []}) when whoami() returns None
- Guard _verify_device_keys_on_server and _reverify_keys_after_upload
  against None/unverified device_id to prevent 'device_keys values must
  be a list of strings' serialization failure
- Disconnect existing client before reconnect to prevent dual OlmMachine
  instances on the same crypto store

Re-targeted from #39779 (legacy gateway/platforms/matrix.py) onto the
migrated plugins/platforms/matrix/adapter.py path following the
2026-06-20 adapter migration. Logic unchanged from original fix.

242 tests passing (233 upstream + 9 new).
2026-07-01 16:46:40 +05:30
Tao Chen
d3c8667462 fix(slack): authorize bot/workflow senders before the no-user-id guard
Slack Workflow Builder posts (and other app/bot messages) arrive as
subtype=bot_message with user=None. _is_user_authorized rejected them at
the `if not user_id: return False` guard, which runs *before* the #4466
{PLATFORM}_ALLOW_BOTS bypass — so @mentioning the bot from a Slack
workflow silently did nothing, even with SLACK_ALLOW_BOTS (or
SLACK_ALLOW_ALL_USERS) set. The chat-scoped allowlist for Telegram/QQ
already runs before that guard for the same reason (channel broadcasts
with no from_user); Slack was both missing from the bot-bypass map and
had the bypass running too late.

- gateway/authz_mixin: move the {PLATFORM}_ALLOW_BOTS bypass ahead of the
  no-user-id guard and add Platform.SLACK -> SLACK_ALLOW_BOTS.
- plugins/platforms/slack/adapter: set is_bot=True on inbound
  bot_message events so the gateway can identify workflow/app senders
  (they carry no user_id to match against the allowlist).

Tested: new tests/gateway/test_slack_bot_auth_bypass.py plus the existing
Discord/Feishu bot-auth and gateway authz/gating suites all pass.
2026-07-01 16:32:32 +05:30
SahilRakhaiya05
bb304b4914 fix(gateway): fail-closed external-surface defaults + profile-aware multiplex authz
Aligns runtime behaviour with SECURITY.md 2.6: externally reachable
messaging adapters must fail closed unless access is explicitly
configured. Closes the confirmed multiplex authorization bypass a
secondary profile's open dm/group policy no longer inherits the default
profile's allowlist trust.

- Own-policy adapters (WhatsApp, WeCom, Weixin, QQBot, Yuanbao) default
  dm_policy/group_policy to pairing/allowlist instead of open; open now
  requires an explicit GATEWAY_ALLOW_ALL_USERS or per-platform allow-all.
- Startup guard (_own_policy_open_startup_violation) refuses to boot when
  an enabled adapter is open without the allow-all opt-in; the guard now
  runs for every secondary profile in multiplex mode too.
- Profile-aware own-policy authorization: _authorization_adapter /
  _adapter_for_source resolve the live adapter via SessionSource.profile,
  so _is_user_authorized and the ingress/pairing/busy/queue paths read the
  originating profile's adapter policy, not the default profile's.
- Fail-closed intake for Email, Feishu P2P, and Discord (blank-principal
  denial, empty-allowlist deny, missing-interaction.user deny).

Salvaged from #44073 (external-surface hardening), split into a focused
gateway-authz PR per maintainer request. Follow-up fix by Hermes Agent:
the Discord slash-auth channel bypass now matches DISCORD_ALLOWED_CHANNELS
by the same name-inclusive keys (id + name + #name + parent) the on_message
scope gate uses, so a name-form channel allowlist authorizes slash
interactions consistently (was id-only, breaking #name matching).

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-07-01 03:56:28 -07:00
srojk34
8e94e8f882 fix(discord): tag unverified channel-context senders like Slack threads
Discord's _fetch_channel_context backfills recent channel/thread activity
(from any member who can post there, not just the allowlisted user) into
the agent's context with no sender-trust distinction. Slack's equivalent
_fetch_thread_context was fixed to prefix non-allowlisted senders with
[unverified] and add LLM guidance not to act on their content, mitigating
indirect prompt injection from third parties in shared channels/threads.
Port the same mechanism to Discord using the already-wired
_is_sender_authorized/set_authorization_check plumbing.
2026-07-01 16:25:16 +05:30
kangsoo-bit
7a2369718a fix(telegram): keep polling alive during transient bootstrap outages
A transient Bot API network error during gateway bootstrap (deleteWebhook
or the initial start_polling) currently raises out of connect() and marks
the Telegram adapter fatal, restart-looping the whole gateway even though
the right behavior is to degrade the Telegram channel and let the existing
reconnect ladder recover in the background.

- _delete_webhook_best_effort(): swallow only transient network errors and
  continue to polling; non-network errors (e.g. auth failures) still raise.
- _start_polling_resilient(): on a transient conflict/network error at
  bootstrap, schedule background recovery and return degraded instead of
  raising; non-transient errors still propagate.
- Track the polling error-callback recovery tasks in _background_tasks so
  they can't be garbage-collected mid-flight.
- Add a second Telegram Bot API seed fallback IP (149.154.166.110).

Reconnect keeps its existing 10-retry -> supervisor-restart semantics; this
change only fixes the bootstrap raise, it does not alter the retry ladder.
2026-07-01 03:42:32 -07:00
teknium1
69f08c2eb5 fix(telegram): guard _post_connect_task access for object.__new__ test pattern
disconnect() reads self._post_connect_task, but several tests build a bare
TelegramAdapter via object.__new__() without calling __init__ (which sets the
attr). Use getattr(..., None) so disconnect() works on those instances too
(pitfall #17).
2026-07-01 03:18:57 -07:00
LeonSGP43
3362bdb4e5 fix(telegram): defer post-connect housekeeping off the connect path
Command-menu registration (set_my_commands), the status-indicator, and
DM-topic setup make Bot API calls that can stall for certain bot tokens.
They ran inside connect() before/after _mark_connected() but still within
the coroutine the gateway wraps in a connect timeout, so one slow call blew
the whole connect and the adapter never came up — even though polling/webhook
was already live (getMe works via curl). Fixes #46298.

- mark connected as soon as polling/webhook startup succeeds
- move command-menu, status-indicator, and DM-topic setup into a cancellable
  background housekeeping task (_run_post_connect_housekeeping)
- cancel that task during disconnect so it can't fire into a torn-down client
- harden scope-name lookup with getattr fallback

Salvaged onto the relocated plugin adapter (plugins/platforms/telegram/
adapter.py) since the original PR #46404 targeted the pre-migration
gateway/platforms/telegram.py path.

Co-authored-by: Hermes Agent <teknium@nousresearch.com>
2026-07-01 03:18:57 -07:00
Ben
4b4349eb9a feat(cron/slack): flat in-channel continuable cron delivery surface
Add a per-platform `cron_continuable_surface` extra key
(`thread` default | `in_channel`) so a continuable cron job can deliver
FLAT into a Slack channel — no dedicated thread — and still be
replied-to. In `in_channel` mode the scheduler skips the thread-open
branch (leaves `thread_id=None`); the shipped origin-mirror then seeds
the `(slack, chat_id, None)` shared-channel session — the same bucket
`reply_in_thread: false` routes inbound channel replies to — so a plain
channel reply continues the job in context.

Design: specs/cron-inchannel-continuable (D1–D7, F5). Model B
(shared-channel session), NOT anchoring to the delivery `ts` — on Slack
replying to a specific message IS threading, so a `ts` anchor would only
relocate the thread, never deliver true threadless continuable.

- gateway/platforms/base.py: `supports_inchannel_continuable` capability
  flag (default False → unsupported platforms fail SAFE to `thread`).
- plugins/platforms/slack/adapter.py: flag=True; `_cron_continuable_surface()`
  resolver (coerces to the two-value enum); `_warn_if_inchannel_without_flat_reply`
  connect-time warning (D5: warn, not hard-require — the misconfig fails safe).
- gateway/config.py: shared-key bridge line (top-level OR nested config).
- cron/scheduler.py: read the key generically from platform config, gate
  the `in_channel` branch on the adapter capability flag, skip thread-open.
  No new seed function (reuses the existing mirror — G6).

Pairing (docs): `in_channel` + `reply_in_thread: false` +
`require_mention: false` (or a free-response channel). Missing
`reply_in_thread: false` fails safe to a threaded continuation.

Gateway-side config flag — `/restart` to apply; NO Slack app reinstall.

Tests (from inside the worktree, PYTHONPATH=$PWD):
- +6 cron scheduler tests (in_channel skips thread-open; seeds flat
  channel session with thread_id=None; thread-mode regression;
  fail-safe on unsupported platform; value coercion). Prove-fail:
  removing the `and not in_channel_surface` guard turns the two
  load-bearing tests RED; restore → GREEN.
- +10 slack resolver/capability/warning tests; +2 config-bridge tests.
- tests/manual/cron_inchannel_e2e.py: offline E2E driving BOTH real
  legs (delivery seed + inbound reply keying) → both converge on
  (slack, C, None).
- No regressions: test_slack.py 216 passed alone; broader sweep green
  (4 pre-existing cross-file-ordering failures reproduce identically on
  pristine origin/main).

Docs: cron.md + slack.md + zh-Hans mirrors of both.
2026-07-01 03:16:13 -07:00
kshitijk4poor
ede5c09f3b docs(disk-cleanup): clarify cron output-root protection is exact-match
Review follow-up: the _is_protected_cron_path docstring listed output/ next
to jobs.json/.tick.lock as 'the directory itself', which is slightly
ambiguous. Spell out that the match is EXACT-path only and must not be
'simplified' into a blanket cron/output/* guard (children stay cleanable) —
prevents a future editor from re-introducing the wholesale-delete bug this
fix closes.
2026-07-01 15:42:04 +05:30
martinramos002-bot
d173e8c3a7 fix: protect cron output root from cleanup
Only classify files below cron/output/ as disposable cron output.
The cron/output directory itself is a durable container for retained
job history and should not be tracked or deleted wholesale.

Add regression coverage for both category detection and cleanup of a
stale tracked entry pointing at the output root.
2026-07-01 15:42:04 +05:30
skyzh
cc7d20d683 feat(raft): add gateway setup wizard
Add an interactive Raft setup flow for hermes gateway setup. The wizard follows the existing platform adapter setup pattern, persists RAFT_PROFILE to the Hermes env file, preserves an existing profile when the user declines reconfiguration, and registers the flow via setup_fn.

Add focused Raft adapter coverage for saving RAFT_PROFILE, keeping an existing profile, and registering setup_fn.

Signed-off-by: skyzh <skyzh@mail.build>
Signed-off-by: HaoHao <HaoHao@mail.build>
2026-07-01 02:45:11 -07:00
heathley
a8a97c358f fix(matrix): block unsafe image redirects per-hop
Matrix outbound image downloads validated only the final URL after
following redirects, so a public URL that 302-redirects to loopback /
private-network / cloud-metadata endpoints had already connected to the
unsafe hop before the check ran.

Re-validate every redirect hop before following it:
- aiohttp path resolves redirects manually with allow_redirects=False,
  validating each Location via is_safe_url (aiohttp can't use the httpx
  response event hook).
- httpx fallback installs the shared _ssrf_redirect_guard event hook.

Regression tests cover per-hop blocking of an unsafe redirect, following
a safe redirect chain, and httpx guard wiring.
2026-07-01 02:44:57 -07:00
Teknium
275e293f54
fix(matrix): decline dead/abandoned invites instead of retrying forever (#56222)
An invite to a room with no remaining members surfaces as "no servers
in the room have been provided" or "room not found" on join. The pending
invite was never cleared, so every gateway startup re-attempted the join
and re-emitted the warning indefinitely.

Detect that specific failure mode by narrow error-message match and call
leave_room to decline the invite; transient/network errors leave the
invite untouched for the next sync. Adds 5 tests.

Reimplements the matrix portion of #33953 onto the current plugin adapter
(gateway/platforms/matrix.py was relocated to
plugins/platforms/matrix/adapter.py since the PR was opened). The two
gateway/status.py fixes from that PR (wrapper-subcommand rejection,
psutil start-time fallback) already landed on main independently.

Reported by @Bougey; original patch authored by @KiraKatana.
2026-07-01 02:44:18 -07:00
teknium1
43edbae638 fix(telegram): widen NoneType reconnect guard to the conflict-retry path
The network-error reconnect ladder (#55992) captured a stable self._app
local across its awaits and failed fast when the adapter was torn down
mid-sleep. The 409-conflict retry path had the identical unguarded
self._app.updater.start_polling() deref — a concurrent disconnect()
during its RETRY_DELAY sleep would raise the same 'NoneType' object has
no attribute 'updater' and, on a non-final retry, land in limbo. Apply
the same stable-local + fail-fast pattern so the existing except block
reschedules or escalates to fatal.
2026-07-01 02:03:58 -07:00
joaomarcos
a682091044 fix(telegram): close reconnect races that leave adapter half-destroyed
_handle_polling_network_error's chained retry never updated
self._polling_error_task, so the reentrancy guard shared with the
heartbeat loop and the pending-updates probe went stale mid-recovery,
letting more than one recovery attempt run concurrently against the
same adapter. Combined with a TOCTOU window in
_handle_adapter_fatal_error (the adapter was only removed from
self.adapters in a finally block after awaiting disconnect()), two
concurrent fatal notifications for the same adapter could both pass
the "still installed" check and call disconnect() twice, which is
where the reported "'NoneType' object has no attribute 'updater'"
originates once self._app is cleared by the first call.

- Reassign the chained retry task to self._polling_error_task so the
  guard reflects an in-flight recovery.
- Capture self._app in a local variable across the stop/start_polling
  sequence instead of re-reading self._app between awaits.
- Claim (pop) the adapter from self.adapters before awaiting
  disconnect() in _handle_adapter_fatal_error, not after, closing the
  TOCTOU window for a concurrent notification on the same adapter.
2026-07-01 02:03:58 -07:00
Teknium
259e6b87a7 fix(teams-pipeline): reject dot-only recording display_name
Path(raw).name reduces '..'/'.'/'' to themselves, so basename
extraction alone still let a Graph-provided display_name of '..' or
'../' escape the temp recording directory (tmp_dir / '..' resolves to
the parent). Reject the dot-only basenames explicitly and fall back to
the artifact id. Extends @outsourc-e's regression coverage with the
dot-only cases.
2026-07-01 02:03:48 -07:00
memosr
3590543312 fix(security): strip directory components from Teams recording display_name to prevent path traversal 2026-07-01 02:03:48 -07:00
Glen Workman
5505dbbf43 fix(telegram): accept both list and mapping shapes for group_topics config
The forum-topic skill-binding lookup assumed config.extra['group_topics']
was always a list of {chat_id, topics} entries. When an operator writes the
natural mapping shape ({"-100...": [...]}), iterating yields string keys and
chat_entry.get(...) raises AttributeError, breaking dispatch for that group.

Normalize both shapes to a common iterator and guard non-dict/non-list
entries so malformed config falls through cleanly instead of crashing.
2026-07-01 01:20:14 -07:00
zapabob
500c2b1e46 fix(security): close SSRF redirect-guard bypass across all httpx download hooks
Inside httpx AsyncClient response event hooks, response.next_request is
often None even for a genuine redirect, so guards keyed on
`if response.is_redirect and response.next_request` silently never fire.
A public URL that 302s to http://169.254.169.254/ was followed anyway,
defeating the pre-flight is_safe_url() check.

Resolve the redirect target from the Location header (via urljoin, so
relative Locations work too), falling back to next_request only when no
Location is present. Extracted as tools.url_safety.redirect_target_from_response
and wired into every SSRF redirect guard:

  - gateway/platforms/base.py  (shared image + audio download for all platforms)
  - tools/vision_tools.py       (two download hooks)
  - plugins/platforms/slack/adapter.py

Original fix by @zapabob (PR #35940), which targeted the since-refactored
gateway/platforms/slack.py; reconstructed onto the current shared sites and
widened to the whole bug class.
2026-07-01 01:18:53 -07:00
zapabob
2e12401ed4 fix(web): re-check Firecrawl final URLs for SSRF 2026-07-01 00:49:38 -07:00
0xsir0000
50a7dce6bd fix(discord): auto-thread failure must not silently fall back to inline reply
When discord.auto_thread is enabled and a top-level server-channel message
should be routed to a new thread, a transient thread-create failure (e.g.
Cannot connect to host discord.com:443) returned None and _handle_message
fell through to an inline parent-channel reply — dumping a new task into a
shared channel and breaking thread-first workflows.

- _auto_create_thread retries the primary + seed-message paths once after a
  750ms backoff for transient connect errors.
- _handle_message treats None as a hard failure: posts a short visible notice
  in the parent channel and returns without invoking the agent. The notify
  send is wrapped so a secondary connect error can't raise.

Fixes #20243
2026-07-01 00:12:17 -07:00
nocturnum91
cc1e4c32c0 fix(telegram): normalize thread id in group gating via shared helper
Group gating (_should_process_message) read the raw message_thread_id,
while event routing (_build_message_event) normalized it. A plain
non-forum group reply's message_thread_id is a reply-UI anchor, not a
topic, so an anchor id matching an ignored_threads entry wrongly
dropped the message, and the anchor was treated as a routable topic
under allowed_topics.

Extract _effective_message_thread_id and route both gating and
event-building through it, so gating and session routing agree on one
normalized value: real topic/forum messages keep their thread id, reply
anchors are dropped, and forum General-topic messages normalize to the
General-topic id.
2026-07-01 00:11:46 -07:00
teknium1
88c9dfecb2 docs(slack): correct block_kit docstrings to reflect native table blocks
The renderer now emits native Block Kit table blocks; the module and
_rich_blocks_enabled docstrings still described the earlier monospace-only
approach.
2026-07-01 00:10:12 -07:00
Ben
7c7b489813 feat(slack): render markdown tables as native Block Kit table blocks
Replace the interim monospace table fallback with Slack's native `table`
block (rows of rich_text cells). Addresses the core ask in #18918.

- _table_block(): builds type:"table" with rich_text cells, so inline
  formatting (bold, links, code) renders inside cells.
- Column alignment parsed from the markdown separator row (:---, :-:, --:)
  into column_settings (left = default/null-skip, center/right emitted).
- Escaped pipes (\\|) are not treated as column separators.
- Respects Slack's table limits (100 rows / 20 cols / 10k aggregate chars);
  oversized or unparseable tables gracefully fall back to aligned monospace
  (rich_text_preformatted), so a big table never breaks the message.

Docs (EN + zh-Hans) updated to describe native tables + the fallback.
Tests: native table shape, alignment->column_settings, inline-formatted
cells, oversized/too-wide monospace fallback, escaped-pipe cell. Prove-
failed against a stubbed _table_block (native-table tests fail, fallback
tests stay green). All existing Slack tests still pass.
2026-07-01 00:10:12 -07:00
Ben
b080b93ad8 feat(slack): opt-in Block Kit rendering for agent messages
Add platforms.slack.extra.rich_blocks (default off). When enabled, the
final agent message is sent as Slack Block Kit blocks — section headers,
dividers, and true nested lists via rich_text — instead of flat mrkdwn.

- New plugins/platforms/slack/block_kit.py: pure markdown->blocks renderer
  (headers, dividers, nested ordered/bullet lists, blockquotes, fenced code;
  pipe-tables as aligned monospace since Block Kit has no robust table block).
  Enforces Slack's 50-block / 3000-char section limits and returns None to
  fall back to plain text on empty/oversized/unexpected input. Never raises.
- adapter.send(): render blocks on the single-chunk primary message; a
  text= fallback is ALWAYS sent alongside (notifications/accessibility).
- adapter.edit_message(): blocks only on finalize=True, so intermediate
  streaming edits stay plain mrkdwn (no per-flush block re-derivation).
- Docs (EN + zh-Hans) + config example. Send-side only: no app reinstall.

Tests: pure-renderer unit suite + adapter integration suite (blocks present
when on, plain text when off, text fallback always set, finalize gating,
multi-chunk fallback). Prove-failed against a stubbed renderer.
2026-07-01 00:10:12 -07:00
syahidfrd
0198713c33 fix(security): reuse auth chain when tagging unverified senders in Slack threads
Mitigates indirect prompt injection (CWE-863) in Slack thread context.
When the bot is mentioned mid-thread for the first time, _fetch_thread_context
pulls the full thread via conversations.replies and prepends every reply to
the LLM prompt. Replies from senders not on the allowlist were rendered
identically to authorised senders, letting a third party in a shared channel
inject instructions the model might act on when answering the next authorised
message.

- BasePlatformAdapter.set_authorization_check / _is_sender_authorized, registered
  by GatewayRunner._make_adapter_auth_check() with a closure over the existing
  _is_user_authorized chain (platform/global/group allowlists, allow-all flags,
  pairing store all stay the single source of truth — no env-var re-parsing).
- Tags non-bot thread messages whose sender fails the auth check with an
  [unverified] prefix; strengthens the header with soft guidance only when at
  least one unverified message is present, so setups without an allowlist see
  no behaviour change.
- Wired into all three adapter-init sites in run.py (start, reconnect watcher,
  restart) so the reconnect path is covered too.

Softened wording: adapted from the original [untrusted] tag to [unverified]
and non-accusatory header framing — the label reflects allowlist status, not
a judgment about the person. Adapter relocated to plugins/platforms/slack/
since the PR was authored.

Salvaged from #17059.
2026-06-30 18:05:43 -07:00
CRWuTJ
8ad15ff7dd fix(telegram): cancel delayed deliveries on disconnect
Buffered text/photo/media-group flushes and the polling-error recovery
task sit behind an asyncio.sleep(). On disconnect they kept running and
dispatched handle_message() into a torn-down session, producing stale or
duplicate deliveries. disconnect() only cancelled media-group and photo
batch tasks — text batches and the polling-error task leaked.

Set a _drop_delayed_deliveries flag from _mark_disconnected/_set_fatal_error
(cleared by _mark_connected) and check it in all enqueue+flush paths so a
flush that wins the race against teardown drops instead of dispatching.
_cancel_pending_delivery_tasks() now cancels+clears all four task maps,
skipping the current task. Media-group flush finally-block guarded so a
cancelled stale flush cannot erase a replacement task handle.
2026-06-30 17:39:30 -07:00
teknium1
36bfe3a449 fix(anthropic+feishu): model-gate max_tokens fallback; wire Feishu channel_prompt
Two independent fixes salvaged from #12811 (closing it; one of its three
bundled fixes — Discord free_response — is already on main).

Anthropic max_tokens (#12790): the chat-completions max_tokens fallback only
fired for OpenRouter/Nous URLs, so any other proxy serving a Claude model
(AWS Bedrock, NVIDIA, LiteLLM, vLLM, corporate gateways) shipped requests
with no max_tokens and inherited the proxy's low default (Bedrock: 4096),
exhausting on thinking + large tool calls. Changed the gate in
chat_completion_helpers.build_api_kwargs from URL-gated to model-gated:
fires whenever the model matches an _ANTHROPIC_OUTPUT_LIMITS key. This also
fixes a latent miss — the old 'claude' substring gate skipped MiniMax and
Qwen3 even on OpenRouter. Remains a last-resort fallback (build_kwargs only
applies it after ephemeral/user/profile max_tokens), so it never overrides
an explicit value, and only touches the chat-completions transport (native
Anthropic Messages API is a separate path).

Feishu channel_prompt (#12805): the Feishu adapter never resolved
channel_prompts config, unlike Discord/Slack, so per-channel role prompts
were silently ignored. Added _resolve_channel_prompt() (delegating to the
shared gateway.platforms.base.resolve_channel_prompt) and wired it into all
three MessageEvent construction sites — inbound message, reaction routing,
and card-action routing.

Tests: tests/gateway/test_feishu_channel_prompts.py (6 cases) covering exact
match, parent-thread fallback, no-match, missing-config safety, and event
propagation.
2026-06-30 17:20:41 -07:00
codexGW
608e8a6062 fix(discord): accept raw direct bot mentions and ignore bare mention-only pings
Some legitimate @bot pings were dropped because the mention gates relied on
message.mentions alone, which does not always populate raw <@ID> / <@!ID>
forms (mobile, edited, relayed messages). A bare @bot with no other text
could also spawn a fake empty-text turn.

- add _self_is_explicitly_mentioned() / _raw_mentioned_user_ids() helpers that
  treat the bot as mentioned via resolved mentions OR raw content forms
- use them at the allow_bots=mentions gate, multi-agent bot filtering, the
  mention-strip/mention_prefix step, and the require_mention gate
- drop bare mention-only pings (no text, no media, no injection, no backfill
  context) instead of injecting a placeholder empty turn

Co-authored-by: Teknium <teknium1@gmail.com>
2026-06-30 16:38:31 -07:00
teknium1
638d2e7bfc fix(memory/holographic): apply FTS5 sanitizer to search_facts sibling
The store-level search_facts() shared the same raw-MATCH bug class as
_fts_candidates (FTS5 AND-joins tokens, zeroing prose recall). Route it
through FactRetriever._sanitize_fts_query via a lazy import to keep the
store->retrieval layering acyclic. Also add cyb3rwr3n to release AUTHOR_MAP.
2026-06-30 15:55:11 -07:00
cyb3rwr3n
cb6d6d46ab fix(memory/holographic): sanitize FTS5 queries for natural-language recall
The FactRetriever's _fts_candidates passed the raw query string directly
to FTS5's MATCH operator. FTS5 defaults to AND-between-tokens, which
means any multi-word prose query like 'what happened with the deployment
rollback' required every single token to co-occur in a fact — dropping
recall to zero on the kind of queries agents actually issue via prefetch().

Fix: add _sanitize_fts_query() that:
- tokenizes the query and drops English stopwords
- strips FTS5 operator characters per token
- OR-joins the remaining content tokens as phrase literals

For pathological inputs (all stopwords, empty), falls back to the raw
query so the caller sees zero results instead of a SQL error.

This is a pure-retrieval-quality fix — the HRR + Jaccard reranking
stages still keep precision high. Ships with 10 tests covering the
sanitizer and retrieval integration.
2026-06-30 15:55:11 -07:00
PRATHAMESH75
e55e9fad2c fix(telegram): recover when polling updater stops while process stays alive
The polling heartbeat's pending-update probe treated a stopped updater
(running=False) as "someone else's job" and silently reset its counter,
so a long-poll task that disappears with no reconnect in flight was never
recovered. get_me() on the general request path stays healthy, so neither
PTB's error_callback nor the connectivity probe ever fires — the gateway
keeps running but stops receiving messages indefinitely (#55769).

Detect the stopped-updater case directly in _probe_pending_updates and feed
it into the existing _handle_polling_network_error ladder, debounced over two
consecutive probes so a just-starting updater or the brief stop()->start_polling()
window of an in-flight reconnect never trips it.
2026-06-30 15:36:58 -07:00
Erosika
1f1d346ced fix(profile): resolve WhatsApp media-path cache roots per-call
The inbound-media validator _is_allowed_bridge_path() checked against
IMAGE_CACHE_DIR / AUDIO_CACHE_DIR / VIDEO_CACHE_DIR / DOCUMENT_CACHE_DIR
value-imported at module load. After the base.py cache-dir getters became
per-call resolvers, the bridge writes media into the active profile's cache
while the validator still matched the frozen launch-profile constants — so
media was rejected under a profile override (multi-profile gateway).

Resolve the cache roots per-call via the get_*_cache_dir() getters and drop
the now-unused frozen value-imports. Caught by automated review on #55867.
2026-06-30 15:30:06 -07:00
konsisumer
46ab06c238 fix(gateway): honor Discord connect timeout for ready wait 2026-06-30 15:03:25 -07:00
kshitijk4poor
1b3768558e docs(image-gen): align OpenRouter model-resolution docstrings with new precedence
The cherry-picked fix added explicit-kwarg and top-level image_gen.model
resolution but left _resolve_model / _resolve_model_chain docstrings stating
the old 'env override -> config -> DEFAULT_MODEL' order. Document the full
precedence (explicit kwarg -> env -> scoped -> top-level -> default chain) to
match the sibling krea/openai providers.
2026-06-30 19:11:49 +05:30
xxxigm
1324add563 fix(image-gen): honor top-level image_gen.model for Nous/OpenRouter
hermes tools persists the selected model to image_gen.model, but the
OpenRouter-compatible provider only read scoped image_gen.<provider>.model
and ignored the dispatch model kwarg — so Nous users always hit the default
quality-first chain and fell back to Gemini.
2026-06-30 19:11:49 +05:30