hermes-agent

Author	SHA1	Message	Date
Teknium	87ae4ae94b	fix(update): harden #57659 follow-ups — task restore on failure, --force-venv split, trampoline detection, managed-install health (#57680 ) Five follow-ups to #57659 from post-merge review: 1. install.ps1: gateway scheduled-task re-enable now runs in a finally (a thrown Remove-Item/uv venv failure previously stranded the user's gateway autostart disabled), and tasks that were already disabled before the install are no longer blindly re-enabled. 2. The venv-python holder guard is no longer bypassed by plain --force (which the desktop bootstrap passes on every update while its lock probe only checks hermes.exe/app.asar). New explicit --force-venv is the escape hatch; --force keeps bypassing only the hermes.exe shim guard. 3. _detect_venv_python_processes now also catches uv/base-interpreter trampolines whose exe is outside the venv, via cmdline (venv path or '-m hermes_cli.main' tied to this install root) and cwd. 4. Missing venv python is now UNHEALTHY on managed installs (.hermes-bootstrap-complete / .update-incomplete markers) so the repair lane runs instead of 'Already up to date!'; the repair branch recreates the venv first when it's gone entirely. Dev checkouts keep reporting healthy. 5. install.ps1 comment no longer claims a Startup-folder disarm the code doesn't perform (logon-only, not a mid-install respawner).	2026-07-03 04:08:37 -07:00
Teknium	372f8195c7	fix(moa): default temperatures to unset — provider default, like single-model agents (#57440 ) A single-model Hermes agent never sends temperature; the provider default applies. MoA hardcoded reference_temperature=0.6 / aggregator_temperature=0.4, and the coercion float(preset.get(key, 0.6) or 0.6) made unset IMPOSSIBLE to express: absent, null, empty, and even an explicit 0 all collapsed to the baked-in default. Every MoA advisor and aggregator therefore ran at 0.6/0.4 while the same model running solo used the provider default — silently skewing solo-vs-MoA comparisons and overriding provider-tuned defaults. - moa_config normalization: temperatures coerce to None when absent/blank/ invalid (new _coerce_float_or_none); explicit values incl. 0 honored. - moa_loop: _preset_temperature() resolves preset values; None flows to call_llm, which already omits the parameter when None (same contract as max_tokens). Aggregator still inherits the acting agent's own configured temperature when the preset doesn't pin one. - conversation_loop (context-mode MoA): same resolution, no more hardcoded 0.6/0.4 at the call site. - DEFAULT_CONFIG preset + web_server payload models + docs updated: unset is the default, pinning stays available.	2026-07-03 00:22:49 -07:00
Victor Kyriazakos	accd672054	fix(slack): MPIMs (group DMs) obey shared-surface mention gating + reaction guard Group DMs (MPIMs) were classified as DMs and thereby exempted from every operator control that shared surfaces are supposed to honor: allowed_channels, require_mention, strict_mention, free_response_channels, and the reaction guard. Symptom: the bot added 👀/✅ to unmentioned MPIM messages and still invoked the agent (which then returned NO_REPLY) instead of the gateway dropping the event before model execution. Removing an MPIM from allowed_channels did not disable it. Root cause is the DM classification at adapter.py: is_dm = channel_type in {"im", "mpim"} used for BOTH routing exemptions and reaction gating. An MPIM is a shared surface (multiple humans can see and trigger the bot), not a private 1:1 DM, so it must be gated like a channel. This behavior was introduced/reinforced by a trail of Slack group-DM PRs: - #4633 fix(slack): treat group DMs (mpim) like DMs + reaction guard - #54632 fix(slack): subscribe to message.mpim + mpim scopes so group DMs work - #54663 fix(slack): group DMs work OOTB + reinstall nudge #54632/#54663 correctly made MPIM messages reachable; #4633 over-reached by giving them the DM mention/reaction exemptions. This corrects only that over-reach. Fix (minimal): introduce `is_one_to_one_dm = channel_type == "im"` and key the two EXEMPTION sites off it instead of `is_dm`: - mention/allowlist gating block (`if not is_one_to_one_dm and bot_uid:`) - reaction guard (`(is_one_to_one_dm or is_mentioned)`) `is_dm` is intentionally retained for session/thread scoping and chat_type labeling, where treating an MPIM as a persistent multi-party conversation is correct — only the mention/reaction exemptions were wrong. Docs: slack.md now distinguishes 1:1 DMs (mention-exempt) from group DMs (shared surface; obey require_mention/strict_mention/allowed_channels/ free_response_channels; reactions only when @mentioned). Tests: +7 in test_slack_mention.py (MPIM unmentioned dropped under require_mention and strict_mention; MPIM mentioned processed; MPIM off allowed_channels dropped; MPIM in free_response opted in; 1:1 IM still exempt; reaction guard drops unmentioned MPIM). Updated _would_process to model the is_one_to_one_dm gating + strict_mention. 72 passed.	2026-07-03 12:34:53 +05:30
Jaaneek	5ef0b8acb0	feat(auth): make xAI Grok OAuth device-code-only, drop loopback login Replace the loopback/PKCE-callback server and manual-paste fallback with the RFC 8628 device-code flow as the only xAI Grok OAuth login path. The flow works in headless/SSH/container sessions with no 127.0.0.1 listener, shrinking the local attack surface. - Poll the token endpoint with server-provided interval, honoring slow_down and expires_in; store tokens with auth_mode oauth_device_code. - Adaptive proactive refresh skew for short-lived device-code JWTs; rotated tokens sync back to auth.json, the global root store, and the credential pool (no refresh-token replay). - Clear source suppression on successful re-login (CLI + dashboard) and drop the duplicate dashboard pool entry so exactly one seeded device_code entry exists. - Use the shared device_code source name for consistency with the nous/codex device-code providers. - Desktop: remove the loopback OAuth flow states and dead type variants; pkce providers' sign-in URL selection is unchanged. - Docs (EN + zh-Hans) rewritten for device-code login; drop the deleted --manual-paste flag from documented commands.	2026-07-02 13:17:41 -07:00
CrazyBoyM	ecffd290a3	feat(image-gen): support Codex image inputs	2026-07-02 17:12:24 +05:30
Teknium	543d305bbb	feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756 ) MoA per-turn latency is dominated by advisor GENERATION: turn wall time correlates ~0.88 with output tokens and ~-0.03 with input tokens (measured over 52 turns). Each turn waits for the slowest advisor to finish writing, and advisors were uncapped — writing multi-thousand-token essays the aggregator only needs the gist of. Add an opt-in per-preset reference_max_tokens knob (mirrors reference_temperature) that caps ADVISOR output only; the acting aggregator is never capped. Default None = uncapped, so existing presets are byte-for-byte unchanged (no regression). Wired through both MoA execution paths (MoAChatCompletions.create and aggregate_moa_context). E2E: same task, closed preset uncapped vs reference_max_tokens=600 -> 59s to 33s (~44% faster), final answer identical/correct. - hermes_cli/moa_config.py: _coerce_int_or_none helper + reference_max_tokens in _normalize_preset/_default_preset/flattened view - agent/moa_loop.py: read preset.reference_max_tokens, pass to reference fan-out - agent/conversation_loop.py: pass reference_max_tokens on the per-turn path - tests + docs	2026-07-02 00:16:35 -07:00
Teknium	76a468e513	feat(models): add claude-fable-5, claude-sonnet-5, fugu-ultra to curated OpenRouter + Nous lists (#56617 ) - claude-fable-5 placed above claude-opus-4.8 in both curated lists - claude-sonnet-5 replaces claude-sonnet-4.6 - sakana/fugu-ultra added near the bottom (before routers/free tier) - regenerated website/static/api/model-catalog.json via scripts/build_model_catalog.py (live-pulled by CLI, published on merge — no release needed)	2026-07-01 13:21:42 -07:00
Teknium	ba0bc01d1f	feat(delegate): remove model-facing toolsets arg — subagents always inherit parent's (#56386 ) The model could pass `toolsets` (top-level and per-task) to delegate_task, letting it choose which toolsets a subagent got. Toolset selection is a capability-scoping decision the model should not control; subagents inherit the parent's enabled toolsets, period. - Remove `toolsets` from the delegate_task() signature, the registry handler, the top-level + per-task JSON schema, and the live dispatch path (run_agent._dispatch_delegate_task — this forwarded it on every model call). - Single-task and per-task child builds now pass toolsets=None so _build_child_agent resolves to pure parent inheritance. - Drop the now-dead _SUBAGENT_TOOLSETS / _TOOLSET_LIST_STR schema-hint block. - _build_child_agent keeps its internal toolsets param + intersection helpers (internal API; fed the inherited value only). - Tests: schema assertions flipped to assertNotIn; added a regression test proving the dispatch path never forwards a smuggled model `toolsets`. - Docs: update delegate_task signature refs in the autonomous-ai-agents skill.	2026-07-01 05:35:26 -07:00
Steve Lawton	c73e74386b	feat(vertex): add Google Vertex AI provider for Gemini (OAuth2) Adds Vertex AI as a first-class provider for Gemini models via Vertex's OpenAI-compatible endpoint. Vertex authenticates with short-lived OAuth2 access tokens (service-account JSON or ADC), not a static API key — the missing piece behind the recurring requests (#13484, #12639, #56259). - agent/vertex_adapter.py: OAuth2 token minting + refresh-on-expiry (5-min margin), ADC->service-account fallback, global vs regional endpoint URLs. Config precedence: env var > config.yaml > default. - plugins/model-providers/vertex/: provider profile (auth_type=vertex), reuses Gemini's extra_body.google.thinking_config translation. - runtime_provider: vertex short-circuit BEFORE the credential pool so a credentials-file path is never mistaken for a static API key; mints a fresh token + computes base_url per resolve. - run_agent + conversation_loop: _try_refresh_vertex_client_credentials() re-mints the token and rebuilds the client on a mid-session 401, so a long-lived gateway agent survives token expiry (~1h). - auxiliary_client: vertex auth_type branch for side-LLM tasks. - config.yaml: vertex.project_id / vertex.region (non-secret, bridged to env); credential path stays in .env (VERTEX_CREDENTIALS_PATH). - setup wizard + model picker: dedicated _model_flow_vertex; curated google/gemini-* model list; --provider choices. - pricing/metadata: Vertex prices off the gemini docs snapshot; endpoint host auto-maps to the vertex provider (no probe spam). - lazy_deps + pyproject [vertex] extra: google-auth, opt-in only. - docs: guides/google-vertex.md + providers page; tests for adapter + runtime resolution. Salvages and modernizes #8427 by @slawt onto current main: rewired from the legacy PROVIDER_REGISTRY path to the provider-profile architecture, moved non-secret config out of .env into config.yaml, and added the per-turn 401 token-refresh the original lacked.	2026-07-01 05:25:33 -07:00
Brett	9f03095044	fix(telegram): cap initialize() with per-attempt timeout so unreachable fallback IPs can't hang startup Wrap each Telegram initialize() attempt in asyncio.wait_for(HERMES_TELEGRAM_INIT_TIMEOUT, default 30s). When api.telegram.org and all fallback IPs are unreachable, the connect chain has no outer bound, so a single initialize() blocks for minutes and the retry-on-exception loop never fires — the gateway appears to hang after the banner. The timeout guarantees each attempt is bounded, then retries with backoff, then fails with an actionable error. Also adds WARNING-level progress logs before DoH discovery and each connect attempt (visible at default log level). Salvaged onto plugins/platforms/telegram/adapter.py (Telegram moved from gateway/platforms/ since the PR was opened). Adds env var to docs + AUTHOR_MAP. Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>	2026-07-01 05:07:10 -07:00
Ben	751a300fca	docs(cron): scope in_channel to channels; document DM continuation knob Live DM testing showed a reply to a DM cron brief did NOT continue the job. Root cause: for a 1:1 DM the governing knob is dm_top_level_threads_as_sessions (default True), NOT reply_in_thread / cron_continuable_surface. Under the default, each top-level DM keys to a per-message session (…:dm:<chat>:<ts>), so a reply mints a new ts and can never converge with the flat …:dm:<chat> session the cron seed creates. A 1:1 DM has no thread-vs-timeline split, so "in_channel" has no coherent meaning for a DM — cron_continuable_surface is a channel concept and is a no-op for DMs. DM continuation is governed entirely by dm_top_level_threads_as_sessions: - false → all top-level DMs share …:dm:<chat> → seed + reply converge → works - true (default) → per-message sessions → no continuation (cron or interactive) Option A (chosen): document the requirement; no code change (the flat-DM seed from the prior commit already lands correctly when the knob is false). Adds a ":::note 1:1 DMs" admonition to cron.md + the zh-Hans mirror. Verification (real inbound handler, not a hard-coded assumption — the mistake that made the earlier DM E2E falsely pass): tests/manual/cron_inchannel_dm_e2e.py drives the REAL _handle_slack_message for a top-level DM under both knob values and asserts false→converges (…:dm:D_TESTDM == seed), true→diverges (…:dm:D_TESTDM:<ts>). See decisions.md D9.	2026-07-01 03:16:13 -07:00
Ben	4b4349eb9a	feat(cron/slack): flat in-channel continuable cron delivery surface Add a per-platform `cron_continuable_surface` extra key (`thread` default \| `in_channel`) so a continuable cron job can deliver FLAT into a Slack channel — no dedicated thread — and still be replied-to. In `in_channel` mode the scheduler skips the thread-open branch (leaves `thread_id=None`); the shipped origin-mirror then seeds the `(slack, chat_id, None)` shared-channel session — the same bucket `reply_in_thread: false` routes inbound channel replies to — so a plain channel reply continues the job in context. Design: specs/cron-inchannel-continuable (D1–D7, F5). Model B (shared-channel session), NOT anchoring to the delivery `ts` — on Slack replying to a specific message IS threading, so a `ts` anchor would only relocate the thread, never deliver true threadless continuable. - gateway/platforms/base.py: `supports_inchannel_continuable` capability flag (default False → unsupported platforms fail SAFE to `thread`). - plugins/platforms/slack/adapter.py: flag=True; `_cron_continuable_surface()` resolver (coerces to the two-value enum); `_warn_if_inchannel_without_flat_reply` connect-time warning (D5: warn, not hard-require — the misconfig fails safe). - gateway/config.py: shared-key bridge line (top-level OR nested config). - cron/scheduler.py: read the key generically from platform config, gate the `in_channel` branch on the adapter capability flag, skip thread-open. No new seed function (reuses the existing mirror — G6). Pairing (docs): `in_channel` + `reply_in_thread: false` + `require_mention: false` (or a free-response channel). Missing `reply_in_thread: false` fails safe to a threaded continuation. Gateway-side config flag — `/restart` to apply; NO Slack app reinstall. Tests (from inside the worktree, PYTHONPATH=$PWD): - +6 cron scheduler tests (in_channel skips thread-open; seeds flat channel session with thread_id=None; thread-mode regression; fail-safe on unsupported platform; value coercion). Prove-fail: removing the `and not in_channel_surface` guard turns the two load-bearing tests RED; restore → GREEN. - +10 slack resolver/capability/warning tests; +2 config-bridge tests. - tests/manual/cron_inchannel_e2e.py: offline E2E driving BOTH real legs (delivery seed + inbound reply keying) → both converge on (slack, C, None). - No regressions: test_slack.py 216 passed alone; broader sweep green (4 pre-existing cross-file-ordering failures reproduce identically on pristine origin/main). Docs: cron.md + slack.md + zh-Hans mirrors of both.	2026-07-01 03:16:13 -07:00
Teknium	01e681aa48	docs: unify /new and /reset rows in gateway slash-commands table (#56235 ) The messaging gateway table still listed /new ("Start a new conversation") and /reset ("Reset conversation history") as two separate commands with divergent descriptions. /reset is an alias of /new (see COMMAND_REGISTRY in hermes_cli/commands.py) — same handler, fresh session ID + history. Collapse them into one row matching the registry wording and the CLI table already on line 39. Closes #42829.	2026-07-01 02:39:39 -07:00
Teknium	12556a9a77	chore(scripts): drop Open WebUI local bootstrap script (#56178 ) Remove scripts/setup_open_webui.sh and its 'one-command local bootstrap' doc sections (EN + zh-Hans). The script pip-installed the third-party Open WebUI frontend into ~/.local and managed a launchd/systemd user service — a maintenance liability for downstream software we don't own, and the source of the LAN first-admin signup footgun in #36121. The Open WebUI integration via the OpenAI-compatible API server is unaffected: the Docker/Docker-Compose setup, multi-user profile guide, and troubleshooting in open-webui.md stay, and Open WebUI remains a listed supported frontend. Only the install-and-service bootstrapper is gone.	2026-07-01 01:30:40 -07:00
Teknium	8d78be5460	revert: back out prompt_caching.enabled toggle (#56105 ) for re-evaluation (#56126 ) * Revert "fix(caching): honor prompt_caching.enabled across model switch + fallback" This reverts commit `36f9f50145`. * Revert "fix: allow disabling prompt caching" This reverts commit `c1c1a12fe6`.	2026-07-01 00:20:32 -07:00
teknium1	36f9f50145	fix(caching): honor prompt_caching.enabled across model switch + fallback @janrenz's PR #35862 added prompt_caching.enabled=false at init only. But _anthropic_prompt_cache_policy re-derives _use_prompt_caching on every /model switch (agent_runtime_helpers) and fallback-model swap (chat_completion_helpers), which re-enabled markers and re-broke the strict proxy the toggle was meant to fix. Move the kill switch into anthropic_prompt_cache_policy so it returns (False, False) on every path. Drop the now-redundant init-time override (kept @janrenz's isinstance hardening on the cache_ttl read). Add policy-level tests + docs for the toggle. Follow-up to salvaged PR #35862.	2026-07-01 00:10:42 -07:00
Ben	7c7b489813	feat(slack): render markdown tables as native Block Kit table blocks Replace the interim monospace table fallback with Slack's native `table` block (rows of rich_text cells). Addresses the core ask in #18918. - _table_block(): builds type:"table" with rich_text cells, so inline formatting (bold, links, code) renders inside cells. - Column alignment parsed from the markdown separator row (:---, :-:, --:) into column_settings (left = default/null-skip, center/right emitted). - Escaped pipes (\\\|) are not treated as column separators. - Respects Slack's table limits (100 rows / 20 cols / 10k aggregate chars); oversized or unparseable tables gracefully fall back to aligned monospace (rich_text_preformatted), so a big table never breaks the message. Docs (EN + zh-Hans) updated to describe native tables + the fallback. Tests: native table shape, alignment->column_settings, inline-formatted cells, oversized/too-wide monospace fallback, escaped-pipe cell. Prove- failed against a stubbed _table_block (native-table tests fail, fallback tests stay green). All existing Slack tests still pass.	2026-07-01 00:10:12 -07:00
Ben	b080b93ad8	feat(slack): opt-in Block Kit rendering for agent messages Add platforms.slack.extra.rich_blocks (default off). When enabled, the final agent message is sent as Slack Block Kit blocks — section headers, dividers, and true nested lists via rich_text — instead of flat mrkdwn. - New plugins/platforms/slack/block_kit.py: pure markdown->blocks renderer (headers, dividers, nested ordered/bullet lists, blockquotes, fenced code; pipe-tables as aligned monospace since Block Kit has no robust table block). Enforces Slack's 50-block / 3000-char section limits and returns None to fall back to plain text on empty/oversized/unexpected input. Never raises. - adapter.send(): render blocks on the single-chunk primary message; a text= fallback is ALWAYS sent alongside (notifications/accessibility). - adapter.edit_message(): blocks only on finalize=True, so intermediate streaming edits stay plain mrkdwn (no per-flush block re-derivation). - Docs (EN + zh-Hans) + config example. Send-side only: no app reinstall. Tests: pure-renderer unit suite + adapter integration suite (blocks present when on, plain text when off, text fallback always set, finalize gating, multi-chunk fallback). Prove-failed against a stubbed renderer.	2026-07-01 00:10:12 -07:00
Teknium	97e0bbef53	feat(lsp): add PowerShellEditorServices language server (#55930 ) Registers PowerShell (.ps1/.psm1/.psd1) in the LSP server registry, spawning PowerShellEditorServices over stdio via a pwsh/powershell host. PSES ships as a GitHub release zip (no npm/go/pip recipe), so it sits in the manual install tier alongside rust-analyzer and clangd. The spawn builder resolves the module bundle from (in order) the lsp.servers.powershell.command override, init bundlePath, the PSES_BUNDLE_PATH env var, or <HERMES_HOME>/lsp/PowerShellEditorServices, then launches Start-EditorServices.ps1 -Stdio with a non-interactive, no-profile host. hermes lsp status/list report it as manual-only until pwsh is present. Docs and tests included.	2026-06-30 16:22:18 -07:00
kshitijk4poor	7b12753948	feat(gateway): expose platform_connect_timeout in config.yaml Adds gateway.platform_connect_timeout (default 30s) to DEFAULT_CONFIG and bridges it to the internal HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT env var at gateway startup, following the existing gateway_timeout config->env pattern. The env var remains the manual-override escape hatch and wins if set explicitly; otherwise config.yaml supplies the value. This closes the issue's documentation/config-surface request (#19776 suggestion 2) on top of the adapter ready-wait fix, so users no longer need an undocumented env var to raise the Discord connect timeout. Refs #19776	2026-06-30 15:03:25 -07:00
Teknium	643b0dc678	fix(cron): raise default pre-run script timeout from 120s to 1h (#55489 ) Cron pre-run scripts were capped at 120s by default, which surprised users running long data-collection scripts on crons (the whole point of crons being to offload long work). Raise _DEFAULT_SCRIPT_TIMEOUT to 3600s (1 hour). This bounds the script only — skill/agent jobs already run on a separate inactivity budget (HERMES_CRON_TIMEOUT, default 600s idle, 0=unlimited), not a wall-clock cap. Scripts dispatch to a persistent thread pool and do not hold the tick lock, so a long script doesn't starve other due jobs. Docs clarified to make the script-vs-agent timeout distinction explicit. env/config overrides (HERMES_CRON_SCRIPT_TIMEOUT, cron.script_timeout_seconds) unchanged and still take precedence.	2026-06-30 01:00:39 -07:00
Brooklyn Nicholson	a10113658b	feat(agent): add pre_verify hook and verify-on-stop coding guidance Add a `pre_verify` user/plugin/shell hook fired once per turn when the agent edited code and is about to finish, after the existing verify-on-stop guard. A hook can keep the agent going one more turn (run a check, defer it, tidy the diff) by returning {"action":"continue","message":...} (the Claude-Code Stop shape {"decision":"block","reason":...} is accepted too). Hooks receive coding, attempt, final_response, and sorted changed_paths so they can self-scope and self-throttle; the path is bounded by agent.max_verify_nudges and preserves message-role alternation. Hermes still ships its default coding guidance (agent.verify_guidance, on by default), but it now rides the evidence-based verify-on-stop missing-evidence nudge instead of a separate default pre_verify continuation, so it costs no extra model turn of its own. Guidance reuses the shared utils.is_truthy_value parser rather than a local copy.	2026-06-30 00:59:29 -05:00
Ben Barclay	05ac16778b	feat(gateway): per-platform typing_indicator toggle Add a generic per-platform PlatformConfig.typing_indicator flag (default True) that gates the _keep_typing refresh loop in _process_message_background. When false, the loop is never spawned, so no typing/"is thinking…" status is shown on that platform — message delivery is otherwise unchanged. Mirrors the gateway_restart_notification contract exactly: dataclass field + to_dict/from_dict (with extra-fallback resolution) + shared-key bridge in load_gateway_config, so 'slack: typing_indicator: false' under platforms works without a separate block. Generic by design — the same key works for every platform (Slack 'is thinking…', Telegram/Discord/Signal typing). Motivated by users who find Slack's assistant 'is thinking…' status noisy (it also briefly disables the compose box, via the Assistant API).	2026-06-29 21:12:57 -07:00
Teknium	d4c14011eb	feat(claude-design): add surface-first conditioning + slop diagnostic (#55399 ) Port the two genuinely-novel ideas from Command Code's /design skill into our existing claude-design skill (skill-only, zero model-tool footprint): - Surface-First: commit to one of 7 surface archetypes (Monitor/Operate/ Compare/Configure/Decide/Explore/Command) before any visual tokens. Most AI design slop is compositional, not cosmetic — conditioning generation on a surface choice collapses entropy the way a CoT step does. Workflow step 3. - Slop Diagnostic: the ~10 tells that account for ~90% of the 'this is AI' signal, as a score-out-of-10 self-audit. Diagnose-then-treat: the report is context not a to-do list; repair only what fired, matched to the tell (re-layout vs recolor vs de-decorate). Workflow step 7 (Verify). Did NOT clone /design's 16-mode CLI, proprietary reference corpus, or make it a core tool. Docs page regenerated via generate-skill-docs.py.	2026-06-29 21:12:29 -07:00
Teknium	c6c1fd8b6b	docs: create dev venv outside the source tree (root-cause fix for #7779 ) (#54862 ) A manually-installed venv inside the cloned repo can be destroyed by the agent running a relative-path command against its own checkout (rm -rf venv, uv venv venv, etc.), silently wiping the running runtime mid-session. Moving the canonical manual-install venv to ~/.hermes/venvs/hermes-dev means no relative path from the agent's workspace resolves to its own runtime, making the bug class impossible without any command-detection code. Closes the root cause of #7779. The managed install.sh layout is unchanged.	2026-06-29 10:00:37 -07:00
teknium1	75317d82d0	fix(vision): narrow the fan-out cap to the CPU encode burst only The original cap held a process-global slot across the WHOLE vision analysis (image load + encode + LLM call) with a default of min(CPUs, 4). That serialized legitimate multi-image workflows — "compare these 6 screenshots", "read this 10-page scan", "analyze every frame" — behind a 4-wide gate, and on the native fast path it even throttled calls that make no LLM request at all. Excess calls queued (blocking acquire, nothing dropped), but the latency hit on real fan-out was the wrong tradeoff. The incident was CPU exhaustion, not call count: concurrent base64/resize bursts saturated every core and left none to service the shared event loop serving /api/status. So cap ONLY that: - A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the encode/resize/dimension-check off the caller's loop, sized to the host's usable core count with NO fixed ceiling — the cap tracks the actual exhausted resource (cores), not a magic number. Excess encodes queue on the executor; cores stay free for the loop. - The LLM call is deliberately OUTSIDE the executor, so multi-image workflows keep full request concurrency. - Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY (honored verbatim, including above core count); sub-1 ignored. - _vision_concurrency_slot() is now a no-op shim for back-compat. Tests assert: resolver defaults to host cores with no ceiling; env/config override (incl. above cores); sub-1 rejection; the executor is dedicated and core-sized; encode runs on a vision-encode thread; and crucially that encode bursts are bounded to the cap while the analyses themselves stay fully concurrent (calls_peak > cap).	2026-06-29 01:27:10 -07:00
Ben Barclay	eddfecd2ce	fix(vision): cap vision_analyze fan-out concurrency process-wide A single agent turn can fan out N vision_analyze calls at once — the classic trigger is "analyze every frame of this video", where ffmpeg explodes a clip into dozens of frames and the model calls vision_analyze on each. Every call does a CPU-heavy base64-encode/resize burst AND holds a long-lived LLM stream open. The tool executor runs concurrent tool calls on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple agent sessions share one process (the dashboard runs the agent in-process), so there was no global ceiling. In prod (June 2026) a video-frame fan-out pinned a worker thread at ~100% CPU and starved the shared asyncio event loop that also serves the dashboard's /api/status liveness probe, flapping the instance to UNHEALTHY even though nothing had crashed. Add a process-global threading.BoundedSemaphore that bounds how many vision analyses run concurrently across the whole process, held across the entire analysis (image load + encode + LLM call) in the single _handle_vision_analyze chokepoint (covers both the native fast path and the legacy aux-LLM path). It is a threading semaphore, NOT asyncio: each vision call is dispatched through model_tools._run_async on a per-thread event loop, so an asyncio primitive bound to one loop cannot coordinate across them. The acquire is offloaded via run_in_executor so waiting for a slot never blocks the calling loop. Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency, or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can never be disabled into an unbounded fan-out. Tests: bounded-fan-out regression guard + a control proving it would fail without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu host, env override, and sub-1 rejection. Pre-existing handler tests updated for the now-async _handle_vision_analyze. Verified via the real registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls, peak bounded to cap).	2026-06-29 01:27:10 -07:00
teknium1	34e616e778	feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required Follow-up to the group-DM manifest fix. The manifest change only helps NEW installs; existing apps keep their old (mpim-less) scopes until the admin reinstalls. Since a missing message.mpim event delivers nothing (no runtime API error to catch), detect stale installs at connect time from the auth.test x-oauth-scopes header and log an actionable reinstall nudge when im:history is granted but mpim:history is not. Also promote message.mpim from Recommended to Required in the docs event tables so the default setup path can't drop it.	2026-06-29 01:02:53 -07:00
Ben	4125cc3b7c	fix(slack): subscribe to message.mpim + mpim scopes so group DMs work Group DMs (multi-person DMs, channel_type=mpim) were never delivered to the Slack bot. The adapter already classifies mpim as a DM and replies ambiently (adapter.py:2526, is_dm = channel_type in {im, mpim}), but the generated app manifest only subscribed to message.im / im:history — the 1:1 DM pair. Without the message.mpim event subscription Slack drops group-DM messages before the adapter ever sees them, so 1:1 DMs worked while group-DM ambient mode was dead. Add message.mpim to bot_events and mpim:history (the scope that event requires per Slack docs) + mpim:read (mirrors im:read for the conversations.info classification call) to bot_scopes. Update the SLACK_BOT_TOKEN / SLACK_APP_TOKEN setup-help strings and the Slack docs (EN + zh-Hans: scope table, event table, troubleshooting) so existing installs are told to add the new scopes and reinstall. Reported by an enterprise customer. Note: this is a manifest/scope change, so it only takes effect after the app is reinstalled and the new scopes are accepted. Tests: assert message.mpim + mpim:history + mpim:read are in the manifest (with and without assistant mode); both fail on current main and pass with this change.	2026-06-29 01:02:53 -07:00
Ben Barclay	e1f4098b9f	docs(cron): document explicit per-channel delivery targets for all platforms (#54630 ) The cron delivery table only showed Discord/Telegram with explicit target syntax and described Slack and every other platform as home-channel-only. In fact the generic platform:<target> routing in _resolve_single_delivery_target resolves explicit targets for every platform: Slack (#channel / channel ID / channel:thread_ts), Matrix (room/user IDs), Feishu (chat:thread), WhatsApp (JID / E.164), Signal (group / E.164), SMS, Email, and Weixin all have dedicated explicit- target branches in _parse_target_ref; the remaining platforms accept a generic platform:<chat_id> passthrough. Update the Delivery Model table (en + zh-Hans) to show the real per-platform syntax, document #channel name resolution via the channel directory, and note the Slack thread_ts nuance. Docs-only.	2026-06-29 15:23:16 +10:00
Brooklyn Nicholson	e684b808ad	fix(desktop): route old runtimes through `dashboard` when `serve` is absent `hermes serve` is newer than the desktop binary's release cadence, so a new app launched against an un-upgraded managed install / PATH `hermes` would crash on an unknown subcommand and brick the user mid-upgrade. Detect whether the resolved runtime registers `serve` (fast source read of its dashboard.py, with a one-time CLI probe fallback) and rewrite the backend argv to the legacy `dashboard --no-open` only when it does not. Happy path (current runtimes) pays nothing and still spawns `serve`. - electron/backend-command.cjs: pure serve/dashboard argv helpers + serve- source detection (unit-tested in backend-command.test.cjs) - main.cjs: backendSupportsServe() cache + getBackendArgsForRuntime() guard at both backend spawn sites; expose `root` from the Windows venv unwrap so the fast source check covers Windows too - docs: note the backward-compat fallback in README, desktop.md, AGENTS.md	2026-06-28 22:10:42 -05:00
Brooklyn Nicholson	dff491a2b9	feat(cli): add headless `hermes serve` backend; desktop no longer launches `dashboard` The desktop app spawned `hermes dashboard --no-open` as its backend, which made the dashboard look like a desktop prerequisite. Add a dedicated headless `hermes serve` command that boots the same gateway (shared cmd_dashboard / start_server) but never opens a browser, and point the desktop backend spawn exclusively at it. dashboard and serve are now independent surfaces — neither launches the other. - subcommands/dashboard.py: factor shared server args; add `serve` parser (always headless; accepts legacy --no-open as a no-op) - main.py: register serve in _BUILTIN_SUBCOMMANDS + coalesce set + gui-log detection; extend stale-backend reaper patterns to match `serve` - desktop electron: spawn `serve`, rename dashboardArgs -> backendArgs, update comments + windows-child-process test assertions - docs: desktop README, desktop.md (incl. remote-backend), AGENTS.md, and cli-commands.md now describe `hermes serve` as the desktop/headless backend	2026-06-28 22:04:22 -05:00
Brooklyn Nicholson	f019a999d8	docs: clarify desktop is self-contained, not dependent on the dashboard The desktop app spawns a headless `hermes dashboard --no-open` backend and talks to it through the shared @hermes/shared WebSocket client — it never runs or requires the browser dashboard UI. Spell this out in the desktop README, the desktop docs page, and AGENTS.md so "dashboard" stops reading as a desktop prerequisite.	2026-06-28 21:50:33 -05:00
Teknium	b31b0b9d95	docs: reconcile docs with code across last 3 releases (#54254 ) Audited the last 3 releases (v2026.5.28..main) against the docs site and fixed code-vs-docs drift: - slash-commands: add /moa, /prompt, /pet, /hatch, /timestamps - cli-commands: add hermes pets / project / desktop / whatsapp-cloud + dashboard register; correct --insecure (now a deprecated no-op); add gateway migrate-legacy + enroll --wake-url + dashboard --skip-build - environment-variables: document the remaining ~48 env vars (SimpleX, Photon, Teams adapter, per-platform _ALLOW_ALL_USERS, home-channel vars, IRC, Brave/Krea/Notion/Linear/Airtable/Tenor keys, QQ_SANDBOX) — full OPTIONAL_ENV_VARS (265) now covered - configuration: document tool_loop_guardrails, goals, prompt_caching, network, onboarding, dashboard config blocks - toolsets/tools-reference + tools.md: add coding/project toolsets and read_terminal/project_ tools; remove the stale messaging toolset and send_message agent tool (removed in #47856); drop stale RL-training prose - messaging: new IRC channel page (adapter shipped without docs) + index row + sidebar + env vars - pets: document the /hatch AI generation pipeline + Nous/OpenRouter image backend - web-dashboard: document the bearer-token / TokenPrincipal service auth path - purge agent-callable send_message references across guides/features and the research-paper-writing skill (tool removed in #47856) Verified: docusaurus build succeeds; all authored internal links resolve.	2026-06-28 12:47:50 -07:00
Christian Persico	135f235165	docs: fix incorrect web search instructions	2026-06-28 02:54:27 -07:00
Teknium	de6e9ac760	docs(discord): document bot-to-bot comms as unsupported (#32791 ) (#54063 ) * docs(discord): document bot-to-bot comms as unsupported (#32791) Multi-profile bot-to-bot conversation is not a supported topology. DISCORD_ALLOW_BOTS=none (the default) blocks all bot-originated messages; setting mentions/all across multiple Hermes profiles to make them reply to each other ack-loops because Discord's reply auto-mention satisfies the mention gate every turn. Document the safe default and the loop hazard so operators don't wire it up. * docs(discord): infographic for bot-to-bot unsupported stance (#32791)	2026-06-28 01:15:34 -07:00
Teknium	1b70a91844	docs: third-party-product plugins ship standalone, not into core tree (#54001 ) * docs: third-party-product plugins ship standalone, not into core tree Generalizes the closed-set memory-provider policy to any plugin that integrates someone else's product/project (observability backends, vendor SaaS, analytics dashboards, paid-service tie-ins). These create an open-ended maintenance burden on us for backends we don't own, so they ship as standalone plugin repos installed into ~/.hermes/plugins/ and are promoted in #plugins-skills-and-skins — not merged into core. - AGENTS.md: new 'what we don't want' bullet + generalized policy note beside the memory-provider closed-set rule - CONTRIBUTING.md: new 'Third-Party Product Integrations' section - build-a-hermes-plugin.md: caution callout at the top of the guide It's a coupling decision, not a quality bar — a plugin can clear review and still be a close. * docs: add infographic for standalone-plugin policy	2026-06-27 22:23:50 -07:00
teknium1	a1ac6baac4	fix(gateway): make bg-process reset TTL configurable + surface session-scoped processes Follow-up to the cherry-picked #29212 (#29177): - Promote the 24h stale-process threshold to config.yaml (session_reset.bg_process_max_age_hours) instead of a hardcoded constant. 0 disables the cutoff (legacy: any live process blocks reset). Wired through GatewayConfig.default_reset_policy in gateway/run.py. - Bug 2: process(action=list) now resolves the gateway session_key from the contextvar and surfaces session-scoped background processes (a forgotten preview server under a different task), flagged session_scoped — so the agent/user can discover and kill the blocker. Previously the task-scoped list returned [] and the blocker was invisible. - Tests: config round-trip for the new field, cross-task list visibility. - Docs: messaging session-reset section.	2026-06-27 20:45:43 -07:00
xxxigm	6f1a176b33	fix(gateway/discord): REST liveness probe to detect zombie clients (#26656 ) The Discord adapter could enter a silent zombie state after a network outage / proxy stall: the process is alive, _client looks open, but the underlying socket is dead. discord.py's WebSocket reconnect never sees a RST through a wedged proxy/NAT, so client.start() spins forever without exiting — which means the bot-task done callback (which only fires on task completion) never trips either. The bot stays "offline" in Discord until a manual `hermes gateway restart`. Reported offline for 13-17h. Adds an out-of-band REST liveness probe in DiscordAdapter. Every `discord.liveness_interval_seconds` (default 60s) the adapter issues a cheap fetch_user(bot_id) — the same REST path as message delivery, so it fails when the proxy/NAT is wedged. After `discord.liveness_failure_threshold` consecutive failures (default 3) the probe closes the wedged client and surfaces a retryable fatal error, which trips the gateway's existing _platform_reconnect_watcher and rebuilds the adapter. Operators disable it by setting either knob to 0. Config lives in config.yaml (discord.liveness_) per the .env-is-secrets policy; _apply_yaml_config bridges it to internal env vars the adapter reads, matching the existing HERMES_DISCORD_TEXT_BATCH_ pattern. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-27 19:30:32 -07:00
Teknium	6717cfc805	docs(gateway): warn against custom ExecStopPost kill drop-in (restart loop) (#53903 ) A user-added systemd drop-in like ExecStopPost=/bin/kill -9 $MAINPID fires on every stop, including clean restarts — it SIGKILLs the freshly spawned gateway before it stabilizes and Restart=always respawns it, producing an infinite restart loop (issue #23272). The unit Hermes installs already shuts down cleanly via KillMode=mixed + KillSignal=SIGTERM with Restart=always + RestartForceExitStatus, so no extra kill is needed. Document this as a danger callout in the gateway service-management section.	2026-06-27 19:04:29 -07:00
Teknium	789f8b7dc2	docs(webhook): clarify authenticated != trusted-content trust model (#53562 ) HMAC validation authenticates the webhook sender, not the business fields inside the payload (PR titles, commit messages, issue bodies), which are authored by untrusted third parties. Expand the prompt- injection section to make the trust boundary explicit: the agent's capability surface, not the input channel. Document the hardening levers (sandbox the runtime, scope the toolset, keep approvals on, template narrowly) instead of pretending to sanitize untrusted text. Refs #8820.	2026-06-27 03:43:33 -07:00
teknium1	50f6855217	feat(moa): make /moa one-shot only; route preset switching through the model picker /moa no longer does a sticky model switch. It now always runs a single prompt through the default MoA preset and restores the prior model afterward; the whole argument is the prompt (no preset-name matching). To switch to a MoA preset for the session, select it from the model picker, where presets already surface under a virtual Mixture of Agents provider on every model-selection surface. Also fixes #53444: the TUI one-shot only set session[model_override], which the already-built cached agent ignored, so MoA silently never ran and the turn used the original model. The TUI now does a real in-place agent.switch_model() via _apply_model_switch() when a live agent exists (with a proper restore after the turn), and falls back to a model_override for lazy/unbuilt sessions. Removes the redundant sticky-switch branch from the CLI, gateway, and TUI /moa handlers; updates the command description, usage string, and docs.	2026-06-27 03:09:09 -07:00
Mahesh Sanikommu	1b75b3fd90	feat(memory): add Supermemory setup connection summary Add post_setup() and get_status_config() to the Supermemory memory provider so `hermes memory setup` and `hermes memory status` print a one-line connection summary (container, profile fact count, auto_recall/auto_capture). Point API-key onboarding at the Hermes connect URL (app.supermemory.ai/integrations?connect=hermes). Salvage of #52988. Two fixes folded in: - Test isolation: the new probe/status tests mocked _SupermemoryClient but not the __import__("supermemory") guard inside _probe_supermemory_connection, so they passed only where the optional supermemory package was installed and failed on a clean checkout / CI (the PR shipped with red CI). Added _stub_supermemory_importable() mirroring the existing test_is_available_false_when_import_missing pattern; the suite now passes with supermemory absent. - post_setup: `if api_key and api_key not in os.environ` checked whether the key's value named an env var (always false in practice). Fixed to compare the value: `os.environ.get("SUPERMEMORY_API_KEY") != api_key`. Verified: 38/38 in test_supermemory_provider.py and the full tests/plugins/memory/ suite green with supermemory not installed. Closes #52988	2026-06-27 15:07:34 +05:30
kshitijk4poor	cdb1dfbc49	fix: use os.pathsep, add tests, update tips for multi-root support - Use os.pathsep instead of literal ':' so Windows paths (C:\dir) and the Windows separator ';' work correctly. - Add 9 tests covering multi-root behavior: writes inside first/second root, writes outside all roots, trailing/leading/double separators, all-separators edge case, static deny priority, duplicate dedup. - Update hermes_cli/tips.py tip string to mention multiple paths. - Update docs to mention os.pathsep / ; on Windows. Follow-up for salvaged PR #49557.	2026-06-27 04:01:12 +05:30
Zheng Tao	d15cc9bc83	docs: update HERMES_WRITE_SAFE_ROOT docs with multi-path format Add note about colon-separated multiple directories support.	2026-06-27 04:01:12 +05:30
Teknium	9b2af36d5a	docs(moa): document prompt-caching behavior for references and aggregator (#53218 ) * docs(moa): document prompt-caching behavior for references and aggregator * docs(moa): clarify references preserve cache, only aggregator trades reuse * docs(moa): correct caching prose — tail-append preserves aggregator cache too	2026-06-26 12:58:05 -07:00
ethernet	ba7026c376	feat(docs): clarify termux/nix as t2 platforoms	2026-06-26 11:37:56 -07:00
ethernet	772cf847b0	feat(docs): clarify platform support	2026-06-26 11:37:56 -07:00
Teknium	2d3071f9d4	docs(moa): clarify MoA presets are selectable on every surface (CLI, hermes model, Dashboard, Desktop, TUI) (#53211 )	2026-06-26 11:16:14 -07:00
Teknium	9dd56f0dfb	docs(moa): add HermesBench results to Mixture of Agents page (#53206 )	2026-06-26 11:05:07 -07:00

1 2 3 4 5 ...

1270 commits