hermes-agent/website/docs/user-guide
Teknium 543d305bbb
feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756)
MoA per-turn latency is dominated by advisor GENERATION: turn wall time
correlates ~0.88 with output tokens and ~-0.03 with input tokens (measured over
52 turns). Each turn waits for the slowest advisor to finish writing, and
advisors were uncapped — writing multi-thousand-token essays the aggregator
only needs the gist of.

Add an opt-in per-preset reference_max_tokens knob (mirrors reference_temperature)
that caps ADVISOR output only; the acting aggregator is never capped. Default
None = uncapped, so existing presets are byte-for-byte unchanged (no regression).
Wired through both MoA execution paths (MoAChatCompletions.create and
aggregate_moa_context).

E2E: same task, closed preset uncapped vs reference_max_tokens=600 -> 59s to 33s
(~44% faster), final answer identical/correct.

- hermes_cli/moa_config.py: _coerce_int_or_none helper + reference_max_tokens
  in _normalize_preset/_default_preset/flattened view
- agent/moa_loop.py: read preset.reference_max_tokens, pass to reference fan-out
- agent/conversation_loop.py: pass reference_max_tokens on the per-turn path
- tests + docs
2026-07-02 00:16:35 -07:00
..
features feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756) 2026-07-02 00:16:35 -07:00
messaging feat(cron/slack): flat in-channel continuable cron delivery surface 2026-07-01 03:16:13 -07:00
secrets feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378) 2026-05-24 02:19:57 -07:00
skills feat(delegate): remove model-facing toolsets arg — subagents always inherit parent's (#56386) 2026-07-01 05:35:26 -07:00
_category_.json feat: add documentation website (Docusaurus) 2026-03-05 05:24:55 -08:00
checkpoints-and-rollback.md feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
cli.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
configuration.md revert: back out prompt_caching.enabled toggle (#56105) for re-evaluation (#56126) 2026-07-01 00:20:32 -07:00
configuring-models.md fix(cli): warn when in-session model switch will preflight-compress 2026-06-21 16:29:31 +05:30
desktop.md fix(desktop): route old runtimes through dashboard when serve is absent 2026-06-28 22:10:42 -05:00
docker.md fix(docker): replace dashboard --insecure with basic-auth provider 2026-06-21 19:05:27 -07:00
git-worktrees.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
managed-scope.md docs: add managed scope admin guide + cross-link from configuration 2026-06-19 07:46:33 -07:00
multi-profile-gateways.md docs(gateway): document multiplexing opt-in + contract changes 2026-06-19 07:34:15 -07:00
profile-distributions.md Expand .gitignore example 2026-06-20 20:42:49 -07:00
profiles.md fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
security.md Make email pairing opt-in 2026-06-21 22:43:57 -07:00
sessions.md docs(sessions): clarify sessions.json is the gateway routing index, not the session list (#51726) 2026-06-23 23:56:36 -07:00
tui.md docs(tui): correct HERMES_TUI_GATEWAY_URL — dashboard-internal, not remote-attach (#42162) 2026-06-08 09:37:03 -07:00
windows-native.md docs(windows): correct native data dir to %LOCALAPPDATA%\hermes (#42856) 2026-06-09 14:11:20 -05:00
windows-wsl-quickstart.md fix(docs): update all install instructions everywhere 2026-06-04 21:07:45 -04:00