hermes-agent/website
Teknium 543d305bbb
feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756)
MoA per-turn latency is dominated by advisor GENERATION: turn wall time
correlates ~0.88 with output tokens and ~-0.03 with input tokens (measured over
52 turns). Each turn waits for the slowest advisor to finish writing, and
advisors were uncapped — writing multi-thousand-token essays the aggregator
only needs the gist of.

Add an opt-in per-preset reference_max_tokens knob (mirrors reference_temperature)
that caps ADVISOR output only; the acting aggregator is never capped. Default
None = uncapped, so existing presets are byte-for-byte unchanged (no regression).
Wired through both MoA execution paths (MoAChatCompletions.create and
aggregate_moa_context).

E2E: same task, closed preset uncapped vs reference_max_tokens=600 -> 59s to 33s
(~44% faster), final answer identical/correct.

- hermes_cli/moa_config.py: _coerce_int_or_none helper + reference_max_tokens
  in _normalize_preset/_default_preset/flattened view
- agent/moa_loop.py: read preset.reference_max_tokens, pass to reference fan-out
- agent/conversation_loop.py: pass reference_max_tokens on the per-turn path
- tests + docs
2026-07-02 00:16:35 -07:00
..
docs feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756) 2026-07-02 00:16:35 -07:00
i18n/zh-Hans/docusaurus-plugin-content-docs/current feat(delegate): remove model-facing toolsets arg — subagents always inherit parent's (#56386) 2026-07-01 05:35:26 -07:00
scripts refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
src refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
static feat(models): add claude-fable-5, claude-sonnet-5, fugu-ultra to curated OpenRouter + Nous lists (#56617) 2026-07-01 13:21:42 -07:00
.gitignore feat(skills-hub): health checks, freshness badge, and a watchdog cron (#32345) 2026-05-25 23:10:45 -07:00
docusaurus.config.ts docs: point desktop download links to site root (deprecate /desktop) (#46795) 2026-06-15 15:02:24 -04:00
package-lock.json docs(website): redirect old automation-templates URL to automation-blueprints 2026-06-12 09:46:27 -07:00
package.json docs(website): redirect old automation-templates URL to automation-blueprints 2026-06-12 09:46:27 -07:00
README.md
sidebars.ts feat(vertex): add Google Vertex AI provider for Gemini (OAuth2) 2026-07-01 05:25:33 -07:00
tsconfig.json change(tooling): typecheck in CI, update ts to 6 2026-06-10 11:59:34 -04:00

Website

This website is built using Docusaurus, a modern static website generator.

Installation

yarn

Local Development

yarn start

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

Build

yarn build

This command generates static content into the build directory and can be served using any static contents hosting service.

Deployment

Using SSH:

USE_SSH=true yarn deploy

Not using SSH:

GIT_USER=<Your GitHub username> yarn deploy

If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.

Diagram Linting

CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.