History

Brooklyn Nicholson 5a6720b884 fix(desktop,tui-gateway,zai): stop thinking-off from reverting to medium A Z.ai desktop user reported thinking reverting to medium after one turn, burning ~200% of a week's credits in 4 days despite reasoning_effort: false in config.yaml. Four compounding bugs: - _session_info reported reasoning_effort "" for disabled reasoning, indistinguishable from unset — the desktop adopted it after the first turn, wiping its sticky "thinking off" pick so every later chat reverted to the default effort. - config.set key=reasoning always wrote agent.reasoning_effort to global config.yaml, so every desktop model-menu selection (preset.effort ?? 'medium') clobbered the user's configured value. Now session-scoped like the messaging gateway's /reasoning, landing on create_reasoning_override so lazily-built sessions keep it too. - YAML `reasoning_effort: false`/`off`/`no` (boolean False) was coerced to "" by every loader's `str(x or "")`, silently re-enabling thinking. parse_reasoning_effort now treats False/"false"/"disabled" as {"enabled": False}; loaders (tui gateway, gateway, cli, cron, delegate) pass the raw value through. The desktop config reader also crashed on the boolean (false.trim()), aborting voice/STT settings. - The zai provider profile never sent thinking on the wire, and GLM-4.5+ defaults to thinking ON server-side — so disabling reasoning was a silent no-op on direct Z.ai, the actual token burner. The profile now emits extra_body.thinking {"type": "enabled"\|"disabled"} for thinking-capable GLM models, mirroring the DeepSeek profile. Also: /new (session reset) now carries reasoning_config across the rebuild like model_override; config.get reasoning prefers the session's live value and maps a config False to "none"; Settings shows "Off" instead of a blank select for hand-written false.		2026-07-02 15:23:47 -05:00
..
alibaba
alibaba-coding-plan	chore(model-picker): refresh provider picker descriptions	2026-05-31 15:02:26 -07:00
anthropic	fix(models): pass model.base_url to fetch_models in /model picker	2026-06-16 13:09:40 -07:00
arcee
azure-foundry	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
bedrock	fix(models): pass model.base_url to fetch_models in /model picker	2026-06-16 13:09:40 -07:00
copilot
copilot-acp	fix(models): pass model.base_url to fetch_models in /model picker	2026-06-16 13:09:40 -07:00
custom	fix(models): pass model.base_url to fetch_models in /model picker	2026-06-16 13:09:40 -07:00
deepseek	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
gemini	feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492 )	2026-06-21 19:53:27 -07:00
gmi	refactor(gmi): move User-Agent to profile.default_headers	2026-05-08 03:22:11 -07:00
huggingface
kilocode
kimi-coding	fix(kimi): send thinking xor reasoning_effort, never both	2026-06-07 01:24:29 -07:00
minimax	fix: route minimax m3 reasoning controls through profile	2026-06-15 07:08:43 -07:00
nous	feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )	2026-05-12 20:49:20 -07:00
novita	docs: update NovitaAI provider positioning (#25532 )	2026-05-14 01:31:12 -07:00
nvidia
ollama-cloud	feat: add reasoning_effort support to ollama-cloud provider	2026-06-23 11:51:43 -07:00
openai-codex
opencode-zen	fix(opencode-go): gate thinking when reasoning_effort set to avoid HTTP 400	2026-06-07 01:24:29 -07:00
openrouter	fix(models): pass model.base_url to fetch_models in /model picker	2026-06-16 13:09:40 -07:00
qwen-oauth
stepfun
vertex	feat(vertex): add Google Vertex AI provider for Gemini (OAuth2)	2026-07-01 05:25:33 -07:00
xai
xiaomi	fix(vision): proactive downgrade for providers rejecting list-type tool content (#41072 )	2026-06-07 21:50:57 -07:00
zai	fix(desktop,tui-gateway,zai): stop thinking-off from reverting to medium	2026-07-02 15:23:47 -05:00
README.md

README.md

Model Provider Plugins

Each subdirectory is a self-contained provider profile plugin. The directory layout mirrors plugins/platforms/:

plugins/model-providers/
├── openrouter/
│   ├── __init__.py      # registers the ProviderProfile
│   └── plugin.yaml      # manifest: name, kind, version, description
├── anthropic/
│   ├── __init__.py
│   └── plugin.yaml
└── ...

How discovery works

providers/__init__.py._discover_providers() scans this directory (and $HERMES_HOME/plugins/model-providers/) the first time anything calls get_provider_profile() or list_providers(). Each __init__.py is imported and expected to call providers.register_provider(profile).

User plugins at $HERMES_HOME/plugins/model-providers/<name>/ override bundled plugins of the same name — last-writer-wins in register_provider(). Drop a file there to replace a built-in.

Adding a new provider

Create plugins/model-providers/<your_provider>/__init__.py:

from providers import register_provider
from providers.base import ProviderProfile

my_provider = ProviderProfile(
    name="your-provider",
    aliases=("alias1", "alias2"),
    display_name="Your Provider",
    description="One-line description shown in the setup picker",
    signup_url="https://your-provider.example.com/keys",
    env_vars=("YOUR_PROVIDER_API_KEY", "YOUR_PROVIDER_BASE_URL"),
    base_url="https://api.your-provider.example.com/v1",
    default_aux_model="your-cheap-model",
)

register_provider(my_provider)

Create plugins/model-providers/<your_provider>/plugin.yaml:

name: your-provider-profile
kind: model-provider
version: 1.0.0
description: Short sentence about the provider
author: Your Name

Nothing else needs to change. auth.py, config.py, models.py, doctor.py, model_metadata.py, runtime_provider.py, and the chat_completions transport all auto-wire from the registry.

Non-trivial profiles

Override the ProviderProfile hooks in a subclass for per-provider quirks — see plugins/model-providers/openrouter/__init__.py for build_extra_body and build_api_kwargs_extras examples, and plugins/model-providers/gemini/__init__.py for thinking_config translation.