Harden hosted Docker install tree against self-modification (#47490)

* Harden hosted Docker install tree

* Document hosted Docker immutable install tree
This commit is contained in:
shannonsands 2026-06-18 09:09:21 +10:00 committed by GitHub
parent f8098c6b6f
commit 6092be413d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 240 additions and 289 deletions

View file

@ -28,14 +28,13 @@ as_hermes() { [ "$(id -u)" = 0 ] || { "$@"; return; }; s6-setuidgid hermes "$@";
# arbitrary host UID (the classic `--user $(id -u):$(id -g)` invocation people
# used in the tini era to make container-written files match their host user).
#
# Under s6-overlay this no longer works: the bootstrap (UID remap, volume +
# build-tree chown, config seeding) all require root, and they're skipped when
# the container starts non-root. The baked image trees (/opt/data, /opt/hermes/
# .venv, ui-tui, node_modules) stay owned by the hermes build UID (10000), so an
# arbitrary `--user` UID can't write them — the runtime then fails with EACCES
# on a bind mount, or hard-crashes on a named volume (Docker initialises the
# volume from the image as UID 10000, and the non-root start can't even `cd`
# into $HERMES_HOME). See #34837 for the supervision-tree side of this.
# Under s6-overlay this no longer works: the bootstrap (UID remap, data-volume
# ownership, config seeding) requires root, and it is skipped when the container
# starts non-root. The baked install tree under /opt/hermes is intentionally
# root-owned and non-writable; mutable runtime state must live under
# $HERMES_HOME. An arbitrary `--user` UID therefore cannot repair or populate
# the data volume, and startup fails with EACCES. See #34837 for the
# supervision-tree side of this.
#
# The supported way to match host-side ownership is to start as root (the image
# default) and pass HERMES_UID/HERMES_GID — or the PUID/PGID aliases — which the
@ -53,9 +52,10 @@ if [ "$cur_uid" != 0 ] && [ "$cur_uid" != "$(id -u hermes)" ]; then
[stage2] ERROR: container started with --user $cur_uid (an arbitrary, non-hermes UID).
This is not supported under the s6-overlay image. The container bootstrap
(UID remap, volume ownership, dependency installs) needs to start as root,
and the baked image directories are owned by the hermes user (UID $(id -u hermes)),
so a pinned --user UID cannot write them — startup will fail.
(UID remap, data-volume ownership, config seeding) needs to start as root,
and the baked /opt/hermes install tree is intentionally root-owned and
non-writable, so a pinned --user UID cannot repair startup state — startup
will fail.
To make container-written files match your HOST user, DON'T use --user.
Start the container as root (the default) and pass your host UID/GID instead:
@ -207,49 +207,13 @@ if [ "$needs_chown" = true ]; then
done
fi
# --- Fix ownership of build trees under $INSTALL_DIR ---
# Hermes-owned trees under $INSTALL_DIR must be re-chowned whenever the
# runtime hermes UID no longer owns them — otherwise:
# - .venv: lazy_deps.py cannot install platform packages (discord.py,
# telegram, slack, etc.) with EACCES (#15012, #21100)
# - ui-tui: esbuild rebuilds dist/entry.js on every TUI launch (when
# the source mtime is newer than dist/ or when HERMES_TUI_FORCE_BUILD
# is set) and writes to ui-tui/dist/. Without this chown the new
# hermes UID can't write the build output (#28851).
# - gateway: Python writes __pycache__ and runtime artifacts beneath the
# gateway package on first import. After a UID remap those source-owned
# paths still belong to the build-time UID (10000) unless repaired here,
# producing EACCES for the supervised gateway (#27221).
# - node_modules: root-level dependencies (puppeteer, web tooling)
# that runtime code may walk/update.
# The set mirrors the build-time `chown -R hermes:hermes` line in the
# Dockerfile — keep them in sync if the Dockerfile chown set changes.
# These are under $INSTALL_DIR (not $HERMES_HOME), so the bind-mount
# concern doesn't apply — recursive is fine.
#
# This MUST be gated independently of the $HERMES_HOME ownership check
# above. `usermod -u <new> hermes` re-chowns the hermes home dir
# ($HERMES_HOME == /opt/data) to the new UID as a side effect, so after a
# HERMES_UID/PUID remap `stat $HERMES_HOME` always already matches the new
# UID and `needs_chown` is false — but the build trees under /opt/hermes
# are NOT touched by usermod and remain owned by the build-time UID
# (10000). Gating them on $HERMES_HOME ownership (as #35027 did) silently
# skipped this chown on the common PUID/NAS path, regressing lazy installs
# and TUI rebuilds. Probe the build trees directly instead: chown only
# when the venv is not already owned by the runtime hermes UID. Idempotent
# and skips the expensive recursive chown on every restart once ownership
# is settled.
venv_owner=$(stat -c %u "$INSTALL_DIR/.venv" 2>/dev/null || echo "")
if [ -n "$venv_owner" ] && [ "$venv_owner" != "$actual_hermes_uid" ]; then
echo "[stage2] Fixing ownership of build trees under $INSTALL_DIR to hermes ($actual_hermes_uid)"
chown -R hermes:hermes \
"$INSTALL_DIR/.venv" \
"$INSTALL_DIR/ui-tui" \
"$INSTALL_DIR/gateway" \
"$INSTALL_DIR/node_modules" \
2>/dev/null || \
echo "[stage2] Warning: chown of build trees failed (rootless container?) — continuing"
fi
# --- Immutable install tree ---
# Do not chown runtime code or dependency trees under $INSTALL_DIR back to the
# hermes user. Hosted/container instances keep mutable state under
# $HERMES_HOME (/opt/data) and run with PYTHONDONTWRITEBYTECODE plus
# HERMES_DISABLE_LAZY_INSTALLS=1. Keeping /opt/hermes root-owned and
# non-writable prevents an agent session from self-modifying the installed
# source, venv, TUI bundle, or node_modules and bricking the gateway.
# Always reset ownership of $HERMES_HOME/profiles to hermes on every
# boot. Profile dirs and files can land owned by root when commands