- Split cloudinit.tf into cloudinit-hermes.tf and cloudinit-openclaw.tf - Split variables.tf into variables-common.tf, variables-hermes.tf, variables-openclaw.tf - Move templates into hermes/templates/ and openclaw/templates/ - Move models/ into openclaw/models/ - Move hermes-openclaw.json to openclaw/openclaw-reference.json - Move hermes docs to hermes/docs/ - OpenClaw cloudinit now uses variables instead of hardcoded values - All 48 variable references verified against definitions
6.6 KiB
Hermes Agent Debugging Guide
This guide helps diagnose why Hermes Agent may not be running after Terraform deployment.
Quick Diagnostic Checklist
1. Service Status
# Check systemd service status
systemctl status hermes.service
# View service logs
journalctl -u hermes.service -f
# Check if container exists
docker ps -a | grep hermes
# View container logs
docker logs hermes
2. Docker Health
# Verify Docker is running
systemctl status docker
# List containers
docker ps -a
# Check Docker events (watch real-time)
docker events
# Check docker socket permissions
ls -la /var/run/docker.sock
3. Directory and File Permissions
# Check .hermes directory
ls -la ~/.hermes/
ls -la ~/.hermes/.env
ls -la ~/docker-compose.yml
# Check file contents
cat ~/.hermes/.env
cat ~/.hermes/config.yaml
cat ~/docker-compose.yml
Common Issues and Fixes
Issue 1: "Hermes container not running"
Symptoms:
docker psshows no hermes container.hermesfolder exists but docker container won't start
Diagnosis:
# Check service status
systemctl status hermes.service
# Check recent logs
journalctl -u hermes.service -n 50
# Check docker logs more verbosely
docker logs hermes 2>&1 | tail -50
Root Causes:
-
Docker image not pulled properly → Pull manually:
docker pull nousresearch/hermes-agent:latest -
Missing .env file → Check if it exists and has content:
ls -la ~/.hermes/.env cat ~/.hermes/.env -
Directory permission issues → Fix permissions:
sudo chown -R $(whoami):$(whoami) ~/.hermes chmod 755 ~/.hermes chmod 600 ~/.hermes/.env -
Docker compose file not found → Verify location:
ls -la ~/docker-compose.yml cat ~/docker-compose.yml -
Port 18789 already in use → Check:
lsof -i :18789If occupied, either:
- Kill the process using it
- Change the port in docker-compose.yml
Issue 2: "Container starts but immediately exits"
Symptoms:
docker psis empty butdocker ps -ashows the container with "Exited" status- Container stops within seconds of starting
Diagnosis:
# View the exit code
docker ps -a | grep hermes
# Get more detailed error logs
docker logs hermes
Common Fixes:
-
Invalid YAML in config.yaml → Validate syntax:
python3 -c "import yaml; yaml.safe_load(open('~/.hermes/config.yaml'))" -
Missing API keys → Check:
grep -E "OPENROUTER|DISCORD_BOT|BRAVE" ~/.hermes/.env -
Invalid gateway token → Verify:
echo $HERMES_GATEWAY_TOKEN
Issue 3: "Docker daemon won't start"
Symptoms:
systemctl status dockershows failed/inactivedocker psreturns "Cannot connect to Docker daemon"
Fixes:
# Start Docker
sudo systemctl start docker
# Enable on boot
sudo systemctl enable docker
# Check Docker health
docker ps
Issue 4: "Discord bot shows offline"
Symptoms:
- Hermes is running (docker ps shows container)
- But Discord bot doesn't show "online" status in your server
Diagnosis:
# Check if Discord configuration is loaded
grep -i discord ~/.hermes/.env
grep -i discord ~/.hermes/config.yaml
# View container logs for Discord errors
docker logs hermes | grep -i discord
Root Causes:
-
Invalid bot token → Verify in .env:
grep DISCORD_BOT_TOKEN ~/.hermes/.env -
Wrong server ID → Check config:
grep -A 5 "discord_server_id" ~/.hermes/config.yaml -
User IDs not in server → Verify in allowlist:
grep -A 10 "users:" ~/.hermes/config.yaml -
Gateway not running → Check port:
lsof -i :18789 -
Bot not in server → Manual fix:
- Go to Discord Developer Portal
- Select your bot
- Copy OAuth2 URL with scopes:
bot,applications.commands - Click the URL to invite bot to your server
Issue 5: "Container gets killed after startup"
Symptoms:
- Service shows active but container keeps restarting
docker logsshows memory or resource errors
Fixes:
# Check Docker stats
docker stats hermes
# Check docker-compose.yml resource limits
grep -A 5 "deploy:" ~/docker-compose.yml
# Increase memory limit if needed
# Edit ~/docker-compose.yml and increase memory value
nano ~/docker-compose.yml
Verification Steps
Once you believe Hermes is running, verify with:
# Health check script (if it exists)
bash /usr/local/bin/hermes-health-check.sh
# Manual health checks
echo "1. Service status:"
systemctl is-active hermes.service
echo "2. Container running:"
docker ps | grep hermes
echo "3. Port listening:"
netstat -tlnp | grep 18789
Manual Start/Stop
If the systemd service isn't working:
# Manual start
cd ~/
docker compose -f docker-compose.yml up -d
# Manual stop
cd ~/
docker compose -f docker-compose.yml down
# Manual logs
cd ~/
docker compose -f docker-compose.yml logs -f
Rebuilding from Scratch
If nothing else works:
# Stop everything
systemctl stop hermes.service
docker compose -f ~/docker-compose.yml down
# Remove container and image
docker rm hermes 2>/dev/null || true
docker rmi nousresearch/hermes-agent:latest 2>/dev/null || true
# Pull fresh image
docker pull nousresearch/hermes-agent:latest
# Start service again
systemctl start hermes.service
# Monitor startup
journalctl -u hermes.service -f
Debug Mode
For more verbose logging:
# Watch service logs with timestamps
journalctl -u hermes.service -f --all
# Watch docker logs continuously
docker logs -f --tail=50 hermes
# Run docker compose in foreground (stops automated service)
cd ~/
docker compose -f docker-compose.yml up
Testing Discord Connectivity
Once Hermes is running:
# Send a test message to your Discord bot
# The bot should respond in the channel or via DM
# Check if bot is responding to mentions
@hermes help
# Or check logs for Discord activity
docker logs hermes | tail -100
Terraform Logs
Check cloud-init logs on the server for deployment issues:
# View cloud-init output
sudo cloud-init status
sudo cat /var/log/cloud-init-output.log
# Check for specific errors
grep -i error /var/log/cloud-init-output.log
grep -i docker /var/log/cloud-init.log
Getting Help
If stuck, provide:
- Output of
systemctl status hermes.service - Output of
docker ps -a - Last 50 lines of
docker logs hermes - Contents of
~/.hermes/.env(redact secrets) - Contents of
~/.hermes/config.yaml - Output of
cloud-init status