Initial commit - Clean public release

Sanitized for public release:
- Removed all API keys, tokens, and secrets
- Removed personal Discord IDs from hermes-openclaw.json
- Updated git URLs to be generic placeholders
- All sensitive data uses environment variable interpolation
This commit is contained in:
CeeLo Greenheart 2026-04-22 19:13:28 +00:00
commit a593af9b27
34 changed files with 5646 additions and 0 deletions

185
docs/DIGITALOCEAN_SETUP.md Normal file
View file

@ -0,0 +1,185 @@
# DigitalOcean Setup
Detailed guide for deploying OpenBoatmobile to DigitalOcean.
## When to Use DigitalOcean
| Factor | Hetzner | DigitalOcean |
|--------|---------|--------------|
| Price | €4.49/mo (cx23) | $24/mo (s-2vcpu-4gb) |
| US West Coast | No | Yes (SFO2, SFO3) |
| Documentation | Good | Excellent |
| One-click apps | Limited | Extensive |
| Support | Ticket | Ticket + Premium |
Use DigitalOcean if:
- You're on the US West Coast (SFO has better latency than Ashburn)
- You already have DO credits/promo codes
- You prefer DO's documentation and ecosystem
## Create DigitalOcean Account
1. Go to [DigitalOcean](https://www.digitalocean.com/)
2. Sign up
3. Add a payment method ($5 minimum)
## Create API Token
1. Go to [DO API Settings](https://cloud.digitalocean.com/account/api/tokens)
2. Click **Generate New Token**
3. Name it (e.g., "openclaw-terraform")
4. Permissions: **Read & Write**
5. Copy the token immediately (shown only once)
## Add SSH Key
1. Go to [DO Security Settings](https://cloud.digitalocean.com/account/security)
2. Click **Add SSH Key**
3. Paste your public key contents:
```bash
cat ~/.ssh/id_ed25519.pub
```
4. Give it a name
5. Click **Add SSH Key**
### Get the Fingerprint
Terraform needs the fingerprint, not the name:
```bash
ssh-keygen -lf ~/.ssh/id_ed25519.pub
# Output: 256 SHA256:abc123... your@email.com (ED25519)
```
The fingerprint is the part after `SHA256:` and before the email.
```bash
TF_VAR_ssh_key_fingerprints='["abc123..."]'
```
## Choose a Region
| Code | Location | Notes |
|------|----------|-------|
| `nyc1` | New York | US East |
| `nyc3` | New York | US East (recommended) |
| `sfo2` | San Francisco | US West |
| `sfo3` | San Francisco | US West |
| `ams3` | Amsterdam | Europe |
| `lon1` | London | Europe |
| `sgp1` | Singapore | Asia |
## Configure OpenBoatmobile
### Minimal Configuration
In `terraform.tfvars`:
```hcl
provider = "digitalocean"
server_name = "my-agent"
droplet_size_digitalocean = "s-2vcpu-4gb"
region_digitalocean = "nyc3"
# These come from environment:
# TF_VAR_do_token
# TF_VAR_venice_api_key
# TF_VAR_ssh_key_fingerprints
```
### Droplet Sizes
| Size | vCPU | RAM | Disk | Price |
|------|------|-----|------|-------|
| s-1vcpu-2gb | 1 | 2 GB | 50 GB | $12/mo |
| **s-2vcpu-4gb** | 2 | 4 GB | 80 GB | **$24/mo** (recommended) |
| s-2vcpu-8gb | 2 | 8 GB | 160 GB | $48/mo |
| s-4vcpu-8gb | 4 | 8 GB | 160 GB | $64/mo |
The s-2vcpu-4gb is the sweet spot for OpenClaw.
## Deploy
```bash
# Load secrets
source .env
# Initialize (first time only)
terraform init
# Preview changes
terraform plan
# Deploy
terraform apply
```
## Post-Deployment
Terraform outputs:
```
server_ip = "123.45.67.89"
ssh_command = "ssh openclaw@123.45.67.89" # or "ssh hermes@123.45.67.89" for Hermes
```
### Connect
```bash
# Username is 'openclaw' or 'hermes' depending on framework
ssh <USERNAME>@123.45.67.89
```
### Run OpenClaw Onboarding
```bash
openclaw onboard --install-daemon
```
## Firewall Rules
OpenBoatmobile creates a DigitalOcean firewall with:
| Direction | Port | Source |
|-----------|------|--------|
| Inbound | 22 (SSH) | Configured IPs |
| Outbound | All | Any |
To restrict SSH to your IP:
```bash
TF_VAR_ssh_allowed_ips='["your.public.ip/32"]'
```
## Cleanup
```bash
terraform destroy
```
## Troubleshooting
### "SSH Key fingerprint not found"
- Use the fingerprint, not the name
- The fingerprint is shown in DO Console under Security
- Make sure there are no extra spaces
### "API Token invalid"
- Regenerate the token
- Copy immediately (shown only once)
- Check for trailing spaces in `.env`
### Droplet created but can't SSH
- Wait 2-3 minutes for cloud-init
- Verify your key fingerprint is correct
- Check firewall allows your IP
### "Rate limit exceeded"
- DO has API rate limits
- Wait a few minutes and retry
- Use `terraform plan` sparingly before `apply`

197
docs/DISCORD_SETUP.md Normal file
View file

@ -0,0 +1,197 @@
# Discord Setup
OpenBoatmobile can configure Discord integration during deployment.
## Why Discord Integration?
| Channel | Pros | Cons |
|---------|------|------|
| Discord | Real-time, familiar interface, mobile push | Requires bot setup |
| Control UI | Full featured, direct | No push notifications |
| CLI | Scriptable | No mobile access |
**Recommended:** Discord for mobile notifications and quick interactions.
## Prerequisites
- A Discord account
- A Discord server where you can add bots
- Permission to create bots in that server
## Step 1: Create Discord Application
1. Go to [Discord Developer Portal](https://discord.com/developers/applications)
2. Click **New Application**
3. Name it (e.g., "OpenClaw Agent")
4. Click **Create**
## Step 2: Create Bot User
1. In your application, go to **Bot** in the left sidebar
2. Click **Add Bot**
3. Confirm the popup
4. **Copy the token** immediately (click "Reset Token" if needed)
5. Save this token — you'll need it for `.env`
### Bot Permissions
Under **Privileged Gateway Intents**, enable:
- **Message Content Intent** (required to read messages)
- **Server Members Intent** (optional, for user info)
## Step 3: Invite Bot to Server
1. Go to **OAuth2****URL Generator** in the left sidebar
2. Under **Scopes**, check:
- `bot`
- `applications.commands`
3. Under **Bot Permissions**, check:
- `Send Messages`
- `Read Messages/View Channels`
- `Read Message History`
- `Mention Everyone` (optional)
- `Use Slash Commands`
4. Copy the generated URL at the bottom
5. Open the URL in your browser
6. Select your server and authorize
## Step 4: Get Server and User IDs
### Server ID
1. In Discord, go to **User Settings** (gear icon)
2. Go to **Advanced** → Enable **Developer Mode**
3. Right-click your server name
4. Click **Copy Server ID**
### User ID
1. Right-click your username in Discord
2. Click **Copy User ID**
## Step 5: Configure OpenBoatmobile
In `.env`:
```bash
TF_VAR_discord_bot_token=your-bot-token-here
TF_VAR_discord_server_id=123456789012345678
TF_VAR_discord_user_id='["123456789012345678", "another-user-id"]'
```
## Step 6: Deploy (or Update)
```bash
# Initial deployment
terraform apply
# If already deployed, update:
terraform apply -var="discord_bot_token=..." -var="discord_server_id=..." -var="discord_user_id=[\"user1\", \"user2\"]"
```
## Step 7: Pair the Gateway
After deployment, the gateway needs to be paired with Discord:
### If Tailscale is Enabled
1. Visit `https://<hostname>.<tailnet>.ts.net/`
2. If device pairing is required:
- You'll see a pairing code
- On the server: `openclaw pairing approve device <CODE>`
### If Using SSH Tunnel
```bash
# Create tunnel
ssh -L 18789:localhost:18789 openclaw@<server-ip>
# Open browser
# http://localhost:18789
```
## Channel Configuration
By default, the bot is configured for:
- All channels in the server (using wildcard `*`)
- No mention required (bot responds to all messages)
- Only your user ID in allowlist
To customize, edit `openclaw.json` after deployment:
```json
{
"channels": {
"discord": {
"enabled": true,
"token": "${DISCORD_BOT_TOKEN}",
"groupPolicy": "allowlist",
"guilds": {
"SERVER_ID": {
"requireMention": false,
"users": ["YOUR_USER_ID"],
"channels": {
"*": { "allow": true }
}
}
}
}
}
}
```
## Testing
### Test Bot is Working
1. In Discord, go to any channel in your server
2. Type a message
3. The bot should respond (if `requireMention` is false)
4. Or: Mention the bot with `@OpenClaw Agent hello`
### Check Gateway Logs
On the server:
```bash
# Check gateway is running
systemctl status openclaw-gateway
# View logs
journalctl -u openclaw-gateway -f
```
## Troubleshooting
### Bot doesn't respond
1. Check bot token is correct
2. Verify bot has **Message Content Intent** enabled
3. Check server ID and user IDs are correct
4. Verify bot is in your server
### "Unauthorized" in gateway logs
- Verify `discord_user_id` list contains your actual Discord IDs
- Check each user ID is in the server's member list
### Gateway shows pairing code
If you see a pairing code:
1. SSH into the server
2. Run: `openclaw pairing approve device <CODE>`
3. Refresh the browser
### Bot joins but doesn't respond
- Check `requireMention` setting
- Verify your user ID is in the allowlist
- Check gateway logs for errors
## Security Notes
- The bot token provides full access to the bot — keep it secret
- Regenerate the token if compromised: Discord Dev Portal → Bot → Reset Token
- The user ID allowlist ensures only you can interact with the agent
- For team access, add multiple user IDs to the `users` array

255
docs/DOCKER_VS_DIRECT.md Normal file
View file

@ -0,0 +1,255 @@
# Docker vs Direct Installation Guide
## Overview
OpenBoatmobile now supports two deployment modes for Hermes Agent:
1. **Docker Container** (default, `docker_enabled = true`)
- Runs Hermes in a Docker container
- Isolated environment, easier updates
- Slightly higher resource usage
2. **Direct Installation** (`docker_enabled = false`)
- Installs Hermes directly on the host system
- Lower resource usage, faster startup
- `hermes` command available in PATH
- Better for dedicated VPS environments
## Configuration
### Enable Direct Installation
**In `.env` file:**
```bash
TF_VAR_docker_enabled=false
```
**In `terraform.tfvars`:**
```hcl
docker_enabled = false
```
### Default Behavior
- `docker_enabled = true` (Docker container) - **Default**
- `docker_enabled = false` (Direct installation)
## Deployment Differences
### Docker Mode (`docker_enabled = true`)
**Installation:**
- Installs Docker and docker-compose
- Pulls `nousresearch/hermes-agent:latest`
- Runs in container with volume mounts
**Management:**
```bash
# Check status
docker ps | grep hermes
# View logs
docker logs hermes
# Restart
docker restart hermes
# Access hermes CLI
docker exec hermes hermes --help
```
**Resource Usage:**
- ~200MB additional RAM for Docker daemon
- Container overhead (~50MB RAM)
- Isolated filesystem
### Direct Mode (`docker_enabled = false`)
**Installation:**
- Installs `uv` package manager from Astral
- Clones `github.com/NousResearch/hermes-agent` repository
- Creates Python 3.11 virtual environment
- Installs with `uv pip install -e ".[messaging]"` (Discord/Slack/Telegram support)
- Creates `/usr/local/bin/hermes` wrapper script
**Management:**
```bash
# Check status
systemctl status hermes.service
# View logs
journalctl -u hermes.service -f
# Restart
systemctl restart hermes.service
# Access hermes CLI directly
hermes --help
hermes gateway status
```
**Resource Usage:**
- Minimal overhead (~20MB RAM for venv)
- Direct process execution
- Shared filesystem with host
## File Locations
### Docker Mode
```
/home/hermes/.hermes/ # Config and data (host)
/var/lib/docker/ # Container runtime
```
### Direct Mode
```
/home/hermes/.hermes/ # Config and data
/home/hermes/hermes-agent/ # Git repository
/home/hermes/hermes-agent/venv/ # Python virtual environment
/usr/local/bin/hermes # CLI wrapper script
/root/.local/bin/uv # uv package manager
```
## Command Line Access
### Docker Mode
```bash
# Run hermes commands
docker exec hermes hermes --version
docker exec hermes hermes gateway status
# Or create alias for convenience
echo "alias hermes='docker exec hermes hermes'" >> ~/.bashrc
```
### Direct Mode
```bash
# hermes command is directly available
hermes --version
hermes gateway status
hermes --help
```
## Health Checks
### Docker Mode
```bash
/usr/local/bin/hermes-health-check.sh
# Checks: Docker daemon, container status, port 18789, config files
```
### Direct Mode
```bash
/usr/local/bin/hermes-health-check.sh
# Checks: hermes binary, venv, process status, port 18789, config files
```
## Troubleshooting
### Docker Mode Issues
```bash
# Docker daemon not running
sudo systemctl start docker
# Container crashed
docker logs hermes
docker restart hermes
# Permission issues
sudo usermod -aG docker $USER
newgrp docker
```
### Direct Mode Issues
```bash
# hermes command not found
which hermes
ls -la /usr/local/bin/hermes
cat /usr/local/bin/hermes # Check wrapper script content
# Virtual environment issues
ls -la ~/hermes-agent/venv/
ls -la ~/hermes-agent/venv/bin/hermes
# Check if repo was cloned
ls -la ~/hermes-agent/
# Check if uv is installed
ls -la /root/.local/bin/uv
# Service not starting
journalctl -u hermes.service -n 20
systemctl status hermes.service
# Reinstall manually
cd ~
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
/root/.local/bin/uv venv venv --python 3.11
/root/.local/bin/uv pip install -e '.[messaging]'
```
## Migration Between Modes
### From Docker to Direct
1. Set `docker_enabled = false`
2. Run `terraform apply`
3. Data in `~/.hermes/` is preserved
4. `hermes` command becomes available
### From Direct to Docker
1. Set `docker_enabled = true`
2. Run `terraform apply`
3. Data in `~/.hermes/` is preserved
4. Use `docker exec hermes hermes` for CLI access
## Performance Comparison
| Metric | Docker Mode | Direct Mode | Difference |
|--------|-------------|-------------|------------|
| RAM Usage | ~400MB | ~200MB | -50% |
| Startup Time | ~15s | ~5s | -67% |
| Disk Usage | ~2GB | ~1GB | -50% |
| hermes CLI | `docker exec` | Direct | Simpler in Direct |
| Isolation | Full | None | Docker more secure |
## Recommendations
### Use Docker Mode When:
- Running multiple services on the same server
- Wanting easy rollback/updates
- Security isolation is important
- Using cloud environments with limited control
### Use Direct Mode When:
- Dedicated VPS for Hermes only
- Wanting minimal resource usage
- Needing fastest possible startup
- Wanting direct CLI access without `docker exec`
## Examples
### Minimal Direct Installation
```hcl
# terraform.tfvars
cloud_provider = "hetzner"
agent_framework = "hermes"
docker_enabled = false
venice_api_key = "your-key"
ssh_key_names = ["your-key"]
```
### Docker Installation with Custom User
```hcl
# terraform.tfvars
cloud_provider = "hetzner"
agent_framework = "hermes"
docker_enabled = true
admin_user = "ai-admin" # Override default 'hermes'
venice_api_key = "your-key"
ssh_key_names = ["your-key"]
```
## Support
Both modes are fully supported. The direct mode is recommended for dedicated VPS deployments where you want the `hermes` command directly available in your PATH.

166
docs/GETTING-STARTED.md Normal file
View file

@ -0,0 +1,166 @@
# Getting Started with OpenBoatmobile
This guide walks you through deploying an OpenClaw agent in 15 minutes.
## Prerequisites
Before you start, you need:
| Requirement | How to Get It |
|-------------|---------------|
| Terraform >= 1.5.4 | [Install guide](https://developer.hashicorp.com/terraform/install) |
| SSH key pair | `ssh-keygen -t ed25519 -C "your@email.com"` |
| Hetzner Cloud API token | [Hetzner Console](https://console.hetzner.cloud/) → Security → API Tokens |
| Venice AI API key | [Venice.ai](https://venice.ai) → Settings → API Keys |
| Tailscale auth key (recommended) | [Tailscale Admin](https://login.tailscale.com/admin/settings/keys) |
**Optional:**
- DigitalOcean API token (if using DO instead of Hetzner)
- Discord bot token (for Discord integration)
- Brave Search API key (for web search)
## Step 1: Clone the Repository
```bash
git clone https://github.com/YOUR_USERNAME/openboatmobile-ai.git
cd openboatmobile
```
## Step 2: Configure Secrets
OpenBoatmobile uses environment variables for secrets. This keeps sensitive dataout of git.
```bash
# Copy the example
cp .env.example .env
# Edit with your values
$EDITOR .env
```
**Required secrets:**
```bash
# Choose your provider
TF_VAR_cloud_provider=hetzner # or digitalocean
# Provider API token (one of these)
TF_VAR_hcloud_token=your-hetzner-api-token-here
# TF_VAR_do_token=your-digitalocean-api-token-here
# Venice AI (required for inference)
TF_VAR_venice_api_key=your-venice-api-key-here
# SSH key name (as shown in your cloud provider's console)
TF_VAR_ssh_key_names='["my-ssh-key-name"]'
```
**Recommended:**
```bash
# Tailscale for secure remote access
TF_VAR_enable_tailscale=true
TF_VAR_tailscale_auth_key=tskey-auth-xxxxx
```
## Step 3: Source the Environment
```bash
source .env
```
This loads your secrets into the shell. Terraform will read `TF_VAR_*` variables automatically.
## Step 4: Initialize and Plan
```bash
terraform init
terraform plan
```
Review the plan. You should see:
- 1 server (Hetzner) or 1 droplet (DigitalOcean)
- 1 firewall
- Cloud-init configuration
## Step 5: Deploy
```bash
terraform apply
```
Type `yes` when prompted. Deployment takes 2-5 minutes.
## Step 6: Connect
Terraform outputs the SSH command (username depends on framework):
```bash
# Example output for OpenClaw:
ssh_command = "ssh openclaw@123.45.67.89"
# Example output for Hermes:
ssh_command = "ssh hermes@123.45.67.89"
```
SSH into your server:
```bash
# The username will be either 'openclaw' or 'hermes' based on your framework
ssh <USERNAME>@<YOUR_SERVER_IP>
```
## Step 7: Run OpenClaw Onboarding
On the server:
```bash
openclaw onboard --install-daemon
```
This configures the OpenClaw gateway and starts the service.
## Step 8: Configure Tailscale (if enabled)
If you're using Tailscale:
```bash
# On the server
sudo tailscale serve --bg 18789
```
Then visit: `https://<hostname>.<tailnet>.ts.net/`
## Step 9: Configure Discord (Optional)
See [DISCORD_SETUP.md](./DISCORD_SETUP.md) for Discord bot configuration.
## Troubleshooting
### SSH Connection Refused
- Wait 2-3 minutes after deployment for cloud-init to complete
- Check firewall allows your IP: `TF_VAR_ssh_allowed_ips='["your.ip.here/32"]'`
### Terraform Error: "SSH key not found"
- Hetzner: Key name must match exactly as shown in Console
- DigitalOcean: Use the fingerprint, not the name
### OpenClaw command not found
- Cloud-init installs Node.js and OpenClaw
- Wait a few minutes, then try: `which openclaw`
- Check logs: `tail -f /var/log/cloud-init-output.log`
### Tailscale not working
- Verify auth key is valid and unused
- Check Tailscale status: `sudo tailscale status`
- Enable Serve in Tailscale Admin Console
## Next Steps
- [HETZNER_SETUP.md](./HETZNER_SETUP.md) - Detailed Hetzner configuration
- [DIGITALOCEAN_SETUP.md](./DIGITALOCEAN_SETUP.md) - Detailed DO configuration
- [SECRETS.md](./SECRETS.md) - Advanced secrets management
- [TROUBLESHOOTING.md](./TROUBLESHOOTING.md) - Common issues and fixes

203
docs/HERMES_AUDIT_REPORT.md Normal file
View file

@ -0,0 +1,203 @@
# Hermes Deployment Audit Report
## Issues Found
During the audit of the Terraform project for Hermes Agent deployment, several critical issues were identified that would prevent Hermes from running properly:
### 1. **Systemd Service Configuration Error** (CRITICAL)
**Problem:** The systemd service didn't specify the docker-compose file path
- `ExecStart=/usr/bin/docker compose up` without the `-f` flag
- The service couldn't find docker-compose.yml when running from an arbitrary directory
- No guarantee the service would change to the correct working directory
**Impact:** Service would start but immediately fail or not find the compose file.
**Fix:** Updated to:
```ini
ExecStart=/bin/sh -c 'cd /home/${admin_user} && exec docker compose -f docker-compose.yml up'
ExecStop=/bin/sh -c 'cd /home/${admin_user} && exec docker compose -f docker-compose.yml down'
```
### 2. **User Permissions Issue** (CRITICAL)
**Problem:** Service was configured to run as `User=${admin_user}` (non-root)
- Adding a user to the docker group with `usermod -aG docker` doesn't take effect for existing sessions
- The systemd service tries to use docker before the hermes user has proper permissions
- Would require a re-login to apply the docker group permissions
**Impact:** Service runs as hermes user without the necessary docker group permissions, causing "permission denied" errors.
**Fix:** Changed service to run as root (necessary for Docker):
```ini
User=root
```
And ensured proper file ownership:
```bash
chown ${admin_user}:${admin_user} /home/${admin_user}/docker-compose.yml
chmod 644 /home/${admin_user}/docker-compose.yml
```
### 3. **Installation Order Issue**
**Problem:** Docker image was pulled before docker-compose-plugin was installed
- `docker pull` command succeeded (using legacy docker)
- But `docker compose` (the plugin) comes later
- If the pull failed, docker-compose-plugin wouldn't have been installed yet
**Impact:** Potential race condition during bootstrap.
**Fix:** Reordered runcmd to install docker-compose-plugin immediately after Docker:
```yaml
1. curl docker installer
2. apt-get install docker-compose-plugin # BEFORE pulling image
3. docker pull nousresearch/hermes-agent:latest
```
### 4. **No Docker Daemon Ready Check** (HIGH)
**Problem:** Script tried to pull images immediately after Docker installation
- Docker socket might not be ready
- Starting services before Docker is fully operational
**Impact:** Timing-dependent failures, especially on slower systems.
**Fix:** Added health checks and delays:
```bash
# Wait for Docker daemon to be ready
sleep 5
docker ps > /dev/null || (sleep 10 && docker ps)
```
### 5. **No Service Startup Verification** (MEDIUM)
**Problem:** Service was started with no check that it actually came up
- If the service failed to start, deployment would complete successfully anyway
- User wouldn't know until they SSH in
**Impact:** Silent failures that only become apparent when checking the server.
**Fix:** Added verification:
```bash
# Verify service started
systemctl is-active hermes.service || systemctl status hermes.service
```
### 6. **Poor Error Logging** (MEDIUM)
**Problem:** systemd service logged to stdout but nothing captured the startup errors
- No journal entries with what went wrong
- No way to see Docker errors in the cloud-init logs
**Impact:** Difficult to diagnose why the service failed.
**Fix:** Added proper journal logging:
```ini
StandardOutput=journal
StandardError=journal
SyslogIdentifier=hermes
```
## Changes Made
### Terraform Files Modified
1. **templates/userdata-hermes.tpl**
- Fixed systemd service configuration
- Reordered runcmd operations
- Added Docker readiness checks and delays
- Enhanced health check script
- Added service startup verification
- Improved completion messages
2. **docs/HERMES_DEBUGGING.md** (NEW)
- Comprehensive troubleshooting guide
- Common issues and solutions
- Diagnostic commands
- Manual start/stop procedures
- Discord connectivity testing
3. **README.md**
- Added reference to HERMES_DEBUGGING.md documentation
## Testing These Changes
To test the fixes, you need to redeploy:
```bash
# Option 1: Destroy and redeploy (cleanest)
terraform destroy
# Answer yes when prompted
source .env && terraform init && terraform apply
# Option 2: Update existing (if keeping infrastructure)
source .env && terraform apply -auto-approve
```
After deployment, verify Hermes is running:
```bash
# SSH into the server (username is 'hermes' or your override)
ssh hermes@<SERVER_IP>
# Run the health check
/usr/local/bin/hermes-health-check.sh
# Or manually verify
systemctl status hermes.service
docker ps
docker logs hermes
```
## Deployment Flow Now
With the fixes, the cloud-init deployment flow is now:
1. ✓ Update system packages
2. ✓ Create hermes user
3. ✓ Write configuration files (.env, config.yaml, docker-compose.yml, SOUL.md)
4. ✓ Write health check script
5. ✓ Write systemd service unit
6. ✓ Install Docker
7. ✓ Install docker-compose-plugin
8. ✓ Wait for Docker daemon to be ready
9. ✓ Pull Hermes image
10. ✓ Set proper permissions
11. ✓ Reload systemd
12. ✓ Enable hermes.service
13. ✓ Start systemd service (which runs docker-compose up)
14. ✓ Wait for startup
15. ✓ Verify service is active
## Expected Behavior After Fix
When you SSH into the server after deployment:
```bash
$ systemctl status hermes.service
● hermes.service - Hermes Agent Service
Loaded: loaded (/etc/systemd/system/hermes.service; enabled; vendor preset: enabled)
Active: active (running) since ...
$ docker ps
CONTAINER ID IMAGE STATUS
abc123 nousresearch/hermes-agent:latest Up 2 minutes
$ docker logs hermes
[INFO] Hermes Agent starting...
[INFO] Discord bot initialized
...
```
And in Discord:
- Bot shows "online" status
- Responds to mentions in configured channels
- Respects user allowlist
## Next Steps
1. **Redeploy** with the fixed template
2. **Verify** using the health checks documented in HERMES_DEBUGGING.md
3. **Test Discord** connectivity by mentioning the bot in a channel
4. **Monitor logs** using `docker logs -f hermes` if issues occur
## Additional Notes
- The audit identified these issues by analyzing the template configuration and deployment flow
- Similar fixes should be applied if you have OpenClaw deployments
- The systemd service is now production-ready with proper error handling
- Health check script was significantly enhanced for better diagnostics

330
docs/HERMES_DEBUGGING.md Normal file
View file

@ -0,0 +1,330 @@
# Hermes Agent Debugging Guide
This guide helps diagnose why Hermes Agent may not be running after Terraform deployment.
## Quick Diagnostic Checklist
### 1. Service Status
```bash
# Check systemd service status
systemctl status hermes.service
# View service logs
journalctl -u hermes.service -f
# Check if container exists
docker ps -a | grep hermes
# View container logs
docker logs hermes
```
### 2. Docker Health
```bash
# Verify Docker is running
systemctl status docker
# List containers
docker ps -a
# Check Docker events (watch real-time)
docker events
# Check docker socket permissions
ls -la /var/run/docker.sock
```
### 3. Directory and File Permissions
```bash
# Check .hermes directory
ls -la ~/.hermes/
ls -la ~/.hermes/.env
ls -la ~/docker-compose.yml
# Check file contents
cat ~/.hermes/.env
cat ~/.hermes/config.yaml
cat ~/docker-compose.yml
```
## Common Issues and Fixes
### Issue 1: "Hermes container not running"
**Symptoms:**
- `docker ps` shows no hermes container
- `.hermes` folder exists but docker container won't start
**Diagnosis:**
```bash
# Check service status
systemctl status hermes.service
# Check recent logs
journalctl -u hermes.service -n 50
# Check docker logs more verbosely
docker logs hermes 2>&1 | tail -50
```
**Root Causes:**
1. **Docker image not pulled properly** → Pull manually:
```bash
docker pull nousresearch/hermes-agent:latest
```
2. **Missing .env file** → Check if it exists and has content:
```bash
ls -la ~/.hermes/.env
cat ~/.hermes/.env
```
3. **Directory permission issues** → Fix permissions:
```bash
sudo chown -R $(whoami):$(whoami) ~/.hermes
chmod 755 ~/.hermes
chmod 600 ~/.hermes/.env
```
4. **Docker compose file not found** → Verify location:
```bash
ls -la ~/docker-compose.yml
cat ~/docker-compose.yml
```
5. **Port 18789 already in use** → Check:
```bash
lsof -i :18789
```
If occupied, either:
- Kill the process using it
- Change the port in docker-compose.yml
### Issue 2: "Container starts but immediately exits"
**Symptoms:**
- `docker ps` is empty but `docker ps -a` shows the container with "Exited" status
- Container stops within seconds of starting
**Diagnosis:**
```bash
# View the exit code
docker ps -a | grep hermes
# Get more detailed error logs
docker logs hermes
```
**Common Fixes:**
1. **Invalid YAML in config.yaml** → Validate syntax:
```bash
python3 -c "import yaml; yaml.safe_load(open('~/.hermes/config.yaml'))"
```
2. **Missing API keys** → Check:
```bash
grep -E "OPENROUTER|DISCORD_BOT|BRAVE" ~/.hermes/.env
```
3. **Invalid gateway token** → Verify:
```bash
echo $HERMES_GATEWAY_TOKEN
```
### Issue 3: "Docker daemon won't start"
**Symptoms:**
- `systemctl status docker` shows failed/inactive
- `docker ps` returns "Cannot connect to Docker daemon"
**Fixes:**
```bash
# Start Docker
sudo systemctl start docker
# Enable on boot
sudo systemctl enable docker
# Check Docker health
docker ps
```
### Issue 4: "Discord bot shows offline"
**Symptoms:**
- Hermes is running (docker ps shows container)
- But Discord bot doesn't show "online" status in your server
**Diagnosis:**
```bash
# Check if Discord configuration is loaded
grep -i discord ~/.hermes/.env
grep -i discord ~/.hermes/config.yaml
# View container logs for Discord errors
docker logs hermes | grep -i discord
```
**Root Causes:**
1. **Invalid bot token** → Verify in .env:
```bash
grep DISCORD_BOT_TOKEN ~/.hermes/.env
```
2. **Wrong server ID** → Check config:
```bash
grep -A 5 "discord_server_id" ~/.hermes/config.yaml
```
3. **User IDs not in server** → Verify in allowlist:
```bash
grep -A 10 "users:" ~/.hermes/config.yaml
```
4. **Gateway not running** → Check port:
```bash
lsof -i :18789
```
5. **Bot not in server** → Manual fix:
1. Go to Discord Developer Portal
2. Select your bot
3. Copy OAuth2 URL with scopes: `bot`, `applications.commands`
4. Click the URL to invite bot to your server
### Issue 5: "Container gets killed after startup"
**Symptoms:**
- Service shows active but container keeps restarting
- `docker logs` shows memory or resource errors
**Fixes:**
```bash
# Check Docker stats
docker stats hermes
# Check docker-compose.yml resource limits
grep -A 5 "deploy:" ~/docker-compose.yml
# Increase memory limit if needed
# Edit ~/docker-compose.yml and increase memory value
nano ~/docker-compose.yml
```
## Verification Steps
Once you believe Hermes is running, verify with:
```bash
# Health check script (if it exists)
bash /usr/local/bin/hermes-health-check.sh
# Manual health checks
echo "1. Service status:"
systemctl is-active hermes.service
echo "2. Container running:"
docker ps | grep hermes
echo "3. Port listening:"
netstat -tlnp | grep 18789
```
## Manual Start/Stop
If the systemd service isn't working:
```bash
# Manual start
cd ~/
docker compose -f docker-compose.yml up -d
# Manual stop
cd ~/
docker compose -f docker-compose.yml down
# Manual logs
cd ~/
docker compose -f docker-compose.yml logs -f
```
## Rebuilding from Scratch
If nothing else works:
```bash
# Stop everything
systemctl stop hermes.service
docker compose -f ~/docker-compose.yml down
# Remove container and image
docker rm hermes 2>/dev/null || true
docker rmi nousresearch/hermes-agent:latest 2>/dev/null || true
# Pull fresh image
docker pull nousresearch/hermes-agent:latest
# Start service again
systemctl start hermes.service
# Monitor startup
journalctl -u hermes.service -f
```
## Debug Mode
For more verbose logging:
```bash
# Watch service logs with timestamps
journalctl -u hermes.service -f --all
# Watch docker logs continuously
docker logs -f --tail=50 hermes
# Run docker compose in foreground (stops automated service)
cd ~/
docker compose -f docker-compose.yml up
```
## Testing Discord Connectivity
Once Hermes is running:
```bash
# Send a test message to your Discord bot
# The bot should respond in the channel or via DM
# Check if bot is responding to mentions
@hermes help
# Or check logs for Discord activity
docker logs hermes | tail -100
```
## Terraform Logs
Check cloud-init logs on the server for deployment issues:
```bash
# View cloud-init output
sudo cloud-init status
sudo cat /var/log/cloud-init-output.log
# Check for specific errors
grep -i error /var/log/cloud-init-output.log
grep -i docker /var/log/cloud-init.log
```
## Getting Help
If stuck, provide:
1. Output of `systemctl status hermes.service`
2. Output of `docker ps -a`
3. Last 50 lines of `docker logs hermes`
4. Contents of `~/.hermes/.env` (redact secrets)
5. Contents of `~/.hermes/config.yaml`
6. Output of `cloud-init status`

194
docs/HETZNER_SETUP.md Normal file
View file

@ -0,0 +1,194 @@
# Hetzner Cloud Setup
Detailed guide for deploying OpenBoatmobile to Hetzner Cloud.
## Why Hetzner?
| Spec | Hetznercx23 | DigitalOcean s-2vcpu-4gb |
|------|-------------|-------------------------|
| vCPU | 2 | 2 |
| RAM | 4 GB | 4 GB |
| Disk | 80 GB NVMe | 80 GB SSD |
| Bandwidth | 20 TB included | 4 TB included |
| **Price** | **€4.49/mo** | **$24/mo** |
Hetzner is ~70% cheaper for equivalent specs.
## Create Hetzner Account
1. Go to [Hetzner Cloud](https://www.hetzner.com/cloud)
2. Sign up (email verification required)
3. Add a payment method
## Create API Token
1. Go to [Hetzner Console](https://console.hetzner.cloud/)
2. Click your project (or create one)
3. Navigate to **Security** → **API Tokens**
4. Click **Create API Token**
5. Name it (e.g., "openclaw-terraform")
6. Permissions: **Read & Write**
7. Copy the token immediately (shown onlyonce)
## Add SSH Key
1. In Hetzner Console, go to **Security** → **SSH Keys**
2. Click **Add SSH Key**
3. Paste your public key contents:
```bash
cat ~/.ssh/id_ed25519.pub
```
4. Give it a name you can remember (e.g., "laptop-2024")
5. Click **Add SSH Key**
## Choose a Location
Hetzner locations:
| Code | Location | Continent |
|------|----------|-----------|
| `nbg1` | Nuremberg | Europe |
| `fsn1` | Falkenstein | Europe |
| `hel1` | Helsinki | Europe |
| `ash` | Ashburn, VA | North America |
For US users: `ash` (Ashburn) has the best latency.
## Configure OpenBoatmobile
### Minimal Configuration
In `terraform.tfvars`:
```hcl
provider = "hetzner"
server_name = "my-agent"
server_type_hetzner = "cx23"
location_hetzner = "ash"
# These come from environment:
# TF_VAR_hcloud_token
# TF_VAR_venice_api_key
# TF_VAR_ssh_key_names
```
### Server Types
| Type | vCPU | RAM | Disk | Price |
|------|------|-----|------|-------|
| cx22 | 2 | 4 GB | 40 GB | €3.79/mo |
| **cx23** | 2 | 4 GB | 80 GB | **€4.49/mo** (recommended) |
| cpx21 | 3 | 4 GB | 80 GB | €5.99/mo |
| cpx31 | 4 | 8 GB | 160 GB | €8.99/mo |
The cx23 is the sweet spot for OpenClaw: enough RAM for Node.js + LLM contexts, affordable price.
## Deploy
```bash
# Load secrets
source .env
# Initialize (first time only)
terraform init
# Preview changes
terraform plan
# Deploy
terraform apply
```
## Post-Deployment
Terraform outputs your server IP:
```
server_ip = "123.45.67.89"
ssh_command = "ssh openclaw@123.45.67.89" # or "ssh hermes@123.45.67.89" for Hermes
```
### Connect
```bash
# Username is 'openclaw' or 'hermes' depending on framework
ssh <USERNAME>@123.45.67.89
```
### Check Cloud-Init Status
On the server:
```bash
# Check if cloud-init is still running
cloud-init status
# If waiting, you can watch progress:
tail -f /var/log/cloud-init-output.log
```
### Run OpenClaw Onboarding
```bash
openclaw onboard --install-daemon
```
### Verify Gateway
```bash
systemctl status openclaw-gateway
```
## Firewall Rules
OpenBoatmobile creates a Hetzner firewall with:
| Direction | Port | Source |
|-----------|------|--------|
| Inbound | 22 (SSH) | Configured IPs |
| Outbound | All | Any |
To restrict SSH to your IP:
```bash
TF_VAR_ssh_allowed_ips='["your.public.ip/32", "another.ip/32"]'
```
## Cleanup
To destroy your deployment:
```bash
terraform destroy
```
**Note:** This deletes the server and all data. Backup anything important first.
## Troubleshooting
### "API Token invalid"
- Copy the token again (shown only once)
- Check for trailing spaces in `.env`
- Verify token has Read & Write permissions
### "SSH Key not found"
- The key name must match exactly what you entered in Hetzner Console
- Case-sensitive
- Use the name, not the fingerprint
### Server shows but can't SSH
- Wait 2-3 minutes for cloud-init
- Check your IP is in `ssh_allowed_ips`
- Verify the key is added to your agent: `ssh-add -l`
### Cloud-init stuck
```bash
# On the server
cloud-init status --wait
# Or check logs
tail -f /var/log/cloud-init-output.log
```

138
docs/SECRETS.md Normal file
View file

@ -0,0 +1,138 @@
# Secrets Management
OpenBoatmobile uses Terraform's native secrets handling: environment variables with the `TF_VAR_` prefix.
## Why Environment Variables?
| Approach | Pros | Cons |
|----------|------|------|
| `TF_VAR_*` env vars | Standard Terraform, never in git, works with CI/CD | Must source before each session |
| `.tfvars` file | Easy to edit | Easy to accidentally commit secrets |
| HashiCorp Vault | Enterprise-grade | Complex setup, overkill for solo use |
| SOPS (encrypted files) | Git-tracked encrypted secrets | Extra tooling required |
We use `TF_VAR_*` because it's the Terraform standard and keeps secrets out of git by default.
## The .env File
The `.env.example` template lists all configurable variables:
```bash
# Copy to .env and fill in your values
cp .env.example .env
```
**Never commit `.env`:** It's in `.gitignore` by default.
## Loading Secrets
Before running Terraform:
```bash
source .env
```
This exports all variables to your shell. Terraform automatically reads `TF_VAR_*` variables.
## Required Secrets
| Variable | Description | How to Get |
|----------|-------------|------------|
| `TF_VAR_hcloud_token` | Hetzner API token | [Hetzner Console](https://console.hetzner.cloud/) → Security → API Tokens → Create Token |
| `TF_VAR_venice_api_key` | Venice AI API key | [Venice.ai](https://venice.ai) → Settings → API Keys |
| `TF_VAR_ssh_key_names` | SSH key name(s) | Name you gave the key in Hetzner Console |
## Optional Secrets
| Variable | Description | How to Get |
|----------|-------------|------------|
| `TF_VAR_tailscale_auth_key` | Tailscale auth key | [Tailscale Admin](https://login.tailscale.com/admin/settings/keys) → Create Key |
| `TF_VAR_discord_bot_token` | Discord bot token | [Discord Dev Portal](https://discord.com/developers/applications) |
| `TF_VAR_brave_search_api_key` | Brave Search API key | [Brave Search API](https://api.search.brave.com/app/keys) |
| `TF_VAR_do_token` | DigitalOcean API token | [DO API Settings](https://cloud.digitalocean.com/account/api/tokens) |
## SSH Key Setup
### Hetzner
1. Generate a key (if you don't have one):
```bash
ssh-keygen -t ed25519 -C "your@email.com"
```
2. Add to Hetzner Console:
- Go to [Hetzner Console](https://console.hetzner.cloud/) → Security → SSH Keys
- Click "Add SSH Key"
- Paste the contents of `~/.ssh/id_ed25519.pub`
- Give it a memorable name (e.g., "laptop-ed25519")
3. Use the name in your config:
```bash
TF_VAR_ssh_key_names='["laptop-ed25519"]'
```
### DigitalOcean
1. Same key generation as above
2. Add to DigitalOcean:
- Go to [DO Settings](https://cloud.digitalocean.com/account/security)
- Click "Add SSH Key"
- Paste the public key contents
3. Use the fingerprint:
```bash
# Get the fingerprint
ssh-keygen -lf ~/.ssh/id_ed25519.pub
# Example output: 256 SHA256:xxx... your@email.com (ED25519)
# The fingerprint is the part after SHA256:
TF_VAR_ssh_key_fingerprints='["abc123..."]'
```
## CI/CD Integration
For GitHub Actions or similar:
```yaml
# .github/workflows/deploy.yml
env:
TF_VAR_hcloud_token: ${{ secrets.HCLOUD_TOKEN }}
TF_VAR_venice_api_key: ${{ secrets.VENICE_API_KEY }}
TF_VAR_ssh_key_names: '["deploy-key"]'
```
## Security Best Practices
1. **Never commit `.env` or `.tfvars` with secrets**
- These files are in `.gitignore` by default
- Double-check before committing
2. **Use least-privilege API tokens**
- Hetzner: Create project-specific tokens
- Venice: Regenerate keys periodically
3. **Rotate secrets if compromised**
- Hetzner: Delete old token, create new one
- Venice: Regenerate in settings
4. **Use Tailscale for remote access**
- No public HTTPS exposure
- Tailnet provides encryption and auth
## Advanced: SOPS Integration
For teams that want git-tracked encrypted secrets:
1. Install SOPS: `brew install sops` or `apt install sops`
2. Create an encrypted tfvars:
```bash
sops --encrypt --input-type binary --output-type binary secrets.tfvars > secrets.tfvars.encrypted
```
3. Decrypt at apply time:
```bash
sops --decrypt secrets.tfvars.encrypted | terraform apply -var-file=-
```
This is overkill for solo use but useful for teams.

274
docs/SSH_GUIDE.md Normal file
View file

@ -0,0 +1,274 @@
# SSH for Clients
**A simple guide to connecting to your server remotely.**
## What is SSH?
SSH (Secure Shell) is a way to control a computer from somewhere else. Think of it like remotely driving a car — you're in the driver's seat, but the car is somewhere else.
When you SSH into a server, you get a command line on that server. You can run commands, install software, check logs — everything you could do if you were physically sitting at that computer.
## Why Do You Need It?
For your OpenBoatmobile deployment, SSH is how you:
- Check if everything is running correctly
- View logs when something goes wrong
- Run maintenance commands
- Update configurations
## The Key Concept: Lock and Key
SSH uses two files that work together:
| File | Analogy | Where it lives |
|------|---------|----------------|
| **Private key** | Your house key | Your computer, never share |
| **Public key** | Your lock | The server, you can share |
**The private key stays with you.** The public key goes on the server.
When you connect, SSH checks: *Does your private key match the public key on the server?* If yes, you're allowed in. If no, access denied.
**Important:** Your private key is like your house key. Don't give it to anyone. Don't email it. Don't upload it anywhere.
## Step-by-Step: Setting Up SSH
### macOS / Linux
**1. Generate your keys:**
Open Terminal and run:
```bash
ssh-keygen -t ed25519 -C "your-email@example.com"
```
When prompted:
- Press Enter to accept the default location (`~/.ssh/id_ed25519`)
- Press Enter twice for no passphrase (or set one if you want extra security)
**2. See your public key:**
```bash
cat ~/.ssh/id_ed25519.pub
```
Copy the entire output — it starts with `ssh-ed25519` and ends with your email.
**3. Add your key to the cloud provider:**
**Hetzner:**
1. Go to [console.hetzner.cloud](https://console.hetzner.cloud/)
2. Navigate to Security → SSH Keys
3. Click "Add SSH Key"
4. Paste your public key
5. Give it a name (like "my-laptop")
6. Click "Add SSH Key"
**DigitalOcean:**
1. Go to [cloud.digitalocean.com](https://cloud.digitalocean.com/)
2. Navigate to Account → Security
3. Click "Add SSH Key"
4. Paste your public key
5. Give it a name
6. Click "Add SSH Key"
**4. Test your connection:**
After your server is deployed (via Terraform), connect:
```bash
# Username is 'openclaw' or 'hermes' depending on your framework
ssh <USERNAME>@your-server-ip
```
If successful, you'll see a command prompt from the remote server.
### Windows
**Option 1: PowerShell (Windows 10/11)**
Open PowerShell and follow the macOS/Linux steps above. Windows now includes OpenSSH by default.
**Option 2: PuTTY (older Windows)**
1. Download [PuTTYgen](https://www.puttygen.com/)
2. Open PuTTYgen
3. Click "Generate" and move your mouse randomly
4. Click "Save private key" — save as `my-key.ppk`
5. Copy the text in "Public key for pasting" — this is your public key
6. Add this public key to your cloud provider (steps above)
To connect:
1. Open PuTTY
2. In "Host Name", enter: `<USERNAME>@your-server-ip` (username is 'openclaw' or 'hermes' depending on framework)
3. Go to Connection → SSH → Auth
4. Browse to your `.ppk` file
5. Click "Open"
### Key Already Exists?
If you've used SSH before (for GitHub, GitLab, etc.), you might already have a key:
```bash
# Check for existing keys
ls ~/.ssh
# If you see id_ed25519.pub, you're good
cat ~/.ssh/id_ed25519.pub
```
Use this existing key — no need to generate a new one.
## Connecting to Your Server
When Terraform finishes, it outputs your server IP:
```
server_ip = "123.45.67.89"
ssh_command = "ssh openclaw@123.45.67.89" # or "ssh hermes@123.45.67.89"
```
**Connect (username is 'openclaw' or 'hermes' based on framework):**
```bash
ssh <USERNAME>@123.45.67.89
```
**First time?** You'll see:
```
The authenticity of host '123.45.67.89' can't be established.
ED25519 key fingerprint is SHA256:xxxxx...
Are you sure you want to continue connecting (yes/no/[fingerprint])?
```
Type `yes` and press Enter. This happens once per server.
**Successful connection looks like:**
```
Welcome to Ubuntu 24.04 LTS
openclaw@openclaw-gateway:~$
```
You're now on the server! The prompt shows `username@hostname`.
## Common Commands
Once connected, here are useful commands:
```bash
# Check if OpenClaw is running
systemctl status openclaw-gateway
# View logs in real-time
journalctl -u openclaw-gateway -f
# Check Tailscale status (if using Tailscale)
sudo tailscale status
# Check disk space
df -h
# Check memory
free -h
# Exit the server
exit
```
## Troubleshooting
### "Permission denied (publickey)"
**Cause:** Your public key isn't on the server, or you're using the wrong username.
**Fix:**
1. Check your public key is added to the cloud provider
2. Make sure you're using `openclaw` as the username (not your personal username)
3. If your key is in a non-standard location: `ssh -i ~/.ssh/my-key openclaw@server-ip`
### "Connection timed out"
**Cause:** Server isn't running, or firewall is blocking you.
**Fix:**
1. Check the server is running in your cloud console
2. Wait 2-3 minutes after deployment (cloud-init takes time)
3. Check your IP is in `ssh_allowed_ips` (or use `["0.0.0.0/0"]` for any IP)
### "Host key verification failed"
**Cause:** You've connected to this IP before, but the server was replaced.
**Fix:**
```bash
ssh-keygen -R 123.45.67.89
ssh openclaw@123.45.67.89
```
### "No such file or directory" for key
**Cause:** Your key is in a different location.
**Fix:**
```bash
# Find your key
find ~ -name "id_ed25519*" 2>/dev/null
# Use the correct path
ssh -i /path/to/your/key openclaw@server-ip
```
## Security Best Practices
| Practice | Why |
|----------|-----|
| Never share your private key | It's your identity. Anyone with it can access your servers. |
| Don't email your private key | Email isn't secure. |
| Use different keys for different purposes | If one is compromised, others remain safe. |
| Use a passphrase (optional) | Extra layer of protection if someone gets your key file. |
| Disable password login | Passwords can be guessed. Keys can't. |
## What if I Lose My Key?
If you lose your private key, you can't SSH in. Your options:
1. **Use the cloud console** — Most providers have a "Console" or "VNC" option in the web interface. This gives you direct access.
2. **Add a new key** — Through the cloud console, you can add a new SSH key.
3. **Recreate the server** — Use `terraform destroy` and `terraform apply` again. Data will be lost.
## Need Help?
- Check the server logs: `journalctl -u openclaw-gateway -n50`
- Check cloud-init logs: `tail -f /var/log/cloud-init-output.log`
- See [TROUBLESHOOTING.md](./TROUBLESHOOTING.md) for more common issues
## Quick Reference
```bash
# Generate a new key
ssh-keygen -t ed25519 -C "your-email@example.com"
# View your public key
cat ~/.ssh/id_ed25519.pub
# Connect to server
ssh openclaw@server-ip
# Use a specific key file
ssh -i ~/.ssh/my-key openclaw@server-ip
# Remove a server from known hosts
ssh-keygen -R server-ip
# Copy files to server
scp myfile.txt openclaw@server-ip:/home/openclaw/
# Copy files from server
scp openclaw@server-ip:/home/openclaw/file.txt ./
```

167
docs/TAILSCALE_SETUP.md Normal file
View file

@ -0,0 +1,167 @@
# Tailscale Setup
Tailscale provides secure remote access without exposing ports to the internet.
## Why Tailscale?
| Approach | Pros | Cons |
|----------|------|------|
| Tailscale | Free for personal use, encrypted, no port forwarding | Requires Tailscale account |
| SSH tunnel | No dependencies | Local only, manual setup |
| Public HTTPS | Works anywhere | Requires domain, SSL cert, security maintenance |
**Recommended:** Use Tailscale for production deployments.
## Prerequisites
- A Tailscale account ([sign up free](https://tailscale.com/))
- An auth key from the admin console
## Create Auth Key
1. Go to [Tailscale Admin](https://login.tailscale.com/admin/settings/keys)
2. Click **Generate auth key**
3. Settings:
- **Description:** "OpenBoatmobile-2024"
- **Reusable:** No (one server per key)
- **Ephemeral:** No (server should persist)
- **Tags:** Optional (e.g., `tag:servers`)
4. Click **Generate key**
5. Copy the key immediately (starts with `tskey-auth-`)
## Add to Configuration
In `.env`:
```bash
TF_VAR_enable_tailscale=true
TF_VAR_tailscale_auth_key=tskey-auth-xxxxx
```
Or in `terraform.tfvars`:
```hcl
enable_tailscale = true
tailscale_auth_key = "tskey-auth-xxxxx"
```
## Post-Deployment
After Terraform completes:
### 1. Enable Tailscale Serve
SSH into your server and run:
```bash
sudo tailscale serve --bg 18789
```
This exposes the OpenClaw gateway on your tailnet.
### 2. Enable "Serve" in Tailscale Admin
1. Go to [Tailscale Admin → Serve](https://login.tailscale.com/admin/settings/serve)
2. Enable the **Serve** feature
3. This allows serving HTTPS on your tailnet
### 3. Access Your Gateway
Visit: `https://<hostname>.<tailnet>.ts.net/`
Where:
- `<hostname>` is your server name (default: `openclaw-gateway`)
- `<tailnet>` is your tailnet name (e.g., `dragonfish-basilisk`)
Example: `https://openclaw-gateway.dragonfish-basilisk.ts.net/`
## Verify Connection
### On the Server
```bash
# Check Tailscale status
sudo tailscale status
# Check serve status
sudo tailscale serve status
# Resolve a tailnet identity
tailscale whois <your-tailnet-ip>
```
### From Your Machine
1. Install Tailscale: [tailscale.com/download](https://tailscale.com/download)
2. Log in to the same account
3. Ping your server: `tailscale ping <hostname>`
4. Open the gateway in your browser
## Security Model
### Solo Tailnet (Recommended)
If you're the only person on your tailnet:
```hcl
# In terraform.tfvars (or via openclaw.json after deployment)
# The cloud-init config sets this automatically
```
- Your tailnet = your trust boundary
- No per-browser pairing required
- Only devices you authorize can access
### Multi-User Tailnet
If you share your tailnet with others:
1. Remove `dangerouslyDisableDeviceAuth` from the gateway config
2. Each browser must complete device pairing
3. Pairing requires approval: `openclaw pairing approve device <CODE>`
## Troubleshooting
### "Tailscale serve failed"
```bash
# Check if Tailscale is running
sudo tailscale status
# If not connected, reconnect
sudo tailscale up --authkey=tskey-auth-xxxxx
```
### "Serve platform not enabled"
- Go to [Tailscale Admin → Serve](https://login.tailscale.com/admin/settings/serve)
- Enable the Serve feature
### "Connection refused on tailnet"
```bash
# Verify gateway is listening
sudo lsof -i :18789
# If not listening, restart
sudo systemctl restart openclaw-gateway
```
### Gateway not accessible from browser
1. Verify Tailscale serve is running: `sudo tailscale serve status`
2. Check allowed origins in gateway config
3. Try accessing via `http://100.x.x.x:18789` (Tailscale IP)
## Advanced: Funnel (Public Access)
If you need public access (not recommended for most use cases):
```bash
# Enable Funnel for public HTTPS
sudo tailscale funnel --bg 18789
```
This creates a public URL: `https://<hostname>.tailnet.ts.net/`
**Warning:** This exposes your gateway to the internet. Use with caution.

304
docs/TROUBLESHOOTING.md Normal file
View file

@ -0,0 +1,304 @@
# Troubleshooting
Common issues and their solutions.
## Deployment Issues
### Terraform Error: "Provider produced inconsistent result"
**Cause:** State file conflicts or provider version mismatch.
**Solution:**
```bash
terraform init -upgrade
terraform plan -refresh=false
```
### Terraform Error: "API Token invalid"
**Hetzner:**
- Token must have Read & Write permissions
- Copy immediately after creation (shown only once)
- Check for trailing spaces in `.env`
**DigitalOcean:**
- Regenerate token in DO Console
- Verify token has Read & Write scope
### Terraform Error: "SSH Key not found"
**Hetzner:**
- Key name must match exactly as shown in Console
- Case-sensitive
- Use the name: `TF_VAR_ssh_key_names='["my-key-name"]'`
**DigitalOcean:**
- Use the fingerprint, not the name
- Get fingerprint: `ssh-keygen -lf ~/.ssh/id_ed25519.pub`
- Format: `TF_VAR_ssh_key_fingerprints='["abc123..."]'`
### Terraform State Locked
**Cause:** Previous `terraform apply` crashed or is still running.
**Solution:**
```bash
# Force unlock (if sure no other apply is running)
terraform force-unlock <LOCK_ID>
```
## Connection Issues
### SSH Connection Refused
**Causes:**
1. Cloud-init still running
2. Firewall blocking your IP
3. Wrong SSH key
**Solutions:**
1. Wait2-3 minutes after deployment, then retry
2. Check cloud-init status:
```bash
# On the server
cloud-init status
tail -f /var/log/cloud-init-output.log
```
3. Restrict firewall to your IP:
```bash
TF_VAR_ssh_allowed_ips='["your.public.ip/32"]'
```
4. Verify SSH key:
```bash
ssh-add -l # Should show your key
ssh -v openclaw@<ip> # Verbose output
```
### SSH Permission Denied
**Causes:**
1. Wrong username
2. Wrong SSH key
3. Key not added to agent
**Solutions:**
1. Username is `openclaw` (not `root`):
```bash
ssh <username>@<ip> # username is 'openclaw' or 'hermes' depending on framework
```
2. Verify key is correct:
```bash
ssh -i ~/.ssh/id_ed25519 openclaw@<ip>
```
3. Add key to agent:
```bash
ssh-add ~/.ssh/id_ed25519
```
### Connection Times Out
**Causes:**
1. Wrong IP
2. Server not running
3. Network issues
**Solutions:**
1. Verify IP from Terraform output:
```bash
terraform output server_ip
```
2. Check server status in cloud console
3. Try from different network (e.g., mobile hotspot)
## Cloud-Init Issues
### Cloud-init Stuck
**Check status:**
```bash
cloud-init status --wait
```
**Check logs:**
```bash
tail -f /var/log/cloud-init-output.log
```
**Common issues:**
- Network timeout downloading packages
- Package repository issues
- Disk space exhaustion
### OpenClaw Command Not Found
**Cause:** Cloud-init hasn't finished or failed.
**Solution:**
```bash
# Check if Node.js is installed
node --version
# Check if Node.js setup ran
ls /etc/apt/sources.list.d/nodesource.list
# Manually install if needed
curl -fsSL https://deb.nodesource.com/setup_24.x | sudo bash -
sudo apt-get install -y nodejs
```
### Disk Full
**Cause:** Small instance with lots of logs.
**Solution:**
```bash
# Check disk usage
df -h
# Clean package cache
sudo apt-get clean
# Remove old logs
sudo journalctl --vacuum-size=100M
```
## Tailscale Issues
### Tailscale Not Connected
**Check status:**
```bash
sudo tailscale status
```
**Reconnect:**
```bash
sudo tailscale up --authkey=tskey-auth-xxxxx
```
### "Serve platform not enabled"
**Solution:**
1. Go to [Tailscale Admin → Serve](https://login.tailscale.com/admin/settings/serve)
2. Enable the Serve feature
### Gateway Not Accessible on Tailnet
**Check gateway:**
```bash
sudo lsof -i :18789
sudo systemctl status openclaw-gateway
```
**Check serve:**
```bash
sudo tailscale serve status
```
**Verify firewall:**
```bash
sudo ufw status
# Should show 18789 allowed on tailscale0
```
## Discord Issues
### Bot Doesn't Respond
**Check:**
1. Bot token is correct
2. Message Content Intent is enabled
3. Bot is in your server
4. Server ID and User ID are correct
**Debug:**
```bash
# Check gateway logs
journalctl -u openclaw-gateway -f | grep -i discord
```
### "Unauthorized" in Logs
**Cause:** Your user ID is not in the allowlist.
**Solution:**
Edit `~/.openclaw/openclaw.json` and add your Discord user ID:
```json
{
"channels": {
"discord": {
"guilds": {
"SERVER_ID": {
"users": ["YOUR_USER_ID"]
}
}
}
}
}
```
### Gateway Shows Pairing Code
**Solution:**
```bash
# On the server
openclaw pairing approve device <CODE>
```
## Performance Issues
### Gateway Slow to Respond
**Causes:**
1. High model load
2. Network latency
3. Instance too small
**Solutions:**
1. Check model usage:
```bash
top
htop
```
2. Check network:
```bash
ping api.venice.ai
```
3. Upgrade instance:
```bash
# Edit terraform.tfvars
server_type_hetzner = "cpx21" # More CPU/RAM
terraform apply
```
### Memory Exhaustion
**Check:**
```bash
free -h
```
**Solution:**
```bash
# Add swap (if not present)
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
```
## Getting Help
1. Check OpenClaw docs: [docs.openclaw.ai](https://docs.openclaw.ai)
2. Search GitHub issues: [github.com/openclaw/openclaw](https://github.com/openclaw/openclaw)
3. Community Discord: [discord.com/invite/clawd](https://discord.com/invite/clawd)