obm/DETAILS.md
MermaidMan ab1de96168 migrate to OpenTofu with Terraform fallback
Add binary lookup in both terraform.go and destroy.go:
tofu preferred, terraform fallback. Update all docs to
reflect the OpenTofu-first approach.
2026-06-04 17:46:40 +00:00

418 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# DETAILS.md — Technical Reference
Companion to [README.md](README.md). This file contains the full technical details that would normally live in a professional README — architecture, API surface, configuration schema, build system, and development workflows. Intended for developers, contributors, and automated tooling.
---
## Overview
`obm` is a Go CLI that generates Terraform-compatible `.env` files through an interactive walkthrough or a YAML config file. It validates API credentials against live endpoints before writing config, and wraps Terraform lifecycle commands (init, apply, destroy).
- **Language:** Go 1.22
- **Module:** `github.com/openboatmobile/obm`
- **Dependencies:** `gopkg.in/yaml.v3` (sole external dependency)
- **Binary:** Single statically-linked binary, zero runtime dependencies
- **Current version:** See `VERSION` file (0.1.0 at time of writing)
---
## Architecture
```
obm/
├── cmd/obm/main.go # CLI entry point, subcommand routing
├── internal/
│ ├── config/
│ │ ├── config.go # Config struct, GetValue/SetValue
│ │ ├── deployment.go # DeploymentConfig, AdminUser(), MonthlyCostEstimate()
│ │ ├── dotenv.go # DotEnvFile parser (round-trip .env read)
│ │ ├── dotenv_writer.go # WriteDotEnv — grouped, commented .env output
│ │ ├── schema.go # VarDef schema (all TF_VAR_ variables), VarGroup enum
│ │ ├── tfvars.go # WriteTfVars — HCL-format tfvars output
│ │ ├── yaml.go # YAMLConfig struct, LoadYAMLConfig(), ToDeploymentConfig()
│ │ ├── config_test.go
│ │ └── yaml_test.go
│ ├── deploy/
│ │ └── deploy.go # Walkthrough orchestrator (Run, RunFromFile, RunWithConfig)
│ ├── destroy/
│ │ ├── destroy.go # Terraform destroy with state parsing, confirmation
│ │ └── destroy_test.go
│ ├── inference/
│ │ ├── client.go # HTTP client, ValidateAPIKey(), ValidationResult
│ │ ├── inference.go # Provider enum, ProviderConfig, FallbackChain, DefaultGLMConfig
│ │ ├── client_test.go
│ │ └── inference_test.go
│ ├── prompt/
│ │ ├── prompt.go # Terminal I/O: Select, Confirm, Input, Password, color helpers
│ │ └── prompt_test.go
│ ├── provider/
│ │ ├── provider.go # Provider interface, BaseProvider, Registry, Register/Get
│ │ ├── import.go # Provider registration (blank import guidance)
│ │ ├── hetzner/
│ │ │ ├── hetzner.go # HetznerProvider: API validation, SSH key listing
│ │ │ └── hetzner_test.go
│ │ └── provider_test.go
│ ├── terraform/
│ │ └── terraform.go # Runner: Init, Plan, Apply, Destroy (OpenTofu/Terraform)
│ └── validation/
│ ├── validation.go # Check interface, Runner, CheckResult, Status enum
│ └── validation_test.go
├── scripts/
│ ├── install.sh # curl | sh installer
│ └── release.sh # Tag + push release automation
├── .github/workflows/
│ ├── ci.yml # Test + build on push/PR
│ └── release.yml # Cross-compile + GitHub Release on tag
├── Makefile # Build, test, lint, cross-compile targets
├── Dockerfile # Multi-stage: golang:1.22-alpine → alpine:3.20
├── deploy.yaml.example # Full YAML config reference
├── CHANGELOG.md
├── CONTRIBUTING.md
└── VERSION # Single line, e.g. "0.1.0"
```
---
## CLI Interface
### Subcommands
| Command | Flags | Description |
|---------|-------|-------------|
| `obm deploy` | `--config <path>` | Interactive walkthrough (default) or non-interactive from YAML |
| `obm validate` | `--env-file <path>` | Load `.env`, check required vars, validate API keys |
| `obm status` | — | Show deployment state (not yet implemented) |
| `obm destroy` | — | Confirmation prompt → `tofu destroy` → state cleanup |
| `obm version` | — | Print version with commit hash and build time |
| `obm help` | — | Print usage |
### Build-time variables
Injected via `-ldflags` at build time:
| Variable | Flag | Example |
|----------|------|---------|
| `main.version` | `-X main.version=0.1.0` | Semver from `VERSION` file |
| `main.gitCommit` | `-X main.gitCommit=abc1234` | Short commit SHA |
| `main.buildTime` | `-X main.buildTime=2026-05-22T15:30:00Z` | UTC ISO timestamp |
---
## Deploy Walkthrough Flow
The `obm deploy` interactive flow runs 8 steps in sequence:
1. **Framework** — Select Hermes or OpenClaw. Sets framework-specific defaults on `DeploymentConfig`.
2. **Cloud Provider** — Hetzner or DigitalOcean.
3. **Provider Token** — Enter API token. For Hetzner, validates against `/server_types` endpoint and lists SSH keys.
4. **SSH Key** — Select from keys found on the provider, or enter manually.
5. **Server Config** — Name, location/region, server type/droplet size.
6. **Inference Provider** — ZAI, Venice, OpenRouter. Enter API key. Validates against `/models` endpoint.
7. **Tailscale** — Optional VPN setup. Auth key and tailnet domain.
8. **Discord** — Optional bot integration. Bot token, server ID, user IDs.
Final step: summary display with cost estimate → confirm → write `.env` → optionally run `tofu init && tofu apply` (or `terraform` if installed).
### Framework-specific defaults
**Hermes:**
- `DockerEnabled = true`
- `VeniceBaseURL = "https://api.venice.ai/api/v1"`
- `GatewayAllowAllUsers = true`
- `DiscordAutoThread = true`
**OpenClaw:**
- `OpenClawVersion = "lts"`
- `NodeVersion = "22"`
- `EnableSwap = true`, `SwapSizeGB = 2`
- `EnableFail2ban = true`, `EnableUnattendedUpgrades = true`
---
## Key Types
### DeploymentConfig (`internal/config/deployment.go`)
Central struct holding all walkthrough choices. 30+ fields covering framework, provider, server, inference, Tailscale, Discord, and gateway configuration.
Key methods:
- `AdminUser() string` — returns framework name ("hermes" or "openclaw")
- `MonthlyCostEstimate() string` — returns price string based on server type/droplet size
Package-level helper functions (not methods, because `DeploymentConfig` is in `config` but called from `deploy`):
- `config.LocationOrRegion(cfg)` — Hetzner location or DO region
- `config.ServerTypeOrDroplet(cfg)` — server type or droplet size
- `config.SSHKeySummary(cfg)` — masked SSH key display
### DotEnvFile (`internal/config/dotenv.go`)
Round-trip parser for `.env` files:
- `ParseDotEnv(path)` — parse `TF_VAR_`-prefixed env file
- `env.GetVar(name)` — lookup with and without `TF_VAR_` prefix
- `env.Values` — raw `map[string]string`
### VarDef Schema (`internal/config/schema.go`)
Complete schema of all Terraform variables with metadata:
```go
type VarDef struct {
Name string // TF variable name (e.g. "cloud_provider")
Type ValueType // string, number, bool, list
Default string
Required bool
Sensitive bool
Description string
Group VarGroup // Section for organized output
EnvComment string // Additional .env hint
}
```
Variables are grouped: `PROVIDER`, `PROVIDER — Hetzner`, `PROVIDER — DigitalOcean`, `SERVER CONFIGURATION`, `SSH CONFIGURATION`, `API KEYS`, `MODEL CONFIGURATION`, `DISCORD`, `TAILSCALE`, `HERMES-SPECIFIC`, `OPENCLAW-SPECIFIC`, `SECURITY`, `PROJECT METADATA`.
### InferenceClient (`internal/inference/client.go`)
HTTP client for validating inference API keys:
- `ValidateAPIKey(ctx, provider, apiKey)` — hits `/models` endpoint, checks for HTTP 200
- Returns `ValidationResult{Valid, ErrorMessage, ModelCount, Latency}`
- 30-second default timeout
### Provider Interface (`internal/provider/provider.go`)
```go
type Provider interface {
Name() string
ProviderName() string
Validate(ctx context.Context) error
Checks(ctx context.Context) []validation.Check
TokenEnvKey() string
SetToken(token string)
GetToken() string
}
```
Provider registry pattern: `Register(name, factory)` / `Get(name)`. Hetzner implementation at `internal/provider/hetzner/`.
### Validation Framework (`internal/validation/validation.go`)
Structured check system with `Runner`:
```go
type Check interface {
Name() string
Category() CheckCategory
Run(ctx context.Context) CheckResult
}
```
Status values: `PASS`, `FAIL`, `WARN`, `SKIP`, `ERROR`.
Categories: `Credentials`, `Connectivity`, `SSH Keys`, `Server Config`, `Quotas`, `Account`.
---
## Inference Providers
Currently supported:
| Provider | Enum | Base URL | Auth |
|----------|------|----------|------|
| Z.ai | `ProviderZAI` | `https://api.z.ai/api/coding/paas/v4` | `GLM_API_KEY` env |
| Venice.ai | `ProviderVenice` | `https://api.venice.ai/api/v1` | `VENICE_API_KEY` env |
| OpenRouter | `ProviderOpenRouter` | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` env |
Fallback chains: ZAI → Venice → OpenRouter (for GLM models); Venice → OpenRouter.
`DefaultGLMConfig()` sets `MaxTokens=16384` to prevent the over-compression bug where Venice defaults to 131K.
---
## .env Generation
`WriteDotEnv()` in `internal/config/dotenv_writer.go` generates the `.env` file from a `Config`. Output format:
- Header comment with usage instructions
- Variables grouped by `VarGroup`, each with description comment
- `TF_VAR_` prefix on all variable names
- JSON arrays for SSH keys: `TF_VAR_ssh_key_names='["key-name"]'` (single-quoted shell string containing JSON)
- Sensitive values get `YOUR_..._HERE` placeholders if empty
- `WriteTfVars()` generates HCL-format `terraform.tfvars` as an alternative
### Variable flow
User input → `DeploymentConfig``.env` (TF_VAR_ prefixed) → `source .env` → Terraform reads env vars → `templatefile()` → cloud-init → server provisioning.
---
## YAML Config (Non-interactive Mode)
`LoadYAMLConfig(path)` parses a YAML file into `YAMLConfig`, then `ToDeploymentConfig()` converts to `DeploymentConfig`. Schema:
```yaml
framework: hermes | openclaw
provider:
name: hetzner | digitalocean
token: "..."
ssh:
names: [...]
fingerprints: [...]
server:
name: "..."
location: "ash" | "fsn1" | "nbg1" | "hel1"
type: "cpx21" | ...
inference:
provider: venice | openrouter | openai | anthropic | custom
api_key: "..."
primary_model: "..."
tailscale:
enabled: true
auth_key: "..."
tailnet: "..."
discord:
enabled: true
bot_token: "..."
server_id: "..."
```
Full example: [`deploy.yaml.example`](deploy.yaml.example).
---
## Build System
### Makefile targets
| Target | Description |
|--------|-------------|
| `make build` | Build binary for current platform |
| `make test` | Run tests with race detection and coverage |
| `make lint` | `go vet` + `gofmt` |
| `make vet` | Run `go vet` |
| `make fmt` | Format with `gofmt` |
| `make clean` | Remove binary and coverage files |
| `make cross-compile` | Build for linux-amd64, linux-arm64, darwin-arm64, windows-amd64, windows-arm64 |
| `make version` | Print VERSION file contents |
### Version injection
```bash
VERSION=$(cat VERSION)
GIT_COMMIT=$(git rev-parse --short HEAD)
BUILD_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
LDFLAGS="-s -w -X main.version=$VERSION -X main.gitCommit=$GIT_COMMIT -X main.buildTime=$BUILD_TIME"
go build $LDFLAGS -o obm ./cmd/obm
```
### Docker
Multi-stage Dockerfile: `golang:1.22-alpine` (build) → `alpine:3.20` (runtime). Non-root user `obm:1000`. Entry point: `obm --help`.
---
## CI/CD
### CI (`ci.yml`)
Triggers on push/PR to main. Runs: `go vet``go test``gofmt` check → build. Build matrix: linux/darwin/windows × amd64/arm64 (excludes darwin/amd64).
### Release (`release.yml`)
Triggers on tag push (`v*`). Builds cross-compiled binaries, creates archives (.tar.gz for Unix, .zip for Windows), generates SHA256 checksums, creates GitHub Release with upload.
### Release process
```bash
./scripts/release.sh v0.2.0
# This: validates version → runs tests → updates VERSION → commits → tags → pushes tag
```
Pre-release versions (containing hyphen, e.g. `v1.0.0-beta.1`) are marked as pre-release on GitHub.
---
## Cost Estimation
`DeploymentConfig.MonthlyCostEstimate()` maps server types to price strings.
Hetzner prices (current at time of writing):
| Type | Price |
|------|-------|
| cx22 | €3.79/mo |
| cx23 | €5.83/mo |
| cpx21 | €4.49/mo |
| cpx31 | €8.98/mo |
| cpx41 | €17.96/mo |
DigitalOcean prices:
| Size | Price |
|------|-------|
| s-1vcpu-1gb | $6/mo |
| s-1vcpu-2gb | $12/mo |
| s-2vcpu-4gb | $24/mo |
| s-4vcpu-8gb | $48/mo |
| g-2vcpu-8gb | $63/mo |
---
## Terraform / OpenTofu Integration
`obm` generates the `.env` file that OpenTofu (or Terraform) expects. The actual infrastructure configs live in the separate [openboatmobile-ai](https://github.com/openboatmobile/openboatmobile-ai) repo.
The `internal/terraform/terraform.go` wrapper provides (auto-detects OpenTofu first, Terraform fallback):
- `Runner.Init()``tofu init -input=false`
- `Runner.Plan(destroy bool)``tofu plan` (with optional `-destroy` flag)
- `Runner.Apply()``tofu apply -auto-approve`
- `Runner.Destroy()``tofu destroy -auto-approve`
All commands run in the `WorkDir` and capture combined output.
### Variable flow to cloud-init
User sets `TF_VAR_*` env vars → sourced from `.env` → Terraform reads them → injected into cloud-init templates via `templatefile()` → written to server during provisioning.
### Cloud-init outputs
**Hermes** (`userdata-hermes.tpl`):
- `/home/<admin_user>/.hermes/.env` — API keys, Discord token, gateway token
- `/home/<admin_user>/.hermes/config.yaml` — model config, Discord channels
- `/home/<admin_user>/.hermes/SOUL.md` — agent personality template
- `/home/<admin_user>/docker-compose.yml` — Docker mode only
- `/etc/systemd/system/hermes.service` — systemd unit
- `/usr/local/bin/hermes-health-check.sh` — diagnostic script
**OpenClaw** (`userdata-openclaw.tpl`):
- `/etc/openclaw.env` — secrets (0600, root-owned)
- `/home/<admin_user>/.openclaw/openclaw.json` — full config
- `/etc/systemd/system/openclaw-gateway.service`
- `/usr/local/bin/openclaw-health-check.sh`
---
## Destroy Flow
`obm destroy` reads `terraform.tfstate` to list resources, shows them, asks for confirmation, runs `tofu destroy`, then cleans up state files unless `--keep-state` is set.
---
## Testing
```bash
make test # Full suite with race detection + coverage
go test ./... # Without make
go test -v -race -coverprofile=coverage.out ./...
```
Test coverage includes: config parsing, dotenv round-tripping, YAML loading, inference client validation, Hetzner provider validation, destroy workflow, prompt helpers.
---
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for the full guide including development setup, PR checklist, and release process.
Key points:
- Go 1.22+ required
- Run `make lint test` before pushing
- Never push directly to `main`
- Feature branches: `git checkout -b <task-name>`