Comprehensive guide covering: - llama.cpp and MLX (omlx) setup on Apple Silicon - Model selection and memory optimization (quantized KV cache) - Real benchmarks on M5 Max comparing both backends - Hermes connection instructions Cherry-picked from PR #2590. |
||
|---|---|---|
| .. | ||
| developer-guide | ||
| getting-started | ||
| guides | ||
| integrations | ||
| reference | ||
| user-guide | ||
| index.md | ||