## Kani-TTS-2 Research - Evaluated Kani-TTS-2 as potential TTS upgrade (3-4x faster, RTF 0.2) - Documented benefits: zero-shot voice cloning, Apache 2.0 license, 3GB VRAM - Identified Windows compatibility issues (pynini compilation failures) - Created test script for future evaluation when Windows support improves ## RTX 5090 Critical Finding - Discovered RTX 5090 (Blackwell sm_120) not supported by PyTorch - Tested stable (2.6.0) and nightly (2.7.0.dev) - both lack sm_120 support - Documented impact: GPU acceleration unavailable for STT/TTS - Performance degradation: 3.5s target → 10-15s actual (CPU-only) ## Files Added - KANI_TTS_EVALUATION.md - Comprehensive Kani-TTS-2 analysis - RTX_5090_BLOCKER.md - GPU compatibility report with solutions - test_kani_tts.py - Benchmark script for future testing - fix_pytorch_cuda.bat - GPU setup script (for when support lands) ## Recommendations - Wait 1-3 months for PyTorch sm_120 support - Monitor PyTorch releases weekly - Alternative: Cloud GPU (RTX 4090) or different local GPU - Current: CPU-only mode functional but slow ## Next Steps - Monitor: https://github.com/pytorch/pytorch/releases - Test when available: pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu124 - Re-evaluate Kani-TTS-2 after GPU support Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
251 lines
6.4 KiB
Markdown
251 lines
6.4 KiB
Markdown
# RTX 5090 Compatibility Blocker
|
|
|
|
**Date:** February 16, 2026
|
|
**GPU:** NVIDIA GeForce RTX 5090 (32GB VRAM, Blackwell sm_120)
|
|
**Status:** ❌ **BLOCKED - No PyTorch Support**
|
|
|
|
---
|
|
|
|
## Critical Finding
|
|
|
|
The **RTX 5090 is too new** for current PyTorch builds. Both stable and nightly releases fail with:
|
|
|
|
```
|
|
RuntimeError: CUDA error: no kernel image is available for execution on the device
|
|
|
|
NVIDIA GeForce RTX 5090 with CUDA capability sm_120 is not compatible with the current PyTorch installation.
|
|
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_61 sm_70 sm_75 sm_80 sm_86 sm_90.
|
|
```
|
|
|
|
**Tested Versions:**
|
|
- ❌ PyTorch 2.6.0+cu124 (Stable) - No sm_120 support
|
|
- ❌ PyTorch 2.7.0.dev20250310+cu124 (Nightly) - No sm_120 support
|
|
|
|
---
|
|
|
|
## Impact on Your Voice Bot
|
|
|
|
### Currently Affected
|
|
|
|
All GPU-accelerated components are **non-functional**:
|
|
|
|
| Component | Current Status | Impact |
|
|
|-----------|---------------|--------|
|
|
| **faster-whisper STT** | CPU-only | 3-5x slower (550ms → ~2s) |
|
|
| **Coqui XTTS v2 TTS** | CPU-only | 2-3x slower (1.6s → ~4-5s) |
|
|
| **Kani-TTS-2 testing** | Blocked | Cannot evaluate |
|
|
| **Total latency** | ~10-15s | vs target 3.5s ❌ |
|
|
|
|
### What Still Works
|
|
|
|
- ✅ Discord bot (voice receiving/sending)
|
|
- ✅ OpenClaw Gateway (LLM inference)
|
|
- ✅ VAD (Silero, CPU-based)
|
|
- ✅ Smart Turn v3 (ONNX, CPU-based)
|
|
- ⚠️ STT/TTS (fallback to CPU, very slow)
|
|
|
|
---
|
|
|
|
## Solutions
|
|
|
|
### Option 1: Wait for PyTorch Support (Recommended)
|
|
|
|
**Timeline:** 1-3 months (estimated)
|
|
|
|
**Reason:** RTX 5090 released Jan 2025, PyTorch typically adds new GPU support within 2-4 months.
|
|
|
|
**Monitor:**
|
|
- [PyTorch Releases](https://github.com/pytorch/pytorch/releases)
|
|
- [PyTorch CUDA Support](https://pytorch.org/get-started/locally/)
|
|
|
|
**Action:**
|
|
- Check weekly for PyTorch updates
|
|
- Subscribe to PyTorch announcements
|
|
- Test with: `pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu124`
|
|
|
|
### Option 2: Build PyTorch from Source (Advanced)
|
|
|
|
**Difficulty:** High
|
|
**Time:** 4-8 hours
|
|
**Risk:** May not work if CUDA Toolkit doesn't support sm_120
|
|
|
|
**Steps:**
|
|
1. Install CUDA Toolkit 12.8+ (if available with sm_120 support)
|
|
2. Clone PyTorch:
|
|
```bash
|
|
git clone --recursive https://github.com/pytorch/pytorch
|
|
cd pytorch
|
|
```
|
|
3. Build with sm_120:
|
|
```bash
|
|
export TORCH_CUDA_ARCH_LIST="5.0;6.0;7.0;7.5;8.0;8.6;9.0;12.0"
|
|
python setup.py install
|
|
```
|
|
4. Test
|
|
|
|
**Resources:**
|
|
- [Building PyTorch from Source](https://github.com/pytorch/pytorch#from-source)
|
|
|
|
### Option 3: Use Different GPU
|
|
|
|
**If available**, use older GPU for development:
|
|
|
|
| GPU | CUDA Capability | PyTorch Support | Recommendation |
|
|
|-----|-----------------|-----------------|----------------|
|
|
| RTX 4090 | sm_89 | ✅ Full support | ✅ Ideal for development |
|
|
| RTX 4080 | sm_89 | ✅ Full support | ✅ Good alternative |
|
|
| RTX 4070 Ti | sm_89 | ✅ Full support | ✅ Sufficient for voice bot |
|
|
| RTX 3090 | sm_86 | ✅ Full support | ✅ Works well |
|
|
|
|
**Action:**
|
|
- Check if you have access to RTX 40-series or 30-series GPU
|
|
- Use for development until RTX 5090 support lands
|
|
|
|
### Option 4: Run in Cloud with Supported GPU
|
|
|
|
**Platforms:**
|
|
- **RunPod** - RTX 4090 @ $0.79/hr
|
|
- **Vast.ai** - RTX 4090 @ $0.40-0.60/hr
|
|
- **Google Colab Pro** - A100/V100 @ $10/month
|
|
|
|
**Pros:**
|
|
- Immediate GPU access
|
|
- Supported hardware
|
|
- Test optimizations quickly
|
|
|
|
**Cons:**
|
|
- Ongoing cost
|
|
- Need to upload code/data
|
|
- Network latency for Discord bot
|
|
|
|
### Option 5: CPU-Only (Temporary Workaround)
|
|
|
|
**Use case:** Testing logic while waiting for GPU support
|
|
|
|
**Current setup** (already done):
|
|
```bash
|
|
pip install torch torchvision torchaudio # CPU version
|
|
```
|
|
|
|
**Performance:**
|
|
- STT: ~2-3s (vs 0.3s target)
|
|
- TTS: ~4-5s (vs 0.9s target)
|
|
- Total: ~10-15s (vs 3.5s target)
|
|
|
|
**Acceptable for:**
|
|
- Testing conversation flow
|
|
- Debugging bot logic
|
|
- Development (not production)
|
|
|
|
---
|
|
|
|
## Recommended Action Plan
|
|
|
|
### Immediate (This Week)
|
|
|
|
1. ✅ **Rollback to CPU PyTorch** for development:
|
|
```bash
|
|
pip install torch torchvision torchaudio
|
|
```
|
|
|
|
2. ✅ **Focus on non-GPU optimizations**:
|
|
- Query routing (Haiku vs Sonnet vs Opus)
|
|
- TTS caching
|
|
- Sentence-level streaming
|
|
- Response filtering
|
|
|
|
3. ✅ **Test bot functionality** with CPU (slow but works)
|
|
|
|
### Short-term (Next 2-4 Weeks)
|
|
|
|
4. 🔄 **Monitor PyTorch releases** for sm_120 support
|
|
|
|
5. 🧪 **Evaluate cloud GPU** options:
|
|
- Test on RunPod/Vast.ai with RTX 4090
|
|
- Measure actual performance gains
|
|
- Compare cost vs waiting
|
|
|
|
6. 📊 **Benchmark CPU baseline** to quantify GPU improvement later
|
|
|
|
### Long-term (Next 1-3 Months)
|
|
|
|
7. ⏳ **Wait for PyTorch sm_120 support**
|
|
|
|
8. 🚀 **Deploy with GPU** when support lands
|
|
|
|
9. 🔍 **Re-evaluate Kani-TTS-2** once GPU works
|
|
|
|
---
|
|
|
|
## Current Bot Configuration
|
|
|
|
**For now, use CPU-only mode:**
|
|
|
|
```yaml
|
|
# config.yaml
|
|
pipeline:
|
|
stt:
|
|
model_size: "small" # Smaller = faster on CPU
|
|
device: "cpu" # Force CPU
|
|
beam_size: 1 # Faster decoding
|
|
|
|
tts:
|
|
device: "cpu" # Force CPU
|
|
```
|
|
|
|
**.env overrides:**
|
|
```bash
|
|
PIPELINE__STT__DEVICE=cpu
|
|
PIPELINE__STT__MODEL_SIZE=small
|
|
PIPELINE__TTS__DEVICE=cpu
|
|
```
|
|
|
|
---
|
|
|
|
## When PyTorch Supports sm_120
|
|
|
|
**Test with:**
|
|
```bash
|
|
# Uninstall current
|
|
pip uninstall torch torchaudio torchvision -y
|
|
|
|
# Install latest
|
|
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
|
|
|
|
# Verify
|
|
python -c "import torch; print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"
|
|
|
|
# Test computation
|
|
python -c "import torch; x=torch.rand(100,100,device='cuda'); print('GPU OK')"
|
|
```
|
|
|
|
**Then update config:**
|
|
```yaml
|
|
pipeline:
|
|
stt:
|
|
device: "cuda"
|
|
model_size: "medium" # Can use larger model on GPU
|
|
beam_size: 5 # Better quality
|
|
|
|
tts:
|
|
device: "cuda"
|
|
```
|
|
|
|
**Expected improvement:**
|
|
- STT: ~2s → ~0.35s (6x faster)
|
|
- TTS: ~4-5s → ~0.9s (5x faster)
|
|
- Total: ~10-15s → ~4s (3x faster, near 3.5s target!)
|
|
|
|
---
|
|
|
|
## Resources
|
|
|
|
- [PyTorch GitHub](https://github.com/pytorch/pytorch)
|
|
- [NVIDIA CUDA Compatibility](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities)
|
|
- [RTX 5090 Specs](https://www.nvidia.com/en-us/geforce/graphics-cards/50-series/rtx-5090/)
|
|
- [RunPod Cloud GPU](https://www.runpod.io/)
|
|
- [Vast.ai GPU Marketplace](https://vast.ai/)
|
|
|
|
---
|
|
|
|
**Summary:** RTX 5090 support is coming, but not here yet. Use CPU mode for development now, monitor for PyTorch updates, or use cloud GPU for testing in the meantime.
|