psb-thinking-2026-04-13

Source

  • Type: local-file
  • Path: /home/topher/.openclaw/workspace-psb-thinking/memory/2026-04-13.md
  • Bytes: 2828
  • Updated: 2026-05-03T01:55:56.660Z

Content

# 2026-04-13
 
## GPU Install Attempt — CasaOS Goes Down
 
### What happened
- Topher attempted to install the Nvidia P102-100 10GB GPU (passive mining card, GTX 1080 Ti chip) into the CasaOS server (media, 100.91.1.57)
- With GPU installed: system boots all the way to terminal login prompt (media login:) — so POST works
- BUT: CasaOS dashboard does not load in browser
- Without GPU: system presumably works normally (pulled card to reboot)
 
### Driver state on media (pre-GPU install, checked via exec)
- Driver 560.35.03 installed (nvidia-driver-560, linux-modules-nvidia-560-6.11.0-24-generic)
- /dev/nvidiactl exists — kernel driver loaded
- Ollama container: big-bear-ollama-cpu (CPU-only, no GPU access)
- CasaOS: running, HTTP 200 on curl to localhost
- Port 80: listening
 
### Theories
1. GPU changes PCI bus order or eth0 interface naming → network binding issue
2. nouveau driver grabbing card → system stall or display conflict
3. CasaOS binding to wrong interface after GPU install (looks for eth0, finds something else)
4. driver not fully compatible with P102-100 (compute card, not a standard GPU)
 
### PSU - ROOT CAUSE
- Current PSU: cheap 430W (INSUFFICIENT)
- Ordered: quality 600W PSU (arriving soon)
- The 430W cheap unit likely caused Postgres protection fault and instability
- P102-100 needs 150W, system pulls ~390W — no headroom on cheap 430W
 
### Next steps (Topher to run with GPU installed)
1. `ping media` — does hostname resolve?
2. `ss -tlnp | grep :80` — is port 80 listening?
3. `curl -I http://localhost/` — does CasaOS respond locally?
4. `dmesg | grep -i nvidia | tail -20` — GPU kernel messages
5. `dmesg | grep -i error | tail -10` — any errors
6. Check if eth0/swac0 changed names: `ip a`
 
### After new PSU arrives
1. Remove GPU before PSU swap (safety)
2. Swap PSU, verify boots clean WITHOUT GPU
3. Reinstall GPU, boot — test CasaOS
4. If Postgres crashes again: `sudo systemctl restart postgresql`
5. Then restart Ollama container with GPU support once stable
 
### Memory search
- Currently DISABLED — Ollama CPU embeddings too slow (21s/chunk), waiting for GPU
- Qdrant: running on media at 100.91.1.57:6333
- Ollama: CPU container, needs GPU restart after card is confirmed working
 
### Agents
- psb-thinking (me): technical research, planning, system admin
- psb-gemma: brewery operations, day-to-day
- psb-business: business/reports/Toast POS
 
## 23:26 UTC - Post-reboot check (GPU install)
- Media rebooted ~22:56 UTC (~2hrs after warning at 21:42)
- System up 30 min, load 0.03 — stable
- **P102-100 NOT detected** — only Quadro K600 at 01:00.0
- nvidia-smi: "couldn't communicate with NVIDIA driver" (Quadro K600 has no driver)
- No /dev/nvidia* devices found
- P102-100 may not have been physically installed, or POST failure, or PCIe lane issue
 

Notes

  • No related pages yet.