# ComBadge**Status:** Research**Created:** 2026-03-30**Updated:** 2026-04-23**Tags:** `esp32-s3`, `wearable`, `voice`, `pico-claw`, `waveshare`## SummaryA Star Trek-style **communicator badge** — wearable, voice-first AI device you tap and talk to. Not a tricorder (that's a separate project). ComBadge is the wearable companion: tap, speak, get things done. Voice comms over Wi-Fi to a local LLM.**Goal:** Wear it daily, talk to it, get answers back. Watch form factor is the target.---## Hardware Requirements### Hard Gates (Must Have)| Requirement | Why ||-------------|-----|| Microphone | Voice input || Speaker or audio output | Voice output || Battery (LiPo with charging) | Wearable power || IMU (accelerometer) | Tap-to-talk activation || Wi-Fi | Connection to host/OpenClaw || ESP32-S3 or equivalent | WiFi + enough RAM for PicoClaw |### Nice-to-Have (Don't Gate On)| Feature | Notes ||---------|-------|| Screen | Useful for status (glanceable), not required for voice I/O || Camera | Not a priority for ComBadge || BLE | WiFi handles connectivity || Built-in IMU | Can add external via I2C if needed |**Screen policy:** Screen is a nice-to-have for status display (glanceable). Do NOT rule out headless boards that meet all hard gates. LEDs or audio feedback are valid substitutes for the screen.---## Lead Hardware: Waveshare ESP32-S3-Touch-AMOLED-2.06**Status:** TOP CONTENDER — Daily driver form factor**Commitment:** Not ready to commit yet — battery needs more research**Product page:** https://www.waveshare.com/esp32-s3-touch-amoled-2.06.htm| Spec | Detail ||------|--------|| **SoC** | ESP32-S3R8, dual-core 240MHz || **Memory** | 8MB PSRAM + 32MB Flash || **Display** | 2.06" AMOLED, 410×502, capacitive touch (FT3168) || **Audio** | Dual digital mics + onboard speaker || **IMU** | QMI8658 6-axis || **RTC** | PCF85063A || **Storage** | TF card slot || **Wireless** | WiFi 802.11 b/g/n + Bluetooth 5 || **PMIC** | AXP2101 || **Battery** | MX1.25 header (external LiPo — need to source) || **Interfaces** | I2C, UART, USB-C, GPIO pads || **Size** | Watch form factor with strap mount || **Price** | ~$40-50 |**Why it's the top contender:**- Watch-style = daily wear, not special occasion- Display is actually **glanceable** (wrist position vs chest)- Dual mics improve voice capture quality over single-mic alternatives- 32MB Flash = room for firmware, local keywords, future expansion- AXP2101 PMIC = proper battery management and charging- Capacitive touch = tap-to-talk + swipe UI options- PicoClaw compatible — full agent loop on-device possible- Designed specifically for voice AI interaction**Cons:**- External battery (no built-in LiPo — need to source MX1.25 connector pack)- Strap form factor = bulkier than flat badge- Higher price than M5StickS3 (~$40-50 vs $20-25)**Battery note:** Needs a LiPo with MX1.25 connector. 500mAh+ is reasonable for all-day wear. Budget extra for a quality LiPo pack.---### Alternative: Band Module (Slim Add-On)**Concept:** A slim pod that slides into a watch band (Whoop/Polar style) — the band is just a carrier, the electronics are the pod. Worn 24/7 alongside Apple Watch.**Goal:** Voice AI in a band form factor, not a watch.| Component | Option ||-----------|--------|| **SoC** | ESP32-S3 mini (XIAO or custom) || **Mic** | MEMS digital mic (I2S) || **Speaker** | Mini speaker pointing toward wrist/forearm || **Feedback** | Haptic motor + RGB LEDs (no screen) || **IMU** | For tap-to-talk wake || **Battery** | Slim LiPo (target: 40-80mAh) || **PMIC** | AXP2101 or similar for aggressive power management |**Why this form factor:**- Worn 24/7 like a fitness band- Doesn't compete with Apple Watch- Apple Watch handles fitness/notifications; this handles voice AI- Slim profile — same as Whoop/Polar Loop**Power management (extreme):**- IMU in interrupt mode — device in deep sleep until tap detected- Mic bias off until wake- ESP32 deep sleep: ~10-20µA average- Target: 40-80mAh for full day voice usage (20-30 interactions)- This is tight — needs careful power budgeting**Audio challenge:** Speaker pointed at wrist won't be loud enough for open-air use. Sound needs to travel up the arm to your ear. May need to evaluate speaker size vs. form factor.**LEDs:** Simple RGB for status (listening, connected, error). WS2812 or similar.**Haptics:** Mini ERM or linear resonant actuator for tap confirmation and alerts.**Status:** Concept phase — sizing study complete### Band Module Sizing**Target envelope:** 35 × 25 × 10mmReal-world reference:| Device | Pod Dimensions | Weight ||--------|---------------|--------|| Whoop 5.0 | 34.7 × 24 × 10.6mm | 26.5g || Whoop 4.0 | 35.97 × 25 × 10.1mm | 11.3g || Polar Loop | 42 × 27 × 9mm | 29g total |Target battery: **150mAh** (slightly thicker than Whoop — doable for all-day voice AI)### Band Module Component Stack| Component | Part | Dimensions | Notes ||-----------|------|-----------|-------|| **SoC** | ESP32-S3-PICO-1-N8R8 | 7×7×1.2mm | Dual-core 240MHz, 8MB Flash + 8MB PSRAM || **PMIC** | AXP2101 | 2×2mm | 4× DC/DC, 7× LDO, full power domain gating || **Audio Codec** | ES8311 | 3×3mm | Class D amp + mic bias control || **Mic** | Knowles SPH0645LM4H-1 | 3.5×2.65×0.98mm | Digital MEMS I2S, 64dB SNR || **Speaker** | CUI CMW-1508-2-108 | Ø15mm × 3.8mm | 8Ω, 1W, side-firing toward wrist || **IMU** | QMI8658A | 3×3×0.9mm | 6-axis, interrupt-wake capable || **LEDs** | WS2812C-2020 × 3 | 2×2×0.5mm each | RGB status: listening/connected/error || **Haptics** | DRV2605L + ERM (prototype) | Ø10 × 3mm | ERM for dev; spec LRA for final board || | *Final:* DRV2605L + LRA (C10-100) | | DRV2605L drives both ERM and LRA modes || **Battery** | 150mAh prismatic LiPo | ~30 × 20 × 5mm | MX1.25 connector, 3.8V nominal || **PCB** | 2-layer FR4 | 32 × 22 × 0.8mm | Flex segments for battery compartment |**Thickness budget:**| Layer | Thickness ||-------|----------|| Battery | 5.0mm || PCB + components (back) | 1.5mm || Speaker (protrudes) | 3.5mm || Front cover | 0.5mm || **Total** | **~10.5mm** |**Tight spots:** Speaker protrusion is the main challenge — ~10.5mm at thickest point. Battery is the dimensional limiter. Speaker audio path (wrist→ear) needs prototype testing before full commit.### Band Module Power BudgetTarget: 150mAh for full day (~20-30 voice interactions)| State | Current | Duration | mAh per event ||-------|---------|----------|--------------|| Deep sleep (IMU wake) | ~10µA | 23.9 hrs | 0.24mAh || Voice capture (mic on) | ~25mA | 2s × 25 = 50s | 0.35mAh || WiFi streaming | ~70mA | 3s × 25 = 75s | 1.46mAh || TTS playback | ~40mA | 3s × 25 = 75s | 0.83mAh || LED/haptic pulse | ~20mA | 0.5s × 25 = 12.5s | 0.07mAh || **Idle daily drag** | ~15µA | 24 hrs | 0.36mAh || **Total per day** | | | **~3.3mAh** |**Realistic for heavy use:** 30-50mAh/day. 150mAh gives comfortable headroom. Power management: mic bias OFF except capture, WiFi OFF except streaming, ESP32 deep sleep between interactions, AXP2101 handles all domain gating.---| Spec | Detail ||------|--------|| **SoC** | ESP32-S3-PICO-1-N8R8, dual-core 240MHz || **Memory** | 8MB Flash + 8MB PSRAM || **Display** | 1.14" LCD, 135×240 (ST7789P3) || **Audio** | ES8311 codec, MEMS mic (65dB SNR), 8Ω@1W speaker || **Wireless** | Wi-Fi 2.4GHz || **IMU** | 6-axis IMU || **Battery** | 250mAh LiPo (built-in) || **Size** | 48×24×15mm, 20g || **Mounting** | Magnetic back design || **Price** | ~$20-25 || **Product** | [m5stack.com](https://shop.m5stack.com/products/m5sticks3-esp32s3-mini-iot-dev-kit) |**Why it's the alternative:**- All hard gates met (mic, speaker, battery, IMU, WiFi)- Badge-sized form factor (fits in a shirt pocket clip)- Built-in 250mAh battery = no external power supply needed- Screen for status confirmations- Magnetic back = easy wearable mounting- **PicoClaw/PycoClaw compatible**- Cheap enough to iterate ($20-25)**Use case:** Prototype/dev badge before committing to watch form factor. Good for proof-of-concept voice pipeline testing.**Enclosure note:** Battery can be repositioned to reduce thickness. 3D printer + CNC available for prototype iteration.### Dev Prototype Board: M5Stack Atom VoiceS3R**Purpose:** Primary ESP-Claw dev platform for Mode B evaluation**Product page:** https://shop.m5stack.com/products/atom-echos3r-smart-speaker-dev-kit**Status:** Ordered — ETA ~2 weeks| Spec | Detail ||------|--------|| **SoC** | ESP32-S3-PICO-1-N8R8, dual-core 240MHz || **Memory** | 8MB Flash + 8MB PSRAM || **Wireless** | WiFi 802.11 b/g/n + BLE 5 || **Audio** | ES8311 codec + NS4150B amp (1W speaker) + MEMS mic (65dB SNR) || **Size** | 24×24×16.8mm || **Price** | $14.50 |**Why this board:**- Same chip + audio solution as our target band module- All-in-one: mic + speaker + amp already on board- ESP-Claw validated on M5Stack S3 hardware- Ready to flash and go — no additional modules needed for audio eval- Cheap enough to iterate**Next step:** Flash ESP-Claw via Web Flasher, connect to WiFi, start voice interaction eval### Secondary Dev Board: Waveshare ESP32-S3-Tiny-N8R8-Kit**Purpose:** Secondary/embedding path — castellated holes for direct PCB integration**Product page:** https://www.waveshare.com/esp32-s3-tiny.htm**Status:** On order| Spec | Detail ||------|--------|| **SoC** | ESP32-S3-PICO-1-N8R8, dual-core 240MHz || **Memory** | 8MB Flash + 8MB PSRAM || **Wireless** | WiFi 802.11 b/g/n + BLE 5 || **GPIO** | 34× multi-function || **USB** | Via adapter board || **Size** | Compact; castellated holes || **Price** | ~$10-15 |**Use case:** Castellated edges can be reflow soldered directly onto the final band module PCB---## Watch Form Factor: Waveshare ESP32-S3 AMOLED Series (Details)**Status:** Lead hardware — watch-style daily driver**Product page:** https://www.waveshare.com/esp32-s3-touch-amoled-2.06.htm### All Sizes Available| Size | Model | Speaker | Mic | IMU | Notes ||------|-------|---------|-----|-----|-------|| 2.41" | ESP32-S3-Touch-AMOLED-2.41 | ❌ | ✅ | ✅ | Larger display || **2.06"** | **ESP32-S3-Touch-AMOLED-2.06** | ✅ | ✅ dual | ✅ | **Lead — full featured** || 1.91" | ESP32-S3-Touch-AMOLED-1.91 | ❌ | ✅ | ❌ | Wide format || 1.75" | ESP32-S3-Touch-AMOLED-1.75 | ✅ | ✅ | ✅ | Round-ish display || 1.43" | ESP32-S3-Touch-AMOLED-1.43 | ❌ | ✅ | ✅ | Small, varies || 1.32" | ESP32-S3-Touch-AMOLED-1.32 | ✅ | ✅ | ✅ | Smallest with speaker |**2.06" is the recommended model** — has all features (dual mic, speaker, IMU, RTC, TF) in a proper watch form factor.### Fit Assessment| Hard Gate | Status ||-----------|--------|| Mic | ✅ Dual mics || Speaker | ✅ Onboard || Battery | ✅ External (MX1.25 header) — need to source LiPo || IMU | ✅ QMI8658 || WiFi | ✅ 802.11 b/g/n |---## Architecture### Activation: IMU Tap Detection```User taps watch → IMU detects acceleration spike → Mic turns on, watch starts listening → User speaks → audio streams to host → Response → speaker + screen confirms```**Why IMU tap over wake word:**- Mic bias off until tap fires = much lower power- IMU in interrupt mode + ESP32 deep sleep ≈ microamps average- Tap-to-talk is more badge-authentic (Star Trek style)- No false triggers from ambient conversation**Wake word is optional** — if you want always-listening, add a TinyML model. For power savings, IMU tap is the default.### Two Modes#### Mode A: Watch as Thin ClientWatch captures audio, streams to a nearby host (Tricorder M10, home server, OpenClaw instance). Host handles STT → LLM → TTS. Watch outputs audio + status.**Pros:** Simple on-watch logic, fast response, no LLM complexity on ESP32**Cons:** Network-dependent#### Mode B: Watch as Full Agent (PicoClaw)Watch runs PicoClaw, connects directly to Ollama. Full autonomous agent loop on-watch.#### Mode B: Watch as Full Agent (PicoClaw)Watch runs PicoClaw, connects directly to Ollama. Full autonomous agent loop on-watch.**Pros:** Works standalone, no host dependency**Cons:** ESP32-S3 constrained; LLM must fit in 8MB PSRAM (quantized small models only)**Decision:** **Mode B (ESP-Claw) as primary plan.** Start there, fall back to Mode A if needed. Design lead time gives us time to evaluate ESP-Claw on dev hardware before committing to the custom band module.### ESP-Claw (NEW — 2026-04-23)**Espressif's official** agent framework for ESP32-S3. Released 2026-04-23.**Why it matters:** Validates that full local agent loop is possible on 8MB PSRAM. Inspired by OpenClaw. Directly integrates with OpenClaw via MCP.**Requirements:** 8MB Flash + 8MB PSRAM (N8R8 — our exact spec)**Chip support:** ESP32-S3 only (P4 coming soon)**Agent loop:** Full on-device sensing → reasoning → action → memory**LLM backends:** OpenAI, Qwen (local Ollama), ChatGPT, custom**Messaging:** Telegram, QQ Bot, Feishu, WeChat ClawBot**MCP:** Acts as both MCP server (exposes hardware) and MCP client (calls external agents)**Memory:** On-chip structured long-term memory — preferences and routines extracted from conversations**Offline:** Lua scripts execute deterministically even offline**Event-driven:** Local event bus drives sensor triggers, millisecond-latency response**Flash:** Web Flasher available or build from source**Relevance to ComBadge:**- MCP server mode: ESP-Claw on band module exposes hardware to OpenClaw host- MCP client mode: ESP-Claw calls OpenClaw for heavy reasoning while handling local control- Qwen backend: can connect to local Ollama instance (no cloud required)- Already hardware-validated on M5Stack StickS3 (same ESP32-S3-S3 module we considered)**vs PicoClaw:** ESP-Claw is Espressif's production-grade version. PicoClaw is community/M5Stack. Feature set is similar but ESP-Claw has official support, MCP native, and on-chip memory architecture.**Status:** Released today — evaluate as primary Mode B path### Voice Pipeline (on Host)| Component | Option ||-----------|--------|| **STT** | Whisper (local Ollama) || **LLM** | Ollama (local, e.g. Qwen 0.5B, Gemma 4 on capable hosts) || **TTS** | Piper or Coqui (local) |Watch streams audio → host processes → host returns text/audio.---## Local-on-Watch CapabilitiesThe ESP32-S3 can handle these tasks without a host:| Task | How ||------|-----|| Tap detection | IMU interrupt || Timer (start/stop) | ESP32 local code || Simple keyword commands | TinyML spotter (local) || WiFi connection | Native ESP32 WiFi || WebSocket to OpenClaw | Native ESP32 || Sensor reads (I2C) | Direct from connected sensors |### Simple Command Examples (Local-capable)- "Set a 5 minute timer" → ESP32 handles timer, no host- "Lights on/off" → intent classifier on-watch or host-assisted- "Read sensor temperature" → watch requests host → host reads → returns via TTS---## Open Questions- [x] **Band module sizing** — ✅ Done — Whoop 5.0: 34.7×24×10.6mm. Target: 35×25×10mm- [x] **Band module battery** — ✅ Done — 150mAh target, 30-50mAh/day modeled- [ ] **Band module speaker** — ⚠️ Needs prototype test — arm-to-ear audio path unvalidated- [ ] **Order ESP32-S3-Tiny-N8R8-Kit** — secondary dev / embedding path- [ ] Confirm BLE on M5StickS3 (if using for dev)- [ ] Test IMU tap detection and tuning (threshold, debounce)- [ ] Audio codec quality for voice calls- [ ] Power budget — battery life under voice load (Waveshare path)- [ ] Source MX1.25 LiPo pack for Waveshare — **ON HOLD**- [ ] Charging solution — USB-C passthrough or dock?- [ ] Strap mounting solution for Waveshare- [ ] Design band module enclosure (3D printed pod)- [ ] Band module speaker prototype — validate wrist-to-ear loudness---## Progress- [x] Hardware requirements defined (hard gates + nice-to-have)- [x] Lead hardware selected (Waveshare ESP32-S3-Touch-AMOLED-2.06)- [x] M5StickS3 evaluated as prototype/dev alternative- [x] Band module concept added (Whoop/Polar style slim add-on)- [x] IMU tap detection adopted (replaces wake word as default)- [x] Architecture outlined (Mode A thin client, Mode B full agent)- [x] Local-on-watch capabilities defined- [x] Band module sizing — Whoop 5.0: 34.7×24×10.6mm, Polar Loop: 42×27×9mm- [x] Band module component stack modeled (ESP32-S3-PICO, AXP2101, ES8311, QMI8658, 150mAh)- [x] Band module power budget modeled — 30-50mAh/day realistic for heavy use- [ ] Validate band module audio path (speaker loudness at wrist) — needs prototype- [ ] **Order M5Stack Atom VoiceS3R ($14.50) — ORDERED, ETA ~2 weeks**- [ ] Flash ESP-Claw, evaluate on VoiceS3R once received- [ ] Order Waveshare ESP32-S3-Tiny-N8R8-Kit (secondary / embedding path)- [ ] Order Waveshare AMOLED 2.06" (watch path) — **ON HOLD**- [ ] Source MX1.25 LiPo battery pack — **ON HOLD**---## Related Projects- [[tricorder.md]] — shares voice pipeline design, K10 uses same ESP32-S3- [[DEVICES.md]] — M5StickS3 and XIAO ESP32S3 Sense full specs- [[HARDWARE-WISHLIST.md]] — ComBadge section for future reference---## Decision Log### 2026-04-23 — Dev Board: Atom VoiceS3R Primary, Waveshare Tiny Secondary- M5Stack Atom VoiceS3R ordered ($14.50, ETA ~2 weeks) as primary ESP-Claw dev platform- Same chip + audio as our target (ESP32-S3-PICO-1-N8R8 + ES8311 + NS4150B)- All-in-one: mic + speaker + amp already on board, flash and go- Waveshare ESP32-S3-Tiny-N8R8-Kit becomes secondary (castellated holes for embedding in final PCB)- On hand: AtomS3R (display + IMU), WeAct ES8311 module (in transit), SPH0645 breakouts (ordered), QMI8658A (ordered)### 2026-04-23 — ESP-Claw Released (Espressif Official)- Espressif released ESP-Claw today — full local AI agent framework for ESP32-S3- Requires 8MB Flash + 8MB PSRAM — matches our N8R8 spec exactly- Inspired by OpenClaw; MCP server/client native integration- LLM backends include Qwen (local Ollama) — no cloud required- Validates Mode B is viable: full agent loop on 8MB PSRAM is confirmed working- StickS3 and CoreS3 already hardware-validated by M5Stack/Espressif- Consider ESP-Claw as primary Mode B path over PicoClaw (official support, richer feature set)- OpenClaw integration via MCP: ESP-Claw as MCP server exposes band module hardware; MCP client calls OpenClaw for reasoning### 2026-04-23 — Haptics: ERM for Dev, LRA for Final- LRA (C10-100) is hard to source and expensive for prototyping- ERM (coin motor) is what -topher has on hand from class- DRV2605L is dual-mode: drives both ERM and LRA — same driver works for both paths- Decision: ERM for prototype dev, spec LRA (C10-100 or equivalent) for final band module- DRV2605L unchanged in the stack — it's the right driver regardless of actuator type### 2026-04-23 — Band Module Concept Added- New concept: slim pod that slides into a watch band (Whoop/Polar style)- Keeps Apple Watch for fitness/notifications; this handles voice AI only- Form factor: mic + speaker + haptics + RGB LEDs (no screen), slim LiPo (40-80mAh)- Extreme power management: IMU interrupt wake, deep sleep, mic bias control- Bone conduction ruled out — scope is mic + speaker + haptics + LEDs- Key unknowns: band pod dimensions, wrist speaker loudness, 40-80mAh power budget### 2026-04-23 — Waveshare AMOLED 2.06" Promoted to Lead Hardware- Waveshare ESP32-S3-Touch-AMOLED-2.06" is now **top contender**- M5StickS3 moved to prototype/dev alternative- Watch form factor wins over badge: daily wear, glanceable display, dual mics- The 2.06" model has all features: dual mic, speaker, IMU, RTC, TF slot, 32MB Flash- External battery (MX1.25) is the main gap — needs a LiPo pack sourced- Screen utility: wrist position (watch) > chest position (badge)### 2026-04-20 — Watch Form Factor Added- Waveshare ESP32-S3 AMOLED series added as strong contender (watch path)- 2.06" model recommended (full feature set, watch straps)- Form factor split: badge for dev/prototype, watch for daily wear- Screen utility: wrist = glanceable, chest = mostly useless- Dual mics on Waveshare better for voice capture than M5StickS3 single mic- Added hard gates vs nice-to-have framework- Screen is nice-to-have, not a gate — do NOT rule out headless boards- IMU tap detection adopted as default activation (replaces wake word)- XIAO Sense added as alternative candidate (headless but meets hard gates)- Local-on-badge capabilities defined### 2026-04-19 — XIAO ESP32S3 Sense Evaluation- Seeed XIAO ESP32S3 Sense evaluated against M5StickS3- Sense has mic, WiFi, BLE, battery management — all hard gates- No speaker/display — would need external components- M5StickS3 remains lead due to all-in-one packaging### 2026-04-15 — Architecture DecisionStart with badge as thin client streaming to local Ollama. Reduces on-device complexity while voice pipeline is proven. Full PicoClaw agent mode is the goal but not the starting point.### 2026-03-30 — Hardware CandidateM5StickS3 chosen as lead candidate over AtomS3 because it has built-in audio (mic + speaker). AtomS3 lacks audio, making it a component rather than standalone solution.