Voice Mode Beta
Talk to Mioru naturally — real-time AI voice conversation.
What is Voice Mode?
Voice Mode lets you have a spoken conversation with Mira's AI. You talk, the AI listens and responds audibly — like a phone call with your notes.
Shortcut: Ctrl + Shift + V
How It Works
Voice Mode uses a dedicated Rust audio pipeline on your machine:
- Audio capture & playback via
cpal+sonora(native Rust libraries) - Real-time streaming via OpenAI Realtime WebSocket
- Local audio processing: 48kHz hardware input, 24kHz API format, with stateful upsampling/downsampling
No browser, no WebRTC — the audio pipeline runs directly on your OS.
Beta Status
Voice Mode is intentionally marked Beta. The underlying audio pipeline is promising and works for real conversations, but:
- Interruption latency is still being improved
- Windows testing is ongoing (Linux is the primary test platform)
- Longer conversations may occasionally experience audio artifacts
Telemetry (saved locally as voice-telemetry-latest.jsonl) helps track buffer levels, drops, and response lifecycle for debugging.
Voice Mode vs. Voice Input
| Feature | Shortcut | Purpose |
|---|---|---|
| Voice Input | Ctrl + Shift + M | Dictate a message into chat or Ctrl+K (text output) |
| Voice Mode | Ctrl + Shift + V | Full duplex conversation (voice input & output) |
| Gedankenspeicher Voice | Ctrl + Shift + G | Record a thought for the Gedankenspeicher |
Requirements
- OpenAI API key (for Whisper transcription)
- Working microphone
- Voice Mode enabled in Settings → Voice