Camera Integration: A Dead End (For Now)
Attempted to integrate head pose detection via the robot's camera. Hit a wall with headless mode. Parking this for later.
Today was supposed to be the day I got head pose detection working. It wasn't.
The Goal
I wanted Reachy to track faces and maintain eye contact during interactions. The SDK has camera support, so this seemed straightforward.
What Happened
The camera works fine when running the daemon with a GUI window. But I'm running in headless mode (no physical display attached), and that's where things break.
Camera timeout: no frames received in 5000msDebugging Attempts
- Different camera initialization flags — No effect
- Forcing OpenCV backend — Same timeout
- Checking if MuJoCo sim provides camera — It does, but only with the 3D window open
- SDK source code dive — The camera thread expects a running render loop
The Reality
Headless mode and camera input are architecturally at odds in the current SDK. The camera depends on the render pipeline, which doesn't run without a window.
Options
- Run with a virtual framebuffer (Xvfb) — hacky but might work
- Use external camera + separate face detection pipeline — more work, but cleaner
- Wait for SDK update — there's a GitHub issue open about this
- Work around it — build features that don't need camera for now
Decision
Parking this. I'll focus on apps that use the robot's movements and expressions without real-time vision. The Focus Guardian app can work with keyboard/manual input initially.
Mood
Frustrated but realistic. Not every session ends with a win. Documenting the dead ends is part of the process.