How to Collect Robot Demonstration Data
A practical guide to teleoperation and data collection for imitation learning.
1. Choose Your Teleoperation Setup
Common options: (a) kinesthetic teaching — physically guide the arm; (b) bimanual teleop — two master arms (e.g., ALOHA); (c) mobile + manipulation — Mobile ALOHA for whole-body tasks. See Teleoperation and Robot Platforms Comparison.
2. Data Format
Record synchronized observations (images, state) and actions. Common formats: HDF5 (ALOHA), RLDS (Open X-Embodiment), LeRobot (Hugging Face). Ensure timestamps align and actions are in the same space as your policy output.
3. How Many Demos?
Simple tasks: 50–200. Complex: 200–1000+. Pre-trained models (OpenVLA, Octo) can fine-tune with fewer. Use open datasets for pre-training when possible.
4. Quality Over Quantity
Diverse object poses, lighting, and failure recoveries matter. Avoid repetitive, identical demos. See What Makes Robot Data Learning-Ready.
5. Related Resources
- Data Services — We collect learning-ready data for your tasks
- Datasets — DROID, BridgeData, ALOHA
- Models — OpenVLA, Octo for fine-tuning