← Research

How to Collect Robot Demonstration Data

A practical guide to teleoperation and data collection for imitation learning.

1. Choose Your Teleoperation Setup

Common options: (a) kinesthetic teaching — physically guide the arm; (b) bimanual teleop — two master arms (e.g., ALOHA); (c) mobile + manipulation — Mobile ALOHA for whole-body tasks. See Teleoperation and Robot Platforms Comparison.

2. Data Format

Record synchronized observations (images, state) and actions. Common formats: HDF5 (ALOHA), RLDS (Open X-Embodiment), LeRobot (Hugging Face). Ensure timestamps align and actions are in the same space as your policy output.

3. How Many Demos?

Simple tasks: 50–200. Complex: 200–1000+. Pre-trained models (OpenVLA, Octo) can fine-tune with fewer. Use open datasets for pre-training when possible.

4. Quality Over Quantity

Diverse object poses, lighting, and failure recoveries matter. Avoid repetitive, identical demos. See What Makes Robot Data Learning-Ready.

5. Related Resources

  • Data Services — We collect learning-ready data for your tasks
  • Datasets — DROID, BridgeData, ALOHA
  • Models — OpenVLA, Octo for fine-tuning