Open-Source Robotics Learning Datasets
A curated catalog of open-source datasets for robot manipulation, imitation learning, and reinforcement learning — with links to official sources.
Datasets for Robot Learning
Each dataset has a dedicated page with description, scale, access links, and citations.
DROID
76K trajectories, 350 hours, 86 tasks. In-the-wild manipulation from 50 collectors across 564 scenes. TensorFlow Datasets, Hugging Face.
View dataset → 2023BridgeData V2
60K trajectories, 24 environments, 13 manipulation skills. Low-cost WidowX robot. Natural language labels, multi-task learning.
View dataset → Google DeepMindOpen X-Embodiment
1M+ episodes, 22 robot types, 500+ skills. Unified RLDS format. RT-X models. 33 institutions.
View dataset → Stanford / NVIDIAALOHA
Bimanual teleoperation. ALOHA-Cosmos-Policy, baseline datasets. HDF5, Hugging Face. Open hardware.
View dataset → BenchmarkLIBERO
130 tasks, 65K demos. Lifelong learning benchmark. Spatial, object, goal suites. RoboSuite simulation.
View dataset → Stanford / BerkeleyRoboNet
15M frames, 7 robot platforms. Multi-robot transfer. Sawyer, Franka, Baxter, Fetch, WidowX.
View dataset → ARISE InitiativeRoboMimic & MimicGen
Framework + datasets. MimicGen: 50K demos from 200 human demos. Simulation + real. MIT license.
View dataset → Hugging FaceLeRobot
Standardized format + hub. DROID-100, ALOHA, SO-100. PyTorch, streaming. "ImageNet of robotics."
View dataset →