RoboMimic
A benchmark and dataset framework for robot imitation learning with standardized tasks and evaluation utilities.
COMMERCIAL-USE FACETS
The license tag is a first-pass signal. truelabel's commercial-use field captures the conservative buyer-grade interpretation — combining license, consent posture, contributor terms, and downstream model-use restrictions — so teams know which datasets are commercial-ready before they invest in training.
DIRECT ANSWER
Commercial-use status is a buyer-grade verdict, not a re-statement of the license. A dataset can ship under Apache-2.0 code but still carry contributor or environmental footage with non-commercial restrictions inherited from the capture process. truelabel’s status is one of: allowed, restricted, unclear, or research-only.
SOURCE APPEARS PERMISSIVE; VERIFY DATA TERMS · 5 DATASETS
Source appears permissive — verify data terms
A benchmark and dataset framework for robot imitation learning with standardized tasks and evaluation utilities.
A simulated manipulation benchmark for multi-task and meta-reinforcement learning.
A simulation benchmark and toolkit for manipulation skills and embodied AI policy evaluation.
An interactive simulated environment for embodied AI agents in household-like scenes.
A simulation framework and benchmark suite for robot manipulation tasks.
A large cross-institution collection of robot demonstrations spanning many embodiments and manipulation tasks.
A real-world robot manipulation dataset focused on diverse teleoperated demonstrations outside narrow lab-only settings.
A robot manipulation dataset from Berkeley focused on real-world behavior cloning and task generalization.
A robotics transformer data release associated with language-conditioned robot manipulation research.
A low-cost bimanual teleoperation platform and dataset family used for imitation learning in dexterous manipulation.
A multi-robot dataset for visual foresight and manipulation policy research.
COMMERCIAL USE RESTRICTED · 6 DATASETS
Commercial use restricted by license or consent
A large-scale egocentric video dataset focused on first-person human activity understanding.
An egocentric video dataset of kitchen activities used for action recognition and human-object interaction research.
An indoor RGB-D reconstruction dataset used for 3D scene understanding.
A human action video dataset focused on object interactions and temporal reasoning.
A large autonomous driving dataset with camera, LiDAR, and labeled traffic scenes.
A large video action recognition dataset used widely for video model pretraining.
KEEP DIGGING
A dataset record is only useful when it connects into the rest of the buyer workflow. The next review step is usually not another summary; it is a fit check, rights triage, source comparison, or custom bounty spec that names the missing proof.
For physical AI teams, the hard question is whether the public source can support a specific model objective under real deployment constraints. That requires adjacent dataset records, tools, comparisons, and sourcing paths, plus external references that a reviewer can open and challenge.
Use the links below to keep the review grounded. Start broad when discovery is incomplete, move into profile and comparison pages when the candidate source is known, and switch to custom collection when the blocker is rights, consent, geography, robot embodiment, or target environment coverage.
TRUELABEL ROUTING
If the catalog can't surface a commercial-use-allowed dataset for your task, commission custom data with explicit commercial-training terms, signed contributor consent, and per-batch QA gates.