ROBOT FACETS

Datasets by robot

Pick the embodiment your team is training on. Each robot facet aggregates the public datasets that ship trajectories, video, and metadata for that platform — with truelabel's commercial-use and consent-risk notes layered on.

DIRECT ANSWER

Datasets are facet-tagged by platform when the original capture used that robot or a close mechanical analogue. Robot-specific facets matter for sim-to-real transfer, controller compatibility, and gripper-equivalence checks before policy training.

16 datasets

Simulation

Synthetic robot environments used for scalable benchmark and policy development.

RT 1
RoboMimic
Meta World

11 datasets

Franka

Research manipulator often used for tabletop manipulation and imitation-learning datasets.

Open X Embodiment
DROID
BridgeData V2

3 datasets

ALOHA

Low-cost bimanual teleoperation platform frequently used for imitation-learning demonstrations.

Open X Embodiment
ALOHA
LeRobot datasets

3 datasets

UR5

Industrial collaborative robot arm used in manipulation and automation research.

Open X Embodiment
BridgeData V2
RoboNet

2 datasets

Humanoid

Human-shaped robot platforms for whole-body control, teleoperation, and deployment data.

AgiBot World
RoboCasa

2 datasets

Mobile manipulator

Robot platforms that combine a mobile base with one or more manipulation arms.

AgiBot World
RoboCasa

1 dataset

Sawyer

Rethink Robotics Sawyer manipulator used in several teleoperation and manipulation benchmark datasets.

RoboTurk

1 dataset

xArm

UFACTORY xArm manipulator family used in real-world collection, imitation-learning, and VLA evaluation setups.

RH20T

CROSS-CATALOG

Pair with another facet

Combine this facet with a second filter (modality, task, robot, format, license, or commercial-use) on the main dataset catalog to narrow the buyer decision faster.

Other facet hubs

Modality Task Robot Format License Commercial use

A dataset record is only useful when it connects into the rest of the buyer workflow. The next review step is usually not another summary; it is a fit check, rights triage, source comparison, or custom bounty spec that names the missing proof.

For physical AI teams, the hard question is whether the public source can support a specific model objective under real deployment constraints. That requires adjacent dataset records, tools, comparisons, and sourcing paths, plus external references that a reviewer can open and challenge.

Use the links below to keep the review grounded. Start broad when discovery is incomplete, move into profile and comparison pages when the candidate source is known, and switch to custom collection when the blocker is rights, consent, geography, robot embodiment, or target environment coverage.

Curated profiles

Physical AI dataset catalog

Use the catalog to compare source-backed dataset profiles by modality, task, rights signal, consent risk, and deployment fit.

Broad discovery

Hugging Face robotics index

Scan the broader robotics dataset surface before narrowing into promoted profiles, comparisons, and custom collection specs.

Freshness layer

Dataset changelog

Track source updates, licensing notes, and buyer-readiness changes that should trigger a renewed review.

Buyer workflow

Dataset fit checker

Score whether a public source is enough for the model, rights path, modalities, and target environment.

Rights triage

License risk checker

Separate source license language from contributor consent, redistribution, private-space risk, and model-use assumptions.

Custom data path

Data spec generator

Turn a public-source gap into a scoped capture request with sample QA, metadata, and delivery requirements.

Supplier research

Vendor alternatives hub

Compare data providers when the answer is not another public dataset but a better sourcing or capture route.

Market map

Data annotation companies

Use the company index to separate annotation vendors, data engines, marketplaces, and specialist capture teams.

External reference

Scale AI physical AI data engine

Market context for why physical AI systems need custom, enriched, real-world data beyond generic labeling workflows.

External reference

LeRobot documentation

Robotics dataset and tooling context for Hugging Face based collection, sharing, conversion, and training workflows.

External reference

Open X-Embodiment

A cross-embodiment robotics dataset reference for comparing trajectory scale, robot diversity, and VLA training assumptions.

External reference

DROID dataset

A large in-the-wild robot manipulation dataset reference for real-world trajectory capture and deployment transfer risk.

TRUELABEL ROUTING

Need data for a robot we don't cover?

If your embodiment isn't in the catalog, commission a custom collection on your exact platform with sample QA and rights review.

Request platform-specific data

Datasets by robot

Browse datasets by robot

Simulation

Franka

ALOHA

UR5

Humanoid

Mobile manipulator

Sawyer

xArm

Pair with another facet

Other facet hubs

Use this record as part of a broader dataset review

Continue the buyer workflow

Physical AI dataset catalog

Hugging Face robotics index

Dataset changelog

Dataset fit checker

License risk checker

Data spec generator

Vendor alternatives hub

Data annotation companies

Source context to verify

Scale AI physical AI data engine

LeRobot documentation

Open X-Embodiment

DROID dataset

Need data for a robot we don't cover?