Records
10
HF AUTHOR CLUSTER
10 robotics-tagged HF records from InternRobotics, totaling 325,043 cumulative downloads. Some records cite published arxiv research.
DIRECT ANSWER
Author clusters consolidate every record from one publisher into a single buyer-review surface. InternRobotics ships 10 robotics datasets on Hugging Face. Top license: not specified. Tier breakdown: 9 indexable as Tier A, 0 as Tier B, 1 demoted (those URLs redirect here).
10
325,043
not specified
DATASETS
10 of 10 datasets
200,677 downloads · cc-by-nc-sa-4.0
[ICLR 2026] OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling 🎉NEWS [2026.3.21] 🔥 OmniWorld-Game with Metric Scale is now released! Check out our latest model Pi3X (an enhanced
47,044 downloads · cc-by-sa-4.0
InternData-N1 InternData-N1 is a large-scale, unified vision-language navigation dataset consolidating multiple benchmarks into a standardized format. With over 240,000 trajectories across 3,000+ scen
36,451 downloads · not specified
InternData-A1 InternData-A1 is a hybrid synthetic-real manipulation dataset containing over 630k trajectories and 7,433 hours across 4 embodiments, 18 skills, 70 tasks, and 227 scenes, covering rigid,
16,673 downloads · not specified
InternScenes InternScenes is a large-scale interactive indoor scene dataset with realistic layouts. This dataset comprises approximately 40,000 diverse scenes and 1.96M 3D objects that cover 15 common
8,119 downloads · apache-2.0
Sim1_Dataset LeRobot-format manipulation dataset for cloth folding / deformable object tasks. Format This repository follows a LeRobot-style layout per subset: data/chunk-xxx/episode_XXXXXX.parquet: f
6,154 downloads · not specified
InternData-M1 InternData-M1 is a comprehensive embodied robotics dataset containing 244K simulation demonstrations with rich frame-based information including 2D/3D boxes, trajectories, grasp points,
5,682 downloads · not specified
RoboInter-Data: Intermediate Representation Annotations for Robot Manipulation Rich, dense, per-frame intermediate representation annotations for robot manipulation, built on top of DROID and RH20T. D
1,842 downloads · not specified
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning 🔑Key Features MesaTask-10K, a large-scale dataset for task-oriented tabletop scene generation, comprises approximately
1,429 downloads · not specified
RobotInter-VQA: Intermediate Representation Understanding & Generation VQA Dataset for Manipulation English | 简体中文 A Visual Question Answering dataset for robotic manipulation, developed as part of th
972 downloads · cc-by-nc-sa-4.0
Viewer is explicitly configured to read the parquet split only. Citation If you find this dataset useful, please cite: @article{zhao2026SythnVerse, title={SynthVerse: A Large-Scale Diverse Synthetic D
RESEARCH PATHS
A dataset record is only useful when it connects into the rest of the buyer workflow. The next review step is usually not another summary; it is a fit check, rights triage, source comparison, or custom bounty spec that names the missing proof.
For physical AI teams, the hard question is whether the public source can support a specific model objective under real deployment constraints. That requires adjacent dataset records, tools, comparisons, and sourcing paths, plus external references that a reviewer can open and challenge.
Use the links below to keep the review grounded. Start broad when discovery is incomplete, move into profile and comparison pages when the candidate source is known, and switch to custom collection when the blocker is rights, consent, geography, robot embodiment, or target environment coverage.
INTERNAL LINKS
Use the catalog to compare source-backed dataset profiles by modality, task, rights signal, consent risk, and deployment fit.
Scan the broader robotics dataset surface before narrowing into promoted profiles, comparisons, and custom collection specs.
Track source updates, licensing notes, and buyer-readiness changes that should trigger a renewed review.
Score whether a public source is enough for the model, rights path, modalities, and target environment.
Separate source license language from contributor consent, redistribution, private-space risk, and model-use assumptions.
Turn a public-source gap into a scoped capture request with sample QA, metadata, and delivery requirements.
Compare data providers when the answer is not another public dataset but a better sourcing or capture route.
Use the company index to separate annotation vendors, data engines, marketplaces, and specialist capture teams.
EXTERNAL REFERENCES
Market context for why physical AI systems need custom, enriched, real-world data beyond generic labeling workflows.
Robotics dataset and tooling context for Hugging Face based collection, sharing, conversion, and training workflows.
A cross-embodiment robotics dataset reference for comparing trajectory scale, robot diversity, and VLA training assumptions.
A large in-the-wild robot manipulation dataset reference for real-world trajectory capture and deployment transfer risk.
TRUELABEL ROUTING
If the Hub records don't carry the license, consent, or deployment fit your team needs, commission a custom collection on the same modality with explicit commercial terms.