HF AUTHOR CLUSTER

allenai robotics datasets

3 robotics-tagged HF records from allenai, totaling 14,632 cumulative downloads. Some records cite published arxiv research.

DIRECT ANSWER

Author clusters consolidate every record from one publisher into a single buyer-review surface. allenai ships 3 robotics datasets on Hugging Face. Top license: cc-by-4.0. Of those, 3 get a full standalone page, 0 get a shorter profile, and 0 are folded into this cluster.

Robotics-tagged

Records

Hub signal

Cumulative downloads

14,632

First-pass rights

Top license

cc-by-4.0

License

Modality

Format

3 of 3 datasets

MolmoAct-Dataset

Published Sep 2025 · cc-by-4.0 · allenai

This dataset was created using LeRobot. Dataset Description This dataset contains MolmoAct Dataset in lerobot format. All contents in this dataset were collected in-house by Ai2.

6,845 downloads
27 likes
1M<n<10M
Paper available

Image
Timeseries
Parquet

MolmoAct-Pretraining-Mixture

Published Sep 2025 · cc-by-4.0 · allenai

MolmoAct - Pretraining Mixture Data Mixture used for MolmoAct Pretraining. Contains a subset of OXE formulated as Action Reasoning Data along with auxiliary robot data and link to Multimodal Web data.

5,929 downloads
10 likes
10M<n<100M
Paper available

Image
Text
Parquet

MolmoAct-Midtraining-Mixture

Published Aug 2025 · cc-by-4.0 · allenai

MolmoAct - Midtraining Mixture Data Mixture used for MolmoAct Midtraining. Contains MolmoAct Dataset formulated as Action Reasoning Data.

1,858 downloads
4 likes
1M<n<10M
Paper available

Image
Text
Parquet

A dataset record is only useful when it connects into the rest of the buyer workflow. The next review step is usually not another summary; it is a fit check, rights triage, source comparison, or custom bounty spec that names the missing proof.

For physical AI teams, the hard question is whether the public source can support a specific model objective under real deployment constraints. That requires adjacent dataset records, tools, comparisons, and sourcing paths, plus external references that a reviewer can open and challenge.

Use the links below to keep the review grounded. Start broad when discovery is incomplete, move into profile and comparison pages when the candidate source is known, and switch to custom collection when the blocker is rights, consent, geography, robot embodiment, or target environment coverage.

TRUELABEL ROUTING

Need data like allenai ships, but with cleaner rights?

If the Hub records don't carry the license, consent, or deployment fit your team needs, commission a custom collection on the same modality with explicit commercial terms.

Request similar data

allenai robotics datasets

Records

Cumulative downloads

Top license

All 3 robotics records from allenai

MolmoAct-Dataset

MolmoAct-Pretraining-Mixture

MolmoAct-Midtraining-Mixture

Use this record as part of a broader dataset review

Where to go next

Other places to verify the claims

Need data like allenai ships, but with cleaner rights?