truelabel

FREE TOOL

Robotics dataset license checker

A conservative procurement triage tool for public physical-world datasets. Score license terms, model-output rights, consent, private-space exposure, provenance, and review artifacts before model training.

DIRECT ANSWER

License clarity and contributor consent are separate questions. A dataset can publish terms and still lack enough consent or private space evidence for a commercial physical AI use case.

Risk presets

Use and terms
Consent and privacy

Risk signal

CRITICAL

100/100

Quarantine from model access until a legal review resolves license, consent, and sensitive-context blockers.

Why this score changed

  • commercial use language is not explicit
  • model weights or downstream outputs are not clearly allowed
  • contributor or site consent evidence is missing
  • redistribution, transformation, or dataset delivery may be limited
  • derivative datasets or enriched labels may not be allowed
  • data includes identifiable people or private environments
  • PII screening is not documented
  • there is no takedown or contributor removal process

Required review packet

  • Source URL, version hash or release date, license text, and terms snapshot.
  • Contributor, operator, or site consent artifacts mapped to samples.
  • Statement covering model training, fine-tuning, evaluation, embeddings, and derivative outputs.
  • Redistribution, transformation, retention, and takedown language.
  • PII and private-space screening notes with reviewer, date, and unresolved exceptions.

Counsel triggers

  • unknown license or terms require review
  • commercial use route is not explicit
  • trained weights, embeddings, eval reports, or derivative outputs are not clearly covered
  • consent artifacts are missing or incomplete

Questions to answer before model access

  • Can model weights trained or evaluated on this data be sold, deployed, or shared?
  • Can the dataset be transformed, enriched, redistributed, or delivered to a customer?
  • Do people, homes, workplaces, vehicles, biometrics, or private locations appear?
  • Can the buyer remove samples if a contributor revokes consent or a source changes terms?
  • Is there a clean separation between research-only use and commercial model development?

METHODOLOGY

What the license checker separates

This checker is procurement triage, not legal advice. It forces separate answers for license family, commercial-use language, trained-model and output rights, redistribution, derivative data, consent, private-space exposure, PII screening, provenance, and takedown process.

That separation matters for robotics and embodied AI because a dataset can have a published license while still containing people, homes, workplaces, vehicles, or sensitive operating contexts that need consent and privacy review.

Use the result to decide whether the source can proceed to sample parsing, must stay in research-only evaluation, needs counsel review, or should be quarantined from model access until missing artifacts exist.

INTERPRETATION RULES

How to read the result

Proceed carefully

Low risk

Low risk means the selected artifacts are present. It does not mean the dataset is approved for every commercial product, customer, or geography.

Quarantine route

High or critical

High scores should block model access until counsel and data owners resolve rights, consent, sensitive-context, provenance, and redistribution questions.

Evidence trail

Required packet

The output should become an artifact checklist: source snapshot, license text, consent map, model-use language, PII notes, and takedown route.

CALIBRATION SOURCES

References behind the rubric

TOOL FOLLOW-UP

Every tool output should route to evidence

A calculator or checker is useful only when it changes the buyer's next step. The output should send the user toward dataset research, rights review, format requirements, budget planning, or a bounty spec with concrete acceptance criteria.

The internal links below make that workflow explicit. They keep tool pages from becoming isolated utilities and give crawlers as well as users a path into deeper catalog, template, briefing, and provider research.

External references are included because tool outputs need calibration against the wider robotics data ecosystem. Buyers should be able to compare truelabel's workflow assumptions with public robotics datasets, developer tooling, and market signals.

Use the tool result as a draft memo, not a final answer. A buyer still needs a source link, a sample packet, a rights note, and a concrete acceptance rule before the output becomes a procurement decision. The links below are the evidence trail for that memo.

INTERNAL LINKS

Continue the buyer workflow

EXTERNAL REFERENCES

Source context to verify