Dataset hub
Egocentric Video Datasets
An egocentric video dataset contains video captured from the viewpoint of a person, wearable device, or robot. Robotics and physical AI teams use these datasets to study hands, objects, tools, spaces, task progress, and interaction context, while separately evaluating consent, provenance, licensing, and task fit.
Comparison
| Public example | Useful for | Caveat to verify |
|---|---|---|
| Ego4D | Broad egocentric daily-life research | Access terms and task fit |
| Ego-Exo4D | Paired first- and third-person skilled activities | Annotation and viewpoint fit |
| EPIC-KITCHENS | Egocentric kitchen activity recognition | Non-commercial licensing caveats |
What egocentric datasets typically capture
Egocentric video datasets usually capture a first-person view of task execution. Public references such as Ego4D, Ego-Exo4D, and EPIC-KITCHENS help teams benchmark tasks and terminology, but each source must be evaluated for access terms, scope, modality, and limitations before use [1] [2] [3].
Modalities and tasks to specify
A buyer brief should separate modality from task. Modality describes what is captured; task describes what the model should learn or evaluate.
| Dimension | Examples | Why it matters |
|---|---|---|
| Modality | RGB, audio, gaze, IMU, depth, pose, narration | Defines capture hardware and annotation needs |
| Task | Action recognition, anticipation, hand-object interaction, SLAM | Defines labels, clips, and acceptance checks |
| Domain | Kitchen, workshop, warehouse, home, retail | Defines object set and environment context |
Public datasets are references, not commercial supply by default
Public datasets are useful for benchmarking and task language. They should not be treated as commercial training supply unless the source terms, consent basis, and downstream license allow that use. EPIC-KITCHENS materials, for example, carry visible non-commercial licensing language in the public annotation license [4].
Related pages
Use these to move from category-level context into specific task, dataset, format, and comparison detail.
External references and source context
- Egocentric video remains useful but incomplete for robot data buyers
Ego4D is an official public reference for egocentric video dataset scope, access, and dataset documentation.
ego4d-data.org ↩ - Ego-Exo4D project site
Ego-Exo4D is the official project source for paired first-person and third-person skilled-activity capture.
ego-exo4d-data.org ↩ - EPIC-KITCHENS project site
EPIC-KITCHENS is an official project reference for egocentric kitchen-activity data.
epic-kitchens.github.io ↩ - EPIC-KITCHENS-100 annotations license
The EPIC-KITCHENS-100 annotation license is a visible source for non-commercial licensing caveats.
GitHub ↩ - Ego4D: Around the World in 3,000 Hours of Egocentric Video
The Ego4D paper is the source-backed reference for first-person daily-life activity video and benchmark design.
arXiv - Ego-Exo4D annotations documentation
Ego-Exo4D annotation documentation supports dataset-structure and skilled-activity-label discussion.
docs.ego-exo4d-data.org - Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
The Ego-Exo4D paper describes skilled human activity from first- and third-person perspectives.
arXiv - Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100
The EPIC-KITCHENS-100 paper supports public kitchen-activity benchmark facts and caveats.
arXiv
FAQ
What is an egocentric video dataset?
It is a dataset of videos recorded from the actor or agent viewpoint, often using wearable, head-mounted, handheld, or robot-mounted cameras.
What are egocentric video datasets used for?
They support research and planning for action recognition, anticipation, hand-object interaction, task understanding, and embodied AI evaluation.
What modalities are common in egocentric video data?
Common modalities include RGB video, audio, gaze, IMU, depth, pose, point cloud data, narration, and task annotations.
Can public egocentric datasets be used commercially?
Do not assume so. Review the dataset source, license, consent basis, and downstream model-use terms before using any public dataset in a commercial program.
Looking for egocentric video datasets?
Specify modality, task, environment, rights, and delivery format. Truelabel matches you with vetted capture partners and helps scope consent artifacts and commercial licensing requirements before delivery.
Compare public dataset limits with a custom collection plan