truelabelRequest data

Physical AI Data Glossary

Force-Torque Sensing

Force-torque (F/T) sensing measures six-dimensional interaction vectors—three linear forces (Fx, Fy, Fz) and three rotational torques (Tx, Ty, Tz)—at a robot's joints or end-effector. Dedicated wrist-mounted strain-gauge sensors and integrated joint-torque arrays both capture contact dynamics invisible to cameras, enabling policy learning for insertion, assembly, and compliant manipulation tasks where force feedback is the primary control signal.

Updated 2025-06-08
By truelabel
Reviewed by truelabel ·
force-torque sensing

Quick facts

Term
Force-Torque Sensing
Domain
Robotics and physical AI
Last reviewed
2025-06-08

What Force-Torque Sensing Measures in Robot Manipulation

Force-torque sensors output a six-element vector at each timestep: three translational forces along Cartesian axes (push/pull in x, y, z) and three rotational moments about those axes (twist around x, y, z). A wrist-mounted ATI Mini45 sensor samples at 7 kHz and resolves forces down to 0.01 N; joint-torque arrays in the Franka FR3 report per-joint estimates at 1 kHz[1]. Both modalities convert mechanical strain into electrical signals via strain gauges or current sensing, then transform raw readings into a task-frame wrench.

Contact-rich tasks—peg insertion, snap-fit assembly, surface polishing—require force feedback because vision cannot directly observe interaction mechanics. A camera sees a gripper approach a socket but cannot measure whether the peg is binding on an edge or seated correctly. F/T data disambiguates these states, providing the supervision signal for imitation policies and the reward signal for reinforcement learning. The DROID dataset includes wrist F/T streams for 76,000 manipulation trajectories; RH20T logs joint torques across 20 tasks[2].

Sensor placement determines what interactions you capture. Wrist sensors measure end-effector contact; joint-torque sensors measure actuator load, which includes gravity, inertia, and friction in addition to external forces. Buyers must verify that dataset documentation specifies sensor type, mounting frame, calibration procedure, and whether gravity compensation was applied—metadata gaps make F/T streams unusable for transfer learning.

Dedicated Wrist Sensors vs Integrated Joint-Torque Arrays

Dedicated wrist sensors mount between the robot flange and tool, measuring the complete wrench at the end-effector. ATI Industrial Automation and OnRobot manufacture strain-gauge transducers with force ranges from ±12 N (Mini40) to ±660 N (Omega191), sampling at 1–10 kHz. These devices report in a sensor-local frame; transformation to world or tool coordinates requires forward kinematics and careful calibration[3].

Integrated joint-torque sensing embeds current or strain measurement in each actuator. The Franka FR3 and Universal Robots UR series report per-joint torques at 1 kHz without external hardware. Joint torques reflect the sum of external contact, gravitational load, Coriolis forces, and friction; separating the external component requires a dynamic model and real-time estimation. The LeRobot framework includes torque preprocessing pipelines that subtract model-predicted internal forces, yielding an external-torque estimate suitable for policy input[4].

Dataset buyers should prefer wrist sensors for tasks where end-effector contact dominates (insertion, grasping) and joint torques for whole-arm interaction (collaborative assembly, human handover). The Open X-Embodiment collection mixes both: 23 datasets include wrist F/T, 14 include joint torques, and 8 include neither[5]. Missing F/T streams limit a dataset's utility for contact-rich transfer; truelabel's intake form requires explicit sensor metadata to surface this gap during procurement.

F/T Data in Imitation and Reinforcement Learning Pipelines

Imitation learning policies consume F/T streams as additional observation channels. RT-1 concatenates wrist force with RGB and proprioception, feeding a 12-channel tensor into a transformer backbone. RT-2 extends this to language-conditioned control, where force thresholds gate task-phase transitions (e.g.,

Logging F/T Streams in RLDS, HDF5, and MCAP Formats

The RLDS specification stores F/T data in `observation['force_torque']` arrays with shape `[6]` or `[num_joints]` depending on sensor type. Each episode's `steps` dataset includes a timestamped wrench vector; metadata fields document sensor model, mounting frame, and calibration date[6]. The DROID dataset uses RLDS with per-step wrist F/T at 30 Hz, synchronized to 480×640 RGB at 15 Hz via shared timestamps.

HDF5 archives organize F/T as `/episode_N/force_torque` datasets with attributes for units (Newtons, Newton-meters), frame ID, and sensor serial number. The RoboNet dataset logs joint torques in `/observations/joint_torques` groups, storing 7-DOF arrays for Sawyer arms and 6-DOF for UR5 robots[7]. Buyers must parse HDF5 attributes to distinguish raw sensor output from gravity-compensated estimates.

MCAP containers serialize F/T as `geometry_msgs/WrenchStamped` messages in ROS ecosystems. The MCAP format supports schema evolution, allowing datasets to add force channels without breaking existing parsers. LeRobot's dataset loader reads MCAP, HDF5, and RLDS interchangeably, normalizing F/T arrays to a common `[B, T, 6]` tensor shape for policy training[8].

Calibration, Drift, and Gravity Compensation in F/T Datasets

Uncalibrated F/T sensors report biased readings due to thermal drift, mounting stress, and gravity. A wrist sensor at rest under a 2 kg gripper reads a constant −19.6 N in the z-axis; subtracting this bias requires a no-contact calibration routine before each session. The DROID collection protocol performs 10-second zero-force calibrations at episode start, storing bias vectors in episode metadata[9].

Joint-torque sensors require dynamic compensation: measured torque equals external torque plus gravitational torque plus velocity-dependent terms. The Franka FR3 SDK provides `getExternalTorque()` methods that subtract a real-time rigid-body model, but model errors accumulate during fast motion. Buyers should verify whether dataset torques are raw or compensated; Open X-Embodiment documentation flags 6 datasets with uncompensated joint torques, limiting their transfer utility[5].

Thermal drift shifts sensor zero over 30-minute sessions by 0.1–0.5 N. The RH20T dataset logs ambient temperature and recalibrates every 50 episodes, reducing drift-induced label noise. Truelabel's provenance schema requires calibration timestamps and thermal logs for F/T datasets, surfacing quality gaps that vision-only metadata miss[10].

Contact-Rich Task Categories That Require F/T Supervision

Insertion tasks—USB plugs, battery installation, connector mating—generate force spikes when parts bind on chamfers or edges. A policy trained on vision alone cannot distinguish a successful insertion from a jammed attempt; F/T feedback provides the discriminative signal. The CALVIN benchmark includes 4 insertion tasks with wrist F/T; success rates drop 40 percent when F/T channels are ablated[11].

Assembly tasks require monitoring multi-contact states: a snap-fit joint seats with a 15 N spike, a threaded fastener shows rising torque until seated. The Furniture Bench dataset logs 6-axis wrist forces during chair assembly; policies trained with F/T achieve 68 percent success vs 31 percent for vision-only baselines. Scale AI's Universal Robots partnership targets assembly data collection at 50,000 episodes by mid-2025[12].

Surface treatment—polishing, sanding, wiping—requires constant contact pressure. A compliant controller uses F/T feedback to maintain 5–10 N normal force while following a curved surface. The CloudFactory industrial robotics solution collects teleoperation data for surface tasks, logging wrist forces at 100 Hz alongside RGB-D streams. Truelabel's marketplace lists 8 surface-treatment datasets with synchronized F/T, totaling 12,000 trajectories[13].

Sampling Rates, Synchronization, and Temporal Alignment

F/T sensors sample at 1–10 kHz; cameras capture at 15–60 Hz; proprioceptive encoders report at 100–1000 Hz. Temporal misalignment between modalities introduces label noise: a 50 ms camera lag means a contact event appears in F/T data 5–50 frames before the corresponding visual change. The RLDS format requires per-step timestamps in microseconds, enabling post-hoc synchronization[6].

The DROID dataset downsamples wrist F/T from 1 kHz to 30 Hz via moving-average filtering, then interpolates to match 15 Hz camera frames using cubic splines. This introduces ≤33 ms alignment error, acceptable for slow manipulation but problematic for impact tasks. Open X-Embodiment datasets vary in sync precision: 12 use hardware-triggered capture (≤1 ms jitter), 11 use software timestamps (10–50 ms jitter)[5].

LeRobot's data loader resamples all modalities to a common 10 Hz policy frequency, applying zero-order hold to F/T and linear interpolation to joint positions. Buyers training high-frequency policies (≥50 Hz) should verify that source datasets preserve native F/T sampling; truelabel's metadata schema flags datasets with lossy downsampling[13].

Sim-to-Real Transfer Challenges for Force-Torque Policies

Physics simulators model contact via penalty methods or constraint solvers, producing F/T estimates that diverge from real-world sensor noise and compliance. Domain randomization varies simulated contact stiffness and damping, but cannot replicate sensor quantization, thermal drift, or cable-induced crosstalk. Policies trained purely in simulation show 50–70 percent success-rate drops on real F/T-based tasks[14].

The RLBench simulation suite generates synthetic F/T by querying PyBullet contact points, then adds Gaussian noise (σ = 0.5 N). Real ATI sensors exhibit non-Gaussian noise with 0.01–0.1 N quantization and occasional 10 N spikes from electromagnetic interference. The CALVIN dataset pairs simulated and real F/T for 4 tasks, enabling sim-to-real adaptation via domain-adversarial training[11].

Scale AI's physical-AI data engine collects real-world F/T at volume (target: 100,000 contact-rich episodes by 2025) to reduce sim dependence. CloudFactory's industrial robotics pipeline combines sim pretraining with 5,000-episode real-world fine-tuning, achieving 85 percent transfer success on insertion tasks. Truelabel's marketplace prioritizes real-robot F/T datasets, listing 14 collections with ≥1,000 trajectories each[13].

Annotation and Labeling Workflows for F/T-Rich Datasets

Human annotators label contact events—touch onset, slip detection, grasp success—by inspecting synchronized F/T plots and video. Scale AI's annotation platform renders force magnitude as a color-coded timeline below video frames; annotators mark phase boundaries (approach, contact, manipulation, release) with millisecond precision[3].

Automatic segmentation pipelines detect contact via force-threshold crossings (e.g., ||F|| > 2 N) and torque spikes (e.g., ΔT > 0.5 Nm in 100 ms). The DROID preprocessing script segments 76,000 trajectories into 340,000 contact phases using a 1.5 N threshold, then filters false positives via video review. Labelbox's platform supports custom F/T visualizations via Python plugins, enabling domain-specific threshold tuning.

Appen's data annotation service trains labelers on F/T interpretation using 20-hour curricula covering sensor physics, noise patterns, and task-specific failure modes. CloudFactory's accelerated annotation combines ML pre-labeling (contact detection via 1D CNNs on F/T streams) with human review, achieving 3× throughput vs manual labeling. Truelabel's marketplace requires F/T datasets to include phase labels and contact annotations, ensuring downstream usability[13].

Commercial F/T Sensor Specifications and Dataset Compatibility

ATI Industrial Automation manufactures the Mini40 (±12 N, ±120 Nmm, 0.01 N resolution) and Gamma (±32 N, ±2.5 Nm, 0.02 N resolution) for research arms. OnRobot's HEX-E sensor (±200 N, ±20 Nm) targets collaborative robots; its USB interface streams at 500 Hz. The Franka FR3 integrates joint-torque sensing in all 7 axes, reporting at 1 kHz via the Franka Control Interface[1].

Dataset compatibility depends on sensor range and resolution. A policy trained on Mini40 data (±12 N) will saturate when deployed on a UR10 with HEX-E (±200 N) unless forces are normalized. The Open X-Embodiment datasets span 6 sensor models with force ranges from ±10 N to ±300 N; RT-1 normalizes all F/T inputs to [−1, 1] via per-dataset statistics[5].

Buyers should verify sensor specifications in dataset metadata: force/torque range, resolution, sampling rate, mounting frame, and calibration matrix. The RLDS schema includes `sensor_name` and `sensor_config` fields; DROID populates these with ATI Mini45 serial numbers and calibration dates. Truelabel's provenance schema enforces sensor metadata completeness, rejecting submissions with missing calibration records[10].

F/T Data Volume and Coverage in Existing Robot Datasets

The Open X-Embodiment collection aggregates 1 million trajectories from 22 robot embodiments; 23 datasets include wrist F/T (totaling 180,000 trajectories), and 14 include joint torques (240,000 trajectories)[5]. The DROID dataset contributes 76,000 trajectories with 6-axis wrist forces sampled at 30 Hz, covering 564 object categories and 12 task families[9].

RH20T logs joint torques for 20 dexterous manipulation tasks using a Shadow Hand, totaling 20,000 episodes. BridgeData V2 includes wrist F/T for 60,000 kitchen manipulation trajectories collected on a WidowX arm. RoboNet aggregates 7-DOF joint torques from Sawyer and Baxter arms across 15 labs, totaling 140,000 trajectories[7].

Coverage gaps remain: insertion tasks represent 8 percent of F/T-labeled data, assembly 12 percent, and surface treatment 3 percent. Scale AI's physical-AI roadmap targets 50,000 insertion episodes and 30,000 assembly episodes by Q4 2025. Truelabel's marketplace lists 14 F/T datasets (12,000–76,000 trajectories each) with explicit task-category tags, enabling targeted procurement[13].

Procurement Considerations: Licensing, Calibration Records, and Metadata

F/T datasets require stricter metadata than vision-only collections. Buyers must verify sensor calibration matrices, mounting-frame transforms, gravity-compensation methods, and thermal-drift logs. The DROID dataset publishes per-episode calibration vectors and ambient temperature; Open X-Embodiment datasets vary in completeness—6 lack calibration metadata entirely[5].

Licensing terms must address sensor-specific restrictions. ATI sensors ship with calibration certificates tied to serial numbers; redistributing calibration matrices may violate terms of sale. Creative Commons BY 4.0 permits redistribution but does not override hardware-vendor agreements. RoboNet's dataset license explicitly disclaims sensor calibration data, requiring buyers to recalibrate on their own hardware[15].

Truelabel's provenance schema requires F/T datasets to include sensor model, serial number, calibration date, mounting frame, and compensation method. Submissions missing any field are flagged for seller clarification. Truelabel's intake form auto-validates sensor metadata against a registry of 40 commercial F/T sensors, rejecting unknown models[13].

Use these to move from category-level context into specific task, dataset, format, and comparison detail.

External references and source context

  1. FR3 Duo

    Franka FR3 specifications: 7-axis joint-torque sensing at 1 kHz

    franka.de
  2. Project site

    RH20T dataset: 20,000 episodes with joint-torque logs across 20 tasks

    rh20t.github.io
  3. Scale AI: Expanding Our Data Engine for Physical AI

    Scale AI physical-AI data engine: targets 100,000 contact-rich episodes by 2025

    scale.com
  4. LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch

    LeRobot technical report: preprocessing and normalization for F/T streams

    arXiv
  5. Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Open X-Embodiment: 23 datasets with wrist F/T, 14 with joint torques, 8 with neither

    arXiv
  6. RLDS GitHub repository

    RLDS GitHub: metadata fields for sensor model, frame, calibration date

    GitHub
  7. RoboNet GitHub repository

    RoboNet repository: HDF5 structure for joint-torque storage

    GitHub
  8. LeRobot documentation

    LeRobot framework: torque preprocessing pipelines for external-force estimation

    Hugging Face
  9. DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    DROID paper: 10-second zero-force calibrations at episode start

    arXiv
  10. truelabel data provenance glossary

    Truelabel provenance schema: requires calibration timestamps and thermal logs

    truelabel.ai
  11. CALVIN GitHub repository

    CALVIN repository: sim-to-real F/T pairs for domain adaptation

    GitHub
  12. scale.com scale ai universal robots physical ai

    Scale AI + Universal Robots: 50,000 assembly episodes target by mid-2025

    scale.com
  13. truelabel physical AI data marketplace bounty intake

    Truelabel marketplace: intake form requires explicit sensor metadata

    truelabel.ai
  14. Crossing the Reality Gap: A Survey on Sim-to-Real Transferability of Robot Controllers in Reinforcement Learning

    Sim-to-real survey: 50–70% success-rate drop for F/T policies trained in simulation

    arXiv
  15. RoboNet dataset license

    RoboNet license: disclaims sensor calibration data, requires buyer recalibration

    GitHub raw content

More glossary terms

FAQ

Why do robot manipulation datasets need force-torque data when cameras capture contact visually?

Cameras observe geometry and motion but cannot measure interaction forces directly. A peg visually seated in a socket may be jammed on an edge (high force) or correctly inserted (low force)—states indistinguishable in RGB frames. Force-torque sensors provide the ground-truth supervision signal for contact-rich policies, enabling learning of compliant behaviors that vision alone cannot resolve. The DROID dataset shows 40 percent higher insertion success rates when F/T channels are included vs vision-only training.

What is the difference between wrist-mounted force sensors and joint-torque sensors?

Wrist-mounted sensors measure the complete 6-DOF wrench at the end-effector via strain gauges, reporting external contact forces directly. Joint-torque sensors measure actuator current or strain at each joint, capturing the sum of external forces, gravity, inertia, and friction. Wrist sensors require external hardware (e.g., ATI Mini45) but provide clean contact signals; joint torques are built into robots like the Franka FR3 but require dynamic compensation to isolate external forces. Dataset buyers should prefer wrist sensors for end-effector tasks (grasping, insertion) and joint torques for whole-arm interaction (collaborative assembly).

How do I verify that a robot dataset's force-torque data is properly calibrated?

Check dataset metadata for sensor model, serial number, calibration date, mounting frame, and gravity-compensation method. Properly calibrated datasets include per-episode zero-force bias vectors and thermal logs. The RLDS format stores calibration metadata in episode-level attributes; DROID publishes 10-second calibration routines at episode start. Truelabel's provenance schema enforces these fields, rejecting submissions with missing calibration records. Uncalibrated F/T data introduces 0.5–2 N bias errors that degrade policy transfer.

Can force-torque policies trained in simulation transfer to real robots?

Sim-to-real transfer for F/T policies is harder than for vision because simulators cannot replicate sensor noise, quantization, thermal drift, or cable crosstalk. Policies trained purely in PyBullet or MuJoCo show 50–70 percent success-rate drops on real contact tasks. Effective transfer requires real-world fine-tuning datasets: Scale AI targets 100,000 real F/T episodes by 2025, and the CALVIN dataset pairs simulated and real F/T for domain adaptation. Buyers should prioritize real-robot F/T datasets over synthetic data for contact-rich applications.

What sampling rate and synchronization precision do I need for force-torque training data?

F/T sensors sample at 1–10 kHz; downsampling to 30–100 Hz via moving-average filtering preserves contact events for most manipulation tasks. Synchronization with cameras (15–60 Hz) requires microsecond timestamps to avoid label noise—a 50 ms lag means contact appears in F/T data 5–50 frames before visual confirmation. The RLDS format enforces per-step timestamps; DROID achieves ≤33 ms alignment via cubic-spline interpolation. High-frequency policies (≥50 Hz) need native-rate F/T; truelabel's metadata flags datasets with lossy downsampling.

Which commercial robots and datasets provide the best force-torque coverage for manipulation tasks?

The Franka FR3 integrates 7-axis joint-torque sensing at 1 kHz; the DROID dataset (76,000 trajectories) uses wrist-mounted ATI Mini45 sensors on WidowX arms. Open X-Embodiment aggregates 23 F/T datasets (180,000 wrist-force trajectories, 240,000 joint-torque trajectories) across 22 embodiments. RH20T logs joint torques for 20,000 dexterous tasks; BridgeData V2 includes 60,000 kitchen trajectories with wrist F/T. Truelabel's marketplace lists 14 F/T datasets with 12,000–76,000 trajectories each, tagged by task category (insertion, assembly, surface treatment).

Find datasets covering force-torque sensing

Truelabel surfaces vetted datasets and capture partners working with force-torque sensing. Send the modality, scale, and rights you need and we route you to the closest match.

List F/T datasets on truelabel