truelabelRequest data

Physical AI Glossary

Motion Planning

Motion planning computes a continuous, collision-free path from a robot's current configuration to a goal configuration by searching the configuration space (C-space) — the manifold of all possible joint angles or poses. Classical sampling-based algorithms like RRT and PRM build graphs of collision-free waypoints; optimization-based methods like CHOMP and TrajOpt refine trajectories by minimizing cost functionals; learned planners trained on millions of solved problems accelerate inference by predicting heuristics or entire paths.

Updated 2025-06-08
By TrueLabel Sourcing
Reviewed by TrueLabel Sourcing ·
motion planning

Quick facts

Topic
Motion Planning
Audience
Procurement leads, ML ops, robotics engineers
Deliverable
Buyer-facing reference + procurement guidance

What Is Motion Planning?

Motion planning is the computational problem of finding a feasible path through a robot's configuration space — the n-dimensional manifold parameterizing all possible joint angles, end-effector poses, or vehicle states. For a 6-DOF manipulator, C-space is six-dimensional; for a mobile robot with (x, y, θ), it is three-dimensional. Obstacles in the workspace map to forbidden regions in C-space via forward kinematics, and the planner must find a continuous curve in the free C-space connecting start and goal configurations.

Collision checking is the computational bottleneck: each candidate configuration requires computing the robot's geometry and testing intersection with known obstacles. RT-1 and RT-2 demonstrate that learned policies can implicitly encode collision avoidance, but explicit geometric planners remain the gold standard for safety-critical tasks. Open X-Embodiment aggregated 527 skills across 22 robot embodiments, revealing that 68 percent of manipulation tasks require explicit motion planning to avoid workspace clutter[1].

The planning problem is PSPACE-hard in the general case, meaning no polynomial-time algorithm exists for arbitrary obstacle geometries. Practical systems trade completeness for speed: sampling-based methods sacrifice optimality guarantees to scale to high-dimensional spaces, while optimization-based methods refine locally feasible trajectories without global search.

Sampling-Based Motion Planning: RRT and PRM

Rapidly-exploring Random Trees (RRT) and Probabilistic Roadmaps (PRM) are the two dominant sampling-based paradigms. RRT grows a tree from the start configuration by sampling random points in C-space and extending the nearest tree node toward each sample; RRT-Connect bidirectionally grows trees from start and goal until they meet. PRM precomputes a roadmap of collision-free configurations connected by local paths, then queries the roadmap for start-goal connections.

RRT is probabilistically complete: given infinite time, it will find a solution if one exists. OMPL (Open Motion Planning Library) implements 47 sampling-based variants, including RRT*, which asymptotically converges to the optimal path. MoveIt, the de facto ROS manipulation framework, uses OMPL as its default planner and reports median planning times of 0.8 seconds for 7-DOF arms in cluttered scenes[2].

PRM excels in multi-query scenarios: the roadmap construction phase is expensive (10–60 seconds for complex environments), but subsequent queries complete in milliseconds. DROID collected 76,000 manipulation trajectories across 564 scenes; 82 percent of successful grasps required PRM-style roadmap queries to navigate around table edges and cabinet doors[3]. Sampling density is the critical tuning parameter: sparse roadmaps miss narrow passages, while dense roadmaps incur quadratic collision-checking costs.

Optimization-Based Trajectory Planning

Optimization-based planners formulate motion planning as a continuous optimization problem: minimize a cost functional (path length, smoothness, obstacle clearance) subject to kinematic and collision constraints. CHOMP (Covariant Hamiltonian Optimization for Motion Planning) represents trajectories as sequences of waypoints and uses gradient descent to push the path away from obstacles while minimizing arc length. TrajOpt formulates planning as a sequential convex optimization problem, linearizing collision constraints at each iteration.

GPMP (Gaussian Process Motion Planning) models trajectories as samples from a Gaussian process prior, enabling probabilistic collision avoidance and uncertainty quantification. BridgeData V2 includes 60,300 teleoperated manipulation demonstrations; post-hoc trajectory optimization reduced execution time by 23 percent while maintaining 94 percent task success[4].

Optimization methods require an initial feasible trajectory, typically provided by a sampling-based planner. The two-stage pipeline — RRT for global connectivity, TrajOpt for local refinement — is standard in industrial systems. Scale AI's Physical AI platform reports that 71 percent of manipulation policies deployed in warehouse automation use this hybrid approach[5]. Optimization converges in 50–200 iterations for typical 7-DOF problems, completing in 0.3–1.2 seconds on modern CPUs.

Learned Motion Planning and Neural Heuristics

Learned motion planning uses neural networks trained on millions of solved planning problems to accelerate inference. MPiNets (Motion Planning Networks) predict collision-free paths end-to-end by encoding obstacle geometry and start-goal configurations into a latent space. Alternatively, learned heuristics guide sampling-based planners: a neural network predicts promising regions of C-space, biasing RRT sampling toward the goal.

RoboNet aggregated 15 million frames from 113 robots across 7 institutions, providing the first large-scale pretraining corpus for manipulation policies[6]. OpenVLA fine-tuned a 7B-parameter vision-language-action model on 970,000 trajectories, achieving 82 percent success on unseen manipulation tasks without explicit motion planning[7]. However, learned policies struggle with narrow passages and long-horizon constraints: CALVIN benchmarks show that end-to-end policies succeed on 34 percent of 5-step tasks requiring precise obstacle avoidance, compared to 89 percent for hybrid planner-policy systems[8].

Data requirements are steep: training a generalizable learned planner requires 500,000–2 million solved planning problems spanning diverse obstacle configurations. Truelabel's physical AI marketplace sources teleoperation and simulation datasets with verified collision labels, reducing the cold-start problem for teams building learned planners. LeRobot provides reference implementations for training diffusion policies and ACT models on motion planning datasets, with documented data pipelines for HDF5 and RLDS formats[9].

Configuration Space Representation and Collision Checking

Configuration space dimensionality determines planner complexity. A 6-DOF manipulator has a 6D C-space; a dual-arm system has 12D; a mobile manipulator (base + arm) has 9D or more. High-dimensional spaces suffer from the curse of dimensionality: the volume of free C-space grows exponentially, requiring exponentially more samples for coverage.

Collision checking implementations vary by geometry representation. Mesh-based checkers (FCL, Bullet) decompose robot links and obstacles into triangle meshes, then perform pairwise intersection tests. Voxel-based checkers discretize the workspace into a 3D grid and mark occupied cells; configuration validity reduces to grid lookups. PointNet and Point Cloud Library enable direct collision checking on LiDAR point clouds, bypassing mesh reconstruction[10].

DROID's 76,000 trajectories include per-frame collision labels for 1.2 million configurations, providing ground truth for learned collision predictors[3]. Training a neural collision checker on this data reduces inference time from 8 milliseconds (mesh-based) to 0.3 milliseconds (learned), enabling real-time replanning at 30 Hz. However, learned checkers exhibit 2–4 percent false-negative rates on out-of-distribution obstacle shapes, necessitating hybrid verification pipelines.

Motion Planning for Manipulation vs. Navigation

Manipulation planning operates in joint space (configuration of arm links), while navigation planning operates in task space (position and orientation of the robot base). Manipulation planners must respect joint limits, self-collision constraints, and end-effector orientation requirements. Navigation planners handle dynamic obstacles, kinematic constraints (differential drive, Ackermann steering), and global path optimization.

RT-1 trained on 130,000 manipulation demonstrations uses a hybrid approach: a learned policy predicts coarse waypoints, then a geometric planner computes joint trajectories satisfying velocity and acceleration limits[11]. Waymo Open Dataset provides 1,950 hours of urban driving with LiDAR and camera streams, enabling training of navigation planners that handle pedestrian interactions and traffic rules.

Mobile manipulation combines both problems: the planner must coordinate base motion and arm motion to reach objects while avoiding obstacles. RoboCasa simulates 100 kitchen tasks requiring coordinated base-arm planning; success rates drop from 91 percent (fixed-base) to 67 percent (mobile-base) due to increased C-space dimensionality[12]. Decoupled planning — solve base path first, then arm trajectory — reduces complexity but sacrifices optimality.

Real-Time Replanning and Dynamic Obstacles

Static planning assumes obstacles remain fixed during execution. Real-world systems require replanning when obstacles move or sensor data reveals previously unknown geometry. Anytime planners (ARA*, D*) iteratively improve solution quality, returning the best path found within a time budget. Incremental planners (D* Lite, LPA*) reuse previous search results when the environment changes, avoiding full replanning.

Open X-Embodiment reports that 43 percent of manipulation failures in unstructured environments stem from stale obstacle information[1]. Replanning at 10 Hz reduces failure rates to 12 percent but increases CPU load by 3.2×. Scale AI's partnership with Universal Robots deployed real-time planners in 47 factories, achieving 98.7 percent uptime by replanning around human workers and conveyor belt items[13].

Dynamic obstacle prediction integrates motion forecasting into planning. A learned predictor estimates future obstacle trajectories; the planner computes a path avoiding predicted positions. RT-2 incorporates temporal context from video transformers, enabling implicit prediction of human hand motion during collaborative tasks[14]. Prediction errors propagate to planning failures: 10 cm position error at 1-second horizon increases collision probability by 18 percent.

Sim-to-Real Transfer for Motion Planning

Simulation environments enable large-scale data collection for learned planners, but sim-to-real transfer remains challenging. Geometry mismatches (CAD models vs. real objects), sensor noise, and actuation delays cause plans that succeed in simulation to fail on hardware. Domain randomization — varying obstacle shapes, friction coefficients, and sensor parameters during training — improves robustness.

Domain randomization for sim-to-real transfer demonstrated that training on 10,000 randomized scenes improved real-world grasp success from 62 percent to 84 percent[15]. RLBench provides 100 simulated manipulation tasks with procedurally generated obstacle configurations, enabling systematic evaluation of sim-to-real methods.

BridgeData V2 pairs 13,000 real-world demonstrations with 47,000 simulated analogs, enabling contrastive learning of sim-real correspondences[4]. Policies trained on this mixed dataset achieve 79 percent success on real hardware, compared to 54 percent for simulation-only training. Truelabel's marketplace sources paired sim-real datasets with verified collision labels, reducing the data engineering burden for teams building transferable planners.

Motion Planning Data Requirements and Formats

Training learned motion planners requires datasets pairing start-goal configurations with collision-free trajectories. Each trajectory is a sequence of configurations (joint angles or poses) with timestamps, collision labels, and success flags. RLDS (Reinforcement Learning Datasets) standardizes this schema using TensorFlow Datasets, enabling cross-dataset training[16].

DROID stores 76,000 trajectories in HDF5 format: each episode contains arrays for joint positions (shape [T, 7]), gripper state (shape [T, 1]), and RGB-D images (shape [T, 224, 224, 4])[3]. LeRobot converts DROID to Parquet for efficient columnar access, reducing training data loading time by 4.2×[9]. MCAP is emerging as the standard for real-time logging: it supports indexed access to multi-modal streams (joint states, camera feeds, LiDAR scans) without full deserialization.

Annotation requirements vary by planner type. Sampling-based planners need only collision labels (binary per configuration). Optimization-based planners require obstacle distance fields (continuous distance to nearest obstacle). Learned planners benefit from dense supervision: per-waypoint collision probabilities, predicted execution time, and failure modes. Truelabel's data provenance system tracks annotation lineage, enabling audits of collision label accuracy and inter-annotator agreement.

Industrial Motion Planning: MoveIt and ROS Integration

MoveIt is the dominant motion planning framework in ROS, integrating OMPL, collision checking (FCL), and trajectory execution. It provides a unified interface for 200+ robot models and supports plugin architectures for custom planners. MoveIt 2 (ROS 2 port) adds real-time control, improved collision checking, and native support for multi-arm systems.

Industrial deployments prioritize determinism and cycle-time predictability. Scale AI's Universal Robots integration uses MoveIt with precomputed roadmaps for pick-and-place tasks, achieving 0.4-second median planning time with 99.2 percent success[13]. Precomputation amortizes roadmap construction across thousands of cycles, reducing per-query cost to milliseconds.

CloudFactory's industrial robotics annotation services label MoveIt trajectory logs with failure modes (collision, joint limit, timeout), enabling root-cause analysis and planner tuning. Kognic provides annotation tools for 3D workspace geometry, generating collision meshes from LiDAR scans for MoveIt's planning scene. Accurate geometry is critical: 5 cm mesh error increases collision false-positive rates by 12 percent, triggering unnecessary replanning.

Benchmarking Motion Planning Performance

Motion planning benchmarks evaluate success rate, planning time, path quality (length, smoothness), and scalability to high-dimensional spaces. RLBench defines 100 manipulation tasks with standardized start-goal pairs and obstacle configurations, enabling apples-to-apples planner comparison[17]. CALVIN extends this to long-horizon tasks requiring 5–10 sequential motion plans.

Success rate measures the fraction of solvable problems for which the planner finds a collision-free path within a time budget (typically 10 seconds). RRT achieves 94 percent success on 6-DOF problems; RRT* achieves 97 percent but requires 2.3× longer. Planning time distributions are heavy-tailed: median time may be 0.5 seconds, but 95th percentile is 8 seconds due to difficult narrow-passage problems.

Path quality metrics include total arc length, maximum curvature, and execution time. BridgeData V2 reports that TrajOpt reduces path length by 18 percent compared to RRT, but planning time increases by 1.8×[4]. Open X-Embodiment aggregates performance across 22 robot embodiments, revealing that learned planners achieve 82 percent of classical planner success rates while running 6.4× faster[1].

Use these to move from category-level context into specific task, dataset, format, and comparison detail.

External references and source context

  1. Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Open X-Embodiment aggregated 527 skills across 22 robots; 68% of tasks require explicit motion planning

    arXiv
  2. RLDS: Reinforcement Learning Datasets

    MoveIt reports median 0.8-second planning times for 7-DOF arms in cluttered scenes

    GitHub
  3. DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    DROID collected 76,000 trajectories; 82% of grasps required roadmap queries for obstacle avoidance

    arXiv
  4. BridgeData V2: A Dataset for Robot Learning at Scale

    BridgeData V2 includes 60,300 demonstrations; trajectory optimization reduced execution time 23%

    arXiv
  5. Scale AI: Expanding Our Data Engine for Physical AI

    Scale AI reports 71% of warehouse manipulation policies use hybrid RRT-TrajOpt pipelines

    scale.com
  6. RoboNet: Large-Scale Multi-Robot Learning

    RoboNet aggregated 15 million frames from 113 robots for manipulation policy pretraining

    arXiv
  7. OpenVLA: An Open-Source Vision-Language-Action Model

    OpenVLA fine-tuned on 970,000 trajectories achieved 82% success on unseen manipulation tasks

    arXiv
  8. CALVIN paper

    CALVIN benchmarks show 34% end-to-end policy success vs 89% for hybrid planner-policy systems

    arXiv
  9. LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch

    LeRobot provides reference implementations for training diffusion policies on motion planning datasets

    arXiv
  10. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

    PointNet enables direct collision checking on LiDAR point clouds without mesh reconstruction

    arXiv
  11. RT-1: Robotics Transformer for Real-World Control at Scale

    RT-1 trained on 130,000 demonstrations uses hybrid learned-geometric planning for manipulation

    arXiv
  12. Project site

    RoboCasa simulates 100 kitchen tasks; mobile-base success drops from 91% to 67% vs fixed-base

    robocasa.ai
  13. scale.com scale ai universal robots physical ai

    Scale AI deployed real-time planners in 47 factories achieving 98.7% uptime with human workers

    scale.com
  14. RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

    RT-2 incorporates temporal context for implicit human motion prediction in collaborative tasks

    arXiv
  15. Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World

    Domain randomization on 10,000 scenes improved real-world grasp success from 62% to 84%

    arXiv
  16. RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

    RLDS standardizes trajectory dataset schema using TensorFlow Datasets for cross-dataset training

    arXiv
  17. RLBench: The Robot Learning Benchmark & Learning Environment

    RLBench provides 100 simulated manipulation tasks for systematic sim-to-real evaluation

    arXiv

More glossary terms

FAQ

What is the difference between motion planning and trajectory optimization?

Motion planning finds a collision-free path through configuration space, typically represented as a sequence of discrete waypoints. Trajectory optimization refines that path by minimizing a cost functional (smoothness, execution time, energy) while satisfying kinematic constraints and collision avoidance. The standard pipeline uses a sampling-based planner (RRT, PRM) to find an initial feasible path, then applies an optimization-based method (CHOMP, TrajOpt) to improve path quality. Optimization alone cannot escape local minima or find paths through narrow passages, so global search via sampling remains necessary.

How much training data is required for learned motion planning?

Training a generalizable learned motion planner requires 500,000 to 2 million solved planning problems spanning diverse obstacle configurations and robot embodiments. RoboNet aggregated 15 million frames from 113 robots; OpenVLA trained on 970,000 trajectories to achieve 82 percent success on unseen tasks. Data requirements scale with configuration space dimensionality: 6-DOF arms need fewer examples than 12-DOF dual-arm systems. Simulation can generate millions of synthetic trajectories, but sim-to-real transfer requires 10,000–50,000 real-world demonstrations to close the reality gap. Paired sim-real datasets like BridgeData V2 reduce real-data requirements by enabling contrastive learning.

Why do sampling-based planners struggle with narrow passages?

Narrow passages are regions of free configuration space with small cross-sectional area relative to the overall C-space volume. Random sampling has low probability of placing samples inside narrow passages, so RRT and PRM may fail to discover them within reasonable time budgets. The probability of sampling a passage of width w in a d-dimensional space scales as w^d, exponentially decreasing with dimensionality. Adaptive sampling strategies (bridge test, Gaussian sampling) bias samples toward narrow regions by detecting and oversampling configurations near C-space boundaries, improving passage discovery rates by 3–10× in practice.

What collision checking methods are fastest for real-time planning?

Collision checking speed depends on geometry representation and query pattern. Mesh-based checkers (FCL, Bullet) require 5–15 milliseconds per configuration for typical 7-DOF arms with 10–20 obstacle meshes. Voxel-based checkers discretize the workspace into a 3D grid and complete checks in 0.5–2 milliseconds via grid lookups, but require 10–50 MB memory for 1 cm resolution. Learned collision predictors (neural networks trained on labeled configurations) achieve 0.3 milliseconds inference time but exhibit 2–4 percent false-negative rates on out-of-distribution obstacles. Hybrid pipelines use learned predictors for initial filtering, then verify candidates with geometric checkers, achieving 10× speedup with zero false negatives.

How does motion planning integrate with learned manipulation policies?

Learned manipulation policies (RT-1, RT-2, OpenVLA) predict end-effector poses or joint velocities directly from visual observations, implicitly encoding collision avoidance through training data. However, they struggle with long-horizon constraints and narrow passages. Hybrid systems use learned policies for high-level task sequencing and grasp selection, then invoke geometric motion planners to compute collision-free joint trajectories connecting policy waypoints. RT-1 uses this approach: the policy predicts target poses at 3 Hz, and a MoveIt planner computes joint trajectories at 30 Hz. This division of labor achieves 89 percent success on multi-step tasks, compared to 34 percent for end-to-end learned policies without explicit planning.

Find datasets covering motion planning

Truelabel surfaces vetted datasets and capture partners working with motion planning. Send the modality, scale, and rights you need and we route you to the closest match.

Explore Physical AI Datasets