Tutorial 7: Camera-Based Obstacle Detection¶

🟡 Difficulty: Intermediate ⏱️ Estimated Time: 30-45 minutes

Overview¶

curobo_ros can integrate depth camera data directly into the collision checker. As new depth frames arrive, the BLOX world model is updated in real time, and any subsequent generate_trajectory call plans around the detected obstacles.

The recommended strategy is depth_camera, which subscribes to a depth image topic and feeds frames into cuRobo’s BLOX voxel world. A point_cloud strategy also exists but is experimental.

Architecture¶

Depth Camera (RealSense, Azure Kinect, etc.)
       │  /depth_to_rgb/image_raw  (raw — includes robot body)
       ▼
robot_segmentation node             (recommended — see Section 6)
  ├── joint states → collision spheres (CudaRobotModel)
  └── removes robot pixels → /masked_depth_image
       │
       ▼
DepthMapCameraStrategy              (configured in cameras.yaml)
  ├── intrinsics: camera_info topic or hardcoded
  ├── extrinsics: config [x,y,z,qw,qx,qy,qz] or TF
  └── each frame → add to cuRobo BLOX world model
       │
       ▼
World Model (BLOX voxels)
       │
       ▼
generate_trajectory → collision-free planning

The world model update happens inside the depth image callback — there is no manual trigger required. Each new depth frame automatically updates the voxel occupancy.

Prerequisites¶

Completed Tutorial 1: Your First Trajectory
A depth camera publishing to a ROS2 topic (sensor_msgs/Image, encoding 16UC1 or 32FC1)
Camera info topic available (sensor_msgs/CameraInfo) if you are not hardcoding intrinsics
TF transform available for base_link → camera_frame if you are not hardcoding extrinsics

Section 1: Camera Configuration File¶

Cameras are configured in a YAML file and passed to the node at launch via cameras_config_file.

1.1 File Structure¶

cameras:
  - name: <unique_camera_name>
    type: depth_camera          # recommended; 'point_cloud' is experimental
    topic: <depth_image_topic>
    frame_id: <camera_tf_frame>
    camera_info: <camera_info_topic>   # used if intrinsics are empty
    intrinsics: []              # empty = read from camera_info topic
    extrinsics: []              # empty = read from TF (base_link → frame_id)

1.2 Example: Single RealSense / Azure Kinect Camera¶

cameras:
  - name: depth_camera_front
    type: depth_camera
    topic: /depth_to_rgb/image_raw
    frame_id: rgb_camera_link
    camera_info: /depth_to_rgb/camera_info
    intrinsics: []
    extrinsics: [0.161, -0.428, 1.585, -0.3409296, 0.6840516, -0.6215661, 0.1717437]

1.3 Intrinsics Formats¶

Option A: read from camera_info topic (recommended)

intrinsics: []
camera_info: /depth_to_rgb/camera_info

The node waits up to 5 seconds at startup for a CameraInfo message on that topic.

Option B: 9-element K matrix

The K matrix is flattened row-major: [fx, 0, cx, 0, fy, cy, 0, 0, 1]

intrinsics: [615.3, 0.0, 320.5, 0.0, 615.8, 240.5, 0.0, 0.0, 1.0]

1.4 Extrinsics Formats¶

Extrinsics define the camera pose in the robot’s base frame.

Option A: read from TF at runtime (recommended for dynamic setups)

extrinsics: []
frame_id: rgb_camera_link   # TF lookup: base_link → rgb_camera_link

Option B: 7-element list [x, y, z, qw, qx, qy, qz]

extrinsics: [0.161, -0.428, 1.585, -0.3409296, 0.6840516, -0.6215661, 0.1717437]

Note: When static extrinsics are provided in the config, TF is not queried. Static extrinsics are preferred for fixed-mount cameras.

Section 2: Launch with Camera¶

Pass the YAML file to the launch command:

ros2 launch curobo_ros gen_traj.launch.py \
  cameras_config_file:=/path/to/cameras.yaml

A default example file is provided at config/cameras.yaml inside the curobo_ros package.

ros2 launch curobo_ros gen_traj.launch.py \
  cameras_config_file:=$(ros2 pkg prefix curobo_ros)/share/curobo_ros/config/cameras.yaml

At startup, the node logs each camera that is successfully initialized:

[INFO] Loading camera configuration from: /path/to/cameras.yaml
[INFO] Added camera strategy 'depth_camera_front' of type 'depth_camera'
[INFO] Camera intrinsics from topic: fx=615.30, fy=615.80, cx=320.50, cy=240.50
[INFO] Using static extrinsics from config file
[INFO] DepthMap camera initialized with depth topic: /depth_to_rgb/image_raw

Section 3: How the World Model Updates¶

Once the camera is initialized, it subscribes to the depth topic. For each incoming frame:

The depth image is converted to a float tensor (mm → m if 16UC1)
The camera pose is applied (static from config, or queried from TF)
The frame is added to the cuRobo BLOX world model:
- world_model.add_camera_frame(observation, "world")
- world_model.process_camera_frames("world")
- world_model.update_blox_hashes()

This happens automatically on every depth frame. There is no manual trigger service needed.

When generate_trajectory is called, the world model already reflects the latest camera observations.

Section 4: Verifying the Pipeline¶

Check the camera is initialized¶

ros2 node list | grep unified_planner
ros2 service call /unified_planner/is_available std_srvs/srv/Trigger

Check depth topic is publishing¶

ros2 topic hz /depth_to_rgb/image_raw
ros2 topic info /depth_to_rgb/image_raw

Expected: sensor_msgs/msg/Image at your camera’s configured rate.

Inspect the voxel grid¶

ros2 service call /unified_planner/get_voxel_grid curobo_msgs/srv/GetVoxelGrid

The number of occupied voxels (data array) should increase when obstacles are placed in front of the camera.

Check collision distance¶

ros2 service call /unified_planner/get_collision_distance curobo_msgs/srv/GetCollisionDistance

Positive values = distance to nearest obstacle. Negative values = collision (penetration).

Visualize voxels in RViz¶

You can vizualise collision on rviz with curobo_rviz panel.

Then in RViz: Add → MarkerArray → /visualization_marker_voxel

Section 5: Multiple Cameras¶

Multiple cameras can be declared in the same YAML file:

cameras:
  - name: camera_front
    type: depth_camera
    topic: /front_camera/depth/image_raw
    frame_id: front_camera_link
    camera_info: /front_camera/depth/camera_info
    intrinsics: []
    extrinsics: []

  - name: camera_top
    type: depth_camera
    topic: /top_camera/depth/image_raw
    frame_id: top_camera_link
    camera_info: /top_camera/depth/camera_info
    intrinsics: []
    extrinsics: [0.0, 0.0, 1.5, 1.0, 0.0, 0.0, 0.0]

Each camera runs independently. If one camera fails to initialize, the others continue. The world model accumulates observations from all ready cameras.

Section 6: Robot Segmentation¶

Without filtering, the camera sees the robot’s own body as obstacles in the depth image. The planner detects a collision at the current configuration and refuses to plan. The robot_segmentation node removes the robot from the depth image before it reaches the collision checker.

6.1 How It Works¶

At 100 Hz, the node:

Reads the current joint positions from the robot
Computes the robot’s collision spheres for that configuration (GPU via CudaRobotModel)
Converts the raw depth image into a 3D point cloud (using camera intrinsics)
Removes every point that falls within distance_threshold meters of any collision sphere
Converts the filtered point cloud back to a depth image
Publishes the cleaned image on /masked_depth_image

/depth_to_rgb/image_raw  ──►  robot_segmentation  ──►  /masked_depth_image
                                      ▲
                               joint states → collision spheres

The cameras.yaml must then point to /masked_depth_image instead of the raw camera topic so that the planner receives pre-filtered frames.

6.2 Launch¶

The node is a separate executable that runs alongside the planner:

# Terminal 1: planner + camera
ros2 launch curobo_ros gen_traj.launch.py \
  cameras_config_file:=/path/to/cameras.yaml

# Terminal 2: robot segmentation
ros2 run curobo_ros robot_segmentation

6.3 Parameters¶

Parameter	Default	Description
`robot_config_file`	`m1013.yml` (package default)	Path to robot YAML — must match the planner’s config
`depth_image_topic`	`/depth_to_rgb/image_raw`	Raw depth image input
`camera_info_topic`	`/depth_to_rgb/camera_info`	Intrinsics topic for 3D projection
`joint_states_topic`	`/dsr01/joint_states`	Robot joint positions
`robot_base_frame`	`base_link`	Frame for collision sphere projection

Override any parameter with --ros-args:

ros2 run curobo_ros robot_segmentation --ros-args \
  -p robot_config_file:=/path/to/my_robot.yml \
  -p depth_image_topic:=/camera/depth/image_rect_raw \
  -p camera_info_topic:=/camera/depth/camera_info \
  -p joint_states_topic:=/my_robot/joint_states \
  -p robot_base_frame:=base_link

distance_threshold (default: 0.05 m) controls how much margin is added around each collision sphere. It is a constructor argument and cannot be changed at runtime. Increase it if the robot is occasionally self-detected due to calibration error.

6.4 Topics¶

Subscriptions:

Topic	Type	Description
`<depth_image_topic>`	`sensor_msgs/Image`	Raw depth image (`16UC1` mm or `32FC1` m)
`<camera_info_topic>`	`sensor_msgs/CameraInfo`	Intrinsics for depth→3D projection

Publications:

Topic	Type	Description
`/masked_depth_image`	`sensor_msgs/Image`	Filtered depth image — robot pixels zeroed (`16UC1`)
`/collision_spheres`	`visualization_msgs/MarkerArray`	Robot collision spheres at current config (RViz debug)
`/robot_pointcloud_debug`	`sensor_msgs/PointCloud2`	Points that were removed (RViz debug, only published if subscribed)

6.5 Integration with cameras.yaml¶

Point the depth_camera entry at the segmentation output:

cameras:
  - name: depth_camera_front
    type: depth_camera
    topic: /masked_depth_image          # ← output of robot_segmentation
    frame_id: rgb_camera_link
    camera_info: /depth_to_rgb/camera_info
    intrinsics: []
    extrinsics: [0.161, -0.428, 1.585, -0.3409296, 0.6840516, -0.6215661, 0.1717437]

The camera_info field still points to the original camera info topic — the segmentation node does not alter intrinsics, only pixel values.

6.6 Verify the Segmentation¶

# Check the filtered image is publishing
ros2 topic hz /masked_depth_image

# Visualize in RViz:
#   Add → Image → /masked_depth_image   (should show black pixels where the robot is)
#   Add → MarkerArray → /collision_spheres  (red spheres around robot links)
#   Add → PointCloud2 → /robot_pointcloud_debug  (points removed by segmentation)

If robot_pointcloud_debug shows points clustered around the robot’s links and the masked image has black patches in those areas, segmentation is working correctly.

Section 7: Tuning Parameters¶

Voxel resolution and collision safety margin can be adjusted while the node is running:

# Finer resolution (slower, more accurate)
ros2 param set /unified_planner voxel_size 0.02
ros2 service call /unified_planner/update_motion_gen_config std_srvs/srv/Trigger

# Coarser resolution (faster, less memory)
ros2 param set /unified_planner voxel_size 0.08
ros2 service call /unified_planner/update_motion_gen_config std_srvs/srv/Trigger

# Increase collision safety margin
ros2 param set /unified_planner collision_activation_distance 0.04
ros2 service call /unified_planner/update_motion_gen_config std_srvs/srv/Trigger

`voxel_size`	Resolution	Speed	Memory
0.02 m	Fine (2 cm)	Slow	High
0.05 m	Default (5 cm)	Moderate	Moderate
0.08 m	Coarse (8 cm)	Fast	Low

Troubleshooting¶

Camera not initialized at startup¶

Symptoms: No log line Added camera strategy '...'

Verify cameras_config_file path is correct and the file is readable
Check the YAML is valid (no indentation errors)
Ensure the cameras: key is present with at least one entry

Intrinsics error at startup¶

Symptoms: Failed to receive camera info from /your/topic

The node waits 5 seconds for a CameraInfo message. If the camera is not publishing yet:

Start the camera driver before the planner, or
Hardcode intrinsics in the YAML (Option B or C in Section 1.3)

Obstacles not detected in trajectory planning¶

Confirm depth frames are arriving: ros2 topic hz <depth_topic>
Inspect the voxel grid — it should show occupied voxels near the obstacle
Check frame_id is correct — TF lookup failure falls back to identity pose (obstacles at wrong location)
Reduce voxel_size for finer detection
Increase collision_activation_distance so the planner avoids obstacles with more margin

TF lookup failure¶

Symptoms: Could not transform base_link to <frame_id>

Either:

Provide static extrinsics in the YAML config to bypass TF, or
Ensure the camera TF frame is published (check ros2 run tf2_ros tf2_echo base_link <frame_id>)

Planner rejects all trajectories (collision at start)¶

Symptoms: generate_trajectory always returns success: false, even for simple goals

The most likely cause is that the robot’s own body is visible in the depth image and is detected as an obstacle. The planner sees a collision at the current state and cannot plan.

Solution: Run robot_segmentation and point cameras.yaml to /masked_depth_image (see Section 6).

Verify: Check the voxel grid with and without robot_segmentation running:

ros2 service call /unified_planner/get_voxel_grid curobo_msgs/srv/GetVoxelGrid

With segmentation, the number of occupied voxels near the robot’s body should drop significantly.

robot_segmentation not publishing¶

Symptoms: /masked_depth_image has no publishers or very low rate

Check that both the depth image and camera info topics are publishing:

ros2 topic hz /depth_to_rgb/image_raw
ros2 topic hz /depth_to_rgb/camera_info

Check that joint states are arriving (without them the timer callback does nothing):
```
ros2 topic hz /dsr01/joint_states
```

If the joint states topic is different, override it:

ros2 run curobo_ros robot_segmentation --ros-args \
  -p joint_states_topic:=/my_robot/joint_states