def unproject_depth(depth,
Unproject a depth map into a 3D point cloud using camera intrinsic parameters and optional transformations.
This function converts a 2D depth map into a set of 3D points by performing the following steps:
1. **Coordinate Grid Generation:**
Generates a grid of pixel coordinates corresponding to the depth image dimensions.
2. **Flattening and Masking:**
Flattens the x and y coordinate grids along with the depth values.
Applies filtering based on:
- A minimum depth threshold (`depth_min`),
- An optional maximum depth threshold (`depth_max`),
- And, if provided, a confidence map (`conf`) that retains only pixels with a confidence value equal to 2.
3. **3D Point Calculation:**
Constructs homogeneous 2D pixel coordinates by stacking x, y, and a constant value of 1.
Multiplies these coordinates by the corresponding depth values to obtain scaled 3D coordinates in pixel space.
Transforms these coordinates into camera space by applying the inverse of the camera intrinsic matrix (`ixt`).
4. **Optional Extrinsic Transformation:**
If an extrinsic matrix (`ext`) is provided, the function converts the 3D points from camera coordinates to another coordinate system (e.g., world coordinates).
This is done by converting the points to homogeneous coordinates, then applying the inverse of the extrinsic matrix.
5. **Clipping (Optional):**
If a clipping box (`clip_box`) is specified, further filters the points based on the provided 3D bounds.
The `clip_box` should be a list or tuple of 6 values:
`[x_max, y_max, z_max, x_min, y_min, z_min]`
Each value can be `None` if no clipping is required for that boundary.
6. **Color Association:**
If a color image (`color`) is provided, the function maps corresponding color values to the valid 3D points.
If the color image has a data type of unsigned 8-bit integers, it is converted to float values in the range [0, 1].
- If `ret_pcd` is set to **False**, the function returns:
- A tuple `(points, colors)` if `color` is provided, where:
- `points` is a NumPy array of shape (N, 3) containing 3D coordinates.
- `colors` is a NumPy array of shape (N, 3) with the corresponding color values.
- Otherwise, only the NumPy array `points` is returned.
- If `ret_pcd` is **True**, the function returns an Open3D `PointCloud` object, populated with points and, if available, colors.
A 2D array (height x width) representing the depth map. Each element is a depth value in the same unit as the camera parameters.
A 3x3 camera intrinsic matrix used to convert pixel coordinates into normalized camera coordinates.
depth_min (float, optional):
The minimum valid depth value. Any depth value below this threshold is discarded.
depth_max (float, optional):
The maximum valid depth value. Any depth value above this threshold is discarded.
If set to `None`, no upper threshold is applied.
color (np.ndarray, optional):
An optional color image associated with the depth map. Expected shape is (height, width, 3).
If provided, the function returns the corresponding colors for each valid 3D point.
ext (np.ndarray, optional):
An optional 4x4 extrinsic matrix used to transform points from camera coordinates to another coordinate system (e.g., world coordinates).
conf (np.ndarray, optional):
An optional 2D array with the same shape as `depth` that represents a confidence map for the depth values.
Only pixels with a confidence value equal to 2 are retained.
ret_pcd (bool, optional):
If `True`, the function returns an Open3D `PointCloud` object containing the points (and colors, if available).
If `False`, the function returns NumPy arrays. Defaults to `False`.
clip_box (list or tuple, optional):
A list or tuple containing 6 values `[x_max, y_max, z_max, x_min, y_min, z_min]` that define a 3D clipping box.
Points falling outside these bounds are discarded. Each bound can be set to `None` to indicate no clipping for that side.
- When `ret_pcd` is **False**:
Tuple[np.ndarray, np.ndarray]:
- `points`: A NumPy array of shape (N, 3) containing the unprojected 3D coordinates.
- `colors`: A NumPy array of shape (N, 3) containing the corresponding color values.
- When `ret_pcd` is **True**:
open3d.geometry.PointCloud:
An Open3D PointCloud object with points and associated colors.
If `color` is not provided:
- When `ret_pcd` is **False**:
A NumPy array of shape (N, 3) containing the unprojected 3D coordinates.
- When `ret_pcd` is **True**:
open3d.geometry.PointCloud:
An Open3D PointCloud object containing the 3D points.
If a `clip_box` is provided but does not contain exactly 6 elements.
>>> # Create a synthetic depth map of size 480x640 with values between 0.5 and 2.0 meters.
>>> depth = np.random.uniform(0.5, 2.0, (480, 640))
>>> # Define a sample camera intrinsic matrix.
>>> ixt = np.array([[600, 0, 320],
>>> # Unproject the depth map into 3D points.
>>> points = unproject_depth(depth, ixt, depth_min=0.5, depth_max=2.0)
(N, 3) # Where N is the number of points after depth filtering.
- The function expects the depth values and intrinsic matrix to be in consistent units.
- When an extrinsic matrix is provided, the transformation is applied after computing the 3D camera coordinates.
- The clipping box should be specified in the coordinate space of the output points.
height, width = depth.shape
xx, yy = np.meshgrid(x, y)
mask = np.ones_like(xx, dtype=np.bool_)
# Apply depth thresholds.
if depth_min is not None:
if depth_max is not None:
# Apply confidence filtering if provided.
mask &= conf.reshape(-1) == 2
# Construct homogeneous pixel coordinates and scale by depth.
pcd = np.stack([xx, yy, np.ones_like(xx)], axis=1)
# Transform pixel coordinates into 3D camera coordinates.
pcd = np.dot(pcd, np.linalg.inv(ixt).T)
# Apply extrinsic transformation if provided.
pcd = np.concatenate([pcd, np.ones((pcd.shape[0], 1))], axis=1)
pcd = np.dot(pcd, np.linalg.inv(ext).T)
# Apply clipping if a clip_box is provided.
new_mask = np.ones_like(pcd[:, 0]).astype(np.bool_)
assert len(clip_box) == 6, "clip_box must be a list or tuple of 6 elements: [x_max, y_max, z_max, x_min, y_min, z_min]"
for i, val in enumerate(clip_box):
new_mask &= (pcd[:, 0] <= val)
new_mask &= (pcd[:, 1] <= val)
new_mask &= (pcd[:, 2] <= val)
new_mask &= (pcd[:, 0] >= val)
new_mask &= (pcd[:, 1] >= val)
new_mask &= (pcd[:, 2] >= val)
# Process color information if provided.
if color.dtype == np.uint8:
color = color.astype(np.float32) / 255.
import open3d as o3d # Local import to avoid dependency issues if not using pcd
pcd_obj = o3d.geometry.PointCloud()
pcd_obj.points = o3d.utility.Vector3dVector(points[:, :3][new_mask])
pcd_obj.colors = o3d.utility.Vector3dVector(color.reshape(-1, 3)[mask][new_mask])
return pcd[:, :3][new_mask], color.reshape(-1, 3)[mask][new_mask]
pcd_obj = o3d.geometry.PointCloud()
pcd_obj.points = o3d.utility.Vector3dVector(pcd[:, :3][new_mask])
return pcd[:, :3][new_mask]