点云数据生成鸟瞰图表示

点云数据

点云数据是点的集合,由激光雷达采集获得,点云数据应表示为具有N行,具有4列的numpy数组。每行对应一个点,该点在空间(x,y,z)使用3个值表示。 第四个值是附加值,通常为反射率(强度)。

鸟瞰图表示

鸟瞰图是俯视视角下点云的一种图形表示。 在自动驾驶中,该表示方法的合理性基于一个前提: 所有的车辆都在地面行驶,因此可以直接将数据展平在x, y平面中。 点云数据转换为鸟瞰图的步骤为:

  • 设置感兴趣的区域(Region of Interest), 区域大小为L * W * H.
  • 设置分辨率,将Region of Interest栅格化,并将点云的位置映射到鸟瞰图上的像素位置,计算occupancy map
  • 按照高度将鸟瞰图划分为不同的通道,以保存点云数据中的高度信息。

微信截图_20200730102703.png

鸟瞰图生成代码

import numpy as np


# ==============================================================================
#                                                                   SCALE_TO_255
# ==============================================================================
def scale_to_255(a, min, max, dtype=np.uint8):
    """ Scales an array of values from specified min, max range to 0-255
        Optionally specify the data type of the output (default is uint8)
    """
    return (((a - min) / float(max - min)) * 255).astype(dtype)


# ==============================================================================
#                                                         POINT_CLOUD_2_BIRDSEYE
# ==============================================================================
def point_cloud_2_birdseye(points,
                           res=0.1,
                           side_range=(-10., 10.),  # left-most to right-most
                           fwd_range = (-10., 10.), # back-most to forward-most
                           height_range=(-2., 2.),  # bottom-most to upper-most
                           ):
    """ Creates an 2D birds eye view representation of the point cloud data.

    Args:
        points:     (numpy array)
                    N rows of points data
                    Each point should be specified by at least 3 elements x,y,z
        res:        (float)
                    Desired resolution in metres to use. Each output pixel will
                    represent an square region res x res in size.
        side_range: (tuple of two floats)
                    (-left, right) in metres
                    left and right limits of rectangle to look at.
        fwd_range:  (tuple of two floats)
                    (-behind, front) in metres
                    back and front limits of rectangle to look at.
        height_range: (tuple of two floats)
                    (min, max) heights (in metres) relative to the origin.
                    All height values will be clipped to this min and max value,
                    such that anything below min will be truncated to min, and
                    the same for values above max.
    Returns:
        2D numpy array representing an image of the birds eye view.
    """
    # EXTRACT THE POINTS FOR EACH AXIS
    x_points = points[:, 0]
    y_points = points[:, 1]
    z_points = points[:, 2]

    # FILTER - To return only indices of points within desired cube
    # Three filters for: Front-to-back, side-to-side, and height ranges
    # Note left side is positive y axis in LIDAR coordinates
    f_filt = np.logical_and((x_points > fwd_range[0]), (x_points < fwd_range[1]))
    s_filt = np.logical_and((y_points > -side_range[1]), (y_points < -side_range[0]))
    filter = np.logical_and(f_filt, s_filt)
    indices = np.argwhere(filter).flatten()

    # KEEPERS
    x_points = x_points[indices]
    y_points = y_points[indices]
    z_points = z_points[indices]

    # CONVERT TO PIXEL POSITION VALUES - Based on resolution
    x_img = (-y_points / res).astype(np.int32)  # x axis is -y in LIDAR
    y_img = (-x_points / res).astype(np.int32)  # y axis is -x in LIDAR

    # SHIFT PIXELS TO HAVE MINIMUM BE (0,0)
    # floor & ceil used to prevent anything being rounded to below 0 after shift
    x_img -= int(np.floor(side_range[0] / res))
    y_img += int(np.ceil(fwd_range[1] / res))

    # CLIP HEIGHT VALUES - to between min and max heights
    pixel_values = np.clip(a=z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

    # RESCALE THE HEIGHT VALUES - to be between the range 0-255
    pixel_values = scale_to_255(pixel_values,
                                min=height_range[0],
                                max=height_range[1])

    # INITIALIZE EMPTY ARRAY - of the dimensions we want
    x_max = 1 + int((side_range[1] - side_range[0]) / res)
    y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
    im = np.zeros([y_max, x_max], dtype=np.uint8)

    # FILL PIXEL VALUES IN IMAGE ARRAY
    im[y_img, x_img] = pixel_values

    return im

pointcloud = np.fromfile(str("000000.bin"), dtype=np.float32, count=-1).reshape([-1, 4])
bev = point_cloud_2_birdseye(pointcloud)

生成的鸟瞰图:
313.png

KITTI数据集标签

KITTI detection数据集标签

数据集标签实例:
微信截图_20200730120209.png

标签含义:

Values Name Description
1 type Describes the type of object: ‘Car’, ‘Van’, ‘Truck’,’Pedestrian’, ‘Person_sitting’, ‘Cyclist’, ‘Tram’,’Misc’ or ‘DontCare’
1 truncated Float from 0 (non-truncated) to 1 (truncated), where truncated refers to the object leaving image boundaries
1 occluded Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded, 2 = largely occluded, 3 = unknown
1 alpha Observation angle of object, ranging [-pi..pi]
4 bbox 2D bounding box of object in the image (0-based index): contains left, top, right, bottom pixel coordinates
3 dimensions 3D object dimensions: height, width, length (in meters)
3 location 3D object location x,y,z in camera coordinates (in meters)
1 rotation_y Rotation ry around Y-axis in camera coordinates [-pi..pi]
1 score Only for results: Float, indicating confidence in detection, needed for p/r curves, higher is better.

第1个字符串:代表物体类别

‘Car’, ‘Van’, ‘Truck’,’Pedestrian’, ‘Person_sitting’, ‘Cyclist’,’Tram’, ‘Misc’ or ‘DontCare’

注意,’DontCare’ 标签表示该区域没有被标注,比如由于目标物体距离激光雷达太远。为了防止在评估过程中(主要是计算precision),将本来是目标物体但是因为某些原因而没有标注的区域统计为假阳性(false positives),评估脚本会自动忽略’DontCare’ 区域的预测结果。

第2个数:代表物体是否被截断

从0(非截断)到1(截断)浮动,其中truncated指离开图像边界的对象

第3个数:代表物体是否被遮挡

整数0,1,2,3表示被遮挡的程度

0:完全可见 1:小部分遮挡 2:大部分遮挡 3:完全遮挡(unknown)

第4个数:alpha,物体的观察角度,范围:-pi~pi

是在相机坐标系下,以相机原点为中心,相机原点到物体中心的连线为半径,将物体绕相机y轴旋转至相机z轴,此时物体方向与相机x轴的夹角

第5~8这4个数:物体的2维边界框

xmin,ymin,xmax,ymax

第9~11这3个数:3维物体的尺寸

高、宽、长(单位:米)

第12~14这3个数:3维物体的位置

x,y,z(在照相机坐标系下,单位:米)

第15个数:3维物体的空间方向:rotation_y

在照相机坐标系下,物体的全局方向角(物体前进方向与相机坐标系x轴的夹角),范围:-pi~pi

第16个数:检测的置信度