Our multi-modal STCrowd dataset supports detection, tracking, and prediction tasks currently. We give evaluation metrics and provide benchmarks.
In 3D object detection of point clouds, we want to infer the label of each object. Therefore, the input to all evaluation methods is the parameter of the 3D bounding box generated from each object in a scene. Each method then output a label for each object of a scan, the 3D bounding box containing the point cloud of object. We evaluate three settings for this task: method using different meters as matching thresholds of 3D center distance.