For human-centric large-scale scenes, fine-grained modeling for 3D human global pose and shape is significant for scene understanding and can benefit many real-world applications. In this paper, we present LiveHPS, a novel single-LiDAR-based approach for scene-level Human Pose and Shape estimation without any limitation of light conditions and wearable devices. In particular, we design a distillation mechanism to mitigate the distribution-varying effect of LiDAR point clouds and exploit the temporal-spatial geometric and dynamic information existing in consecutive frames to solve the occlusion and noise disturbance. LiveHPS, with its efficient configuration and high-quality output, is well-suited for real-world applications. Moreover, we propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses, shapes and translations. It consists of multi-modal and multi-view acquisition data from calibrated and synchronized LiDARs, cameras, and IMUs. Extensive experiments on our new dataset and other public datasets demonstrate the SOTA performance and robustness of our approach.
The pipeline of LiveHPS. With sequential LiDAR point clouds as input, LiveHPS consists of three critical modules to obtain human SMPL parameters, including a point-based body tracker to distill the pose-prior information, a consecutive pose optimizer to refine the pose via utilizing joint-wise features, and a multi-head SMPL solver to regress parameters of human models.
The capture systems of FreeMotion. In (a), we use a dense-camera capture system with LiDARs for accurate pose and shape capture. In (b), we set LiDARs and cameras at three views to capture human motions.
FreeMotion_Indoor |——LiDAR_info | |——FM_Indoor_train.pkl | | |——pc_x(Point cloud data in view x) | | |——T_x(Ground truth of translation in view x) | | |——shape(Ground truth of shape) | | |——gt(Ground truth of SMPL local pose) | | |——motion_id | |——FM_Indoor_test.pkl | |——... |——Camera_info | |——camera18_train.pkl | | |——shape(Ground truth of shape) | | |——body_pose(Ground truth of body pose) | | |——transl_cam(Ground truth of transl) | | |——K(Camera calibration matrix) | | |——root_pose_cam(Ground truth of global rotation) | | |——motion_id | | |——images | | |——bbox | | |——kp2d | |——camera18_test.pkl | |——... |── images.tar.gz(Image data) |── livehps.t7(Pretrained Model)
np.load(file_path, allow_pickle=True) to load the file.
@inproceedings{ren2024livehps,
title={LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment},
author={Ren, Yiming and Han, Xiao and Zhao, Chengfeng and Wang, Jingya and Xu, Lan and Yu, Jingyi and Ma, Yuexin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={1281--1291},
year={2024}
}