A Multiview Depth-Based Motion Capture Benchmark Dataset for Human Motion Denoising and Enhancement Research
The field of human motion enhancement is a rapidly expanding field of study in which depth-based motion capture (D-Mocap) is improved to generate a more accurate counterpart for demanding high precision real-world applications. The D-Mocap that is initially generated relies on commercially available SDKs or open source tools to produce the initial skeletal sequence which works best in an ideal front-facing camera setup. This in turn creates a challenging initialization for human motion enhancement when the camera is not positioned in the ideal forward facing position. Currently there are no multiview D-Mocap datasets which have corresponding time-synced and skeleton-matched optical motion capture (Mocap) reference data for view-invariant motion enhancement. We develop a multiview D-Mocap dataset extended from the popular and comprehensive Berkeley MHAD dataset. In addition, we analyze the performance of the D-Mocap data generated through a series of open source tools, highlighting the difficulty and the need to produce robust results in a rear-facing camera setup due to a 21.4% increase in average joint position error over front-facing data. Finally, we analyze the results of some recent human motion enhancement algorithms with regard to a front-facing camera setup versus a rear-facing one.