DPOSE: Online Keypoint-CAM Guided Inference for Driver Pose Estimation With GMM-Based Balanced Sampling
Human pose estimation (HPE) is an essential component of Driving Monitoring Systems (DMS) for real-time recognition of driving behavior. To achieve this, HPE is typically integrated with other tasks such as detection and head pose regression, into a single lightweight model that can be easily deployed on edge-side devices. However, oversimplified designs of lightweight HPE models may cause overfitting on generalized samples, rendering them unable to handle rare samples, particularly in the case of the dataset with the imbalanced distribution. In this paper, we propose an optimization scheme for a proprietary HPE task in DMS scenarios. Our method involves a pose-wise hard mining strategy to balance the pose distribution. Additionally, we introduce an online keypoint independent grad-cam loss, which constrains the gradient-based activation feature map of each keypoint prediction to its corresponding semantic region. We evaluate our approach using a benchmark dataset for DMS tasks and achieve outstanding results. Our code will be publicly available.