Self-Supervised Human-Object Interaction of Complex Scenes With Context-Aware Mixing: Towards In-Store Consumer Behavior Analysis

Takashi Kikuchi, Shun Takeuchi; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2024, pp. 744-751

Abstract


Recognizing human-object interactions (HOIs) in physical retail stores, such as picking up a product, can provide valuable information about non-purchasers, and is an important aspect of understanding customer behaviors. However, there are often complex scenes in physical retail stores with numerous similar objects in the shelf, making the task of recognizing the interacting object challenging. To address the drawback of complex background scenes, we propose a method using image mixing and self-supervised techniques to train the model to differentiate objects that interact with background objects. The proposed method generates images without the object's influence based on the input image using Context-aware image mixing. Then, we introduce a self-supervised method using the generated images to learn the difference between the actual and the background objects. We evaluated the network's performance using public and private retail dataset. We confirmed that when applied to physical retail scenes, the performance overcame the recent HOI detection methods including the recent state-of-the-art method. To the best of our knowledge, this is the first study to apply a self-supervised technique to control the target of interaction for the HOI detection model, demonstrating promising potential for use in in-store consumer behavior analysis.

Related Material


[pdf]
[bibtex]
@InProceedings{Kikuchi_2024_WACV, author = {Kikuchi, Takashi and Takeuchi, Shun}, title = {Self-Supervised Human-Object Interaction of Complex Scenes With Context-Aware Mixing: Towards In-Store Consumer Behavior Analysis}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2024}, pages = {744-751} }