Where Are They Looking in the 3D Space?

Nora Horanyi, Linfang Zheng, Eunji Chong, Aleš Leonardis, Hyung Jin Chang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 2678-2687

Abstract


We propose a novel depth-aware joint attention target estimation framework that estimates the attention target in 3D space. Our goal is to mimic human's ability to understand where each person is looking in their proximity. In this work, we tackle the previously unexplored problem of utilising a depth prior along with a 3D joint FOV probability map to estimate the joint attention target of people in the scene. We leverage the insight that besides the 2D image content, strong gaze-related constraints exist in the depth order of the scene and different subject-specific attributes. Extensive experiments show that our method outperforms favourably against existing joint attention target estimation methods on the VideoCoAtt benchmark dataset. Despite the proposed framework being designed for joint attention target estimation, we show that it outperforms single attention target estimation methods on both the GazeFollow image and the VideoAttentionTarget video benchmark datasets.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Horanyi_2023_CVPR, author = {Horanyi, Nora and Zheng, Linfang and Chong, Eunji and Leonardis, Ale\v{s} and Chang, Hyung Jin}, title = {Where Are They Looking in the 3D Space?}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {2678-2687} }