GaitW: Enhancing Gait Recognition in the Wild using Dynamic Information

Daksh Thapar, Jayesh Chaudhari, Sunny Manchanda, Aditya Nigam, Chetan Arora; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 268-285

Abstract


Success of modern deep neural networks (DNNs) for gait recognition on in-the-lab datasets such as CASIA-B and OU-MVLP have encouraged the community to aim for more challenging, and in-the-wild datasets such as GREW and Gait3D. The new datasets contain large variations in silhouettes due to change in camera pose, clothing, accessories, as well as occlusion, thus posing huge challenges to existing techniques and training strategies for gait recognition. We posit that to achieve high accuracy in in-the-wild datasets, explicitly leveraging dynamic information in gait samples during training is imperative. We propose a novel transformer based architecture for gait recognition specifically leveraging such dynamic information. The novel contributions include: (1) We propose interleaved spatial and temporal encoders to attend to positioning of various body parts in a frame, and movement of a body part across the sample, respectively. (2) We propose a novel dynamic information inspired curriculum, where we first determine the hardness of a sample based on the disparity between representations of its frame-wise silhouettes (FWSs) and GEI. The model is trained using easier samples first, followed by progressively difficult samples. (3) We propose mask-annealing for silhouettes using Gait Energy Images (GEIs), which attends to silhouette contours and allows a model to learn robust silhouette shape representation. We report a significant improvement in accuracy (in %) of 96.9, 92.9, 81.2, and 67.7 on benchmark CASIA-B, OU-MVLP, GREW, and Gait3D datasets respectively using our technique, against the current state-of-the-art (SOTA) accuracy of 96.9, 92.4, 77.4, and 67.0 by MSGR [43] (TMM23), HSTGait(ICCV23), SkeletonGait (AAAI24), and QAGait (AAAI24) respectively. In a significant departure from the current trend, and as evident from the above numbers, the proposed technique sets up simultaneous SOTA on most prominent in-the-lab as well as in-the-wild datasets. Complete source code and trained models of our method will be publicly available.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Thapar_2024_ACCV, author = {Thapar, Daksh and Chaudhari, Jayesh and Manchanda, Sunny and Nigam, Aditya and Arora, Chetan}, title = {GaitW: Enhancing Gait Recognition in the Wild using Dynamic Information}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {268-285} }