CGAPoseNet+GCAN: A Geometric Clifford Algebra Network for Geometry-Aware Camera Pose Regression

Alberto Pepe, Joan Lasenby, Sven Buchholz; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 6593-6603

Abstract


We introduce CGAPoseNet+GCAN, which enhances CGAPoseNet, an architecture for camera pose regression, with a Geometric Clifford Algebra Network (GCAN). With the addition of the GCAN we obtain a geometry-aware pipeline for camera pose regression from RGB images only. CGAPoseNet employs Clifford Geometric Algebra to unify quaternions and translation vectors into a single mathematical object, the motor, which can be used to uniquely describe camera poses. CGAPoseNet solves the issue of balancing rotation and translation components in the loss function, and can obtain comparable results to other approaches without the need of expensive tuning of the loss function or additional information about the scene, such as 3D point clouds, which might not always be available. CGAPoseNet, however, like several approaches in the literature, only learns to predict motor coefficients, and it is unaware of the mathematical space in which predictions sit in and of their geometrical meaning. By leveraging recent advances in Geometric Deep Learning, we modify CGAPoseNet with a GCAN: proposals of possible motor coefficients associated with a camera frame are obtained from the InceptionV3 backbone, and the GCAN downsamples them to a single motor through a sequence of layers that work in G_ 4,0 . The network is hence geometry-aware, has multivector-valued inputs, weights and biases and preserves the grade of the objects that it receives in input. CGAPoseNet+GCAN has almost 4 million fewer trainable parameters, it reduces the average rotation error by 41% and the average translation error by 8.8% compared to CGAPoseNet. Similarly, it reduces rotation and translation errors by 32.6% and 19.9%, respectively, compared to the best performing PoseNet strategy. CGAPoseNet+GCAN reaches the state-of-the-art results on 13 commonly employed datasets. To the best of our knowledge, it is the first experiment in GCANs applied to the problem of camera pose regression.

Related Material


[pdf]
[bibtex]
@InProceedings{Pepe_2024_WACV, author = {Pepe, Alberto and Lasenby, Joan and Buchholz, Sven}, title = {CGAPoseNet+GCAN: A Geometric Clifford Algebra Network for Geometry-Aware Camera Pose Regression}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {6593-6603} }