A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera

Yan Ru Pei, Sasskia Brüers, Sébastien Crouzet, Douglas Mclelland, Olivier Coenen; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 5780-5788


Event-based data are commonly encountered in edge computing environments where efficiency and low latency are critical. To interface with such data and leverage their rich temporal features we propose a causal spatiotemporal convolutional network. This solution targets efficient implementation on edge-appropriate hardware with limited resources in three ways: 1) deliberately targets a simple architecture and set of operations (convolutions ReLU activations) 2) can be configured to perform online inference efficiently via buffering of layer outputs 3) can achieve more than 90% activation sparsity through regularization during training enabling very significant efficiency gains on event-based processors. In addition we propose a general affine augmentation strategy acting directly on the events which alleviates the problem of dataset scarcity for event-based systems. We apply our model on the AIS 2024 event-based eye tracking challenge reaching a score of 0.9916 p10 accuracy on the Kaggle private testset.

Related Material

@InProceedings{Pei_2024_CVPR, author = {Pei, Yan Ru and Br\"uers, Sasskia and Crouzet, S\'ebastien and Mclelland, Douglas and Coenen, Olivier}, title = {A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {5780-5788} }