Lightweight Delivery Detection on Doorbell Cameras

Pirazh Khorramshahi, Zhe Wu, Tianchen Wang, Luke DeLuccia, Hongcheng Wang; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 6962-6971

Abstract


Despite recent advances in video-based action recognition and robust spatio-temporal modeling, most of the proposed approaches rely on the abundance of computational resources to afford running huge and computation-intensive convolutional or transformer-based neural networks to obtain satisfactory results. This limits the deployment of such models on edge devices with limited power and computing resources. In this work we investigate an important smart home application, video based delivery detection, and present a simple and lightweight pipeline for this task that can run on resource-constrained doorbell cameras. Our proposed pipeline relies on motion cues to generate a set of coarse activity proposals followed by their classification with a mobile-friendly 3DCNN network. For training we design a novel semi-supervised attention module that helps the network to learn robust spatio-temporal features and adopt an evidence-based optimization objective that allows for quantifying the uncertainty of predictions made by the network. Experimental results on our curated delivery dataset shows the significant effectiveness of our pipeline compared to alternatives and highlights the benefits of our training phase novelties to achieve free and considerable inference-time performance gains.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Khorramshahi_2024_WACV, author = {Khorramshahi, Pirazh and Wu, Zhe and Wang, Tianchen and DeLuccia, Luke and Wang, Hongcheng}, title = {Lightweight Delivery Detection on Doorbell Cameras}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {6962-6971} }