Neural Exposure Fusion for High-Dynamic Range Object Detection

Emmanuel Onzon, Maximilian Bömer, Fahim Mannan, Felix Heide; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 17564-17573

Abstract


Computer vision in unconstrained outdoor scenarios must tackle challenging high dynamic range (HDR) scenes and rapidly changing illumination conditions. Existing methods address this problem with multi-capture HDR sensors and a hardware image signal processor (ISP) that produces a single fused image as input to a downstream neural network. The output of the HDR sensor is a set of low dynamic range (LDR) exposures and the fusion in the ISP is performed in image space and typically optimized for human perception on a display. Preferring tonemapped content with smooth transition regions over detail (and noise) in the resulting image this image fusion does typically not preserve all information from the LDR exposures that may be essential for downstream computer vision tasks. In this work we depart from conventional HDR image fusion and propose a learned task-driven fusion in the feature domain. Instead of using a single companded image we introduce a novel local cross-attention fusion mechanism that exploits semantic features from all exposures learned in an end-to-end fashion with supervision from downstream detection losses. The proposed method outperforms all tested conventional HDR exposure fusion and auto-exposure methods in challenging automotive HDR scenarios.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Onzon_2024_CVPR, author = {Onzon, Emmanuel and B\"omer, Maximilian and Mannan, Fahim and Heide, Felix}, title = {Neural Exposure Fusion for High-Dynamic Range Object Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {17564-17573} }