- [pdf] [supp]
SliceNets -- A Scalable Approach for Object Detection in 3D CT Scans
One of the most promising approaches for automated detection of guns and other prohibited items in aviation baggage screening is the use of 3D computed tomography (CT) scans. However, automated detection, especially with deep neural networks, faces two key challenges: the high dimensionality of individual 3D scans, and the lack of labeled training data. We address these challenges using a novel image-based detection and segmentation technique that we call the slice-and-fuse framework. Our approach relies on slicing the input 3D volumes along the three cardinal directions, generating 2D predictions on each slice using 2D CNNs, and subsequently fusing the 2D predictions to obtain a 3D prediction. We develop two distinct detectors based on this slice-and-fuse strategy: the Retinal-SliceNet that uses a unified, single network with end-to-end training, and the U-SliceNet that uses a two-stage paradigm, first generating proposals using a voxel labeling network and, subsequently, refining the proposals by a 3D classification network. The networks are trained using a data augmentation approach that creates a very large training dataset by inserting weapons into 3D CT scans of threat-free bags. We demonstrate that the two SliceNets outperform state-of-the-art 3D object detection methods on a large-scale 3D baggage CT dataset for baggage classification, 3D object detection, and 3D semantic segmentation.