Enhancing Human Face Recognition with an Interpretable Neural Network

Timothy Zee, Geeta Gali, Ifeoma Nwogu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0


The purpose of this work is to determine if the ability to interpret a convolutional neural network (CNN) architecture can enhance human performance, pertaining to face recognition. We are interested in distinguishing between the faces of two similar-looking actresses of Indian origin, who have only a few discriminating features. This recognition task proved challenging for humans who were not previously familiar with the actresses (novices) as they performed only just better than random. When asked to perform the same task, humans who were more familiar with the actresses (experts) performed significantly better. We attempted the same task with a Siamese CNN which performed as well as the experts. We therefore became interested in applying any new knowledge obtained from the CNN to aid in improving the distinguishing abilities of other novices. This was accomplished by generating activation maps from the CNN. The maps showed what parts of the input face images created the highest activations in the last convolutional layer of the network. Using "fooling'" techniques, we also investigated what spatial locations on the face were most responsible for confusing one actress for the other. Empirically, the cheekbones and foreheads were determined to be the strongest differentiating features between the actresses. By providing this information verbally to a new set of novices, we successfully raised the human recognition rates by 11%. For this work, we therefore successfully increased human understanding pertaining to facial recognition via post-hoc interpretability of a CNN.

