Less Is More: Pursuing the Visual Turing Test With the Kuleshov Effect
The Turing test centers on the idea that if a computer could trick a human into believing that it was human, then the machine was deemed to be "intelligent" or indistinguishable from people. Designing a visual Turing test involves recognizing objects and their relationships on images and creating a method to derive new concepts from the visual information. Until now, the proposed visual tests heavily use natural language processing to conduct the questionnaire or storytelling. We deviate from the mainstream, and we propose to reframe the visual Turing test through the Kuleshov effect to avoid written or spoken language. The idea resides on elucidating a method that creates the concept of montage synthetically. Like the first days of cinema, we would like to convey messages with the interpretation of image shots that a machine could decipher while comparing it with those scored by humans. The first implementation of this new test uses images from a psychology study where the circumplex model is applied to rate each image. We consider five deep learning methodologies and eight optimizers, and through semiotics, we derive an emotional state in the computer. The results are promising since we confirm that this version of the visual Turing test is challenging as a new research avenue.