A Multimodal Dataset of Viewer Responses to Japanese Manzai Comedy

Kawamura, Kazuki; Kengo, Nakai; Rekimoto, Jun

Kazuki Kawamura, Nakai Kengo, Jun Rekimoto; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 42-50

Abstract

We present the first large-scale multimodal dataset of viewer responses to Japanese manzai comedy. We recorded synchronized facial video and audio from 241 native speakers who watched up to ten professional performances in randomized order at home (94.6% watched at least 8; analyses use n=228), yielding approximately 192 hours of data. Our findings are simple. (1) Viewers cluster into two stable styles of appreciation: about three quarters rate most acts consistently high, whereas about one quarter are more selective and variable; each person's style is stable across videos. (2) Ratings tend to rise, not fall, as a session progresses--a positive "momentum" effect rather than fatigue. (3) Across 77 annotated humor instances spanning nine categories, no single category clearly dominated; delivery and context appear to matter more than labels. This dataset provides a culturally grounded benchmark for affective computing and supports personalization and cross-cultural evaluation of emotion-aware systems.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Kawamura_2025_ICCV, author = {Kawamura, Kazuki and Kengo, Nakai and Rekimoto, Jun}, title = {A Multimodal Dataset of Viewer Responses to Japanese Manzai Comedy}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {42-50} }