Overall Description of video: 
1. Users rate quality of captions generated by TC-Gen pipeline. The users rated the quality of tone-controlled captions generated by TC-Gen pipeline (given their tone content extracted via TX) along five dimensions: Tone Alignment (Personality  and Writing Style), Tone Relevance, Factual Consistency, Usefulness, and Human-likeness as shown in this video. The ratings obtained from this are used in TC-Gen Caption Quality Assessment.

2. Users assess change in Tone Intensity. The users assessed whether change in tone attribute intensities can be perceived in the captions generated by TC-Gen while maintaining factual accuracy. The ratings obtained from this are used to evaluate the Tone-Controllability aspect of TC-Gen pipeline.

3. Users rate quality of captions generated by RoadTones-VL. Users evaluated the quality of tone-controlled captions generated by the RoadTones-VL model, using the ground-truth tone specifications from the RoadTones-51K dataset. Each user provided ratings along two dimensions: Tone Alignment (covering both Personality and Writing Style) and Factual Consistency. For every caption, the RoadTones-Eval framework also produced automatic metric scores for Personality Alignment (Sp), Writing Style Alignment (Sw), and Factual Consistency (FC). To validate these metrics, we computed the Spearman correlation between user ratings and RoadTones-Eval metrics. This correlation was calculated per user across all captions they rated. We then aggregated the per-user correlations: correlations for Personality and Writing Style were averaged to obtain the Tone Alignment Score (TAS) correlation, while Factual Consistency (FC) correlations were reported separately. Higher Spearman correlations for TAS and FC indicate strong agreement between human judgments and RoadTones-Eval, demonstrating the reliability of our proposed evaluation metrics.


Attachment filename: RoadTones_USER_STUDY.mp4 (32360000 Bytes)

The video can be played on any standard media player for e.g. VLC.


