-
[pdf]
[supp]
[bibtex]@InProceedings{Sharma_2026_WACV, author = {Sharma, Akshit and Patil, Prashant W}, title = {MemeTAG: Keyword-Driven Meme Classification through Tag Embedding Reconstruction}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {March}, year = {2026}, pages = {7679-7688} }
MemeTAG: Keyword-Driven Meme Classification through Tag Embedding Reconstruction
Abstract
The proliferation of harmful internet memes poses a significant societal threat, yet their automated classification remains a formidable algorithmic challenge due to the nuanced, multimodal nature of their content. To address this, we introduce MemeTAG, a novel dual objective framework that pioneers a keyword-aware approach to meme classification. Our core innovation is a two-part semantic guid-ance mechanism: first, we leverage a pretrained Vision-Language Model to generate a set of descriptive keywords, that capture the high-level semantics. Second, we introduce the Aggregated Tag Inference Network (ATIN), an attention-based module that distills these keywords into a single, rich semantic embedding. This embedding servesas a target for a novel auxiliary reconstruction loss, which compels the model to learn deeply aligned visual and textual features. This approach, combined with an efficient three-stage training strategy, establishes a new state-of-the-art on the HarMeme, Hateful Memes Challenge (HMC) and PrideMM datasets, decisively outperforming existing state-of-the-art methods.
Related Material
