Fusion Learning Using Semantics and Graph Convolutional Network for Visual Food Recognition

Zhao, Heng; Yap, Kim-Hui; Kot, Alex Chichung

Heng Zhao, Kim-Hui Yap, Alex Chichung Kot; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 1711-1720

Abstract

Food-related applications and services are essential for the health and well-being of people. With the rapid development of social networks and mobile devices, food images captured by people can offer rich knowledge about the food and also necessary dietary assistance for people that require special care. Known food recognition frameworks and approaches in computer vision have heavy reliance on many-shot training of a deep network on existing large-scale food datasets. However, it is common for many food categories that it is difficult to collect enough images for training. Traditional few-shot learning is unable to properly address the problem due to the complex characteristics and large variations of food images, and most few-shot frameworks cannot perform classification for many-shot and few-shot categories at the same time. In this paper, we propose a new fusion learning framework for food recognition. It unifies many-shot and few-shot under a single framework, by leveraging on extracted image representations and context sensitive semantic embeddings. Further, considering food categories are often correlated to each other for many commonalities such as same ingredients, cooking methods, the fusion learning framework utilizes a Graph Convolutional Network (GCN) to capture the inter-class relations between both image representations and semantic embeddings of different food categories. The final output fusion classifier will be more robust and discriminative. Comprehensive experimental results on two popular food benchmarks have shown the proposed framework achieves the state-of-the-art fusion performance.

Related Material

[pdf]

[bibtex]

@InProceedings{Zhao_2021_WACV, author = {Zhao, Heng and Yap, Kim-Hui and Kot, Alex Chichung}, title = {Fusion Learning Using Semantics and Graph Convolutional Network for Visual Food Recognition}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2021}, pages = {1711-1720} }