Fine-Grained Visual Attribute Extraction From Fashion Wear

Viral Parekh, Karimulla Shaik, Soma Biswas, Muthusamy Chelliah; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 3973-3977


Automatically extracting visual attributes for e-commerce data has widespread applications in cataloging, catalogue qualification and enrichment, visual search, etc. Here, we address the task of visual attribute extraction for a highly challenging real-world fashion data from Flipkart catalogue (an Indian e-commerce platform), which is collected from seller uploaded product images. This data not only contains widely varying categories (e.g., shirt, sari, shoes), but also has both coarse-grained (e.g., occasion, top type, sari type) and fine-grained (e.g., neck type, print type) attributes. Training examples available for different attributes are highly imbalanced, making this task even more challenging. To this end, we propose an end-to-end framework which integrates multi-task learning with transformer as an attention module, in addition to handling the data imbalance. The proposed architecture supports multiple attributes across various product categories in a scalable manner. Extensive experiments on the in-house dataset shows effectiveness of the proposed framework in improving performance of the fine-grained attributes by 13% on the baseline across the attributes.

Related Material

@InProceedings{Parekh_2021_CVPR, author = {Parekh, Viral and Shaik, Karimulla and Biswas, Soma and Chelliah, Muthusamy}, title = {Fine-Grained Visual Attribute Extraction From Fashion Wear}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {3973-3977} }