SVGformer: Representation Learning for Continuous Vector Graphics Using Transformers

Defu Cao, Zhaowen Wang, Jose Echevarria, Yan Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 10093-10102

Abstract


Advances in representation learning have led to great success in understanding and generating data in various domains. However, in modeling vector graphics data, the pure data-driven approach often yields unsatisfactory results in downstream tasks as existing deep learning methods often require the quantization of SVG parameters and cannot exploit the geometric properties explicitly. In this paper, we propose a transformer-based representation learning model (SVGformer) that directly operates on continuous input values and manipulates the geometric information of SVG to encode outline details and long-distance dependencies. SVGfomer can be used for various downstream tasks: reconstruction, classification, interpolation, retrieval, etc. We have conducted extensive experiments on vector font and icon datasets to show that our model can capture high-quality representation information and outperform the previous state-of-the-art on downstream tasks significantly.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Cao_2023_CVPR, author = {Cao, Defu and Wang, Zhaowen and Echevarria, Jose and Liu, Yan}, title = {SVGformer: Representation Learning for Continuous Vector Graphics Using Transformers}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {10093-10102} }