CCLSL: Combination of Contrastive Learning and Supervised Learning for Handwritten Mathematical Expression Recognition
Handwritten Mathematical Expressions differ considerably from ordinary linear handwritten texts, due to their two-dimentional structures plus many special symbols and characters. Hence, HMER(Handwritten Mathematical Expression Recognition) is a lot more challenging compared with normal handwriting recognition. At present, the mainstream offline recognition systems are generally built on deep learning methods, but these methods can hardly cope with HEMR due to the lack of training data. In this paper, we propose an encoder-decoder method combining contrastive learning and supervised learning(CCLSL) , whose encoder is trained to learn semantic-invariant features between printed and handwritten characters effectively. CCLSL improves the robustness of the model in handwritten styles. Extensive experiments on CROHME benchmark show that without data enhancement, our model achieves an expression accuracy of 58.07% on CROHME2014, 55.88% on CROHME2016 and 59.63% on CROHME2019, which is much better than all previous state-of-the-art methods. Furthermore, our ensemble model added a boost of 2.5% to 3.4% to the accuracy, achieving the state-of-the-art performance on public CROHME datasets for the first time.