YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID

Iñaki Erregue, Kamal Nasrollahi, Sergio Escalera; Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops, 2025, pp. 824-833

Abstract


We introduce YOLO11-JDE a fast and accurate multi-object tracking (MOT) solution that combines real-time object detection with self-supervised Re-Identification (Re-ID). By incorporating a dedicated Re-ID branch into YOLO11s our model performs Joint Detection and Embedding (JDE) generating appearance features for each detection. The Re-ID branch is trained in a fully self-supervised setting while simultaneously training for detection eliminating the need for costly identity-labeled datasets. The triplet loss with hard positive and semi-hard negative mining strategies is used for learning discriminative embeddings. Data association is enhanced with a custom tracking implementation that successfully integrates motion appearance and location cues. YOLO11-JDE achieves competitive results on MOT17 and MOT20 benchmarks surpassing existing JDE methods in terms of FPS and using up to ten times fewer parameters. Thus making our method a highly attractive solution for real-world applications.

Related Material


[pdf]
[bibtex]
@InProceedings{Erregue_2025_WACV, author = {Erregue, I\~naki and Nasrollahi, Kamal and Escalera, Sergio}, title = {YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {February}, year = {2025}, pages = {824-833} }