- [pdf] [arXiv]
A Scale-Invariant Trajectory Simplification Method for Efficient Data Collection in Videos
Training data is a critical requirement for machine learning tasks, and labeled training data can be expensive to acquire, often requiring manual or semi-automated data collection pipelines. For tracking applications, the data collection involves drawing bounding boxes around the classes of interest on each frame, and associate detections of the same "instance" over frames. In a semi-automated data collection pipeline, this can be achieved by running a baseline detection and tracking algorithm, and relying on manual correction to add/remove/change bounding boxes on each frame, as well as resolving errors in the associations over frames (track switches). In this paper, we propose a data correction pipeline to generate ground-truth data more efficiently in this semi-automated scenario. Our method simplifies the trajectories from the tracking systems and let the annotator verify and correct the objects in the sampled keyframes. Once the objects in the keyframes are corrected, the bounding boxes in the other frames are obtained by interpolation. Our method achieves substantial reduction in the number of frames requiring manual correction. In the MOT dataset, it reduces the number of frames by 30x while maintaining a HOTA score of 89.61%. Moreover, it reduces the number of frames by a factor of 10x while achieving a HOTA score of 79.24% in the SoccerNet dataset, and 85.79% in the DanceTrack dataset. The project code and data are publicly released at https://github.com/foreverYoungGitHub/trajectory-simplify-benchmark.