We show an animated video of the retrieved moments in a sequence from the LIPD testing dataset.
The sequence is ["eLIPD"]["921"]["seq2"], which is roughly 1400 frames long.
The figure presented in similarity_peaks.png shows the similarity peaks corresponding to the Figure 7 of the main paper.
The peaks are found using scipy thresholding to a height/similarity of 0.6 and setting a distance limit of 100 (i.e., find_peaks(similarities, height=0.6, distance=100)).
The animated gif includes the found moments where we show the point cloud and skeleton at the respective frame on the top 2 rows, and the IMU signal on the bottom.
The first IMU signal is the query, and we compute the embedding similartiy against the point cloud modality.
 