This supplemental file contains:
-our Appendix
-a simple MNIST-Demo (produces tokens, then generates from the compacted sequences)
-our token processor (C++  AND python version) that applies MDBPE to an input dataset
--> for the exact format that this thing expects, run our MNIST-demo to see what it produces!

We also refer to our github for the current version:
https://github.com/DaiDaiLoh/MDBPE_TF