This method reminds me of how deep learning models get compressed for deployment...

This method reminds me of how deep learning models get compressed for deployment on accelerators. You take advantage of different redundancies of different data structures and compress each of them using a unique method.

Specifically the dictionary + delta-encoded + huffman'd index lists method mentioned in TFA, is commonly used for compressing weights. Weights tend to be sparse, but clustered, meaning most offsets are small numbers with the occasional jump, which is great for huffman.