-<p>Roaring solves this problem. It works in the following manner. It divides the data into chunks of 2<sup>16</sup> integers (e.g., [0, 2<sup>16</sup>), [2<sup>16</sup>, 2 x 2<sup>16</sup>), …). Within a chunk, it can use an uncompressed bitmap, a simple list of integers, or a list of runs. Whatever format it uses, they all allow you to check for the present of any one value quickly (e.g., with a binary search). The net result is that Roaring can compute many operations much faster that run-length-encoded formats like WAH, EWAH, Concise… Maybe surprisingly, Roaring also generally offers better compression ratios.</p>
0 commit comments