Sunday, April 17, 2011

Segment Trees and Bitmaps

Bitmaps are used all the time to keep track of free lists since they are a very compact way of keeping track of true/false information. They are used in the ext2/ext3 file system to keep track of free inode and data blocks, in the bitmapped_allocator to keep track of free memory blocks, and in many many other places.

However, in almost all places, a linear search technique is used to get the first set (or unset) bit. This works very well for small sets of data, but say you have 40 million items that you want to keep track of, you'll land up linear searching 40x106/8/1024/1024 = 4.7 MB of data, which is more than the cache size of most CPUs these days. What this effectively means is that you will land up filling up the CPU cache with the bitmaps, leaving very ittle space for your application's real data

This is obviously bad news and we should do something to fix it.

Welcome Trees!!

We take some inspiration from Segment Trees and use a hierarchial structure to store range data that will help us divide-and-conquor the free list in a zippy fashion.

Let's assume that each byte on our fictitious machine can hold just 2 bits. A bit 0 means that the object that that bit represents is free, whereas a bit 1 means that the object that that bit represents is taken (or allocated).

The root node gives us information about the complete range (all the 8 bits in the leaves). The nodes at the 1st level give us information about 4 bits each, and each of the leaf nodes give us information about 2 bits each (this is a base case since each byte on our machine can hold just 2 bits).

Now, finding a free node is just a simple tree traversal down a path that has 0 in one of the bit positions. While coming up, we set the 0 bits in each level (that we followed down) to a 1 bit ONLY if all the children of that 0 bit are all 1. We unconditionally set the 0 bit in the leaf node to 1 and go up the tree.

Hence, you can see that we have converted a O(n) operation to a O(lg n) operation by increasing the space requirements to 2x the original. Even though the space requirement is 2x the original, the tree based algorithm is much better not just from a time complexity perspective, but is also much more cache friendly. You should expect to have only the first 2 levels entirely in the cache. The remaining levels will come and go as necessary. This is the exact reason why B-Trees are better as an on-disk indexing and retrieval structure.

On a 32-bit machine, we need just a little over 5 levels of the tree to store 40 million bits (34 million bits can be stored completely in just 5 levels).

Update: You can achieve the same runtime complexity of O(lg n) and NO extra space overhead (compared to the linear scan solution) by using a Fenwick Tree instead of a Segment Tree.

1 comment:

Anonymous said...

thank you very much.for information.