Monday, April 18, 2011

Bajra Roti/Parantha

Ingredients:
Bajra atta (Pearl millet flour): 2 cups
Water: Enough to bind the flour (keep 2 cups handy)

Procedure:
  1. Add little water and try to bind the flour. Keep adding a water as and when required
  2. Once the flour is hydrated, add a little more water to make it smooth
  3. Take a plastic sheet and place it on the chakla (flat circular rolling board)
  4. Ensure that you heat the tava (flat skillet) before you begin. Turn the flame to low when it is heated up
  5. Take about 100-150g of dough and place it on the plastic sheet
  6. Sprinkle some millet flour on top of it so that it doesn't stick to your hand
  7. Using your hand, gently press down the dough ball into a flat circular disc - rotating the dough all the time. It should be easy to rotate the dough since the plastic sheet will slide very easily on the rolling board
  8. Keeping the flame on low, place the flattened dough (top side down) on the tava
  9. Once you see small bubbles popping up on the surface, turn the roti over
  10. Heat till you see brown spots on the under-surface
  11. Turn the flame to high and momentary place the roti (turned over again) on the flame directly. This should make the roti balloon. The ballooning of the roti means that it has been cooked properly
  12. Turn over to other side to cook consistently (as desired - but don't burn it please!!)
  13. Serve with dahi (curd), butter, jaggery and vegetables. Enjoy!!


News these days

Being the beneficiary of a headache yesterday, I had the privilege of being unproductive and just fixing myself in front the of television set and watching the daily news with the rest of the family. The news reader almost mechanically read out these 2 news items back to back (in the same segment):
  1. Police firing in The Ratnagiri district of Maharashtra kill 1 and injures 5, and
  2. Indian Ministers oppose the wastage of food at weddings and ask the organisers to provision accurately
It just seems so weird that these 2 news items come together - any way you look at it!! And I'm not even going into whether the people opposing the wastage are themselves practicing what they preach.

Sunday, April 17, 2011

Segment Trees and Bitmaps

Bitmaps are used all the time to keep track of free lists since they are a very compact way of keeping track of true/false information. They are used in the ext2/ext3 file system to keep track of free inode and data blocks, in the bitmapped_allocator to keep track of free memory blocks, and in many many other places.

However, in almost all places, a linear search technique is used to get the first set (or unset) bit. This works very well for small sets of data, but say you have 40 million items that you want to keep track of, you'll land up linear searching 40x106/8/1024/1024 = 4.7 MB of data, which is more than the cache size of most CPUs these days. What this effectively means is that you will land up filling up the CPU cache with the bitmaps, leaving very ittle space for your application's real data
.

This is obviously bad news and we should do something to fix it.

Welcome Trees!!

We take some inspiration from Segment Trees and use a hierarchial structure to store range data that will help us divide-and-conquor the free list in a zippy fashion.

Let's assume that each byte on our fictitious machine can hold just 2 bits. A bit 0 means that the object that that bit represents is free, whereas a bit 1 means that the object that that bit represents is taken (or allocated).


The root node gives us information about the complete range (all the 8 bits in the leaves). The nodes at the 1st level give us information about 4 bits each, and each of the leaf nodes give us information about 2 bits each (this is a base case since each byte on our machine can hold just 2 bits).

Now, finding a free node is just a simple tree traversal down a path that has 0 in one of the bit positions. While coming up, we set the 0 bits in each level (that we followed down) to a 1 bit ONLY if all the children of that 0 bit are all 1. We unconditionally set the 0 bit in the leaf node to 1 and go up the tree.


Hence, you can see that we have converted a O(n) operation to a O(lg n) operation by increasing the space requirements to 2x the original. Even though the space requirement is 2x the original, the tree based algorithm is much better not just from a time complexity perspective, but is also much more cache friendly. You should expect to have only the first 2 levels entirely in the cache. The remaining levels will come and go as necessary. This is the exact reason why B-Trees are better as an on-disk indexing and retrieval structure.

On a 32-bit machine, we need just a little over 5 levels of the tree to store 40 million bits (34 million bits can be stored completely in just 5 levels).

Update: You can achieve the same runtime complexity of O(lg n) and NO extra space overhead (compared to the linear scan solution) by using a Fenwick Tree instead of a Segment Tree.