Tuesday, November 08, 2016

Generators in C++

This post is about a generic framework (idea) to write generator functions in C++. Languages such as python have first class support for generators using the yield keyword. However, C++ has no such support. From the perspective of API design, there may be a need to support sequence generating functions in C++ so that each subsequent invocation of the function generates the next element in a sequence.

For example, a simple generator may generate numbers between 0 and n-1. If n = 10, then we should get the sequence 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.

The C++ code below shows how we can do so using the goto keyword in C++. The basic idea is that we write the function the same way we would a normal function which just generates values in a sequence, and instead of printing the value (or invoking a callback for the next value), we just return (yield from the function) that value. The next time the function is called, we just resume execution from the next line (i.e. the line after the previous return statement). To do this we need to track internal state as well as whether this is the first invocation of this function, since we probably need to handle that in a special manner.

The code for the simple generator as mentioned above is shown below:

The code for performing an inorder traversal of a Binary Tree is shown below. This code yields the next node in sequence instead of printing its value or invoking a callback. The advantage of such an API is that one could be traversing multiple Binary Trees at once in a simple manner. i.e. If you implement the in-order traversal in a recursive manner, it becomes hard to traverse 2 trees in a way that you can merge the values in non-decreasing order (for example).

Wednesday, May 20, 2015

Checking if some text "looks like" a given pattern

We as humans are able to perform pretty decent pattern matching when it comes to checking for object simnilarity. We are able to quickly see patterns in the world around us and generalize based on those patterns. However, can we program machines to see these patterns?

There's an interesting question I came across online, which basically asks the following:

Given a pattern and a string input, find if the string follows the same pattern and return 'True' or 'False'. Examples:
  1. Pattern: "abba", input: "redblueredblue" should return 'True'
  2. Pattern: "aaaa", input: "asdasdasdasd" should return 'True'
  3. Pattern: "aabb", input: "xyzabcxzyabc" should return 'False'

Turns out that the solution to this question is an algorithm that takes time that's exponential in the size of the input 'text' in the worst case. This is because we need to test every possible substring to check if it could be a substitution for a pattern letter.

Here is some Python code that solves it using Python regexes (these are more powerful compared to Posix Regular Expressions, since they allow you to use backreferences). The solution relies heavily on the backtracking capabilities of the python regex execution engine (PCRE).

Thursday, April 23, 2015

Range (Interval) Union

The Range (or Interval) Union problem is a pretty commonly encountered problem both in real life and in a technical (coding) interview setting.

Problem Statement: Given a set of ranges (intervals) of the form [start, end), compute the Union of these intervals such that the result set of intervals contains no overlapping intervals.

Other variations of the problem ask you to compute things like:
  1. The largest set (cardinality) of intervals that cross each other
  2. The number of intervals crossing a given range (or point, or set of ranges/points)
All these variations can be answered using the general technique described below.

The aim of this post is to provide a very simple/easy solution to a problem that otherwise would require fancy data structures, while at the same time retain a decent workable run-time complexity. The solution presented below requires time O(n log n) if the input intervals are not sorted, and O(n) if we assume that the input intervals are sorted by their start times.

This is a shuttle post (i.e. the solution was written up in a shuttle, with the duration of the journey being ~40 minutes).

A massive thanks to Vishwas for providing the most simplified solution to this problem (which he came up with after I described the problem to him in the shuttle itself!)

The solution: The solution itself first sorts all intervals by their start point, and then does something really clever to compute the union of the intervals. The key observation is that if we are processing intervals in order of their start point, we know that the union interval extends at least as far as the end point of the first interval, so we move ahead; including intervals that begin before the first interval ends. In fact, we must keep updating the end point to the right most end point of any interval we have included in the current union interval. That's pretty much it. Once we encounter an interval that starts after our current union interval ends, we can wrap up the union interval and start a new union interval which starts and ends where the new interval starts and ends. We only ever update the end point of the union interval while extending it since we started with the left most begin point for the union interval.

Examples used in the code: (Blue boxes are input intervals and the green boxes are the union intervals).

Monday, April 13, 2015

Models of engagement

As a person who solves problem for a living, there's a few options one has as far as current models of engagement (or more traditionally employment) are concerned:

  1. Engaged full time
  2. Engaged part time
  3. Freelance (i.e. somehow the work and the worker find each other)
I would argue that [1] and [2] are a special case of [3], wherein [3] just runs in a loop and the extra overhead of performing the search and matching are avoided, hence gaining some efficiency as a result of eliding the constant extra cost of running the matching algorithm.

[3] Is the purest (in some way) of being engaged, but some industries (such as the bollywood music industry) are more conducive to this model compared to something like a person working in a factory (either churning out tangible goods or code, etc...) (please pardon the phrase "factory" since some of you might be offended, and you should be since not all code is created equal; a lot of code is also a work of art). I'm ignoring some of the things that come with being employed full (or part) time such as health insurance, etc... since I want to focus on the most important aspect of employment, which in my opinion is impact and engagement.

Why is it that some industries have a preference for a certain model and others prefer another model?

Is it in the best interests of both parties to gravitate towards the most flexible model in most situations?

Which situations require that less flexible models thrive and are conducive to a more long term contractual type of engagement setting?

Which situations require that more flexible models thrive and are conducive to a more short term freelance type of engagement setting?

Wednesday, February 18, 2015

Searching faster than Binary Search

We all know that Binary Search is asymptotically the fastest way to search for data within a sorted (ordered) array in the comparison (or simple decision tree) model assuming nothing in particular about the data or the data types.

However, there's an entire class of algorithms and data structures that focus on how efficiently they utilize the host system's cache while processing data. These class or algorithms and data structures are called cache efficient algorithms and data structures [pdf].

There's 2 types of cache efficient algorithms and data structures:
  1. One class tunes the algorithm for a particular size of cache or cache hierarchy. The algorithm is aware of the cache sizes, the number of caching levels, and the relative speeds of each of these caching levels.
  2. Another class of algorithms is oblivious of the underlying cache sizes and layouts and are provably optimal for any cache size and caching layout (sounds magical doesn't it!). These are called Cache Oblivious Algorithms. See this link on Cache Oblivious Algorithms for more details, and this link for more details on the model, and the assumptions made in the Cache Oblivious model.

  • An example of a cache-efficient algorithm that is also cache-oblivious is Linear Search.
  • An example of a cache-inefficient algorithms that isn't cache-oblivious is Binary Search.
  • An example of a cache-efficient data structure that isn't cache-oblivious is the B-Tree (since B is the tuning parameter for the particular machine on which we arr running).
Without getting into the details, the complexity of running Binary Search on an array in the Disk Access Model (DAM) (where we are only concerned about the number of disk blocks read, and not the number of comparisons made) is O(logNB)), since we must always load a block from disk till we reach a small enough array (of size B) such that no more jumps within that array will trigger another disk I/O operation to fetch another disk block. The optimal complexity for searching for ordered data on disk is realized by ordering the data recursively in a static search tree such that the complexity reduces to O(logBN).

However, implementing that structure is somewhat complicated, and we should ask ourselves if there is a way to get the best of both worlds. i.e.
  • Runtime efficiency of the cache-oblivious recursive layout, and
  • Implementation simplicity of the standard Binary Search algorithm on an ordered array

Turns out, we can reach a compromise if we use the square-root trick. This is how we'll proceed:

  1. Promote every sqrt(N)'th element to a new summary array. We use this summary array as a first-level lookup structure.
  2. To lookup an element, we perform binary search within this summary array to find the possible extents within the original array where our element of interest could lie.
  3. We then use binary search on that interesting sub-array of our original array to find our element of interest.

For example:

  • Suppose we are searching for the element '7', we will first perform binary search on the top array; i.e. [10,22,33,43] and identify that 7 must lie in the original values sub-array at an index that is before the index of 10. We then restrict our next binary search to the sub-array [5,7,8,10].
  • Suppose we are searching for the element '22', we first identify the sub-array [11,21,22,25,26,33] as potentially containing a solution and perform binary search on that sub-array.

Even though we are asymptotically performing the same number of overall element comparisons, our cache locality has gone up, since the number of block transfers we'll perform for a single lookup will now be 2(log√NB), which is an additive factor of log B lesser than what we had for our normal binary search on the sorted array.

You can find another similar idea named Cache Conscious Binary Search described here.

All the code needed for reproducing the results shown here can be found below:

Sunday, February 15, 2015

The many small steps to Pincha Mayurasana (पिंच मयूरासन ) (Feathered Peacock Pose)

Pincha Mayurasana (or Feathered Peacock Pose) is a yoga pose that roughly translates to a forearm stand. This can be the next step after mastering (or getting decent at) the Headstand (Shirshasana).

You'll need some tools to be able to start practicing the feathered peacock pose. There are:

  1. Forearm strength
  2. Shoulder strength
  3. Lower Back strength
  4. Lower Back flexibility (to arch into a slight concavity)
  5. General balance (that you'll have developed as part of the headstand practice)
  6. Some Core strength (that you'll have developed as part of the headstand practice)

Here are some posts explaining (in great detail) how to attain the final pose and possibly some of the intermediate steps needed to be practiced on the way to getting there:
  1. Baby Steps to Forearm Stands
  2. Pincha Mayurasana (Two Fit Moms)
  3. Feathered Peacock Pose (Yoga Journal) and the video
  4. Benefits of Pincha Mayurasana
Here are some excellent videos that you should watch to get a visual introduction to the pose:

Let me now discuss some of the extra bits I would like to add or stuff I would like to emphasize from the posts and videos above:

  1. Practicing the Cobra Pose (Bhujangasana) was helpful in opening up my lower back
  2. Practicing the Locust Pose (Salabhasana) (or even doing one leg at a time) was extremely helpful in developing lower back strength
  3. Forearm and shoulder strength can be developed by practicing the Peacock Pose (Mayursana) and by using wrist strengtheners such as a Powerball
  4. When I started, I used a wall to support me when I kicked up from the dolphin pose
  5. Forearm strength, shoulder strength, and general flexibility can be developed by practising multiple rounds of Surya Namaskaar
  6. Andrei Ram Om's video on the Feathered Peacock pose (above) touches up on some subtle but important aspects about the back position (arched) and the differences between various hand positions. I think it's extremely relevant to know these differences when you start so that you can experience the full flavour of the pose

On Shanti, and keep practising! Namastey!

Tuesday, February 10, 2015

Smallest multiple with constraints

Problem statement:

Given a number 'n', design an algorithm that will output the smallest integer number 'X' which contains only the digits {0, 1} such that X mod n = 0 and X > 0 (1 ≤ n ≤ 100000)

Problem credits: Ayush on the Stony Brook Puzzle Society Facebook Group

Insight: Suppose n = 6. Consider some numerator (target answer X) to be 1101. Can this be correct? We can verify by dividing and checking the remainder. 1101 mod 6 = 3, so we've overshot the answer by 3 or more (since we might not have the smallest X).

Another way to get the same remainder '3' is to say:
  • 1000 mod 6 = 4
  • 100 mod 6 = 4
  • 00 mod 6 = 0
  • 1 mod 6 = 1
Add up all the remainders and take that mod 6. Hence, (4 + 4 + 0 + 1) mod 6 = 9 mod 6 = 3.

Which basically means that we want to find that subset of powers of 10, added together (mod n) leave a remainder of 0. We can solve this using dynamic programming using a technique similar to the one used to solve the subset sum problem. The solution to our problem (and to the subset sum problem) is pseudo-polynomial. Specifically, our problem is solved in time O(n log X).

Here is the ideone link, and below is the code (in python) to solve it.

Follow up problem:

Find a number 'X' given an 'n' (similar to the problem above) except that you are allowed to only use digits {0, 3, 5, 8}. Can you always find a solution for any 'n'?