First bad Version

Problem description:

You are a product manager and currently leading a team to develop a new product. Unfortunately, the latest version of your product fails the quality check. Since each version is developed based on the previous version, all the versions after a bad version are also bad.

Suppose you have n versions [1, 2, …, n] and you want to find out the first bad one, which causes all the following ones to be bad.

You are given an API bool isBadVersion(version) which will return whether version is bad. Implement a function to find the first bad version. You should minimize the number of calls to the API.

Solution:

This is solved using binary search since we can see the list of versions as an array of 0’s and 1’s in sorted order having zeroes as good versions and ones as bad versions, so our task is to find the first occurrence of 1, so how can we find the first occurrence of a number on a sorted array?… Binary search!!
So we basically…

1.- Create to pointers, one pointing to the beginning of the list (start = 0) and another one to the end of the list (end = n)
1.- Check if the middle element (middle = start + end / 2) is a correct version
2.- If it is a correct version we now know we don’t need the second half of the list so we go left (end = middle-1), otherwise we go right (start = middle+1)
3.- Repeat steps 1 and 2 until there is no more elements to check (while start <= end)

    def firstBadVersion(self, n):
        start = 1
        end = n
        middle = 0
        
        while(start <= end):
            middle = (start + end) / 2
            if isBadVersion(middle):
                end = middle - 1
            else:
                start = middle + 1
                
        if isBadVersion(middle):
            return middle
        else:
            return middle + 1

Runtime complexity: O(log n) since we cut the array in half at every step, so for an array of size 10 we just need log(10) = 3 operations to find the first bad versionūüôā

Number of distinct binary trees

Problem statement:

Given n nodes, how many binary trees can be formed?

example.

for n = 3 there are 5 different binary trees that can be formed:

   1         3     3      2      1
    \       /     /      / \      \
     3     2     1      1   3      2
    /     /       \                 \
   2     1         2                 3

Solution:

If we have n nodes 1, 2, 3, …, n we know that in some point every node is gonna be the root, and we also know that every node has two sub-trees, the left sub-tree and the right sub-tree, the key here is RECURSION, we can split this problem into different sub problems, let’s see it with an example:

for n = 1 we have ONE possible tree.
for n = 2 we have TWO possible trees:

   1         1 
    \       /  
     2     2

for n = 3 we have the following options:
A) zero nodes at the left sub-tree and two nodes at the right sub-tree
B) one node at left and one node at right
C) zero nodes at the right sub-tree and two nodes at the left sub-tree

          root                     root                      root
         /    \                  /       \                 /     \
numTrees(0)  numTrees(2)   numTrees(1) numTrees(1)  numTrees(2) numTrees(0)
    1      *     2       +      1     *    1      +     2      *    1       =   5 different trees

So it’s clear we are going to solve this problem recursively:

def numTrees(self, n):
    if n <= 1:
        return 1
    res = 0
    for i in range(n):
        res += self.numTrees(i) * self.numTrees(n - i - 1)
    return res

Edit Distance

Problem description:

Given two strings A and B, find the minimum number of operations needed to transform A into B, there are only three valid operations:

-Replace a character
-Delete a character
-Insert a character

Examples:

A = “hello”
B = “hola”
result = 3, the operations were:
1.- replace ‘e’ with ‘o’ ¬† -> “hollo”
2.- replace ‘l’ with ‘a’ ¬† -> “holao”
3.- remove ‘o’ ¬† ¬† ¬† ¬† ¬† ¬† -> “hola”

Solution:

So for this problem we have two strings A and B, let’s work with the following example:

A = "abc"
     i
B = "cba"
     j

Remember that we want to transform A into B, any operation has to be done over A.

We are gonna use indexes i and j to loop through the strings and compare them character by character so that at every step we have the following options:

First index i and index j are both at position 0, so we have ‘a’ and ‘c’, since they are different we have three options here:
1.- remove ‘a’ (i + 1, ¬†j) ¬† ¬† ¬† ¬† ¬† ¬† ¬† -> “bc”
2.- insert ‘c’ into A (i, j + 1) ¬† ¬† ¬† ¬† -> “cabc”
3.- replace ‘a’ with ‘c’ (i + 1, j + 1) -> “cbc”

Why don’t we just try them all?

int minDistance(String word1, String word2) {
    int dp[][] = new int[word1.length() + 1][word2.length() + 1];
    for(int i = 0; i <= word1.length(); ++i)
        for(int j = 0; j <= word2.length(); ++j)
            dp[i][j] = -1;
    return getDistance(word1, word2, word1.length(), word2.length(), dp);
}
int getDistance(String word1, String word2, int i, int j, int [][]dp) {
    if(i == 0) return j;
    if(j == 0) return i;
    if(dp[i][j] >= 0) return dp[i][j];

    if(word1.charAt(i - 1) == word2.charAt(j - 1))
        return dp[i][j] = getDistance(word1, word2, i - 1, j - 1, dp);
    else
        return dp[i][j] = 1 + Math.min(getDistance(word1, word2, i - 1, j - 1, dp),
        Math.min(getDistance(word1, word2, i - 1, j, dp), getDistance(word1, word2, i, j - 1, dp)));
 }

The dp table is very important in this problem, it is used to avoid a sub task to be computed more than once, otherwise our program would run very very slow.

Find missing number

Problem description:

You have an array A of size n – 1 containing numbers from 1 to n so there is one missing number, find it!

First Approach using extra memory:

We could create a boolean array B of size n + 1 and set to true the index of every number found in A, at the end our boolean array will have only one element in false, that’s the missing number.

int findMissingNumber(int A[], int n)
{
	boolean B[] = new boolean[n + 1];
	for(int i = 0; i < A.length; ++i)
		B[A[i]] = true;
	
	int missing = 0;
	for(int i = 1; i < B.length; ++i)
		if(!B[i]) missing = i;
	
	return missing;
}

Second Approach without extra memory:

Another approach that avoids the use of extra memory is to sum up all the elements in the array and subtract that value from the total sum which can be obtained with the following formula (n * (n + 1)) / 2

int findMissingNumber(int A[], int n)
{
	int total = (n * (n + 1)) / 2;
	int sum = 0;
	for(int i = 0; i < A.length; ++i)
		sum += A[i];
	
	return total - sum;
}

This solution only uses constant space but the problem is that if the numbers are too large we could overflow the variable sum since an integer has a max value.

Third approach, XOR:

This approach is based on the XOR operation, remember:

A B | A XOR B
0 0 | 0
0 1 | 1
1 0 | 1
1 1 | 0

so:
A ^ 0 = A
A ^ A = 0
A ^ B = C
C ^ A = B

The key is to realize that if we XOR the same number twice we are gonna get 0 (equal numbers are removed), but if we XOR a number once then this number is kept, so we can XOR all numbers from 1 to n and then xor all numbers in our array, at the end we are gonna have the missing number since all duplicated numbers were removed, here the code:

int findMissingNumber(int A[], int n)
{
	int xor = 0;
	for(int i = 1; i <= n; ++i)
		xor ^= i;
		
	for(int i = 0; i < A.length; ++i)
		xor ^= A[i];
	
	return xor;
}

This solution doesn’t have the problem of integer overflow like the previous one and needs only linear time and constant spaceūüôā

Generate Parentheses

Problem description:

Given a non negative number n, generate all possible combinations of strings that can be formed by n-pairs of well-formed parentheses, for example:

for n = 2
a solution is: “(())”, “()()”

and for n = 3
a solution set is: “((()))”, “(()())”, “(())()”, “()(())”, “()()()”

Solution:

We could use a simple Depth First Search (DFS) approach to recursively build the set of solutions, see the solution code below and then read the explanation:

    void generateParentheses(int n) {
        dfs("", n, n);
    }
    void dfs(String s, int left, int right) // build the set of solutions using DFS approach
    {
        if(left == 0 && right == 0) // BASE CASE: there is no more parentheses to add? we have a solution!
            System.out.println(s);
        if(left > 0) // while we have left parentheses to add, just add them
            dfs(s + "(", left - 1, right); // we call our function recursively with a left parentheses added
        if(right > left) // We are gonna add a right parentheses if we have more right parentheses than left ones
            dfs(s + ")", left, right - 1);
    }

Let’s work on the simplest example to explain how it works:

for n = 2 we know that we have 2 left parentheses and 2 right parentheses so for each recursive call we are going to take 2 decisions:
1.- Can we put a left parentheses?
2.- Can we put a right parentheses?

How do we know if we are able to put a left or right parentheses at each recursive call?
We can put a left parentheses as long as we have at least one left parentheses available but for the right ones we can add them only when we have more right parentheses available because we know that we have already added the corresponding left parentheses before adding the right one.

Do you have any question/comment/suggestion? Please leave your comment below.

Clone a graph

Problem Description

Given an undirected graph write a function that returns a CLONE of that graph, example of an undirected graph:

1 ¬†———- ¬†2

|            /

|      /

3

Each node has a label and a list of neighbors (represented as an ArrayList).

Solution code


map = new HashMap<UndirectedGraphNode, UndirectedGraphNode>(); //This map is gonna be used to check if a node has been already created,

//we use the original node as a key and the clone node as the value.

UndirectedGraphNode clone(UndirectedGraphNode node)
{
    if(map.containsKey(node)) //if the current node has been already cloned then just return the cloned node
        return map.get(node);

    UndirectedGraphNode newNode = new UndirectedGraphNode(node.label);  // create the clone node
    map.put(node, newNode); // put the clone into the map

    ArrayList<UndirectedGraphNode> nei = new ArrayList<UndirectedGraphNode>(); //create a new arraylist of neighbors for the new node

// loop through the neighbors of the original node and call our function recursively to build the new list of neighbors
    for(UndirectedGraphNode n : node.neighbors)
        nei.add(clone(n));

    newNode.neighbors = nei;
    return newNode;
}

Explanation:

It is basically a DFS traversal (http://en.wikipedia.org/wiki/Depth-first_search), ¬†the clone function is called for every node and it returns the new cloned node with it’s corresponding list of neighbors, the key part here is that we need to keep track of those nodes that we have already cloned to avoid infinite loops, this is done with the HasMap.

String break

Problem description:

Given a String and a dictionary of words, write a program that returns true if the given string can be formed by concatenating one or more of the words in the dictionary, eg.

Given the string S = “algorithmstuff”

and the dictionary dic = {“algorithm”, “stuff”}

Our function must return true since S can be formed by concatenating “algorithm” + “stuff”

but given S = “algorithmstuffisgood”

and dic = {“algorithms”, “good”, “stuff”} must return false since there’s no way to build S from the strings in the dictionary.

Solution

This is a very good problem for a programming interview since it has one “lazy solution” which is very inefficient but it can be converted to a very fast algorithm with just a simple modification, so you could think about the most obvious solution even if it’s slow and once you got a working solution you can think about how to improve it.

This problem is clearly a good candidate for a recursive solution, for each recursive call the idea is to start checking if the first character is in our dictionary, if so, we call the function recursively but now with string S starting with the next character, otherwise we check if the first two characters form a valid word and so on, for instance:

S = “hithere”

dic = {¬†“there”,¬†“hi”}

We start with the string “hithere” at index 0, so we check if the substring “h” is a valid word, since it’s not we continue now with the next character which is “i”, now we check if the substring “hi” is a valid word in the dictionary, it is so we know that “hi” can be formed by concatenating one or more words of the dictionary so we now call the function recursively but starting with the next index which is 2, so now we are gonna have the same function with the substring “there”, we are gonna stop when we reach the end of the string S.

This is basically a DFS algorithm (http://en.wikipedia.org/wiki/Depth-first_search) , here the code:


public boolean stringBreak(String s, Set<String> dict) {
    return dfs(s, 0, dict);
}
boolean dfs(String s, int i, Set<String> dict)
{
    if(dict.contains(s.substring(i)))
        return true;

    for(int j = i; j < s.length(); ++j)
        if(dict.contains(s.substring(i, j + 1)))
            if(dfs(s, j + 1, dict, set)) return true;

    return false;
}

The solution¬†above works! but it’s too slow since there are many overlapping calls (it does the same thing many times), let me show you with an example:

let’s say we have the string S = “helloworld”

and dict = {“he”, “e”, “h”}

Our recursive function is gonna build a tree like this:


                       helloworld

                   /                \

           h-elloworld           he-lloworld

                |

           e-lloworld

As you can see the substring “lloworld” is gonna be computed two times and the result is gonna be the same, the problem here is that we could end up with too many overlapping calls which would be a waste of time. So, how can we avoid overlapping calls?

The solution is to have a record of which substrings we have already computed so that when our function is called again for the same substring we already know the answer, this technique is called MEMOIZATION (http://en.wikipedia.org/wiki/Memoization), Here is the code that shows this optimization.


public boolean stringBreak(String s, Set<String> dict) {
    HashSet<Integer> memo = new HashSet<Integer>(); // We just need to keep track of those indexes that we have already computed and the result is false
    return dfs(s, 0, dict, memo);
 }
 boolean dfs(String s, int i, Set<String> dict, HashSet<Integer> memo)
 {
     if(dict.contains(s.substring(i)))
         return true;
     if(memo.contains(i)) // if we have already computed the result for this substring we just return the answer
         return false;
     //otherwise, compute the answer for this substring
     for(int j = i; j < s.length(); ++j)
         if(dict.contains(s.substring(i, j + 1)))
             if(dfs(s, j + 1, dict, set)) return true;

     //we just store the results for substrings which result is false
     memo.add(i);
     return false;
 }