Arrays are a fundamental data structure in every language. Its ability to provide O(1) random access to elements is something that should not be disregarded. Due to its importance, arrays are guaranteed to be on your next technical interview. Here’s a quick tool kit on techniques and strategies you can employ.
Before going into a deep dive in array manipulation let’s brush up on some array trivia.
- Arrays are data structures that are represented by contiguous blocks of memory. This means that we are able to iterate through it unlike pointer based data structures such as linked lists.
- Arrays have O(1) insertion time.
- Arrays have O(n) time for deletion and search.
Some readers will make the claim that search is O(log n) because of binary search. However it is important to remember that binary search is only applicable to sorted arrays.
Bet you didn’t see this one coming. The humble for loop should not be neglected for being too elementary. One of the major upsides for arrays is that we can easily iterate through it without creating separate iterator methods.
Something I’m always surprised to see is when someone doesn’t know you can declare multiple variables in a for loop to allow for more control over the array.
In the code posted above we see that we are trying to reverse the contents of an array. We reverse the array in place by creating two pointer variables
i starts at the front and
j starts at the end. This code is much more readable as opposed to creating a separate variable that is initialized to the end of the array and then counting it down inside the for loop.
I like to personally use the for loop method over
for..of loops in code challenges, because of the control I have over it. I can increment the index pointer by 2 instead of 1. I can declare as many as I want, although I recommend you keep variable declaration to a minimum.
Recursion is the big bad wolf for new developers. All these fancy terms get thrown around: base case, recursive case, substructure, recurrence relation and people get confused. These are all good to know, but we only really need to know base case and recursive case to write a recursive algorithm.
Base case is the termination check for our recursive call. It is the case where we can immediately return a value without computing anything. More definitively, it is the recursive level where we cannot go any deeper.
Recursive case is a smaller sub-problem of the original problem. It is still too large to solve, so we can perform a recursive call on this sub-problem.
Fun fact: arrays are recursive data structures. An array of size n is made up of a subarray of size n — 1.
Here’s the reverse array function done recursively:
To the wary reader who claims that this is in-fact a worse solution than the for-loop solution from earlier, they are completely correct. Recursion can use a lot of memory, when there are enough recursive calls on our stack. In this case, we are pushing recursive calls onto our stack, 1 call for each swap.
Memory issues aside, recursion is an important technique to know, due to its importance in many algorithm design techniques such as divide and conquer and greedy algorithms.
Whenever I am approached with a problem that involves arrays, the first thing I always ask myself is: Can I use a map?
Maps are very powerful when solving problems that involve arrays. Storing key-value pairs is an extremely useful resource to have. You can check for duplicates in an array or string, store other data structures, and do so much more.
The reason Map is so popular for solving code challenges is because they support O(1) amortized time for insertion, deletion, and search. This means our time complexity is usually very good.
Below is a solution to the famous two sum problem using a map
Our time complexity for this solution is O(n) and has a space complexity of O(n), because we had to make another data structure and insert the elements of our array into it.
Compared this to our brute force solution
We have nested two for loops inside one-another in this solution. Although the solution is technically correct, it’s significantly slower than the Map solution sporting a O(n²) time.
Let’s take a quick break and explain what Brute Force is.
Brute force is the algorithm design paradigm of trying every single possible combination to produce the correct output.
Brute force is not limited to a O(n²) time. It can be any time where we manually check for every possible combination. For example when searching a sorted array, a brute force solution has a time of O(n).
Sounds pretty horrid right? However this does not mean that brute force should be avoided. Brute force is useful, if not necessary, to creating and writing elegant and fast algorithms. We build these elegant and fast algorithms by studying our brute force algorithm and identifying where to improve it. Back to our tools now.
This simple adjustment makes Set a very powerful data structure whenever checking for duplicates in an array.
The solution has a time complexity of O(n) and a space complexity of O(n). Compared to the brute force solution of sorting the array and then checking each element next to it which has a time of O(n log n), the Set is superior in time.
The function is passed an array with elements from 1 ≤ N. Some elements may be duplicates. The function returns all numbers missing in the sequence. Because Set doesn’t allow duplicates, we are able to pass all element of the array into the set, without trying to filter the inputs.
For all the flak I’ve been giving sorting in the previous sections, it’s surprising that I bring up sorting here as a potential technique.
Sorting can play an integral role in the design of algorithms for two reasons.
- For the purpose of Array algorithm techniques specifically, sorting is often a good tool to help design a brute force solution. Ordering the array can help us identify patterns in the problem not seen previously. It should be noted that sorting any array automatically makes the algorithm have a minimum time complexity of O(n log n).
- In the general context of algorithm design, sorting offers valuable take aways that we can apply to other algorithm design. The partitioning function in Quick Sort and the merging function in Merge Sort can all be modified and used to fit certain needs.
Now that we’ve covered some fundamental techniques let’s go ahead and look into more elegant array techniques we may be able to use.
Sliding window algorithm is a technique used to perform operations on an array with a specific buffer. It is similar to declaring multiple pointers in the same for-loop, however we keep the distance between each pointer constant.
Let’s say for example we have a problem where we are asked to find the maximum continuous subarray of size k in an array A. We can easily find this using a brute force solution.
The algorithm is technically correct, but has a run time of O(A.L * k) where A.L is the length of the array. Is it possible to do better?
Using the sliding window we can create a linear time solution. The question is nice enough to give us the size of the buffer we need to set: k.
You can see here that we begin with finding the sum of the first “window” in lines 3/4. In lines 7–9 we begin “sliding” our window to the right, subtracting the left most element and adding the right most element. If at any point our window sum is greater than the previous max sum, we assign window sum as our new max sum.
We reduced our run time to O(n)! Sliding window can be a very powerful strategy however, it can be difficult to recognize sometimes. Here’s some practice problems you can try.
Binary search???? That’s most people’s reaction when I explain you can use this to solve non-searching problems. Binary search has the misfortune of being many people’s first “cool” algorithm. Its implementation is simple and its usage is scarce, so many people either forget about it or don’t understand the properties of it. But its simplicity is what makes it such a powerful technique at times. It can only be used on sorted/semi-sorted arrays.
Binary search builds off of us knowing that certain properties will always be true. The last element is always largest than the first element. The middle element is the median of the array. So if a sorted array is changed in a consistent pattern, we can identify the pattern and work around it.
It’s important to note that binary search does not need to be recursive. Implementing it recursively is the easiest way, but for the problem in this section, we must implement it iteratively to take advantage of its properties.
A pretty famous binary search problem is finding the minimum element in a rotated array
Implementing a modified iterative binary search we are able to find the minimum element in a rotated array in O(log n) time, significantly better than the brute force method of checking each element which has a O(n) time.
Now we’re getting into end game material.
Dynamic programming is an algorithm design paradigm of optimizing recursive algorithms. There are a few key things to note about dynamic programming.
- Dynamic programming only works on recursive problems that have something called optimal substructure. Optimal substructure is a fancy way of saying that a solution is built off the solutions of smaller subproblems.
- We can avoid recalculating previously found solutions.
- We store the calculated solutions to subproblems in an array and refer to it to optimize the runtime. Because we create a cache, the minimum space complexity for a dynamic solution is always O(n).
The Fibonacci sequence is a perfect example of a problem that builds off of previous solutions due to the recurrence relation of f(n) = f(n — 1) + f(n — 2). Unfortunately this very short and clean code has an ugly time complexity of O(2^n). We can optimize this with a dynamic optimization.
We iterate until we reach n, adding values into our cache array. The indices of the array represent the recursive level had we done it recursively. When we have filled the array with values up to and including n, we return the element at index n, the nth number in the Fibonacci sequence.
Our time complexity has magically shrunk dramatically to O(n) from O(2^n)! That is significantly faster than the recursive solution. This is because for any values of n greater than 1, we have to repeatedly recalculate f(n). Solving the problem dynamically allows us to only solve it once at the expense of memory.
These are most of the techniques I go through when looking at how to solve problems that use arrays. There are a few stuff I left out such as greedy algorithms, divide and conquer, and bit operators since they each deserve a separate post explaining their somewhat fragile implementation and usage.
If you have any other toolkits you’d like to see or I forgot/messed something up, please go ahead and let me know in the comments!