Arrays and Strings


Arrays and Strings


August 6, 2023

tutorial interviewing

Arrays are a foundational data structure, available in pretty much every language. That makes them very common components of interview questions.

I’m assuming you’re already familiar with the fundamentals, so I won’t spend a ton of time explaining how arrays work. Instead, I’ll focus on how they’re used in interview questions, and some techniques that come in handy.

Input and Output

Many interview questions follow this pattern: Can you write a function that takes an array of blah blah blah, and returns blah blah blah?

The blah blah blah varies wildly, but here are some examples:

  • Take an array of int values, and return indexes of the values that sum to target.
  • Take an array of a stock prices over time, and return the best time to buy and sell the stock.
  • Take an array of overlapping start and end times, and return an array with deduplicated durations.

These are just a couple random examples, but the important pattern to notice is that in each question you’re taking an array, doing some processing or calculating, and then returning an answer.

Here are a few techniques that help with this kind of question.

Tracking Best So Far

Very often, you’ll need to iterate over an array and track the “best” option you’ve seen so far. The definition of “best” depends on the question, but here are some examples:

  • Given an array of numbers, can you select the maximum value?
  • Given an array of Strings, can you select the longest one?
  • Given an array of points, can you select the one furthest from the origin?

The code for this approach looks something like this:

public int getMaximumValue(int[] array) {
  int max = Integer.MIN_VALUE;
  for (int i = 0; i < array.length; i++) {
    if(array[i] > max) {
      max = array[i];
    }
  }
  return max;
}

In this example, max starts out as Integer.MIN_VALUE, the smallest number an int can hold: -2^31 or -2,147,483,648. This placeholder value works because we’re probably going to encounter a number that’s bigger than this right away. Your placeholder value depends on how you’re defining the best element in your array, but it should generally start out as whatever the worst option would be.

Importantly, this code will not work if the array is empty, so be careful that you’re asking about corner cases and handling them appropriately.

The above pattern is what I would try to remember, but here’s an example that handles corner cases more explicitly:

public int getMaximumValue(int[] array) {
  if (array == null || array.length == 0) {
    throw new IllegalArgumentException("Provide an array with at least one value.");
  }

  int max = array[0];
  for (int i = 1; i < array.length; i++) {
    if(array[i] > max) {
      max = array[i];
    }
  }
  return max;
}

Nested Loops

Tracking the best so far is a fundamental approach while using arrays, but chances are your question is going to be a little more complicated than finding the min or max in an array.

Rather than tracking a single “best” value, you’re more likely to have elements that relate to each other in some other way. Examples include:

  • Given an array, return whether that array contains any duplicates.
  • Given an array of numbers, return whether any two numbers sum to exactly 100.

You might be able to use specific data structures or tricks to solve this kind of problem with a single for loop, but you can often get started by solving it with a more naive or “brute force” solution by nesting multiple loops inside each other.

Here’s an example that detects duplicates:

public boolean containsDuplicates(int[] array){
  for (int i = 0; i < array.length; i++) {
    for (int j = i + 1; j < array.length; j++) {
      if (array[i] == array[j]) {
        return true;
      }
    }
  }

  return false;
}

Again, there are other ways to solve this specific problem, but the nested loop pattern applies to many problems.

Sorting

A big part of solving an interview question is looking for ways to make your life easier. If you find yourself thinking “this question would be a lot easier if the array was sorted…” then don’t be afraid to sort it!

Here’s the same containsDuplicates() example, this time using a sorted array:

public boolean containsDuplicates(int[] array) {

  Arrays.sort(array);

  for (int i = 0; i < array.length - 1; i++) {
    if (array[i] == array[i + 1]) {
      return true;
    }
  }

  return false;
}

This code no longer relies on a nested for loop, and instead relies on the sorted nature of the array to check for duplicates in a single pass. In other words, the time complexity of the algorithm went from O(n^2) to O(n * log(n)), because Arrays.sort() takes O(n * log(n)) time.

You can’t always take advantage of built-in functions like Arrays.sort(), but it never hurts to ask. The worst case scenario is that your interviewer says no, but they’ll see that you’re aware of these language features.

Think Outside the For Loop

A plain old for loop that iterates over every index and element in an array in order is one of the most common structures in interview problems, but don’t limit yourself to using that exact technique in every question.

For example, some questions are better solved by iterating backwards:

public static String reverseString(String input) {
  StringBuilder output = new StringBuilder();
  for (int i = input.length() - 1; i >= 0; i--) {
    output.append(input.charAt(i));
  }
  return output.toString();
}

Here are a couple other alternative approaches for iterating over an array:

Sliding Window

You can sometimes remove nested for loops using a technique called a sliding window, which means maintaining the end state as you calculate your output. As you consider other elements, you modify the state of your window instead of iterating over every element in the potential output.

For example, let’s say have an array of int values, and you want to find the highest sum you can reach using a certain number of contiguous values. For example, for the array {1, 5, 10, 7, -1} and a length of 3, the largest sum you can reach with 3 contiguous values is 22.

You could do this using a nested for loop:

int getMaxContiguousSum(int[] array, int contiguousValuesCount) {
  int maxContiguousSum = Integer.MIN_VALUE;

  // Loop over every start index
  for(int startIndex = 0; startIndex <= array.length - contiguousValuesCount; startIndex++) {

    // Calculate the total of the contiguous values starting at startIndex
    int totalAtStartIndex = 0;

    // Iterate over every contiguous element starting at startIndex
    int endIndex = startIndex + contiguousValuesCount;

    for(int index = startIndex; index < endIndex; index++){
      totalAtStartIndex += array[index];

      if(totalAtStartIndex > maxContiguousSum) {
        maxContiguousSum = totalAtStartIndex;
      }
    }
  }

  return maxContiguousSum;
}

This code contains a for loop that iterates over every index in the array, and inside of that, it nests another for loop that iterates over the next contiguousValuesCount indexes. In other words, it has an algorithmic complexity of O(n * contiguousValuesCount). Can that be improved?

Consider this example array, with a contiguousValuesCount of 5:

{1, 1, 5, 10, 15, -1, 5, -20, 3, 1}

The above algorithm starts by totaling up the first 5 elements:

{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (1 + 1 + 5 + 10 + 15 = 32)

It then continues by moving to the right by one index, and totaling those elements:

{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (1 + 5 + 10 + 15 + -1 = 30)

That process continues until the end of the array:

{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (5 + 10 + 15 + -1 + 5 = 34)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (10 + 15 + -1 + 5 + -20 = 9)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (15 + -1 + 5 + -20 + 3 = 2)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (-1 + 5 + -20 + 3 + 1 = -12)

Each step of the iteration adds 5 numbers together. But notice how much work is repeated in this process: how many times are 10 and 15 added together?

Instead of adding all 5 numbers every time, you can keep track of a running total. When you move your index to the right, you subtract the leftmost element, and you add the new rightmost element.

Visually it looks like this:

{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (1 + 1 + 5 + 10 + 15 = 32)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (32 - 1 + -1 = 30)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (30 - 1 + 5 = 34)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (34 - 5 + -20 = 9)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (9 - 10 + 3 = 2)
{1, 1, 5, 10, 15, -1, 5, -20, 3, 1} (2 - 15 + 1 = -12)

This is where this technique gets its name: you maintain a window into the array that slides to the right as you process the array.

Now, no matter how large the contiguousValuesCount is, each iteration only does 2 calculations: subtracting the leftmost element from the window, and adding the rightmost to the window.

Putting it into code:

int getMaxContiguousSum(int[] array, int contiguousValuesCount){

  // Calculate the starting windowSum
  int windowSum = 0;
  for(int i = 0; i < contiguousValuesCount; i++){
    windowSum += array[i];
  }

  // Assume the max is the first window
  int maxContiguousSum = windowSum;

  // Loop over every start index
  for(int startIndex = 1; startIndex <= array.length - contiguousValuesCount; startIndex++){

    // Subtract the element that fell off the left of the window
    windowSum -= array[startIndex - 1];

    // Add the new rightmost element to the window
    windowSum += array[startIndex + contiguousValuesCount - 1];

    if(windowSum > maxContiguousSum){
      maxContiguousSum = windowSum;
    }
  }

  return maxContiguousSum;
}

Now the code has two for loops, but they aren’t nested! The complexity is now O(contiguousValuesCount + n) or just O(n), which is a big improvement.

The “window” in this example is a single int value windowSum, but this technique becomes especially handy when you pair it with other data structures like Sets and Maps.

For another visualization of the sliding window technique, check out this question on Stack Overflow: What is Sliding Window Algorithm?

Two Pointers

Another alternative to the traditional for loop, which keeps track of a single index that iterates over each element in an array, is to keep track of two indexes.

This is especially handy for problems where you need to swap or shift elements.

The two pointer approach has two variants:

  • Start both pointers at the beginning of the array. Increment the right pointer to search through the array, and increment the left pointer as your output is finalized. (This sounds confusing, so keep reading for an example!)
  • Start one pointer at the beginning of the array, and start the other pointer at the end. Increment the left pointer and decrement the right pointer to search through the array.

Two Pointers: Both at Start

For example, lets say you wanted to remove a value from an array in-place, without using any other data structures, and fill any remaining indexes with -1.

  • Removing 2 from {1, 2, 3} modifies the array to be {1, 3, -1}
  • Removing 2 from {1, 2, 3, 4, 2, 5} modifies the array to be {1, 3, 4, 5, -1, -1}
  • Removing 2 from {1, 2, 2, 2, 3, 4} modifies the array to be {1, 3, 4, -1, -1, -1}

You could use a for loop that iterates over each element. When you find an element to remove, you could use another for loop to shift the rest of the elements left by one.

void removeValue(int[] numbers, int valueToRemove) {
  for (int i = 0; i < numbers.length; i++) {
    if(numbers[i] == valueToRemove){

      // Shift every element starting at i left by one
      for(int j = i; j < numbers.length - 1; j++) {
        numbers[j] = numbers[j+1];
      }

      // Set the last element to -1 to show this index is unused
      numbers[numbers.length - 1] = -1;

      // Keep i at the same index to avoid skipping shifted elements
      i--;
    }
  }
}

The nested for loop gives this function an algorithmic complexity of O(n ^ 2). You can improve this using the two pointer approach.

To understand the two pointer approach, think about how you’d do this in your head, or with a piece of paper and a pencil. How would you remove 2 from this array?

{1, 2, 2, 2, 2, 2, 3, 4}

You probably wouldn’t take the above approach, where the first time you see a 2, you shift the entire array left by one index. You’d notice that you have multiple 2 values in a row, and you’d skip over those until you found a non-2 value. Then you’d move that element down.

Putting that into code, it looks like this:

void removeValue(int[] numbers, int valueToRemove){

  // leftPointer tracks the processed end of the array
  int leftPointer = 0;

  // rightPointer searches for numbers to shift left
  for(int rightPointer = 0; rightPointer < numbers.length; rightPointer++){

    // Found a value to shift left
    if(numbers[rightPointer] != valueToRemove) {
      // Shift it left
      numbers[leftPointer] = numbers[rightPointer];

      // Move the left pointer right
      leftPointer++;
    }
  }

  // Fill the end of the array with -1
  for(int i = leftPointer; i < numbers.length; i++){
    numbers[i] = -1;
  }
}

Now this function only iterates over the array once, for an algorithmic complexity of O(n).

If this seems confusing, try walking through this code with a few example input arrays!

Two Pointers: Start and End

Another two-pointer approach is to start one pointer at the beginning of the array, and the other pointer at the end of the array. The pointers move towards each other to search through the array.

Exactly how and when they move depends on the problem, but here’s an example: let’s say you were given a sorted array of numbers, and you wanted to return two indexes that added up to a target number. (Leetcode link: Two Sum II)

You could use a nested for loop:

int[] twoSum(int[] numbers, int target) {
  // Iterate over every element in the array
  for(int i = 0; i < numbers.length; i++){
    // Iterate over every subsequent element in the array
    for(int j = i + 1; j < numbers.length; j++){
      // If they equal the target, return these indexes
      if(numbers[i] + numbers[j] == target){
        return new int[]{i, j};
      }
    }
  }

  // We didn't find any elements that sum to target
  return new int[]{};
}

But this has an algorithmic complexity of O(n ^ 2).

To understand the two pointer approach, think about this array:

{1, 2, 3, 5, 6, 7, 11}

Which two elements sum to 10?

You could take the above approach, and for each value, you could look for another value that summed to 10. Or you could start by looking at the smallest and largest numbers. If their sum is less than 10, then you know you need a bigger value, so you can throw away the left number and look for a larger option. If their sum is more than 10, then you know you need a smaller value, so you can throw away the right number and look for a smaller option.

It looks like this:

  • {1, 2, 3, 5, 6, 7, 11}
    • 1 + 11 = 12 which is larger than 10, so drop the 11 and consider 7 instead
  • {1, 2, 3, 5, 6, 7, 11}
    • 1 + 7 = 8 which is smaller than 10, so drop the 1 and consider 2 instead
  • {1, 2, 3, 5, 6, 7, 11}
    • 2 + 7 = 9, which again is smaller than 10, so drop the 2 and consider 3 instead
  • {1, 2, 3, 5, 6, 7, 11}
    • 3 + 7 = 10 🎉

And here’s the code:

public int[] twoSum(int[] numbers, int target) {
  int leftIndex = 0;
  int rightIndex = numbers.length - 1;

  // Keep processing until the pointers meet
  while(leftIndex < rightIndex) {
    // Found two elements that sum to target
    if (numbers[leftIndex] + numbers[rightIndex] == target) {
      return new int[] {leftIndex + 1, rightIndex + 1};
    }
    // We need to find a bigger sum, so move the left pointer up
    else if (numbers[leftIndex] + numbers[rightIndex] < target) {
      leftIndex++;
    }
    // We need to find a smaller sum, so move the right pointer left
    else {
      rightIndex--;
    }
  }

  // We didn't find any elements that sum to target
  return new int[] {};
}

These are some specific examples that use two pointers to search through an array, but don’t limit yourself to these patterns. When you encounter a problem that involves searching through an array, ask yourself whether this pattern applies.

Pick a Better Data Structure

Although you’re often given data in an array, that doesn’t always mean you’re stuck with an array. If you can think of a data structure that would make your life easier, you should tell your interviewer that!

For example, here’s the containsDuplicates() function from above, this time using a HashSet data structure:

public boolean containsDuplicates(int[] array) {
  Set<Integer> set = new HashSet();
  for (int value : array) {
    if (set.contains(value)) {
      return true;
    }
    set.add(value);
  }
  return false;
}

This algorithm takes advantage of the fact that HashSet#contains() and HashTag#add() both run in constant time. All of the work of checking for duplicates is hidden by the HashSet implementation.

In other words, the algorithm now has O(n) time complexity. The HashSet also adds an additional O(n) space complexity. Trading space complexity for time complexity is another common talking point in interview questions.

Again, you can’t always bring in other data structures to solve your problems for you, but if you know that a particular data structure would help, you should always tell your interviewer about it.

Off By One Errors

One of the most common pitfalls when working with arrays and for loops is writing code that’s off by one when iterating over an array.

Make sure to check your code for this kind of problem. Explain what your code does as it approaches the end of the array, and double check that your code does what you expect.

Having an off-by-one error in your code isn’t the end of the world, so don’t spend too much time on it. If you aren’t completely sure, you can tell your interviewer that you might have an off-by-one error. Showing that you know to look for this kind of error is often just as good as not having the error at all.

Strings

String is another common data type used in interviews, and it has a lot in common with arrays. In fact, behind the scenes, every String value contains an array of char values!

The String class contains several handy functions that make them more useful than a raw array. Here are a few examples:

  • charAt(index) returns the character at the specified index. This is very similar to an array’s [] bracket accessor.
  • length() returns the length of the array. Remember that this is a function, whereas arrays have a length field.
  • substring(start, end) returns a smaller string inside a larger one. Keep in mind that start is inclusive, but end is exclusive. For example, substring(0, 2) returns a String with the characters at index 0 and 1, but not 2. Keep this in mind when you’re looking for off-by-one errors.
  • contains(substring) returns whether a String contains another.
  • matches(regex) returns whether a String matches a regular expression.
  • split(regex) returns an array of String values, generated by splitting the String on a regular expression. This can be handy if you need to split out the words of a sentence into an array, for example.
  • toUpperCase() and toLowerCase() do what they say on the tin.

Here are a few other tips for working with String values:

  • String values are immutable. That means you can’t modify a String, you can only create new ones.
  • You should almost never use == to compare String values! Always use the equals() function instead.
  • Don’t concatenate String values inside a loop, use a StringBuilder instead. Behind the scenes, Java creates an instance of StringBuilder every time you concatenate two String values, so if you do it in a loop you create a ton of throwaway instances. Instead, create a single instance yourself and use that inside your loop.

Practice Questions


Arrays and Strings Examples

Comments

Happy Coding is a community of folks just like you learning about coding.
Do you have a comment or question? Post it here!

Comments are powered by the Happy Coding forum. This page has a corresponding forum post, and replies to that post show up as comments here. Click the button above to go to the forum to post a comment!