Maximum Subarray Sum

Author

chris

Published

August 31, 2024

It’s been a few months since I posted due to health issues, but I’m breaking the silence with something small today. Most of my posts have been deep dives so this will be a nice change up.

Even if I’m not working on any side projects outside of work, I tend to fulfill my curiosity (ok maybe obsession) via some kind of MOOC or programming challenge. I picked up this habit while filling gaps in my computer science knowledge (undergraduate math and statistics background) and it hasn’t quite stopped.

Apropos to that, I came across the “Best Time to Buy and Sell Stock” problem the other day. The problem setup is pretty simple; given an array of daily stock prices, find the maximum profit possible by buying one day and selling another.

The naive/brute force solution that you might imagine (two pointers) actually fails a time limit because of its \(O(n^2)\) time complexity. When I first saw it, it smelled a lot like DP and the Maximum Subarray Sum problem that I had seen in my graduate algorithms class. After solving it, I realized the DP approach wasn’t entirely obvious to most people who attempted the problem. The official solution even misses presenting it this way, so in this post I want to go through that problem/solution framing and explain how/why it works.

In the ‘maximum subarray sum’ problem , we’re given an array of numbers and want to find the maximum sum of a contiguous subarray. We can’t use this as a drop-in solution to the buy and sell problem though, as it would give us a sum of prices (could call it the most expensive period) instead of profits. However, if you take daily price differences, then you can use the same approach.

Let’s define the prices array as

\[ P = [p_0, p_1, ... p_{n-1}, p_n] \]

and a price difference as

\[ d_i = p_i - p_{i-1} \]

The price differences array is

\[ D = [d_0, d_1, ... d_{n-1}, d_n] \]

where \(d_0 = 0\) since buy/selling the same day is zero profit.

To illustrate how a contiguous sum of price differences amounts to profit, let’s say we want to buy at \(p_3\) and sell at \(p_6\) for a profit of \(p_6 - p_3\). If during that time period you sell and rebuy everyday, this is the same as

\[ (p_6 - p_5) + (p_5 - p_4) + (p_4 - p_3) \]

which are just the sum of price differences

\[ d_6 + d_5 + d_4 \]

so a contiguous sum of price differences is the same as buying one day and selling another day.

The DP solution I devised uses 1-D table \(T\) to keep track of contiguous sums. The base case is

\[ t_0 = 0 \]

and the recurrence relation is

\[ t_i = d_i + max(0, t_{i-1}) \]

The max is there because we don’t want to carry any negative profits. Put in words, the recurrence says that the max profit today is; today’s price difference plus yesterday’s max profit but only if yesterday’s max profit is greater than zero.

Finally, we just take the maximum value from table \(T\) and the algorithm looks something like below.

def max_profit(prices: list[int]) -> int:
    diffs = [0] + [prices[i] - prices[i - 1] for i in range(1, len(prices))]
    T = [None] * len(prices)  # pretend fixed size array
    T[0] = 0
    for i in range(1, len(prices)):
        T[i] = diffs[i] + max(0, T[i - 1])
    return max(T)

Here’s a small example to verify this works as expected.

prices = [23, 27, 26, 22, 27, 26, 24, 29, 28]

Looking at the diffs helps to see what the max profit should be for this prices array (7).

diffs = [0] + [prices[i] - prices[i - 1] for i in range(1, len(prices))]
print(diffs)
[0, 4, -1, -4, 5, -1, -2, 5, -1]
max_p = max_profit(prices)
print(max_p)
7

The code could be optimized by keeping track of the best max at each step but the time complexity would still be \(O(n)\).