Path sum: two ways
June 10, 2022
In this post we will look at solving problem 81 on Project Euler. This problem is about finding a minimal path sum in a matrix. There are three variants of this problem, and this is the easiest one.
Problem statement #
In the 5 by 5 matrix below, the minimal path sum from the top left to the bottom right, by only moving to the right and down, is indicated in bold red and is equal to 2427.
$$ M = \begin{pmatrix} \color{red}{131} & 673 & 234 & 103 & 18\\ \color{red}{201} & \color{red}{96} & \color{red}{342} & 965 & 150\\ 630 & 803 & \color{red}{746} & \color{red}{422} & 111\\ 537 & 699 & 497 & \color{red}{121} & 956\\ 805 & 732 & 524 & \color{red}{37} & \color{red}{331} \end{pmatrix} $$
Find the minimal path sum of the 80 by 80 matrix that is included with the problem.
Approach #
The problem requires us to find the minimal path sum which is an optimization problem. It is also possible to split the problem into multiple sub-problems. Because of these two characteristics, dynamic programming (DP) is a good approach for solving this problem.
Let $M$ be the matrix that is defined in the problem statement, where we have $m$ rows and $n$ columns. The element $M[i, j]$ is the element at the $i^\text{th}$ row and $j^\text{th}$ column. Note that we have zero-based numbering. To use dynamic programming to solve the problem, we have to define a recursive relationship that identifies the optimal policy at each stage. Because of the recursive relationship, the solution procedure starts at the end and moves backward. If we start at the last element $M[4, 4] = 331$, then we have two routes that lead to this element. The element above it, which is $M[3, 4] = 956$, and the element to the left of it, which is $M[4, 3] = 37$. Of those two elements, there also is a left and top element that leads to that element. If we follow this pattern and create a diagram of it, we can see that the following structure is emerging.
flowchart LR 331 --> 956 331 --> 37 956 --> 111 956 --> 121 37 --> 121 37 --> 524 111 --> 150 111 --> 422 121 --> 422 121 --> 497 524 --> 497 524 --> 732
We can use this structure to define the recursive relationship. The idea is to add all the distances together, and use the minimum distance if there are two possibilities. This leads to to the recursive relationship
$$ f(i, j) = M[i, j] + \min\left(f(i + 1, j), f(i, j + 1)\right). $$
Note that the boundary conditions have been left out, to keep it more simple. The last element at $(4, 4)$ is the base case, and in that case, the equation should give $M[4, 4]$. This also doesn’t check if the values are within the bounds of the array, but to fix this, the value int.MaxValue
is returned if we access a value outside of the matrix. This trick allows us to have a more simple algorithm. We could define the recursive relationship mathematically correct, but the extra notation won’t make it easier to understand in my opinion.
We can visualize the recursive relationship by changing the diagram. Note that int.MaxValue
is used when we access an element outside the bounds of the array, which is indicated with $\infty$.
flowchart LR 331 --> 956["956 + min(331, ∞)"] 331 --> 37["37 + min(∞, 331)"] 956 --> 111["111 + min(956 + min(331, ∞), ∞)"] 956 --> 121["121 + min(37 + min(∞, 331), 956 + min(331, ∞))"] 37 --> 121 37 --> 524["524 + min(∞, 37 + min(∞, 331))"]
Of course we need to do this up to $M[0, 0]$, but for brevity, this has been left out of the diagram. I think by now you should get the gist of the approach that we are using to solve the problem.
Solving the example problem #
Before we will solve the 80 by 80 matrix, we will solve the example problem first. This is the 5 by 5 matrix. To do so, we will first create the matrix $M$.
1int[,] M = new int[,]
2{
3 { 131, 673, 234, 103, 18 },
4 { 201, 96, 342, 965, 150 },
5 { 630, 803, 746, 422, 111 },
6 { 537, 699, 497, 121, 956 },
7 { 805, 732, 524, 37, 331 }
8};
We will use the recursive relationship and create a recursive function to find the solution to the problem, however, doing so will lead to extra unnecessary computations. Note that if we start at $(i, j)$, the function calculates $(i + 1, j)$ and $(i, j + 1)$. Because of the recursion, $(i + 1, j)$ will result in calculating $(i + 2, j)$ and $(i + 1, j + 1)$. If we do the same thing for $(i, j + 1)$ we need to calculate $(i + 1, j + 1)$ and $(i, j + 2)$. Note that we are calculating $(i + 1, j + 1)$ two times! To prevent this, we will add memoization. First we define a data structure for the memo.
1using Memo = System.Collections.Generic.Dictionary<(int, int), int>;
With that in place, we define the recursive method int PathSum(int[,] M, Memo memo, int i, int j)
which results in the following implementation:
1int PathSum(int[,] M, Memo memo, int i, int j)
2{
3 // Base case.
4 if (i == M.GetLength(0) - 1 && j == M.GetLength(1) - 1)
5 return M[i, j];
6
7 // Out of bounds.
8 if (i >= M.GetLength(0))
9 return int.MaxValue;
10
11 // Out of bounds.
12 if (j >= M.GetLength(0))
13 return int.MaxValue;
14
15 // Memoization.
16 if (!memo.ContainsKey((i, j)))
17 memo[(i, j)] = M[i, j] + Math.Min(PathSum(M, memo, i + 1, j),
18 PathSum(M, memo, i, j + 1));
19
20 return memo[(i, j)];
21}
Which we can call in the following way to solve the example problem.
Input
1Memo memo = new();
2int sum = PathSum(M, memo, 0, 0);
3Console.WriteLine($"Solution = {sum}");
Output
2427
We find that the minimum path sum of the 5 by 5 matrix is indeed 2427. Hooray, our algorithm is correct.
Finding the answer #
Now that we have validated our implementation. Let’s load in the 80 by 80 matrix. The following code reads in the text file and converts it to a 2D array of integers.
1int[,] FromFile(string filepath)
2{
3 string[] lines = File.ReadAllLines(filepath);
4
5 int m = lines.Length;
6 int n = lines[0].Split(',').Length;
7 int[,] matrix = new int[m, n];
8
9 for (int i = 0; i < m; i++)
10 {
11 int[] elements = lines[i]
12 .Split(',')
13 .Select(v => int.Parse(v))
14 .ToArray();
15
16 for (int j = 0; j < n; j++)
17 {
18 matrix[i, j] = elements[j];
19 }
20 }
21
22 return matrix;
23}
If we run the PathSum(…)
method on the 80 by 80 matrix, it finds the correct solution in just 5 milliseconds! The time complexity of the DP approach with memoization is $\Theta(mn - 1)$, whereas a naive implementation where we evaluate all the routes is at worst $O(2^{m + n - 3})$. For the 80 by 80 matrix, this means that with the DP approach the recursive relationship is evaluated 6399 times (not counting the base cases and when it is out of bounds), whereas a naive implementation is absolutely exploding with a total of 182687704666362864775460604089535377456991567872 steps.