Daily Coding Problem #63
The Word Search Problem
Background:
I am going to try my best to publish my process (and hopefully, results) to each of the Daily Coding Problems here. I make no claims that I know what I am doing, that I have an efficient answer, or that any of this will make sense. Let’s get started.
Instructions:
Given a 2D matrix of characters and a target word, write a function that returns whether the word can be found in the matrix by going left-to-right, or up-to-down.
For example, given the following matrix:
[['F', 'A', 'C', 'I'],
['O', 'B', 'Q', 'P'],
['A', 'N', 'O', 'B'],
['M', 'A', 'S', 'S']]
and the target word ‘FOAM’, you should return true, since it’s the leftmost column. Similarly, given the target word ‘MASS’, you should return true, since it’s the last row.
Process:
This problem was hard for me. I’m not really sure what about it was that challenging, because it is listed as an “easy” problem, but I had a really rough time with it.
My first thought on this problem was to use a recursive approach. I’ve been doing that for most of the other problems, why not do it here?
However a few things pushed me away from that:
The word is always the same. In other words, you’re given that at the beginning of the function call, so no need to iterate on it.
The board is always the same. Unlike other problems, you don’t need to adjust the problem as you go and recursively evaluate — the letters don’t move.
Lastly, I couldn’t quickly wrap my head around what a recursive solution would look like. When in doubt, I fall back to my trusty for-loops.
Step 1: Simplify
I was having a really difficult time mentally grasping how to step through for loops of searching each cell, and what to search for, so I just made the problem easier:
Evaluate whether the word begins at (0,0). If I can build a function to do this, I can loop through the same function for all subsequent cells.
Step 1.1 Direction Vectors
In other problems, I have seen the technique of having a set of “movement” vectors called prior to any loop, and thought that would be a useful tool here.
Note: after implementing it, however, I realized that there are only two directions that really matter (right and down) so this was probably a waste. However, the problem is ambiguous as to whether up and left matter (ie the word reads backwards) so maybe not a bad idea after all.
So I implemented my (x,y) vector grid:
x = [-1, 0, 1, 0]
y = [0, -1, 0, 1]
This way, if I call x[i] and y[i], it will give me, in order, left, down, right, up. You could also add in diagonals if you’re feeling adventurous, but we’ll just leave it like that.
This is a technique that is useful for solving problems like the Knight’s Tour, or other problems where you have multiple options of “moves” and need to cycle through those options in a systematic fashion.
Step 1.2: Evaluate Square
The first test, before we get too in the weeds, is to evaluate if the first letter of the word occupies the square we are looking at. If it isn’t, no sense continuing the loop through something because our word is not there. This looks as follows:
if arr[0][0] != word[0], return False
Basically, if the first letter of the word (word[0]) is not where we thought it was, stop the function. Simple.
Now if we wanted to write this in the more general case, we would put the following:
if arr[row][col] != word[0], return False
where row and col are the coordinates (row and column) for the square we are evaluating.
So far so good.
Step 1.3: We Found the (First) Letter!
So what if it is the right letter?
Then we want to loop through our options of direction vectors. In simple terms, does the next letter appear either left, down, right or up from the first?
We create two new temporary variables, rd and cd, which we will use to track where we are looping through evaluating without modifying the initial row and col points. Ie from 0,0 we want to look at all the options around it, but we don’t want to move the first letter from 0,0.
Therefore:
for i in range(len(x)):
rd = row + x[i]
cd = col + y[i]
We set the range equal to the number of direction vectors we want to iterate through (this lets us add more in the future if we want to without modifying anything else) and then set our temporary evaluation coordinates to this updated location.
Step 1.4: Check for Next Letter
While we are looping through the direction vectors, we need to evaluate whether the square is correct. There are a few conditions we want to evaluate:
- We step outside the grid
- The next letter is not the right one
At first, I did this a weird way, and then I went back and fixed it. I’ll show both methods.
The first is using “break” in a loop:
So if we are outside of the grid, we want to break the loop because there is no way the next letter is there. It looks like this:
if (rd ≥ len(arr) or rd < 0 or cd ≥ len(arr[0]) or cd <0), break
Note: len(arr) gives the number of rows in the array, len(arr[0]) gives the number of columns specifically in the first row, but we assume the number of columns is constant in all rows (it’s a grid). Also note that rd and cd start at 0,0 where the lengths start at 1, hence the ≤ and ≥ symbols.
What this if statement tells us is that if we step outside the grid, stop looking.
The secondary, better way to do this is as follows:
if (rd < len(arr) and rd > 0 and cd < len(arr[0]) and cd > 0), next nested if statement
This gets a little messy with nested if statements, but it sets the function up in a way that if the step we are looping through (rd, cd) is outside the grid, it does not go to the next step. This is messier than the “break” option, but I suppose is better coding practice. Breaks can get difficult to wrap your head around.
The next if statement is as follows:
if arr[rd][cd] != word[1], break (using method 1)
if arr[rd][cd] == word[1], nest next statement (using method 2)
At this point, whether you use method 1 or 2, you have ensured the square you are evaluating (rd, cd) has the correct letter (word[1]).
However, we want to loop through all of the letters in the word using word[k].
So we put a for-loop around our checks for k from 1 to len(word) as follows:
for k in range(len(word)):
if (rd ≥ len(arr) or rd < 0 or cd ≥ len(arr[0]) or cd <0), break
if arr[rd][cd] != word[k], break
The next step is to increment which square you are evaluating, and repeat through the loop, as follows:
rd += x[i]
cd += y[i]
Put these statements within the for-loop.
Finally, as a final check before the end of the loop, if we have reached the end of the word (k = len(word) -1) we are done! Hurray!
Note: the -1 is because k begins at 0 but the length of the word begins at 1.
if k = len(word)-1, return true
At the very end of the function, we add the customary return False in case this did not work.
Step 2: Check All Squares
That was a lot of steps, but if you recall, this was just to evaluate 0,0. Now we have a function that will check for 0,0 — or whatever we put in as row, col.
We just need to double-for-loop for all of the rows and columns.
I put this all into a separate function which calls the first within it to keep things clean.
That’s it!
Clean Code and Extensions
So a few final notes.
I created a third function which was a combination of the first two and cleans up the instructions to make it more readable. Holy nested-for-loops Batman! But it makes sense when you think through it.
I’m not sure it is a better way of programming the solution, but it’s nifty.
As an extension, what if the problem included diagonals?
And much harder, what if the grid was in 3D?
Brendan is a Mechanical Designer at Nymi, and blogs about startups, mental models and why hardware is hard here. He’s a Venture for Canada alumni, coffee aficionado, and cookbook collector. He’s learning python.