A Genetic Algorithm Solution to the Jotto Problem

Have you played wordle? Yeah, me too. I actually play Xordle everyday now. It’s pretty tough. Anyhow, let me pose you a question and maybe it will spark your imagination. I encourage you to try and program it yourself if you’re interested.

The problem is simple: Out of a list of valid 5-letter words, your task is to assemble a list of five of those words, such that between the five you choose, 25 unique letters are represented amongst them. For example, [Salad, Lunch] only represents 8 letters: [A, C, D, H, L, N, S, U]

Here’s a link to the word list.

So, there’s 12973 words. That’s quite a few. Sadly, we need a rather complex subset, because we need a non-repeating combination of those words. How many combinations are there, anyways?

367,453,247,900,259,316,093.

367 quintillion, as it were.

Wolfram Alpha reports this number as approximately equivalent to 8.5 times the PSPACE of a typical 3x3 Rubik’s cube.

There’s a lot of ways in programming that you might deal with such a large number of possibilities when you need to find only one. One way you might start is by reducing the number of candidates to start with.

We need first a function to determine how many unique letters a word actually has.

def unique_letters(word):
    return sorted(list(set([ltr for ltr in word])))

Nice.

So, thinking about the problem logically, we know that no matter what words we use, there will be five words in the solution. This means that each word MUST have 5 unique letters in it. So, we can reduce our problem space by just removing all the words like “salad,” which have a letter repeated!

aux_words = open("words.txt").read().split("\n")

words = []
for word in aux_words:
    if len(unique_letters(word)) == 5:
        words.append(word)

I use “aux_” to show that a variable is still being built. Also, you could totally throw this into a function or a context manager if you wanted to make your solver even cleaner. I chose not to, because it works either way and this is the first way that I wrote it.

If you’ve never written a genetic algorithm before, there are some concepts you need to know.

The premise is to use the principle of natural selection to gradually arrive at a “fit” solution, given some fitness function or approximation. You generate a bunch of possible solutions upfront; these are typically just randomly selected from a generator.

def generate_random_population(n, words):
    return [random.sample(words, 5) for x in range(n)]

That’s the way I chose to do it. Using random.sample ensures that we won’t have duplicate words in our candidates.

The next step is also the first step of the life-cycle of the genetic algorithm. Evaluation. The premise is that all the candidates would have their fitness calculated so that further steps of the algorithm can be executed. I used a very simple fitness function that just counts up the unique letters in each solution.

def fitness(word_list):
    return len(set([c for c in "".join(word_list)]))
 

As an aside, in most cases, this is where the bottleneck of your program is going to be. You might want to spend some effort making your fitness function as efficient as possible or consider using an approximation.

Historically, genetic algorithms have been effective when searching for single, formulaic solutions to problems that have a lot of possible inputs. One of the coolest known successes is in the case of this space antenna.

This is from the 2006 NASA ST5 satellite. In space, there are special conditions that define the way radio signals interfere with themselves. It’s possible to model this interference function but it’s very difficult for human intuition to approach an optimal solution to what geometry to use for the antenna as a result. Scientists used a special design program to run a genetic algorithm against this interference function and this wonky little antenna is the result. Would you have thought to try this particular shape? In a way, this creativity can set machines apart from humans. They try everything without hesitation and are only acting on what objectively works or not, rather than following intuitions or emotions.

The next step is crossover. Now, in the code we will read shortly, I didn’t end up using a crossover approach, but I might as well explain what it is anyways, in case you end up seeing this approach in the wild.

Crossover involves splitting a candidate up into two or more parts and then recombining those parts into a descendant candidate. In simplistic terms, it’s like taking half of your mother’s DNA, and half of your father’s DNA, and then mashing them together, and that’s *your* DNA. Real life doesn’t really work like that, but the approach can be effective occasionally in implementations of the genetic algorithm.

Mutation is another “genetic operator” that is used in GAs. Mutation is exactly what it sounds like, you are taking candidates and tweaking them just a little bit. Maybe that means slicing a little bit off, maybe that means adding a little bit on, maybe that just means changing the order that some things are in, anything goes.

def mutate_candidate(c):
    # candidate is just a list of 5 strings
    res = list.copy(c)
    res.remove(random.choice(res))
    res.append(random.choice(words))
    return res

So, I used list.copy here to avoid mutating the original candidate. The structure I am using for this is essentially going to be just one big loop in a boring, plain script. To that end, the population will be stored in some variable, and if mutate is changing the original list by-reference, then it throws off the fitness calculation for the next round.

There’s one more function I need to show before we get to the big, juicy part of our program. It’s a simple helper function that just grabs the n most-fit candidates from our population. I will explain why we need it shortly.

def most_fit(pop, n=50):
    return sorted(pop, key=fitness, reverse=True)[:n]

That’s it!

Before we drop head-first into the big function, let’s talk about a few things. I highly recommend that you try to implement this algorithm on your own sometime. It can be on a simpler problem, if that makes it any easier. Start with a sentence, and then evolve random letters until they match the sentence. That sort of thing can be a nice exercise to learn the concept.

I’ve played with this a *lot* and so I know some of the pitfalls and shortcomings of this approach. One of these pertains to the idea of local optima. If we think of all the possible candidates and all the fitness’ of those candidates as a geometric space, it will resemble a mountain range. There will be high peaks where the solution is most fit and low valleys where the solution is least fit, with jagged parts in-between. If we’re not careful, it could be possible for the algorithm to get “stuck” in some of the peaks. Imagine if we were running this program, and the candidates only mutated if their next descendant was fitter. This sounds favorable at first glance, but let’s think about it a little more. If we have picked four words, maybe just [First, Second, Third, Fourth] for example. Ignore the fact that they are different lengths. This might leave us in a position where no possible fifth word can increase the fitness of the candidate as a whole. So every generation, no matter what word is mutated, it will never get more fit and it will never die off.

Our use of random.choice in mutate_candidate technically prevents this, but we can take it a step further and design some categories. There’s probably a technical word for this, in my head I call it “tribes.”

Let’s get into some code now.

def evolve(n, words):
    finished   = False
    highest    = 0
    gen        = 0
    population = generate_random_population(n, words)
    while not finished:
        gen += 1
        
        print(f"gen: {gen} highest: {highest} pop_size: {len(population)}")
        
        elites = most_fit(population, 25000)
        babies = generate_random_population(25000, words)
        worsen = [mutate_candidate(c) for c in list.copy(elites)]
        legacy = random.choices(population, k = (n - 75000) + (100 * gen))
        legacy = [mutate_candidate(c) for c in legacy]
        
        population = elites + babies + legacy + worsen

        gen_highest = max(population, key=fitness)
        
        if fitness(gen_highest) > highest:
            highest = fitness(gen_highest)
            print(f"new highest is {highest}: {gen_highest}")
        if highest == 25:
            finished = True
            print(f"solution: {gen_highest}")

Don’t panic! It’s simpler than it looks. Let’s start at the beginning.

So, finished, highest, and gen. These are just some pieces of state that we will keep track of to have good output as the solver is running.

Finished is a boolean that tells our loop when to stop running. We know that 5 x 5 is 25, so you’ll see at the bottom of the function

if highest == 25:
            finished = True
            print(f"solution: {gen_highest}")

We stop, and print out the solution. Very simple.

Highest is our state for keeping track of the best fitness found so far. We don’t *need* to do this, but I like to see visually how quickly it’s finding improvements. If we were working on a problem-space that had more complexity then it would be helpful to see if it’s finding improvements every few minutes, or once every 12hrs.

Gen is just our generation counter. We increment it at the beginning of every cycle, and that tells us how many times the natural selection cycle has occurred.

Now, let’s move on to our tribes.

elites = most_fit(population, 25000)
        babies = generate_random_population(25000, words)
        worsen = [mutate_candidate(c) for c in list.copy(elites)]
        legacy = random.choices(population, k = (n - 75000) + (100 * gen))
        legacy = [mutate_candidate(c) for c in legacy] 

So, let’s dig in.

Elites are the fittest candidates out of the bunch. This was what we wrote that helper function for. These candidates are the most fit of the bunch and will likely all be the same fitness. If your highest fitness is 24, likely *all* of these candidates will also have a fitness of 24.

Babies are brand new solutions. I’m including them to ensure additional randomness in the gene pool. These babies are just randomly generated from scratch and will likely not be very fit. However, this means that there is constant “motion” in each generation, because next generation these babies will be mutated into possible fitter descendants.

Worsen is a copy of the list of elites, mutated. This is to help the elites from getting stuck, like we learned about earlier.

Finally, we have the legacy. The legacy group is just the mixing pot of the remaining n candidates. You can see that we are using (100 * gen) in our call to random.choices. This lets the total size of population grow each generation. I run this simulation with 400,000 as the initial population size, you can use any number above 75000 and it should still run as expected.

Those are our tribes. I’ve used these tribes because I have played with them before and I have a half-baked intuition for what will work and what won’t, and I felt confident that these would work.

population = elites + babies + legacy + worsen

Here, we shove all the tribes together as the new population and that completes the cycle.

gen_highest = max(population, key=fitness)
        
        if fitness(gen_highest) > highest:
            highest = fitness(gen_highest)
            print(f"new highest is {highest}: {gen_highest}")
        if highest == 25:
            finished = True
            print(f"solution: {gen_highest}")

We find out what our highest candidate is, update our highest variable from before and check to see if we’re finished.

You can find a link to this script here. The word list is up at the top.

I hope you enjoyed reading this, maybe you’ve learned something along the way!