Learning to play Asteroids in Golang with NEAT

I am a huge fan of CodeBullet, an educational YouTuber that makes game playing AI bots. That is because computer science education more than just conveying complex ideas, it is also getting people excited using those ideas to solve interesting problems, something CodeBullet does very well.

Two years ago CodeBullet showed a bot learning to play Asteroids. I though I wonder if I can do that… in Golang. The first thing was build a desktop app in Golang, then it was to build an asteroids game. Finally, this post is about how I used the goNEAT implementation of NEAT to build an AI that can play my asteroids game.

The constraint to use Golang is artificial. If I were being practical I would use Python and boot up a demo of PyGame Asteroids and plug into the PyTorch NEAT implementation. But I want to use Golang, so here we are.

You can play the AI at https://grahamjenson.github.io/asteroids/### NeuroEvolution of Augmenting Topologies (NEAT)

NeuroEvolution of Augmenting Topologies

NEAT is an algorithm that combines a few ideas like speciation and evolving neural network graphs into a genetic algorithm. I would prefer not to implement it myself (yet), since making these kinds of algorithms performant is a massive time sink. Good news though, on the official(?) list of NEAT implementations there are three written in Golang; one is not working, one has not been touched in 5 years, and the last is goNEAT that looks both maintained and well implemented.

The biggest drawback to goNEAT is that it is not very usable or well documented. It does have some examples though, like XOR and pole balancing which I can copy and adapt.

There was one minor issue that required me to vendor goNEAT; I had to remove its [_viper_](https://github.com/spf13/viper) dependency to make it WASM compatible.

Asteroids

I wrote the asteroids game with an AI bot in mind. The game logic and rending can be separated so I can train the bot as a Golang binary (not WASM) to take advantage of native performance.

To implement the Asteroids testing I used the experiments setup in goNEAT: err = experiment.Execute( context, start_genome, AsteroidGenerationEvaluator{ OutputPath: out_dir_path, PlayTimeInSeconds: 120, FrameRatePerSecond: 15, } )

AsteroidGenerationEvaluator will simulate a 120 second long asteroids game (or until the AI dies) running at 15 frames a second. So a single game takes at most 1800 calls to update the game.

To work with the training framework in goNEAT, AsteroidGenerationEvaluator implements the method GenerationEvaluate: func (ex AsteroidGenerationEvaluator) GenerationEvaluate( population *genetics.Population, epoch *experiments.Generation, context *neat.NeatContext ) (err error) { // Calculate the fitness of all organisms in the population }

This calculates the fitness of the organisms for a population (generation) by getting each of them to play an Asteroids game to calculate their score.

This looks like: for _, org := range population.Organisms { net := org.Phenotype // Neural Network game := &asteroids.Game{} frames := PlayTimeInSeconds * FrameRatePerSecond for f = 0; f < frames; f++ { // calculate the inputs inputs := FindInputs(game) // send those inputs to the network net.LoadSensors(inputs) net.Activate() // run the network for i, output := range net.ReadOutputs() { // if a key output is pushed switch i { case 0: key = keys.KEY_UP case 1: key = keys.KEY_SPACE case 2: key = keys.KEY_LEFT case 3: key = keys.KEY_RIGHT } if output > 0.5 { // output activated pressedKeys[key] = true } } // Update the GameState game.Update(pressedKeys) if game.GameOver() { break } } // Use game score as fitness function // Fitness is normalized to between 0 and 1 org.Fitness = norm(game.Score) }

Loop over each organism
Initialize a game
Find the bots inputs
Send inputs to the organisms neural network
Gets the networks outputs and map them to “pressed” keys
Updates the game with the outputs
Checks if GameOver
After the game assign the organism’s fitness as the game’s score

The variables that need to be experimented with are the inputs and the fitness function. What the bot sees in the world and how we compare the organisms will be the main factors in finding a “good” bot.

Coordinates of the Asteroids

The inputs are what the bot “sees”. The easiest input to find is the relative x,y coordinates of the asteroids to the ship. These coordinates center the ship at 0,0 and rotate around so an asteroid in front of the ship lands on an axis. Since the inputs are a fixed length, I will just find the 5 closest asteroids and send them as inputs.

An important aspect of training is the randomness in the game. Two organisms should be compared with as little noise as possible, so we are fix the randomness per-generation using rand: game := &asteroids.Game{} rand.Seed(int64(generation)) // Force same game per generation defer rand.Seed(time.Now().UnixNano()) game.Init()

The defer statement will reset the randomness to not mess with the NEAT algorithm.

Training this bot created something like:

After 100 generations

This bot can fire at least. It also might be aiming, but that is probably just noise. After lots of training this bot peaked scoring about 20 points. That is the amount you get if you just hold down fire.

Whiskers

There are a few problems with the previous x,y inputs:

Coordinates don’t map to actions; e.g. if x=10 do I fire or turn left.
An input is useless on its own; the AI must look at two inputs to get any use this requires its own training.
Not a High/Low measurement; because of normalization an x=0.4 means the asteroid is left of the ship and x=0.6 means the asteroid is right

What did CodeBullet use? Whiskers. He drew 8 lines of sight and if one of them hit an asteroid it returned the distance. Like how a sea-lion uses its whiskers to see the world.

CodeBullet’s whiskers

To do this, the ship gets 8 whiskers and using SAT collision detection to find the closest asteroids distance makes a bot like:

100 generations

Live Longer and Prosper

After some more training, the bots seem to be all really lazy. They just camp the same location to aim and shoot at passing asteroids. That is boring.

I think these camping bots exist because we don’t consider the speed a bot gets its points. I want the ship to get a high score AND get it quickly. So I am reducing the time a game runs and changing the fitness function to game.Score + survived_seconds. This trains bots like:

Well at least it is moving around. There is something still wrong with this bot. I think it is because the weighting between score and survived_seconds is wrong.

I am going to do a little experiment and change the fitness function to only be survived_seconds:

Survival Bot

That looks cool, the bot dodged an asteroid. So, I think that survival is far more important than score, so with the fitness of score + (10*seconds):

Now the bots are actively trying not to shoot asteroids! I think that is because when an asteroid is shot it splits in two and makes the game harder. So the bot is now trying to shoot as little as possible.

I tested out a few different weights but nothing worked well. I think weighting the bots like score + (N * seconds) is the wrong approach. It just adds one more dimension of difficulty.

Simulating a bots life

With the ability to simulate my asteroids game I could calculate some numbers to help me decide what to do next. I can think of two basic strategies to try “do nothing” and “spin and shoot” and take a random 1000 games and see their mean/min/max lifespan: Do nothing bot: mean:**5.66** min:**0.5** max:**34.5** Fire and spin bot: mean:**4.96** min:**0.3** max:**26.8

These were surprising. There is a game that kills a bot in less than a second (that is not a fun game), and one that takes over 30 seconds. Also, my intuition that not shooting was a better strategy for survival is true.

What we want is to remove all bots that do nothing, and only try keep active bots. One way to do that is to force a minimum lifespan the fitness: if seconds < 6{
return 0
} else return score``

Now, most bots that do nothing, or just spin and shoot, will have 0 fitness.

Also, further reducing the time a bot plays to 15 seconds should make getting a large score much more difficult. Now bots are trying to maximize their score in a small amount of time while surviving long enough to not just sit still.

Exponential inputs

The inputs to the neural network are linear. A asteroid that is 100px away is 2x as important as one that is 200px away. I think this is intuitively incorrect. For example, a bear that is 100m away is much more than just 2x concerning than one that is 200m away.

I am also going to change the inputs from the whiskers to grow exponentially with this function exp:

This function (from here) will take a number between 0 and 1 and return a number in the same range, e.g. for a=40:

With a=40 an asteroid that is 100px away is now 5x more important than one that is 200px away. A bot now will hopefully now appropriately react to an incoming asteroid.

Game Difficultly

I think the game is too hard. To fix that I have:

made the asteroids split less: they were getting in between the whiskers
make the ships rotation and velocity faster
always set the training frame-rate to 60fps, so whiskers have more chance of hitting an asteroid on a turn

After this a trained bot looks like:

Dodging an asteroid

Randomness and Precision

I have come to the conclusion that I should eliminate all randomness during training. A bot should train on the exact same level for all generations. I think training is hard enough without making it even more difficult on the bot. Training a specific asteroids bot should be much faster than training a general one.

I also adjusted the fitness algorithm to be: return score * exp(secondsPercent)

secondsPercent is the percentage of game time the bot is alive, and exp is the function defined above, e.g. a max game time of 10 seconds and the bot gets a score of 20 in 7 seconds its fitness will be 6.3-ish.

The goal is to reward surviving on an exponential curve. With this we can remove the previous if statement as surviving for 0 or 5 seconds are about the same, and not surviving the game is punished significantly.

170 Generations resulted in a bot that can play pretty well

This strategy resulted in a bot that can play, and get scores into the 70s, sometimes. It also looks human-ish as it is not spinning around quickly or performing weirdly.

Performance

This might be a good time to talk about the performance of using a goNEAT bot with WASM inside the browser. We can check how long it takes the AI to make a decision using the chromes dev tools:

This shows that the entire animation loop, including calling the AI takes only 3.64ms, with only 0.12ms spent asking the bot to make a decision. This is encouraging as it means we can make these bots significantly more complicated before they become a performance issue on the front end.

Partial Ordered Fitness

Here is the problem, I want a bot to get a high and score quickly. These two goals form a partial order, a Pareto front of possible good solutions. It is difficult to build a fitness function that does not over-optimize on score (creating short lived rapid fire bots) or survival (a bot that waits for asteroids to come to it). I need a fitness function to focus on a single value.

So I made the game even easier with slower and less asteroids. Now, even simple bots can win given enough time. Then I can focus the fitness function on a single target: if !won { return 0 } return 1 - secondsPercent

This will fail any bot who can’t win, so only optimizing on how long it takes to win.

After a NEAT trains a bunch of species that can win easy games, we increase the difficulty. This is the boiling a frog method to training a bot.

After a few iterations of this we get a bot that plays human-ish:

Then a bot that plays a little better:

Then after more training, I finally have a bot that is starting to look superhuman:

Perfection

This project never ends. It is impossible to train the perfect bot, there is always one more thing to do, one more level of difficulty, or one more trick to teach. For example, my bot is not an ambi-turner, it is like Zoolander and cant turn left.

So far I have been using the NEAT algorithm and goNEAT’s implementation as a black box. I haven’t really dived into it to see how it is working, or how to optimize the many parameters that go into training. I think the next steps will be to really understand, and probably reimplement, the NEAT algorithm. Or like CodeBullet try other games and other algorithms to go wide and not deep.

You can play some AI bots here

Asteroids#

Coordinates of the Asteroids#

Whiskers#

Live Longer and Prosper#

Simulating a bots life#

Exponential inputs#

Game Difficultly#

Randomness and Precision#

Performance#

Partial Ordered Fitness#

Perfection#