← Selected workR-006/Neural Evolution

Neural Evolution

Self-driving cars in Unity, evolved without supervision. A custom NN, a population, a fitness function, and a diverse track set. Watch a generation get less terrible at not crashing in real time.

GitHub ↗

UnityC#Custom NNEvolutionary algorithms

01

The setup

Neural-Evolution preview - cars driving an unseen track — Unseen track. The model generalised through pure evolution to steer, drive, and complete a map at full speed and boost - just natural competition.

Each car gets a small neural network and a fitness function that rewards distance covered along the track centerline. There is no supervision and no gradient descent on the network itself - the training loop is a generational evolutionary algorithm: select top performers, copy them, mutate weights and biases by a configurable amount, evaluate the next generation, repeat.

The whole loop is configurable from the inspector. Layer sizes, population size, mutation rate and strength, track topology, evaluation episode length. The point of the project was less to produce a top-line driving model and more to make the geometry of behavior legible - to actually watch a behavior emerge from variation and selection rather than from a loss surface.

02

The network

Neural network structure — Layered NN. Inputs are raycast distances and velocity readings; outputs steer, throttle and boost.

The network is a small fully-connected MLP, intentionally tiny so the search space stays tractable. Inputs are raycast hits in a fan around the car (distance to track edge in several directions) plus current velocity. Outputs are steering, throttle and a boost gate. Mutation perturbs weights and biases with gaussian noise scaled by configurable strength.

The interesting bit operationally is how brittle behavior is to the input layout. Move a raycast a few degrees and you get a completely different end-state behavior, because evolution latches onto whatever cue is most immediately predictive of the fitness. That is one of the things I keep liking about evolutionary search: it is brutally honest about what your representation does and does not let the system see.

03

The track

Main training track. Unseen-track evaluation is what tells you whether the model evolved a generalisable policy or memorised one path.

Three tracks ship with the project. Training happens on one; evaluation on the others. The fun finding - the one in the GIF above - is that with sufficient population and diverse-enough tracks during training, the evolved networks generalise. They steer through track shapes they have never seen before. Pure variation and selection produces a control policy that is more robust than I would have predicted from the size of the network and the simplicity of the inputs.

That sounds like a small thing. It is in absolute terms. But it is the single observation that started me thinking seriously about what evolutionary signals can teach an architecture, and eventually about why low-bandwidth selection rules feel like they belong in cognition - which is a thread that runs through the rest of my research.

Want to argue with any of this

aiman@shabib.net

Back to work