The approach is clearly valid but I feel like there's some missing pieces in making it work effectively in a digital context. I played around with this stuff a lot in college and my take was that the evolvable encoding problem (a form of representation problem) is fairly major and not really solved. There are also some unsolved issues around evolutionary dynamics and evolutionary game theory, which means how to structure the population and the "game" for best results. Not sure how much progress has been made since then on these, but my impression is not much.
The main area where GAs have seen use in the field so far is optimization problems with a lot of parameters related in unknown or hard to model ways or where a closed form solution is unknown or computationally too expensive (NP). These include shipping and air travel routing (traveling salesman with multiple optimization goals like distance + time + fuel + depreciation), circuit board and IC layout, antenna design for exotic RF modulations, drug discovery, materials science, etc. Problems like these are fairly easy to map to a GA, and current GAs are pretty good at finding local maxima in these functions.
Still those are a far cry from "design ex nihilo," which is really the promise that evolutionary computation carries. Those applications are using GAs as a bigger brother to things like Monte Carlo and simulated annealing.
One area that I'd look into if I were doing this now would be a hybrid approach where a GA is used to design deep learning architectures and their associated parameters. Seems like it could be very powerful but damn would that ever take a lot of computing power. The fitness function of the GA would consist of N deep learning runs for each candidate. Luckily GAs are parallelizable to an almost unlimited degree.
An AlexNet/ResNet-type moment may be in the cards for GAs, but I wouldn't put any money on it. They're typically only one better than brute-force. This can be good enough (and is certainly easy to implement), but if you can get a gradient for your problem -- you should use that. And nowadays, you can typically get a gradient!
Nature can't SGD through genomes but has a metric ton of time, so evolution might be near-ideal for sexual reproduction. We typically don't have billions of generations, trillions of instantiations, and complex environments to play with when optimizing functions... It's telling that the fastest-evolving biological system (our brain!) certainly doesn't employ large-scale GA; if anything, it probably approximates gradients via funky distributed rules.
EDIT: The most modern application I can think of was some stuff from OpenAI (https://openai.com/blog/evolution-strategies/). But the point here is one of computational feasibility -- if they could backprop through the same workload, they would.
If biological evolution wasn't much better than brute force we would not be here. There is no way you could randomly generate a functional genome for anything non-trivial in any reasonable multiple of the age of the universe even if you had a trillion Earths to work with.
But ... GAs are not biological evolution. I think the real issue is that present day GAs only approximate some aspects of biological evolution, but they're very "chunky" in the same way that primitive neural network models are. They get generation and selection but actual biological evolution involves much deeper processes than that. Evolutionary theory is rich and quite fascinating.
Nice to see this sort of thing coming back - it was a problem I worked on for a couple of years during my MSc / PhD.
I think that there is probably a lot of scope for this type of approach. IIRC a key issue is around tuning the Genotype -> Phenotype mapping algorithm so that you start covering the space of potential network architectures reasonably efficiently while maintaining an ordered enough fitness landscape to enable the genetic algorithm to search reasonably efficiently.
I really should try and find the time to think a bit about this again...
Neural architecture search (NAS) is a thing! But it's almost exclusively based on meta-gradients. Again, wouldn't put my money on GAs ever outperforming gradient-based methods again.
GAs follow gradients too. It's a different learning approach. All learning follows gradients in one form or another except brute force search, which is not feasible for anything larger than about 2^80 bits of state space. Evolution is not brute force.
https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6...
The approach is clearly valid but I feel like there's some missing pieces in making it work effectively in a digital context. I played around with this stuff a lot in college and my take was that the evolvable encoding problem (a form of representation problem) is fairly major and not really solved. There are also some unsolved issues around evolutionary dynamics and evolutionary game theory, which means how to structure the population and the "game" for best results. Not sure how much progress has been made since then on these, but my impression is not much.
The main area where GAs have seen use in the field so far is optimization problems with a lot of parameters related in unknown or hard to model ways or where a closed form solution is unknown or computationally too expensive (NP). These include shipping and air travel routing (traveling salesman with multiple optimization goals like distance + time + fuel + depreciation), circuit board and IC layout, antenna design for exotic RF modulations, drug discovery, materials science, etc. Problems like these are fairly easy to map to a GA, and current GAs are pretty good at finding local maxima in these functions.
Still those are a far cry from "design ex nihilo," which is really the promise that evolutionary computation carries. Those applications are using GAs as a bigger brother to things like Monte Carlo and simulated annealing.
One area that I'd look into if I were doing this now would be a hybrid approach where a GA is used to design deep learning architectures and their associated parameters. Seems like it could be very powerful but damn would that ever take a lot of computing power. The fitness function of the GA would consist of N deep learning runs for each candidate. Luckily GAs are parallelizable to an almost unlimited degree.