undefined | Better HN

0 pointsthephyber4y ago0 comments

I have fiddled with genetic programming. I don't think there is a good solution for a useful metric for comparing one code generator against another, so I don't think DeepMind should care.

Most of the genetic programming results code generated by my algos doesn't compile. Very occasionally the random conditions exist to allow it to jump over a "local maxima" and come up with a useful candidate source code. Sometimes the candidates compile, run, and produce correct results.

The time it takes to run varies vastly with parameters (like population, how the mutation function works, how the fitness function weights/scores, etc).

Personally I really like that these DeepMind announcements don't get lost in performance comparisons, because inevitably those would get bogged down in complaints like "the other thing wasn't tuned as well as this one was". Let 3rd party researchers who have access to both do that work, independently.

0 comments

1 comments · 1 top-level

rabbits774y ago

If you think tuning GP parameters are challenging wait until you try tuning hyper parameters for a DL model!

It is just a press release, to be fair to DeepMind, and I guess they can promote themselves however they wish.

My original comment was more from the context of seeing neural network models in practice perform barely any better, if at all, then classic ML models. Just as those comparisons were revealing similarly I was suspecting this use case may be the same to another classic technique.

GP is certainly not the shining star of AI right now but it is actively researched and perusing Google scholar on the subject will show you plenty of interesting, but less heralded, results.

There are probably several meaningful metrics for this problem that can be examined. If nothing else it is a simple matter of grading the solutions of each, like a university assignment. Also, typically classical techniques are less resource intensive then any neural network methods; the energy savings alone when considered at production scales would be significant.

j / k navigate · click thread line to collapse