undefined | Better HN

0 pointsbytefactory8y ago0 comments

Ooh, this'll be interesting to see, as with AlphaGo, a lot of people disputed the "experts believed it would take ~10 more years" claim retrospectively.

With SC2, no AI even comes close to beating even a silver level player, so even a 5 year timeline seems really soon. Let's see if DeepMind can beat it!

What's your totally unscientific guess, Gwern?

0 comments

4 comments · 1 top-level

gwern8y ago· 3 in thread

I think it is doable in under 5 years, but this critically depends on the resources invested by DM and other DL orgs. Deep RL is hugely demanding of computational resources to iterate your designs - for example, the first AlphaGo took something like 3 GPU-years to train it once (2 or 3 months parallelized); however, with much more iteration, DM was able to get Master's from-scratch training down to under 1 month. Now an AG researcher can iterate rapidly with small-scale hobbyist or researcher resources, but if they had had to do it all themselves, Ke Jie would still be waiting for a worthy adversary... When I look at all the recent deep RL research ( https://www.reddit.com/r/reinforcementlearning/ ) I definitely feel that we can't be far from an architecture which could solve SC2, but I don't know if anyone is going to invest the team+GPUs to do it within that timeframe. (It might not even be as complex as people think: some well-tuned mix of imitation learning on those 500k+ human games, self-play, residual RNNs for memory/POMDP-solving, and use of recent work on planning over high-level environment modeling\, might well be enough.)

\ "Learning model-based planning from scratch" https://arxiv.org/abs/1707.06170 , Pascanu et al 2017; "Imagination-Augmented Agents for Deep Reinforcement Learning" https://arxiv.org/abs/1707.06203 , Weber et al 2017 (blog: https://deepmind.com/blog/agents-imagine-and-plan/ "Agents that imagine and plan"); "Path Integral Networks: End-to-End Differentiable Optimal Control" https://arxiv.org/abs/1706.09597 , Okada et al 2017; "Value Prediction Network" https://arxiv.org/abs/1707.03497 , Oh et al 2017; "Prediction and Control with Temporal Segment Models" https://arxiv.org/abs/1703.04070 , Mishra et al 2017

bytefactoryOP8y ago

> (It might not even be as complex as people think ...

Yeah, I suspect you're right. Eliezer was alluding to this with the AlphaGo victory as well:

> ... Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it. https://www.facebook.com/yudkowsky/posts/10153914357214228?p...

I can't decide if I would be bummed or excited if that turns out to be the case. On the one hand, we'd be that much closer to AGI. On the other, we'd be continuing down the path of brute-forcing intelligence, rather than depending on those elegant, serendipitous breakthroughs that much of human progress has been built on.

visarga8y ago

> rather than depending on those elegant, serendipitous breakthroughs that much of human progress has been built on

That's brute forcing as well. One such elegant idea comes every million(billion?) people. Random people would just output random ideas.

cjbprime8y ago

Yeah! I mentioned the same sentiment in this 2012 post when it was becoming clear that computers were reaching human strength at Go via brute force: http://blog.printf.net/articles/2012/02/23/computers-are-ver...

j / k navigate · click thread line to collapse