story

AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code (opens in new tab)

theverge.com

173 pointsmtuncer8y ago45 comments

45 comments

This reminds me of AI research using NES Games. The AI eventually became proficient at completing Mario levels, and along the way it discovered novel strategies for survival, obtaining points, and finishing levels.

> Check out this timestamp to watch the machine "cheat": https://youtu.be/xOCurBYI_gY?t=9m55s

> Researcher's site about the project: http://www.cs.cmu.edu/~tom7/mario/

> The Paper: The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel...after that it gets a little tricky.: http://www.cs.cmu.edu/~tom7/mario/mario.pdf

pbhjpbhj8y ago

Lol, that Youtube video, at the end the AI pauses the game of Tetris forever so as not to lose.

maxander8y ago

Another case where “the winning move is not to play.”

1 more reply

Sohcahtoa828y ago

I'd call it the first example of AI rage-quitting.

dpflan8y ago

Agreed, that may be the cleverest thing the AI did.

iforgotpassword8y ago

This guy. One of my favorite YouTube channels. Releases something like once a year but oh boy, worth the wait. Check it out if you're a nerd and like creative/useless stuff. ;)

atldev8y ago

Well worth the watch. His quote when the AI pauses the game is gold!

ballenf8y ago

Speaking of easy, I spent many an hour playing that Qbert version on Atari and a decent number of quarters spent on the arcade version.

The atari version even on the hard setting was almost fatally dumbed down to be mindless. The enemies were just way dumber than in the arcade version. The game really didn't even feel like Qbert.

With just a little practice, one could play on a single life for as long as desired. Similar to Asteroids on Atari.

personjerry8y ago

> It’s not the most powerful or widely used form of AI at the moment, but it is making something of a comeback. The ability to crack Q*bert could be read as a good omen that evolutionary algorithms are going to be very useful in the future.

Wow that's quite a jump to make

mnx8y ago

This sound like me at the end of every school essay. A forced and over-broad conclusion just to get a "proper" ending.

tclancy8y ago

@#$&%!

andyjohnson08y ago

The title seems misleading to me. The AI isn't finding bugs by somehow examining the game's source code, it's trying random gameplay and exploiting any advantages that emerge. That it's finding previously unknown bugs seems to be almost entirely down to trying things that human players wouldn't think to do.

tantalor8y ago

You confuse bug (unintended behavior) with its cause (bad code).

BatFastard8y ago

We called them "Unintended features", and they were usually quite popular with users.

montyf8y ago

Exactly what I was thinking. It may be a bug, but the AI treats it as another legitimate game rule. I wonder if there are any techniques for it to be able to tell the difference... for example, if it can quantitatively demonstrate that the conditions for a rule are very rare/unlikely.

pvg8y ago

The AI isn't finding bugs by somehow examining the game's source code

The title doesn't say that.

PeterisP8y ago

It kind of implies that - "AI Cheats at Old Atari Games by Finding Unknown Bugs" would be an accurate title, but the extended "AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code" tells that it's actually finding something in the code, as opposed to simply unexpected/emergent behavior.

2 more replies

soared8y ago

Did you read the article?

>It’s important to note, though, that the agent is not approaching this problem in the same way that a human would. It’s not actively looking for exploits in the game with some Matrix-like computer-vision.

andyjohnson08y ago

I did. "looking for exploits in the game with some Matrix-like computer-vision." is a fairly meaningless phrase.

1 more reply

yorwba8y ago

I haven't read the article yet, and I was not mislead by the title. I guess it helps to be familiar with the way reinforcement learning agents are hooked up to a simulation environment.

nopinsight8y ago

The case is an example of wireheading [1] and illustrates the difficulty of eliciting behaviors we actually desire from complex systems we do not fully understand.

[1] https://wiki.lesswrong.com/wiki/Wireheading

Another lesson: Evolutionary algorithms are really hard to control. Using neural networks developed through evolutionary algorithms means that we are employing a mostly opaque (though not entirely black) box created by a mechanism we can't mentally keep track of in detail. Hope that they are not deployed to control any critical systems until we get a much better grasp of them.

nopinsight8y ago

Has anyone been able to comprehensively state all of essential human values for a general AI to follow? Thankfully, we do not yet have an operational AGI and it is still quite a bit away from reality. (Narrow AIs we are using do not pose much of a problem because they are limited in capabilities.)

raverbashing8y ago

Well how do you say what's cheating or not? It works and it increases the evaluation score

In this case one possible workaround to "cheating" would be to reduce the control precision, add some jittering to control inputs or change the goal function. But I'd say if it's being done solely with using the intended controls it's not cheating (as opposed to changing memory or using a debug 'cheat code').

Still, even in real sports some "cheating" is allowed (see Fosbury Flop)

tboughen8y ago

If it’s not technically cheating, it could be described as gamesmanship.

From Wikipedia https://en.m.wikipedia.org/wiki/Gamesmanship

“Gamesmanship is the use of dubious (although not technically illegal) methods to win or gain a serious advantage in a game or sport. It has been described as "Pushing the rules to the limit without getting caught, using whatever dubious methods possible to achieve the desired end".”

sincerely8y ago

Another term for this is "angle shooting".

mannykannot8y ago

It isn't cheating - as far as the program is concerned, each bug is another rule.

To understand the concept of cheating, and to discuss what is cheating, requires an entirely higher cognitive capability.

NicoJuicy8y ago

I always found this a good project to demonstrate AI :https://xviniette.github.io/FlappyLearning/ ( based on Neuro evolution ) - speed it up for faster results

camgunz8y ago

Can we put AI to work on proving that we live in a simulation? I would never enter/exit my apartment 38 times alternating between forwards, backwards and each side, but an AI would. Maybe then all the walls start flashing and then we'll know!

corobo8y ago

"AI, are you in a simulation?" "Yes" "no I don't mean.. not the simulation I'm running you in, outside of that"

ianferrel8y ago

People with Obsessive Compulsive Disorder are just depth-first-searching for an overlooked maximal strategy.

AnIdiotOnTheNet8y ago

What would it possibly matter? If I told you tomorrow that the entire universe as you know it is running on some extra-dimensional alien computer, how exactly is your life changed? Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

Besides, how would you even tell the difference between a bug in the simulation and legitimate physics? I mean, look at electron tunneling.

shakna8y ago

> Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

My happiness won't change, but I would be excited.

If we are indeed in a simulator, then I would be compelled to create or join an effort to attract the attention of a being outside the simulator. Not for worship, but discourse.

To be able to communicate with something outside of what we had perceived as reality, and would be no less real, would be an amazing opportunity.

1 more reply

camgunz8y ago

It would literally make the whole of human history a lie, in the same way that Mario never saved the princess and I haven't shot hundreds of ducks.

Semiapies8y ago

So, it's basically working as a goal-oriented fuzzer.

hatsunearu8y ago

Fuzzers are like bug/anomaly/new state finding-oriented reinforcement learning programs, so yeah, in a way :P

tabtab8y ago

So it can become a dirty cheat just like a human. AI is getting more "natural" after all.

j / k navigate · click thread line to collapse

45 comments

dpflan8y ago

> Check out this timestamp to watch the machine "cheat": https://youtu.be/xOCurBYI_gY?t=9m55s

> Researcher's site about the project: http://www.cs.cmu.edu/~tom7/mario/

> The Paper: The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel...after that it gets a little tricky.: http://www.cs.cmu.edu/~tom7/mario/mario.pdf

pbhjpbhj8y ago

Lol, that Youtube video, at the end the AI pauses the game of Tetris forever so as not to lose.

maxander8y ago

Another case where “the winning move is not to play.”

1 more reply

Sohcahtoa828y ago

I'd call it the first example of AI rage-quitting.

dpflan8y ago

Agreed, that may be the cleverest thing the AI did.

iforgotpassword8y ago

This guy. One of my favorite YouTube channels. Releases something like once a year but oh boy, worth the wait. Check it out if you're a nerd and like creative/useless stuff. ;)

atldev8y ago

Well worth the watch. His quote when the AI pauses the game is gold!

ballenf8y ago

Speaking of easy, I spent many an hour playing that Qbert version on Atari and a decent number of quarters spent on the arcade version.

The atari version even on the hard setting was almost fatally dumbed down to be mindless. The enemies were just way dumber than in the arcade version. The game really didn't even feel like Qbert.

With just a little practice, one could play on a single life for as long as desired. Similar to Asteroids on Atari.

personjerry8y ago

Wow that's quite a jump to make

mnx8y ago

This sound like me at the end of every school essay. A forced and over-broad conclusion just to get a "proper" ending.

tclancy8y ago

@#$&%!

andyjohnson08y ago

tantalor8y ago

You confuse bug (unintended behavior) with its cause (bad code).

BatFastard8y ago

We called them "Unintended features", and they were usually quite popular with users.

montyf8y ago

pvg8y ago

The AI isn't finding bugs by somehow examining the game's source code

The title doesn't say that.

PeterisP8y ago

2 more replies

soared8y ago

Did you read the article?

andyjohnson08y ago

I did. "looking for exploits in the game with some Matrix-like computer-vision." is a fairly meaningless phrase.

1 more reply

yorwba8y ago

I haven't read the article yet, and I was not mislead by the title. I guess it helps to be familiar with the way reinforcement learning agents are hooked up to a simulation environment.

nopinsight8y ago

The case is an example of wireheading [1] and illustrates the difficulty of eliciting behaviors we actually desire from complex systems we do not fully understand.

[1] https://wiki.lesswrong.com/wiki/Wireheading

nopinsight8y ago

raverbashing8y ago

Well how do you say what's cheating or not? It works and it increases the evaluation score

Still, even in real sports some "cheating" is allowed (see Fosbury Flop)

tboughen8y ago

If it’s not technically cheating, it could be described as gamesmanship.

From Wikipedia https://en.m.wikipedia.org/wiki/Gamesmanship

sincerely8y ago

Another term for this is "angle shooting".

mannykannot8y ago

It isn't cheating - as far as the program is concerned, each bug is another rule.

To understand the concept of cheating, and to discuss what is cheating, requires an entirely higher cognitive capability.

NicoJuicy8y ago

I always found this a good project to demonstrate AI :https://xviniette.github.io/FlappyLearning/ ( based on Neuro evolution ) - speed it up for faster results

camgunz8y ago

corobo8y ago

"AI, are you in a simulation?" "Yes" "no I don't mean.. not the simulation I'm running you in, outside of that"

ianferrel8y ago

People with Obsessive Compulsive Disorder are just depth-first-searching for an overlooked maximal strategy.

AnIdiotOnTheNet8y ago

Besides, how would you even tell the difference between a bug in the simulation and legitimate physics? I mean, look at electron tunneling.

shakna8y ago

> Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

My happiness won't change, but I would be excited.

If we are indeed in a simulator, then I would be compelled to create or join an effort to attract the attention of a being outside the simulator. Not for worship, but discourse.

To be able to communicate with something outside of what we had perceived as reality, and would be no less real, would be an amazing opportunity.

1 more reply

camgunz8y ago

It would literally make the whole of human history a lie, in the same way that Mario never saved the princess and I haven't shot hundreds of ducks.

Semiapies8y ago

So, it's basically working as a goal-oriented fuzzer.

hatsunearu8y ago

Fuzzers are like bug/anomaly/new state finding-oriented reinforcement learning programs, so yeah, in a way :P

tabtab8y ago

So it can become a dirty cheat just like a human. AI is getting more "natural" after all.

j / k navigate · click thread line to collapse