Are there open source vision libraries that are low-latency enough to play this in real time? Assuming the robot could keep up with some regular servos. Would be a very interesting project.
Edit: I'm thinking I could run an Android emulator, and then use MonkeyRunner with Python OpenCV processing my monitor to do the work. Anyone who has any relevant experience please reply, I'd love to hear if this is feasible.
I was wondering if this would work for Flappy birds. Due to the speed of the game, I think it is going to be quite a challenge.
P.S. If you are interested in the mindstorms robot part, I'll put a shameless plug for a robot I built some time ago (based on some work by David Singleton). Code is on github but here is a pycon video: http://pyvideo.org/video/1195/self-driving-lego-mindstorms-r...
I'm working on training Tapster to play Flappy Bird. I'm going down the route of webcam + OpenCV.
(only with flappy bird) - they say it's addictive but you're taking this to another level. "Man I need to do a Lego Mindstorms robot with a webcam and stylus and OpenCV - I've gotten to 194 but I NEED more. I need to take the human element out of this equation...."
It's really poorly formatted and has unused code in it but it was fun to automate and watch. http://pastebin.com/yTmdWgfC
To use it just install the bookmarklet, click on it and then start the game. Press space to make it die.
I suspect you could cut the training time by an order of magnitude with a different optimization algorithm, or just by varying alpha. For example, have you considered trying something like a line search? http://en.wikipedia.org/wiki/Line_search
reenforcement learning is trying to find the optimal policy of action from a given position in the discrete state space.
The policy is roughly a map between state -> action. But it understands the temporal nature of the world.
It will start knowing nothing, then discover hitting a pipe is really bad, therefore states close to nearly hitting a pipe are almost as bad. Over tons of iterations the expected negative value of hitting a pipe bleeds across the expected value of the state space.
With regression and optimization, its never clear what you are optimizing against. Obviously hitting a pipe is bad. But what about the states near a pipe? Are they good or bad? There is no natural metric in the problem that tells you the distance from the pipe or what the correction action should be.
So that's the major different between reenforcment and supervised learning.
Can someone comment on the game's physics? I am assuming:
- constant horizontal velocity
- gravity
- flapping implemented via impulse
- collisions are handled with bounding boxes
Maybe someone who knows something about optimal control (ODE) say whether this is analytically solvable? Of course there's still the practical stuff (numerical integration, I/O lag) to deal with but I'm optimistic.Set flap_height = bottom of next pipe + constant.
If bird height < flap_height, flap.
Done!
Let distance between ground and top of screen=100.
Bird position after a flap at y=y0, time=t0:
y(t) = y0 + 13.8 - 147 * (t - t0 - 0.288)^2
(https://news.ycombinator.com/item?id=7229018 -- note the original comment had "+ 0.288", but if you plot the graph this was obviously a mistake)I tried it out in my own HTML5/JS Flappy clone (mine's not at all interesting or even done, but I felt I had to give it a try), and the movement seems really accurate.
I have no idea how they got to those numbers, however.
- Have a neural network with 6 inputs for:
- player x
- player y
- acceleration speed
- acceleration direction
- next pipe mid-point x
- next pipe mid-point y
- Two outputs of 0 or 1 for click or no-clickThe fitness score would be how close the player is to the pipe mid-point. Hopefully, this would cause the bird to stay as close as possible to fly between the pipes. The genetic algorithm would select the best neural network that knows when to flap based on the current input state.
I go by a rule of thumb like this:
- If you are able to collect lots of training data (inputs and valid outputs) then use back-propagation. It's faster and you might get better results.
- If you don't know the outputs for inputs or if there are simply too many possible combinations (such as in the case of a game), then use a genetic algorithm. It's effectively a search engine that finds the best solution within the problem space (the solution being the optimal weights for the neural network).
Using Neural Networks and Genetic Algorithms http://primaryobjects.com/CMS/Article105.aspx
Do you know some good literature for machine learning 101? Where to start?
Is that plausible? Or just my imagination?
https://dl.dropboxusercontent.com/u/8554242/available-for-2-...
(It's there in the background of the main menu.)