Abstract: Artificial neural networks are universal function approximators. They can forecast dynamics, but they may need impractically many neurons to do so, especially if the dynamics is chaotic. We use neural networks that incorporate Hamiltonian dynamics to efficiently learn phase space orbits even as nonlinear systems transition from order to chaos. We demonstrate Hamiltonian neural networks on a widely used dynamics benchmark, the Hénon-Heiles potential, and on nonperturbative dynamical billiards. We introspect to elucidate the Hamiltonian neural network forecasting.
http://www.catb.org/~esr/jargon/html/koans.html
In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. “What are you doing?”, asked Minsky.
“I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.
“Why is the net wired randomly?”, asked Minsky.
“I do not want it to have any preconceptions of how to play”, Sussman said. Minsky then shut his eyes. “Why do you close your eyes?”, Sussman asked his teacher.
“So that the room will be empty.”
At that moment, Sussman was enlightened.
A baby is not born with the knowledge of body movement, for example, but through natural exploration of the body and environment, almost all physically capable humans learn to walk.
"We are seeking exceptional candidates to join our growing Autonomous Vehicle (AV) business team!"
https://techcrunch.com/2019/03/13/ford-is-expanding-its-self...
...oh that makes so much more sense! -.-
I'm probably misunderstanding what the accomplished, but it sounds like they've increased the accuracy of a neural network model of a system, notably for edge cases, by training it on complete a complete model of said system.
Not quite. It's really just that they require the dynamics to be Hamiltonian, which would be highly atypical of the kind of dynamics an otherwise unconstrained neural network would learn. This is reflected in their loss function, the first of which learn an arbitrary second order differential equation, the second of which enforces Hamiltonian dynamics.
I don't understand how this was considered novel enough to warrant at PRE paper.
Here is a link to the paper:
https://journals.aps.org/pre/pdf/10.1103/PhysRevE.101.062207
In general the idea of including model or context-based information into neural networks goes along the line of Kahneman's System I and System II of the human mind. System I is the "emotional" brain that is fast and makes decisions quickly while System II is the "rational" brain that is slow and expensive and takes time to compute a response. Researchers have been trying to develop ML models that utilize this dichotomy by building corresponding dual modules but the major challenge remains in efficiently embedding the assumptions of the world dynamics into the models.
[0] https://arxiv.org/abs/1906.01563 [1] https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow
ML non-expert here. Is this the same as having an extra column of your input data that's a hamiltonian of the raw input? Or a kind of neuron that can compute a hamiltonian on an observation? Or something more complicated.
is this like a specialized 'functional region' in a biological brain? (broca's area, cerebellum)
Hamiltonian neural network (HNN) intakes position and momenta {q,p}, outputs the scalar function H, takes its gradient to find its position and momentum rates of change, and minimizes the loss
<latex equation for a modified loss function that differs from traditional NN>
which enforces Hamilton's equations of motion.
https://journals.aps.org/pre/abstract/10.1103/PhysRevE.101.0...
So, here it is: https://github.com/thesz/nn/tree/master/series
A proof of concept implementation of training neural networks process where loss function is a potential energy in Lagrangian function and I even incorporated "speed of light" - the "mass" of particle gets corrected using Lorenz multiplier m=m0/sqrt(1-v^2/c^2).
Everything is done using ideas from quite interesting paper about power of lazy semantics: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32....
PS Proof-of-concept here means it is grossly inefficient, mainly due to amount of symbolic computation. Yet it works. In some cases. ;)
Sutton is saying 'over a slightly longer time'.
You can wait 20 more years and super-duper-deep-NN-on-steroids, and hardware a million times as big and powerful, would rediscover all of theoretical physics.
Or you could inject some theoretical physics acquired by humans and make DNNs smarter today.
If so, this would be dramatic, no?
If you could teach a translation service 'grammar' and then also leverage the pattern matching, could this be a 'fundamental' new idea in AI application?
Or is this just something specific?
I don’t see a way to generalize this to the procedural rule-based systems you describe, unless they too are governed by a fairly simple continuous function Like the Hamiltonian.
I don’t know if it was “dramatic”, but it made me really happy.