Could we construct a neutral net from nodes with more complex behaviour? Probably, but in computing we’ve generally found that it’s best to build up a system from simple building blocks. So what if it takes many ML nodes to simulate a neuron? That’s probably an efficient way to do it. Especially in the early phase where we’re not quite sure which architecture is the best. It’s easier to experiment with various neural net architectures when the building blocks are simple.
This is probably what you're remembering: https://www.sciencedirect.com/science/article/pii/S089662732...
Well there's spiking neural networks (SNN)[1], which are modeled more closely to how neurons actually work.
Main obstacle is still, as far as I know, that there's no way to train a SSN as efficiently as a "regular" neural network, which lends itself very nicely to gradient descent and similar[2].
An infinitely-fast computer wouldn't meaningfully change the "expensive training vs fast, static inference" workflow that neural networks have always been developed around (except in the most brute force-y "retrain on the entire world, every single nanosecond" sense).
The brain is supremely efficient at what the brain has evolved to do. It is almost tautological! Because if it wasn't, it wouldn't have evolved to that.
Silicon comes from an alien land, and is emulating. Even with the best algorithms there has to be a limit on how efficient a computer-based intelligence can be without changing how the chips work.
You could spin it around and say, well computers are better at many things than humans, and there is no way you could get a biological brain to be as good for the same amount of power (e.g. a raspberry pi can do calculations our brain couldn't possibly do).
The primary difference, and likely the reason that brains are unreasonably effective, is the specifics of the architecture and internal representations (in the rigorous, information-theoretic sense) of its computational systems. It's not quite analog but it uses analog means. It's not quite digital but it does process via abstractions.
You can still reasonably call the brain a "computer" if you decide it can shed the laden history of that word and its close association with binary operations using transistors. You can do so because it uses internal structures to process inputs and emit outputs. But like I said above, it requires a generalized interpretation of the word to start to understand where and how the two fields of study may be unified.
Neural networks fundamentally aren't designed to be otherwise. The workflow that has guided their entire development for over a decade is based around expensive training and static inference.
Let’s say a single A100 has a peak power draw of 250W, and you need 100 to train an LLM. So each hour of training consumes 25,000 Wh of energy. 15 MWh / 25,000 W = 600 hours, or 25 days, which is probably pretty close to the true training time.
So the numbers are actually pretty close. But a human brain doesn’t start out as a set of random weights like an LLM. The human brain has predefined structure that’s the result of an extremely long evolutionary process.
To me that just means nobody has figured out how to do that effectively. The majority will simply make use of what's been done and proven, so we got a plateau at object recognition, and again at generative AI (with applications in several domains). One problem with continuous adaptation and learning is providing an "entity" and "environment" for it to "live" in which doing the adaptive learning. There are some researchers doing that either with robots, or simulations. That's much harder to set up than a lot of cloud compute resources. I do agree with you that these aspects are missing and things will be much more interesting when they get addressed.
Please don't claim things the author didn't. What I read was "ergo (artificial) neural networks may be missing a trick"
I mean, sure, but the topology is exactly what makes both work, so we only really care about the topology.
Agreed, a widespread fallacy of category.
But computers still do some pretty cool things. Powerful tools.
Their entire article hinges on the complaint "brain seems shallow and neural networks are deep, ergo neural networks are doing it wrong."
Neurologists seem to have a really hard time comprehending that researchers working on neural networks aren't as clueless about computers as neurology is about the brain. They also vastly overestimate how much engineers working on neural networks even care about how biological brains work.
Virtually every attempt at making neural networks mimic biological neurons has been a miserable failure. Neural networks, despite their name, don't work anything like biological neurons and their development is guided by a combination of
A) practical experimentation and refinement, and
B) real, actual understanding about how they work.
The concept of resnets didn't come from biology. It came from observations about the flow of gradients between nodes in the computational graph. The concept of CNNs didn't come from biology, it came from old knowledge of convolutional filters. The current form and function of neural networks is grounded in repeated practical experimentation, not an attempt to mimic the slabs of meat that we place on pedestals. Neural networks are deep because it turns out hierarchical feature detectors work really well, and it doesn't really matter if the brain doesn't do things that way.
And then you have the nitwits searching the brain for transformer networks. Might as well look for mercury delay line memory while you're at it. Quantum entanglement too.
There are insights that can come from studying the brain, that do indeed apply. Some researchers may not glean anything from such studies, and some may. I have no doubt that as neural networks get more an more powerful, we will continue to find more ways they are similar to the brain, and apply things we've learned about the brain to them.
I certainly prefer to see people making comparisons of neural networks to the brain, that the old "it's just a glorified autocomplete" and the like.
Relax.
1. https://braininitiative.nih.gov/sites/default/files/document...
I certainly have many critiques of methods used in neuroscience rn (as a working neuroscientist) but to reduce those to the conclusion that the entire project of neuroscience is hopeless is absurd. We understand certain things quite well actually, and it's not at all obvious what "understanding" at a larger scale would look like. It is very possible that the brain is irreducibly complex, and that the model you would need to construct to describe it would itself be so complex as to be useless in providing insight. Considering that the brain is by far the most complex object in the universe I think we're doing pretty well.
Furthermore, there are quite a lot of disagreements about the utility of connectomics. Outside of the extremists (Sebastian Seung and his ilk) no one thinks that connectomics is going to be the key that brings earth shattering insight. It's just another tool. There is a complete connectome for part of the drosophila brain already (privately funded btw), which is in daily use in many fly labs. It tells you what other neurons are connected to. Incredibly useful. Not earth shattering.
also you might want to measure the neuroscience funding you deem wasteful up against the tens of billions NASA is spending to send humans (and not robots) back to the moon for "the spirit of adventure". cold war's over. robots will do just fine for the moon.
It seems the whole point is to bring in additional details of how brains work, that the think may be relevant to artificial NNs.
Lots of graph nodes, with weighted connections, performing distributed computation (mainly hierarchical pattern matching), learning from data by gradually updating weights, using selective attention (and/or recurrence, and/or convolutional filters).
Which of the above is not happening in our brains? Which of the above is not biologically inspired?
In fact this description equally applies to both a brain and GPT4.
Would you have preferred I emulate your style, and complain while providing no support for my complaint?
Ok.
> The concept of CNNs didn't come from biology
I just opened a survey paper on CNNs and literally the first sentence of the paper reads:
> “Convolutional Neural Network (CNN) is a well-known deep learning architecture inspired by the natural visual perception mechanism of the living creatures. In 1959, Hubel & Wiesel [1] found that cells in animal visual cortex are responsible for detecting light in receptive fields. Inspired by this discovery…”
Source: https://arxiv.org/pdf/1512.07108.pdf%C3%A3%E2%82%AC%E2%80%9A
The C in CNN isn't "Convolution" for no reason. It came from work with convolutional filters (yay Sobel kernels!) which at it's height became filter banks and gabor filters and so on before neural networks pretty much killed off handcrafted feature development. Every explanation of how CNNs work still falls back to the original convolutional kernel intuition.
Before that: comparing brain with hydraulic machines. There has been tendency to compare brain with most complex machine known to us at that particular time.
"Descartes was impressed by the hydraulic figures in the royal gardens, and developed a hydraulic theory of the action of the brain. We have since had telephone theories, electrical field theories, and now theories based on computing machines… . We are more likely to find out how the brain works by studying the brain itself, and the phenomenon of behavior, than by indulging in far-fetched physical analogies." -- Karl Lashley 1951
There's little sense in ignoring the whole basic mode of operation, physics, chemistry and biology of the brain in order to analogise it to another system without any of those properties.
This, at best, provides a set of inspirations for engineers -- it does nothing for science.
Sure there is. People had a feel for it back in "clockworks" times, nowadays we have a much better grasp because of progress of physics and math, particularly CS - mode of operation is an implementation detail. Whatever the mode, once you understand the behavior enough to model it in computational terms, you can implement it in anything you like - gears and levers, pistons, water flowing between buckets, electrons in silicon, photons going through lenses, photons diffusing through metamaterials, sound waves diffusing through metamaterials - and yes, also via a person locked in a room full of books telling them what to draw in response to a drawing they receive, and also via a billion kids following a game to the letter, via corporate bureaucracy, via board game rules, etc.
Substrate. Does. Not. Matter.
The only thing limiting your choice here is practical one. Humanity is getting a good mileage out of electrons in silicon, so that's the way to go for now. Gears would work too, they're just too annoying to handle at scale.
Of course, today we don't have a full understanding of biological substrate - we can't model it fully in terms of computation, because it's a piece of spontaneously evolved nanotech and we barely begun being able to observe things at those scales. We have a lot of studying in front of us - but this is about learning how the gooey stuff ticks, what does it compute and how. But it's not about some new dimension of computation.
The deepest fundamental structures in the brain[0] are quantum fields, which are also the deepest fundamental structures in everything else.
There is no known quantum field of "soul" or "intelligence".
The right abstraction is higher, and could still be a whole lot of things; but as maths can be implemented in logic, which can be implemented in electronics or clockwork or hydraulics, it doesn't matter what analogy is used — and my mild disagreement here is that such inspiration has been useful and gotten us this far.
[0] that we know of
[1] - https://en.wikipedia.org/wiki/Convolutional_neural_network
It's not surprising that we found out later the brain also uses such a fundamental element of signal theory.
[1] https://www.amazon.com/Biophysics-Computation-Information-Co...
Please don't claim things the author didn't. What I read was "ergo (artificial) neural networks may be missing a trick"
But in reality, we’re equipped exactly to exist, and we still wonder why in a backwards way, even with education (guilty!)
AI is the task of playing God like toddlers at recess, and LLMs the tower of babel. I still wanna play, it’s fun
Second, there is no need to compare brains to neural networks because brains are neural networks. Neurons form vertices and axons edges connecting the aforementioned. What you are perhaps thinking of are artificial neural networks - most of which are very dissimilar to brains. But even then you are wrong. Artificial Izhikevich and Hodgkin-Huxley neural networks attempts to closely mimic the behavior of real neurons.
While deep, hierarchical artificial neural networks have been more successful than biologically plausible ones, that may be because the technology isn't ready yet. After all, the perceptron was invented in the 1950's but didn't become prominent until the 2010's (or so). Perhaps we need new memories that better map to (real) neural network topologies, or perhaps 3d chips that can pack transistors in the same way brains pack neurons.
Changes in mechanical pressure, electric field, other molecules attachment, photon absorption, can control the conductivity.
Organic semiconductors designed to fit like lego bricks to naturally build the desired structure are IMHO the way to go to produce 3d circuits, rather than layered silicone litography.
I've seen this particular mistake a lot recently. New and exciting auto-corrupt from the latest version of iOS?
Given that our brains rewire themselves live, which ANNs can only do by being excessively connected and updating weights to/from zero, silicone (I'm thinking mainly the oil form) may be a better inspiration than lego.
As a developmental neuroscientist, I found the article insightful and thought provoking. Further, it is quite consistent with major hypotheses in psychology, how the hippocampus works (a subcortical structure) and combines information into memories: See fuzzy trace theory [1], for example.
Your dismissive tone is unappreciated, ill-informed, and crass.
Value of this comment aside, it kind of makes me chuckle how casually it (and other comments in this thread) just drops the word "artificial" from neural networks here, specifically when comparing with neurology. The irony is funny. Like, somehow we've forgotten why we call them that in the first place, exactly when talking about the thing that inspired the approach.
There are things the brain does we have not yet been able to reproduce with a neural network, or to the extent we have seemingly with excessive resources of training and network size. Therefore there is some salient feature of neurology which has been overlooked. I don't think it is necessary to mimic biology down to the exact function of real neurons, but there must in fact be something we are neglecting to mimic.
"Book smart, not street smart" (to use a catchphrase) would apply perfectly to GPT models: brain the size of a rodent's, with 50,000 year's experience of reading Reddit, Wikipedia, and StackOverflow, but no "real life" experiences of its own.
It is more useful to use AI to develop more ecologically valid measurement methods for biology.
Even theoretically, no they can't. They can theoretically model any continuos function.
Plus, even for continuous functions, the theorem only proves that, for any function, there exists some NN that approximates it to arbitrary precision. It is not known whether there is some base NN + finite training set that could be used to arrive at that target NN using some algorithm in a finite number of steps.
Of course you then need to compensate with residuals, initialisation, normalisation, and all that, but it’s a small price to pay for scaling much much better with compute.
This is why today, if you need a low-latency NN, which means a shallow one, often your best bet is to train a deep one first and then distill or prune it down into a shallow one. Because the deep one is so much easier, while training a shallow one from scratch without relying on depth may be an open research question and effectively impossible.
It doesn't. You can speak perfectly fine with children. And in fact some teenagers think they know everything.
As I understand it the thalamus is basically a giant switchboard though. I see no reason to believe that it never connects the output of one cortical area to the input of another, thus doubling the effective depth of the neural network. (I haven’t read this paper though, as it was behind a paywall.)
Your comment would be very valuable to me if it included pointers to better sources. I have sufficient background to see gaps in Jeff's book, and would be interested in exploring these, perhaps through the references you seem to be aware of.