High-res image reconstruction with latent diffusion models from human brain (opens in new tab)

(github.com)

459 pointstrojan133y ago155 comments

155 comments

I immediately found the results suspect, and think I have found what is actually going on. The dataset it was trained on was 2770 images, minus 982 of those used for validation. I posit that the system did not actually read any pictures from the brains, but simply overfitted all the training images into the network itself. For example, if one looks at a picture of a teddy bear, you'd get an overfitted picture of another teddy bear from the training dataset instead.

The best evidence for this is a picture(1) from page 6 of the paper. Look at the second row. The building generated by 'mind reading' subject 2 and 4 look strikingly similar, but not very similar to the ground truth! From manually combing through the training dataset, I found a picture of a building that does look like that, and by scaling it down and cropping it exactly in the middle, it overlays rather closely(2) on the output that was ostensibly generated for an unrelated image.

If so, at most they found that looking at similar subjects light up similar regions of the brain, putting Stable Diffusion on top of it serves no purpose. At worst it's entirely cherry-picked coincidences.

1. https://i.imgur.com/ILCD2Mu.png

2. https://i.imgur.com/ftMlGq8.png

sillysaurusx3y ago

I don’t get the criticism here. Normally I’d be the first to err on the side of skepticism, but this work seems above board.

I think the confusion is that this model is generating “teddy bear” internally, not a photo of a teddy bear. I.e. the diffusion part was added for flair, not to generate the details of the images that exist inside your mind. They could just as easily have run print(“teddy bear”), but they’re sending it to diffusion instead of printing it to console.

The fact that it can correctly discern between a dozen different outputs is pretty remarkable. And that’s all that this is showing. But that’s enough.

It’s not really a “gotcha” to say that it’s showing an image from the training set. They could have replaced diffusion with showing a static image of a teddy bear.

It sounds like this is many readers’ first time confronting the fact that scientists need to do these kinds of projects to get funding. As long as they’re not being intentionally deceptive, it seems fine. There’s a line between this and that ridiculous “rat brain flies plane” myth, and this seems above it.

Disclaimer: I should probably read the paper in detail before posting this, but the criticism of “the building looks like a training image” is mostly what I’m responding to. There are only so many topics one can think about, and having a machine draw a dog when I’m thinking about my dog Pip is some next-level sci-fi “we live in the future” stuff. Even if it doesn’t look like Pip, does it really matter?

Besides, it’s a matter of time till they correlate which parts of the brain are more prone to activating for specific details of the image you’re thinking about. Getting pose and color right would go a long way. So this is a resolution problem; we need more accurate brain sampling techniques, i.e. Neuralink. Then I’m sure diffusion will get a lot more of those details correct.

Aransentin3y ago

Because pretty much everybody that reads the article will have taken away a grossly exaggerated idea of what the system is actually capable of. If Stable Diffusion was intentionally added "for flair" and really is unnecessary, then I would absolutely say that the researchers were being intentionally deceptive.

Even if we do a massive goalpost-move and grant that the system is only identifying the label "dog" with a brain scan of a person looking at a dog, we would need to see actual statistics of its labelling accuracy before judging it in that way. If the images in the paper are cherry-picked(1), it could easily be only able to extract a handful of bits to no bits at all, and the entire thing could very well turn out the be replicable from random noise.

(1) Note that the paper even states "We generated five images for each test image and selected the generated images with highest PSMs [perceptual similarity metrics].", so it even directly admits that the presented images are cherry-picked at least once.

williamcotton3y ago

It’s more like this:

We can take fMRI scans when people are looking at images and generate blurry blobs that do indeed resemble the images spatially.

We can predict a text label of the image the person is looking at using another technique.

If you use SD just on the text labels and you generate an image, you get the semantic content, but not the special content.

If you combine the image and the text label and run it through an LDM then you get pictures that more closely match both the semantic and spatial characteristics of the images shown to the person.

1 more reply

Hakkin3y ago

I'm definitely not an expert in this subject, but even if the model is overfitted, doesn't the fact that it can pull out the similar images at all give credit to the idea that a larger, non-overfitted model could actually work as the paper describes? It means that there does exist some correlation between the shown subject, the captured fMRI data, and the resulting location in latent space.

Double_a_923y ago

The output part is basically nonsense. It would be more honest if the output was a text. E.g. "Teddybear" instead of a bad image of a random teddybear.

Hakkin3y ago

In this specific case I agree, since the model may be overfitted, it seems like it's currently just a glorified object classifier based on what was in the training data, but the fact that it works at all may indicate that the underlying idea has merit. They would probably have to train a much larger network to see if it's able to separate features distinctly enough using the input fMRI data to be useful.

3 more replies

angusturner3y ago

Largely agree with this, although I think it would be interesting to formulate in terms of: "what is the mutual information between the fMRI scan and the stimulus".

i.e) is there actually more information than a few bits encoding a crude object category, which stable diffusion then hallucinates the rest (/ uses to regurgitate an over-fit image)?

Or are there many bits, corresponding spatially to different regions of the stimulus - allowing for some meaningful degree of generalization.

hiddencost3y ago

Nope.

If you train a model where the input is an integer between 1 and 10, and the output is a specific image from a set of ten, the model will be able to get zero loss on the task. That is what's happening here.

geysersam3y ago

Yes but the input isn't an integer from 1 to 10 right? It's MRI data.

Although it seems they're only able to extract the subject of the brain activity, not any actual "pictures".

darawk3y ago

Are you saying the demonstrated results are all in sample? Because this is definitely not true for out of sample data. And the GP comment implies that there is in fact a validation/holdout set.

1 more reply

radu_floricica3y ago

It's still a legitimate direction to pursue. Once you get to large enough training sets, it's basically the same way our own brains work. We don't perceive or remember all the details of a building - just "building, style 19B", plus a few extra generic parameters like distance, angle, color and so on. Totally manageable for deep learning to recognize, and perhaps even combine.

williamcotton3y ago

We performed visual reconstruction from fMRI signals using LDM in three simple steps as follows (Figure 2, middle). The only training required in our method is to construct linear models that map fMRI signals to each LDM component, and no training or fine-tuning of deep-learning models is needed. We used the default parameters of image- to-image and text-to-image codes provided by the authors of LDM 2, including the parameters used for the DDIM sam- pler. See Appendix A for details.

csomar3y ago

But unless they tested this on a single human being; doesn't this mean that we can read brains (it's just this one particular reader is bad).

2 more replies

thedudeabides53y ago

Yes.

It means there may be signal in the noise. Even if it's overfitting. Which makes sense.

A sufficiently granular map of the human brain aught to be readable, if you know what the input and output signals are.

chaxor3y ago

If things are being overfit you should typically make the model smaller - not larger.

kdma3y ago

Good find, when I read it I called bullshit but I got lost trying to understand the diagrams. Another gotcha is the semantic decoder, they are just looping the model on itself "A cozy teddy bear" + fMRI random input => A teddy bear!!!

arnarbi3y ago

Subject 4 in the first line also looks very different from the ground truth, but clearly an airliner. I'm curious if there is also a closer match to that one in the set.

brucethemoose23y ago

Its still picking out the correct "overfitted" images, which is remarkable.

Theoretically, the results would scale to more training images... we just need to fMRI all of LAION-5B. Easy peasy.

mkagenius3y ago

The only question is whether more images will confuse the model or not?

sampo3y ago

> The dataset it was trained on was 2770 images, minus 982 of those used for validation.

I don't think you got that 2770 correct. Might be 9250 images, minus 982 (that one you got right). Then again, the paper is so badly written, I find it difficult to decipher what they did. From section 3.1:

Briefly, NSD provides data acquired from a 7-Tesla fMRI scanner over 30–40 sessions during which each subject viewed three repetitions of 10,000 images. We analyzed data for four of the eight subjects who completed all imaging sessions (subj01, subj02, subj05, and subj07).

We used 27,750 trials from NSD for each subject (2,250 trials out of the total 30,000 trials were not publicly released by NSD). For a subset of those trials (N=2,770 trials), 982 images were viewed by all four subjects. Those trials were used as the test dataset, while the remaining trials (N=24,980) were used as the training dataset.

https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2....

2-718-281-8283y ago

there is also no way that you could represent details as shown with such a small sample.

bawolff3y ago

Even if true, the result still seems very impressive to me as a layman.

gfaure3y ago

That’s the whole problem — that the reconstruction aspect of the contributions seems overstated given only a layperson’s understanding.

SubiculumCode3y ago

I feel like you might be moving the goal posts here a bit. Getting a reconstruction that is a bear, even if not the same bear, is impressive enough to be noteworthy.

xmonkee3y ago

I think the point is that it's not a reconstruction. It's more like recognizing which letter of a thousand-letter alphabet is shown to the human after decoding their brain waves. Still impressive, but not really as impressive as visual reconstruction.

groestl3y ago

TBH, I was not impressed up until now, but given the videos I have in mind from people trying to use brain computing interfaces to type a text, now I'm impressed.

1 more reply

razor_router3y ago

What evidence do you have that this technique is overfitting the training data rather than reading the brain?

ditchfieldcaleb3y ago

What are you talking about? They didn't train a model for this. That's why it's so impressive.

Hakkin3y ago

Quoting from the paper,

  The only training required in our method is to con-
  struct linear models that map fMRI signals to each LDM
  component, and no training or fine-tuning of deep-learning
  models is needed.
  
  ...
  
  To construct models from fMRI to the components of
  LDM, we used L2-regularized linear regression, and all
  models were built on a per subject basis. Weights were
  estimated from training data, and regularization parame-
  ters were explored during the training using 5-fold cross-
  validation.

ditchfieldcaleb3y ago

Ah. This makes more sense. Thanks.

2bitencryption3y ago

Are any of the example images novel, i.e. new to the model? Or is the model only reconstructing images it has already seen before?

Either way, if I'm understanding right, it's very impressive. If the only input to the model (after training) is a fMRI reading, and from that it can reconstruct an image, at the very least that shows it can strongly correlate brain patterns back to the original image.

It'd be even cooler (and scarier?) if it works for novel images. I wonder what the output would look like for an image the model had never seen before? Would a person looking at a clock produce a roughly clock-like image, or would it be noise?

All the usual skepticism to these models applies, of course. They are very good at hallucinating, and we are very good at applying our own meaning to their hallucinations.

andai3y ago

There was a video many years ago (early 2010s?) demoing a similar technology, which would overlay and blend many images on top of each other to make a fuzzy image approximating what was actually being viewed.

Edit: found it! https://youtu.be/nsjDnYxJ0bo

ricudis3y ago

The youtube video quotes a paper by the same author, so it's probably the same group's work. I wonder why didn't they used an approach similar to the one in the video using SD - it looks more viable.

djmips3y ago

Because it's from 11 years ago when SD wasn't around.

crispyambulance3y ago

In 1990, there was a train-wreck Wim Wenders movie that I loved and still love called "Until the End of the World". It was about a scientist (played by Max Von Sydow) who created a machine that could record someone's dreams or visual experiences directly from the brain and play them back even to a blind person. https://youtu.be/gilzgbdk300?t=442

Anyways, the images that were depicted in this work of fiction shot in 1990 about "the future" of 2000, had a very interesting look to them-- kind of distorted and dreamy like the images in the paper.

Are the images in the paper just a case of overfitting? ¯\_(ツ)_/¯ but it still makes me giddy remembering the Wim Wenders film.

dustractor3y ago

Such a great soundtrack too! I rewatched it last week just for the jams. Also, for those into the glitch art genre, the dream sequences were WAY ahead of their time.

donohoe3y ago

As people and groups increasingly move this direction do we think about vectors for abuse in 10, 20 or 50+ years?

The human mind is considered the only place where we have true privacy. All these efforts are taking that away.

At this rate all notions of privacy will soon be dead.

andai3y ago

There was a guy at MIT about ten years ago (edit: 2018! Woah) who made a headset that would read electrical impulses from your face. Apparently when people think in words, the same nerves fire as when they speak, just at a lower activation level. Using those signals it is possible to reconstruct the words being thought.

I'm surprised it didn't seem to go anywhere.

Edit: found it https://youtu.be/RuUSc53Xpeg

zamadatix3y ago

The project page claims it's different process than thinking in words:

> What exactly is “silent speech”? Does the user have to move his or her face or mouth to use the system?

> Silent speech is different from either thinking of words or saying words out loud. Remember when you first learned to read? At first, you spoke the words you read out loud, but then you learned to voice them internally and silently. In order to then proceed to faster reading rates, you had to unlearn the “silent speaking” of the words you read. Silent speaking is a conscious effort to say a word, characterized by subtle movements of internal speech organs without actually voicing it.

> Can this device read my mind? What about privacy?

> No, this device cannot read your mind. The novelty of this system is that it reads signals from your facial and vocal cord muscles when you intentionally and silently voice words. The system does not have any direct and physical access to brain activity, and therefore cannot read a user's thoughts. It is crucial that the control over input resides absolutely with the user in all situations, and that such an interface not have access to a user's thoughts. The device only reads words that are deliberately silently spoken as inputs.

https://www.media.mit.edu/projects/alterego/frequently-asked...

judge20203y ago

> I'm surprised it didn't seem to go anywhere.

At least not publicly.

danem3y ago

There have been some corporate research groups that have tried to take this approach further, and they all have more or less failed as far as I know.

Filligree3y ago

People only think in words right before they say something, so I'm not sure how big a deal this is. I guess they'd be able to predict what I'm writing half a second before I write it?

Would be useful if I lost the ability to write or speak, for whatever reason.

Symmetry3y ago

The extent that people's thinking relies on inner monologue is something that varies wildly between different people. Likewise people's abilities to form mental images.

slingnow3y ago

Citation definitely needed. I have a nearly constant internal monologue that is 100% composed of words.

2 more replies

judge20203y ago

Many people do have an internal monologue. The vector is that some police unit presents you with a login form (eg. for your password manager or encrypted filesystem), and you involuntarily think of the password, which this device reads and presents to them.

1 more reply

squeaky-clean3y ago

I'm "thinking in words" this entire thread as I read it. Do some people read without hearing the words in their head?

3 more replies

notfed3y ago

If this technology becomes accessible to courtrooms or police, they will use it. There will never be a way to encrypt thoughts.

ryanjshaw3y ago

Prediction: aphantasics will be in high demand for certain roles and activities

flockonus3y ago

With all the amazing advances we've seen in the recent years, I'd hope people would now stop thinking "there will never be X".

notfed3y ago

Ok, correction: there will never be a way to encrypt human thoughts. If we reach that point, we won't be human anymore.

maxerickson3y ago

When they tell you to think about what you were doing at the bank on Tuesday, think about what you were doing at the bank on Friday.

omoikane3y ago

If brainwave scanning reaches a point where those instruments are pervasive, I am sure tinfoil hats or some technology similar in spirit will advance accordingly.

3233y ago

Have you seen the size and cost of an fMRI machine?

We are a long way away from worrying about this.

Cheap cameras everywhere on the other hand...

polski-g3y ago

There is an amazing novel called The Truth Machine by James L Halperin that speaks about this.

antegamisou3y ago

> As people and groups increasingly move this direction do we think about vectors for abuse in 10, 20 or 50+ years?

No, the delusional shortsighted and revenue-driven SV startup culture doesn't give a shit about such 'technophobic trivialities'.

ryanjshaw3y ago

Yu Takagi, Shinji Nishimoto

Graduate School of Frontier Biosciences, Osaka University, Japan

antegamisou3y ago

Yeah, put the blame on the researchers and not the greedy worms VCs are.

berniedurfee3y ago

It feels like we’re at the point in the movie where someone travels back in time to warn humanity of the impending apocalypse soon to be unleashed by our insatiable appetite for technological advancement.

roarcher3y ago

> do we think about vectors for abuse in 10, 20 or 50+ years?

Of course, but what's to be done about it? Should we outlaw research like this?

berniedurfee3y ago

No, but maybe we should think ahead and outlaw some of the activities that will abuse this technology.

It’s not hard to imagine some really terrible ways this can be used.

A bit of preemptive legislation might be wise as AI is advancing so rapidly.

KRAKRISMOTT3y ago

A decade ago we had https://youtu.be/nsjDnYxJ0bo

userbinator3y ago

This is something 1984 was slightly hinting at.

Of course, these "advances" will be praised greatly in MSM as providing great benefits for mutes, "harmonious society", and whatever else happens to be the virtue-signaling fad of the moment.

gus_massa3y ago

In case someone miss it, there is a link to more info https://sites.google.com/view/stablediffusion-with-brain/ and to the preprint https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2

ninesnines3y ago

I am suspicious of these results; if we blast a high frequency visual stimulus of a couple of letters and do quite a lot of post processing we can sometimes get a visual cortex map of those particular letters. However, these paper examples are very complex images and I’m very doubtful of the results - aransentin above made a couple of very valid points

SubiculumCode3y ago

I mean in 2011, they were already able to do some basic reconstruction: see https://www.youtube.com/watch?v=nsjDnYxJ0bo

I could imagine improvements since then,especially with advances in image networks.

Lutzb3y ago

Reminds me of this paper [1] from 2011. See it in action in [2]

1. https://www.cell.com/current-biology/fulltext/S0960-9822(11)...

2. https://www.youtube.com/watch?v=nsjDnYxJ0bo

Edit: Just realized the paper above is also from Shinji Nishimoto

smusamashah3y ago

There was this research where they reconstructed human face images from monkey brain scan. https://www.electronicproducts.com/scientists-reconstruct-im...

What's astonishing here is the quality of reconstruction. But I have not seen this research referenced a lot. Does someone how /why the reconstruction from monkey brain looks so perfect while we don't have anything close from human brain?

Edit: better images here https://www.newscientist.com/article/2133343-photos-of-human...

drzoltar3y ago

My understanding is that we won’t get a “mind reader” model out of this, because visual stimulus vs your imagination happen in separate parts of the brain. In other words we won’t be reading the minds of suspected criminals anytime soon. Maybe someone with neurology experience can chime in here? Is it even theoretically possible to see what’s happening in the imagination?

wongarsu3y ago

In the best (worst?) case the method generalizes well, and you could just replace the training set of fMRI scans of people viewing images with fMRI scans of people asked to recall images they were shown previously, or fMRI scans of people told to imagine a scene based on a verbal description. It's rarely that easy though

giantrobot3y ago

> In other words we won’t be reading the minds of suspected criminals anytime soon.

Oh don't worry, this will get wrapped up in some pseudoscience bullshit and misleading statistics and marketed to law enforcement. But not to worry, at first it'll only be used on real bad criminals. If you have nothing to hide you have nothing to fear!

meindnoch3y ago

I have a hunch that any sort of mind-reading machine would have to be tailored uniquely to the individual you want to probe. The internal neural representations likely develop uniquely for each individual.

soulofmischief3y ago

We'll just train a model on training models.

wongarsu3y ago

Maybe that's actually feasible, in the sense that you don't have to train a complete new model but can spit out a personalized model based on a calibration sequence of ten images.

1 more reply

rvnx3y ago

Creepy and cool at the same time. It goes into the bucket of things that are not ethically right, same ways as implanting chips to read monkeys brains. But technically interesting and well-executed.

levzettelin3y ago

What's ethically wrong about this?

tiarafawn3y ago

The end game here is developing a mind reading device. The endeavor device is ethically questionable because such a device would have a lot of ethically wrong/questionable applications.

arbitrage3y ago

You're begging the question.

2 more replies

zepolen3y ago

The only thing that's ethically questionable are humans themselves and such a device would likely do more to expose the unethical.

Everybody remembers what happen when online dna hit mainstream.

2 more replies

not-my-account3y ago

Why is it unethical to put chips in monkey brains?

burnished3y ago

Well, for one, the lived experience for most science monkeys is "torture, execution, autopsy", and the reason we tamper with their brains is due to the similarity.

I suspect others are taking the wide view but I wanted to point out this direct answer to your question.

p.s the use of "torture" is intended in contrast to the neutrality of clinical language on this topic, not as a hint at my judgement on the matter.

Tomis023y ago

How did you get consent to put the chip in?

elil173y ago

Sound's like you're opposed to all animal research, not specifically brain-computer interface research. We also don't ask a monkey's consent before doing any other sort of experiment on it.

2 more replies

xattt3y ago

You ask for consent through the brain-computer interface.

1 more reply

gptgpp3y ago

Oooh I love moral philosophy.

Here's one deontological perspective you could take:

It is always wrong to cause unecessary suffering to others.

Now, the subjective traits to be considered here are "unecessary" and "suffering."

It used to be a common belief that animals lacked the capacity to suffer as humans do. They could feel pain, nocioception sure, but whether it caused complex psychological suffering (torment) used to be contenious.

Today, this is certainly not contentious for our closest relatives. All primates possess a theory of mind, long memory, emotional states, complex social behavior like lasting bonds and altruism, and other traits necessary to suffer in significant ways.

So the focus (as long as we are considering primates, obviously nobody cares about model species like drosophilia as they have a greatly dimished capacity to suffer (edit: although I should mention that ranking things based on capacity to suffer leads to pretty awful territories too. Eg, if a human is severely mentally disabled, is more permissible to experiment on them? I think most of us would say no, which raises the question why it's okay to do so for other species)) shifts to whether causing them immense physical pain and torture is necessary.

And this is where I think things get pretty murky, and I will leave the rest up to you! I wish more people were curious about moral philosophy and creating their own consistent ethical framework for the world... I think it's especially important in science and engineering.

Of course, you could use a different model like utilitarianism, but utilitarianism still requires some level of deontological principles or you end up with a pretty extremist moral philosophy (same goes for just having Kantian deontology with no room for utilitarianism, IMO).

edit 2: come to think of it, Jainists would certainly have an issue with experimentation even on drosophilia, so I take that back that "nobody" cares. IIRC they even have a mouth covering to prevent swallowing and killing any insects that might accidentally fly into their mouths, as well as a specialized stick to gently move things like spiders and other insects out of the way. I know most people would scoff at that, but I think their deep respect for all forms of life is beautiful.

YeGoblynQueenne3y ago

Nice analysis, shame about the cop-out ("I will leave the rest up to you!"). You're like- here's how to use morality, actually using it is left as an exercise to the reader :P

Re Jainism, adherents practice lacto-vegetarianism, but they, for example, don't eat tubers because they consider them too advanced, if I understand correctly. A deep respect for all forms of life is hard to get right in a world where every living thing eats some other living thing, or dies.

1 more reply

00F_3y ago

here we see, basically, a potential feedback loop. AI tools advance brain science -- more advanced brain science can then inform progress in AI. this is why the situation is dangerous: because people dont think about these feedback loops. people see AI and they move the goalposts and rationalize by saying that "cutting edge AI is still short of AGI so its ok." but most normal people dont think about how AI can be used to create AI or how AI could be used to revolutionize all kinds of fields that then plug back into AI. this is a very dangerous, non-linear space. its not the first non-linear space we have traversed but its certainly the least linear space we have ever entered into and it is the highest stakes humanity has ever or will ever deal with.

even if this is just another bullshit article, im just making a point related to it. people need to be worried about this. for the first time in history, lots of people are now creeped out by AI. but they arent taking action or demanding change. we need regulation, grass-roots efforts to stop AI. even if the only way humanity could abort AI as a concept, or delay it for a significant amount of time, was to return to the iron age, and it certainly isnt the only way, it would be unambiguously worth it, in every way and from every angle.

AI requires large compute. what we are doing now was impossible just 20 years ago. if not 20 then 30. you cant manufacture that kind of compute in your garage. global regulation would take care of it no problem. at the very least it would buy us an enormous amount of time that we could use to figure something else out. people always say that some hold-out country would defy global regulations. they wouldnt defy NATO, let alone a super-global coalition. and the idea of such a group or NATO enforcing compute regulations is not far-fetched whatsoever because the emergence of AGI or even advanced non-AGI goes against the interests of literally every human being. there is no group of humans that benefit from that ultimately. the problem is simply waking people up to this plain fact.

TechBro86153y ago

When in human history have we ever been able to stop technological advancement?

> we need regulation

No, we don't. Regulation doesn't stop technological progress - it puts it in the hands of an elite few. And besides, there are 130+ regulatory jurisdictions. For example, the US government doesn't fund human cloning research, but that doesn't mean China won't fund it. Or perhaps you'd also like a one world government that can jail anyone doing wrongthink on their GPU?

Personally, I hope we get AGI (in the most Kurzweillian sense) as soon as possible. It will lead to a cambrian explosion of advancements across all fields of science. This is our best chance of cracking the secrets of the universe and answering fundamental questions, like whether FTL interstellar travel is truly impossible, or whether aging is really irreversible.

Imagine an intelligence unencumbered by the "technical debt" we've accrued over centuries of building our scientific model of the world. AGI could simulate infinitely many novel paths through the "tech tree" of human history, replaying our scientific discoveries and trying different assumptions. What if we had 12 fingers and mathematics started from a base-12 system? What if we could see in infrared? We would have followed entirely different scientific paths; AGI will be able to find what we missed.

00F_3y ago

from your comment i am pretty confident that you are around 19 years old. it doesnt seem like you actually read my comment. i guess i will respond to you but there is no comment in the world that could set you straight. you need more experience and personal development before you could begin to understand this topic.

someone once said "do not see things as they are, see them as they might be." this quote is really about discoveries and the tendency of humans to only see things in terms of what already exists. and the implication is that humans have a big blind spot for seeing whats next. thats why we need a motivational quote to help us to see things as they might be rather than simply as they are. it is true that there has never been a global regulation or ban like the one we are talking about except maybe ozone layer emissions. but by this same metric, AGI can never exist because it has never existed before. its a silly response and a complete waste of time. even if such regulations already existed for other things in the past, you still would be here saying it was impossible but for some other reason. the key here is to make up your mind last, not first.

regulation can stop anything as long as it doesnt break the laws of physics. and, if you had read my comment, i explain why china wouldnt pursue AGI. even if china did pusue AGI, they probably wouldnt be able to crack it. none of the major breakthroughs have come out of china.

"i hope we get AGI as soon as possible. [it will lead to many incredible things]." you have no idea what AGI will lead to. you just cherry pick all the cool stuff that would be possible but totally ignore all of the other implications. there would be an immediate and total power vacuum caused by the advancements. these advancements would be so huge that it would change the geopolitical equation beyond recognition. the concept of a country would probably be economically and geopolitically untenable. there would have to be a transition to an entirely new order where the dominant meta-organisms arent countries but some bizarre AGI conglomerate that looks like an expressionist painting in comparison to what we have now. the transition to this new world, whatever it looks like, would involve war. probably the biggest war that has ever happened. this is intrinsic and unavoidable. it cannot be disproved or denied. the fundamental economic and geopolitical equation that underlies the current equilibrium would change suddenly and violently.

the current world order will disappear, you will probably lose everything you own and everyone you love along with your country. a global war will break out where there is a high chance that all established rules of engagement are ignored. weapons or methods that render the environment unlivable to humans will more than likely be used because the dominant organisms and meta-organisms wont need humans in any practical sense. and after the dust settles and a new equilibrium is reached, the existence of humans will end very quickly (if it hadnt already) because we will offer nothing of value anymore and if our existence presents the slightest inconvenience to the machines, they will allow us to die. and that is just the scenario where they are apathetic towards us. i have not even begun to discuss the repulsive, grotesque nature of our suffering if we ever are the subject of AGI malice. those possibilities are always brushed aside as fear mongering so i dont even bring them up. but they should play into our decision to move forward or not.

at the very best, we will somehow manage to attach ourselves as parasites to the new machine meta-organisms and experience an existence with no agency or purpose other than to ogle at the machines. but that wont happen because the machines will immediately embark on doing things that humans could never, ever understand.

"what if we had 12 fingers [...]." what if indeed. perhaps i was too hasty... no cost is too high in pursuing the deeper mysteries of the universe.

nfgrep3y ago

> you have no idea what AGI will lead to

Neither do you? None of us do, in fact I’d imagine the people trying for AGI right now would have a better guess than you or I.

> there would be an immediate and total power vacuum caused by the advancements. these advancements would be so huge that it would change the geopolitical equation beyond recognition.

This sounds like you’re assuming someone will flip a switch one day and the most powerful mind in history will be let loose. I’m not sure AGI will advance that fast. We might have alot of incredibly “stupid” iterations of AGIs first, for many years before a clever one rolls around.

> this is intrinsic and unavoidable. it cannot be disproved or denied.

Were all just making assumptions here, I don’t think yours get to be called “intrinsic and unavoidable”.

I understand the concerns here, but if you’re willing to claim the end of the world, I would suggest basing your claims on something, or atleast making your assumptions explicit. E.g. “assuming we achieve AGI, and its equipped to rapidly become more powerful/intelligent than the whole of the human population…)

1 more reply

TechBro86153y ago

> from your comment i am pretty confident that you are around 19 years old... you need more experience and personal development before you could begin to understand this topic

And from the first paragraph of your comment, I didn't read the rest of it. Have a nice day (or don't).

armatav3y ago

You can’t stop it - certainly not with regulation; so why so much concern for the “geopolitical equation”?

Fear won’t change that - and we are at least 20 years away from a neuromorphic revolution.

You’ve got enough time to come to grips with it.

1 more reply

politician3y ago

Show HN: Human Diffusion

Hi everybody! We’re Joe and Ahmed and super thrilled to be launching Human Diffusion today! We’ve built an exciting new image generation system that supports economies in developing nations.

Our product leverages the latent creativity of humanity by directly fitting employees with fMRI rigs and presenting them with text inquiries through our API (JavaScript SDK available, Python soon!). Unlike competing alternatives we preserve human jobs in an era of AI supremacy.

I’d like to address rumors that our facilities amount to slaving brains to machines. This is a gross misunderstanding of the benefits we offer to our staff - they are family. Our 18 hour shifts are finely calibrated based on feedback collected through our API, and any suggestion of exploitation is flatly untrue.

Send us an email (satire@humandiffusion.com) to get early access.

Madmallard3y ago

Couldn't we train an AI with FMRI or EEG with like billions of samples of people thinking and describing what they're thinking about and have it gradually train some level of accuracy?

samuelzxu3y ago

There's also this paper with very similar methodology called Mind-Vis, and also accepted to CVPR 2023. https://mind-vis.github.io/

rvz3y ago

Another small step into creating a worse dystopia than the one we are already living in.

Please continue. /s Governments, three letter agencies and the like would be absolutely excited to see this. The future that no-one has asked for.

exclipy3y ago

In 2004, I wrote a short story about exactly this in high school. Using neural networks to "mind read" visual images from an fMRI scan of a brain. I thought it was farfetched, but look where we are now!

dheera3y ago

I wonder how well this would work with wearable brainwave detectors rather than MRI, seeing as MRI isn't really something I could have at home.

babblingfish3y ago

By brainwave detector I am going to assume you mean an EEG. An EEG measures electrical activity at the surface of the brain. A fMRI shows the activity of individual neurons of the entire brain in real time. It's sort of an apples to oranges comparison given the tools measure different things at vastly different resolutions.

jejeyyy773y ago

any idea what the best consumer brainwave detectors are on the market rn?

oth0013y ago

OpenBCI is one. Not cheap from what I've seen.

mrtranscendence3y ago

Need answer fast?

1 more reply

bitL3y ago

Can't wait to this becoming one of individual performance metrics recording all brain states all the time (video/audio/etc.) and be a part of regular performance reviews...

ACV0013y ago

This is big thing. Although this particular paper is not big thing, the many related quoted studies, set a trend.

fretime3y ago

I'm looking forward to it. When will the code be released, Thanks

chrstphrknwtn3y ago

I don't see anything "high-res" about the reconstructed images.

_4483y ago

So this is like mind reading?

lazy_moderator13y ago

not the first time something like this ended up on HN

https://news.ycombinator.com/item?id=33632337

convolvatron3y ago

very curious about the little 'semantic model' at the bottom of the brain. does anyone know how that gets constructed and how it gets fed into the results?

j / k navigate · click thread line to collapse

155 comments

Aransentin3y ago

1. https://i.imgur.com/ILCD2Mu.png

2. https://i.imgur.com/ftMlGq8.png

sillysaurusx3y ago

I don’t get the criticism here. Normally I’d be the first to err on the side of skepticism, but this work seems above board.

The fact that it can correctly discern between a dozen different outputs is pretty remarkable. And that’s all that this is showing. But that’s enough.

It’s not really a “gotcha” to say that it’s showing an image from the training set. They could have replaced diffusion with showing a static image of a teddy bear.

Aransentin3y ago

williamcotton3y ago

It’s more like this:

We can take fMRI scans when people are looking at images and generate blurry blobs that do indeed resemble the images spatially.

We can predict a text label of the image the person is looking at using another technique.

If you use SD just on the text labels and you generate an image, you get the semantic content, but not the special content.

If you combine the image and the text label and run it through an LDM then you get pictures that more closely match both the semantic and spatial characteristics of the images shown to the person.

1 more reply

Hakkin3y ago

Double_a_923y ago

The output part is basically nonsense. It would be more honest if the output was a text. E.g. "Teddybear" instead of a bad image of a random teddybear.

Hakkin3y ago

3 more replies

angusturner3y ago

Largely agree with this, although I think it would be interesting to formulate in terms of: "what is the mutual information between the fMRI scan and the stimulus".

i.e) is there actually more information than a few bits encoding a crude object category, which stable diffusion then hallucinates the rest (/ uses to regurgitate an over-fit image)?

Or are there many bits, corresponding spatially to different regions of the stimulus - allowing for some meaningful degree of generalization.

hiddencost3y ago

Nope.

geysersam3y ago

Yes but the input isn't an integer from 1 to 10 right? It's MRI data.

Although it seems they're only able to extract the subject of the brain activity, not any actual "pictures".

darawk3y ago

Are you saying the demonstrated results are all in sample? Because this is definitely not true for out of sample data. And the GP comment implies that there is in fact a validation/holdout set.

1 more reply

radu_floricica3y ago

williamcotton3y ago

csomar3y ago

But unless they tested this on a single human being; doesn't this mean that we can read brains (it's just this one particular reader is bad).

2 more replies

thedudeabides53y ago

Yes.

It means there may be signal in the noise. Even if it's overfitting. Which makes sense.

A sufficiently granular map of the human brain aught to be readable, if you know what the input and output signals are.

chaxor3y ago

If things are being overfit you should typically make the model smaller - not larger.

kdma3y ago

arnarbi3y ago

Subject 4 in the first line also looks very different from the ground truth, but clearly an airliner. I'm curious if there is also a closer match to that one in the set.

brucethemoose23y ago

Its still picking out the correct "overfitted" images, which is remarkable.

Theoretically, the results would scale to more training images... we just need to fMRI all of LAION-5B. Easy peasy.

mkagenius3y ago

The only question is whether more images will confuse the model or not?

sampo3y ago

> The dataset it was trained on was 2770 images, minus 982 of those used for validation.

https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2....

2-718-281-8283y ago

there is also no way that you could represent details as shown with such a small sample.

bawolff3y ago

Even if true, the result still seems very impressive to me as a layman.

gfaure3y ago

That’s the whole problem — that the reconstruction aspect of the contributions seems overstated given only a layperson’s understanding.

SubiculumCode3y ago

I feel like you might be moving the goal posts here a bit. Getting a reconstruction that is a bear, even if not the same bear, is impressive enough to be noteworthy.

xmonkee3y ago

groestl3y ago

TBH, I was not impressed up until now, but given the videos I have in mind from people trying to use brain computing interfaces to type a text, now I'm impressed.

1 more reply

razor_router3y ago

What evidence do you have that this technique is overfitting the training data rather than reading the brain?

ditchfieldcaleb3y ago

What are you talking about? They didn't train a model for this. That's why it's so impressive.

Hakkin3y ago

Quoting from the paper,

  The only training required in our method is to con-
  struct linear models that map fMRI signals to each LDM
  component, and no training or fine-tuning of deep-learning
  models is needed.
  
  ...
  
  To construct models from fMRI to the components of
  LDM, we used L2-regularized linear regression, and all
  models were built on a per subject basis. Weights were
  estimated from training data, and regularization parame-
  ters were explored during the training using 5-fold cross-
  validation.

ditchfieldcaleb3y ago

Ah. This makes more sense. Thanks.

2bitencryption3y ago

Are any of the example images novel, i.e. new to the model? Or is the model only reconstructing images it has already seen before?

All the usual skepticism to these models applies, of course. They are very good at hallucinating, and we are very good at applying our own meaning to their hallucinations.

andai3y ago

Edit: found it! https://youtu.be/nsjDnYxJ0bo

ricudis3y ago

The youtube video quotes a paper by the same author, so it's probably the same group's work. I wonder why didn't they used an approach similar to the one in the video using SD - it looks more viable.

djmips3y ago

Because it's from 11 years ago when SD wasn't around.

crispyambulance3y ago

Are the images in the paper just a case of overfitting? ¯\_(ツ)_/¯ but it still makes me giddy remembering the Wim Wenders film.

dustractor3y ago

Such a great soundtrack too! I rewatched it last week just for the jams. Also, for those into the glitch art genre, the dream sequences were WAY ahead of their time.

donohoe3y ago

As people and groups increasingly move this direction do we think about vectors for abuse in 10, 20 or 50+ years?

The human mind is considered the only place where we have true privacy. All these efforts are taking that away.

At this rate all notions of privacy will soon be dead.

andai3y ago

I'm surprised it didn't seem to go anywhere.

Edit: found it https://youtu.be/RuUSc53Xpeg

zamadatix3y ago

The project page claims it's different process than thinking in words:

> What exactly is “silent speech”? Does the user have to move his or her face or mouth to use the system?

> Can this device read my mind? What about privacy?

https://www.media.mit.edu/projects/alterego/frequently-asked...

judge20203y ago

> I'm surprised it didn't seem to go anywhere.

At least not publicly.

danem3y ago

There have been some corporate research groups that have tried to take this approach further, and they all have more or less failed as far as I know.

Filligree3y ago

People only think in words right before they say something, so I'm not sure how big a deal this is. I guess they'd be able to predict what I'm writing half a second before I write it?

Would be useful if I lost the ability to write or speak, for whatever reason.

Symmetry3y ago

The extent that people's thinking relies on inner monologue is something that varies wildly between different people. Likewise people's abilities to form mental images.

slingnow3y ago

Citation definitely needed. I have a nearly constant internal monologue that is 100% composed of words.

2 more replies

judge20203y ago

1 more reply

squeaky-clean3y ago

I'm "thinking in words" this entire thread as I read it. Do some people read without hearing the words in their head?

3 more replies

notfed3y ago

If this technology becomes accessible to courtrooms or police, they will use it. There will never be a way to encrypt thoughts.

ryanjshaw3y ago

Prediction: aphantasics will be in high demand for certain roles and activities

flockonus3y ago

With all the amazing advances we've seen in the recent years, I'd hope people would now stop thinking "there will never be X".

notfed3y ago

Ok, correction: there will never be a way to encrypt human thoughts. If we reach that point, we won't be human anymore.

maxerickson3y ago

When they tell you to think about what you were doing at the bank on Tuesday, think about what you were doing at the bank on Friday.

omoikane3y ago

If brainwave scanning reaches a point where those instruments are pervasive, I am sure tinfoil hats or some technology similar in spirit will advance accordingly.

3233y ago

Have you seen the size and cost of an fMRI machine?

We are a long way away from worrying about this.

Cheap cameras everywhere on the other hand...

polski-g3y ago

There is an amazing novel called The Truth Machine by James L Halperin that speaks about this.

antegamisou3y ago

> As people and groups increasingly move this direction do we think about vectors for abuse in 10, 20 or 50+ years?

No, the delusional shortsighted and revenue-driven SV startup culture doesn't give a shit about such 'technophobic trivialities'.

ryanjshaw3y ago

Yu Takagi, Shinji Nishimoto

Graduate School of Frontier Biosciences, Osaka University, Japan

antegamisou3y ago

Yeah, put the blame on the researchers and not the greedy worms VCs are.

berniedurfee3y ago

roarcher3y ago

> do we think about vectors for abuse in 10, 20 or 50+ years?

Of course, but what's to be done about it? Should we outlaw research like this?

berniedurfee3y ago

No, but maybe we should think ahead and outlaw some of the activities that will abuse this technology.

It’s not hard to imagine some really terrible ways this can be used.

A bit of preemptive legislation might be wise as AI is advancing so rapidly.

KRAKRISMOTT3y ago

A decade ago we had https://youtu.be/nsjDnYxJ0bo

userbinator3y ago

This is something 1984 was slightly hinting at.

Of course, these "advances" will be praised greatly in MSM as providing great benefits for mutes, "harmonious society", and whatever else happens to be the virtue-signaling fad of the moment.

gus_massa3y ago

In case someone miss it, there is a link to more info https://sites.google.com/view/stablediffusion-with-brain/ and to the preprint https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2

ninesnines3y ago

SubiculumCode3y ago

I mean in 2011, they were already able to do some basic reconstruction: see https://www.youtube.com/watch?v=nsjDnYxJ0bo

I could imagine improvements since then,especially with advances in image networks.

Lutzb3y ago

Reminds me of this paper [1] from 2011. See it in action in [2]

1. https://www.cell.com/current-biology/fulltext/S0960-9822(11)...

2. https://www.youtube.com/watch?v=nsjDnYxJ0bo

Edit: Just realized the paper above is also from Shinji Nishimoto

smusamashah3y ago

There was this research where they reconstructed human face images from monkey brain scan. https://www.electronicproducts.com/scientists-reconstruct-im...

Edit: better images here https://www.newscientist.com/article/2133343-photos-of-human...

drzoltar3y ago

wongarsu3y ago

giantrobot3y ago

> In other words we won’t be reading the minds of suspected criminals anytime soon.

meindnoch3y ago

soulofmischief3y ago

We'll just train a model on training models.

wongarsu3y ago

Maybe that's actually feasible, in the sense that you don't have to train a complete new model but can spit out a personalized model based on a calibration sequence of ten images.

1 more reply

rvnx3y ago

Creepy and cool at the same time. It goes into the bucket of things that are not ethically right, same ways as implanting chips to read monkeys brains. But technically interesting and well-executed.

levzettelin3y ago

What's ethically wrong about this?

tiarafawn3y ago

The end game here is developing a mind reading device. The endeavor device is ethically questionable because such a device would have a lot of ethically wrong/questionable applications.

arbitrage3y ago

You're begging the question.

2 more replies

zepolen3y ago

The only thing that's ethically questionable are humans themselves and such a device would likely do more to expose the unethical.

Everybody remembers what happen when online dna hit mainstream.

2 more replies

not-my-account3y ago

Why is it unethical to put chips in monkey brains?

burnished3y ago

Well, for one, the lived experience for most science monkeys is "torture, execution, autopsy", and the reason we tamper with their brains is due to the similarity.

I suspect others are taking the wide view but I wanted to point out this direct answer to your question.

p.s the use of "torture" is intended in contrast to the neutrality of clinical language on this topic, not as a hint at my judgement on the matter.

Tomis023y ago

How did you get consent to put the chip in?

elil173y ago

Sound's like you're opposed to all animal research, not specifically brain-computer interface research. We also don't ask a monkey's consent before doing any other sort of experiment on it.

2 more replies

xattt3y ago

You ask for consent through the brain-computer interface.

1 more reply

gptgpp3y ago

Oooh I love moral philosophy.

Here's one deontological perspective you could take:

It is always wrong to cause unecessary suffering to others.

Now, the subjective traits to be considered here are "unecessary" and "suffering."

YeGoblynQueenne3y ago

Nice analysis, shame about the cop-out ("I will leave the rest up to you!"). You're like- here's how to use morality, actually using it is left as an exercise to the reader :P

1 more reply

00F_3y ago

TechBro86153y ago

When in human history have we ever been able to stop technological advancement?

> we need regulation

00F_3y ago

"what if we had 12 fingers [...]." what if indeed. perhaps i was too hasty... no cost is too high in pursuing the deeper mysteries of the universe.

nfgrep3y ago

> you have no idea what AGI will lead to

Neither do you? None of us do, in fact I’d imagine the people trying for AGI right now would have a better guess than you or I.

> there would be an immediate and total power vacuum caused by the advancements. these advancements would be so huge that it would change the geopolitical equation beyond recognition.

> this is intrinsic and unavoidable. it cannot be disproved or denied.

Were all just making assumptions here, I don’t think yours get to be called “intrinsic and unavoidable”.

1 more reply

TechBro86153y ago

> from your comment i am pretty confident that you are around 19 years old... you need more experience and personal development before you could begin to understand this topic

And from the first paragraph of your comment, I didn't read the rest of it. Have a nice day (or don't).

armatav3y ago

You can’t stop it - certainly not with regulation; so why so much concern for the “geopolitical equation”?

Fear won’t change that - and we are at least 20 years away from a neuromorphic revolution.

You’ve got enough time to come to grips with it.

1 more reply

politician3y ago

Show HN: Human Diffusion

Hi everybody! We’re Joe and Ahmed and super thrilled to be launching Human Diffusion today! We’ve built an exciting new image generation system that supports economies in developing nations.

Send us an email (satire@humandiffusion.com) to get early access.

Madmallard3y ago

Couldn't we train an AI with FMRI or EEG with like billions of samples of people thinking and describing what they're thinking about and have it gradually train some level of accuracy?

samuelzxu3y ago

There's also this paper with very similar methodology called Mind-Vis, and also accepted to CVPR 2023. https://mind-vis.github.io/

rvz3y ago

Another small step into creating a worse dystopia than the one we are already living in.

Please continue. /s Governments, three letter agencies and the like would be absolutely excited to see this. The future that no-one has asked for.

exclipy3y ago

dheera3y ago

I wonder how well this would work with wearable brainwave detectors rather than MRI, seeing as MRI isn't really something I could have at home.

babblingfish3y ago

jejeyyy773y ago

any idea what the best consumer brainwave detectors are on the market rn?

oth0013y ago

OpenBCI is one. Not cheap from what I've seen.

mrtranscendence3y ago

Need answer fast?

1 more reply

bitL3y ago

Can't wait to this becoming one of individual performance metrics recording all brain states all the time (video/audio/etc.) and be a part of regular performance reviews...

ACV0013y ago

This is big thing. Although this particular paper is not big thing, the many related quoted studies, set a trend.

fretime3y ago

I'm looking forward to it. When will the code be released, Thanks

chrstphrknwtn3y ago

I don't see anything "high-res" about the reconstructed images.

_4483y ago

So this is like mind reading?

lazy_moderator13y ago

not the first time something like this ended up on HN

https://news.ycombinator.com/item?id=33632337

convolvatron3y ago

very curious about the little 'semantic model' at the bottom of the brain. does anyone know how that gets constructed and how it gets fed into the results?

j / k navigate · click thread line to collapse