SimpleFold: Folding proteins is simpler than you think (opens in new tab)

(github.com)

471 pointskevlened9mo ago132 comments

132 comments

96 comments · 25 top-level

barbarr9mo ago· 15 in thread

Why is apple doing protein folding?

Apple has an ML research group. They do a mixture of obviously-Apple things, other applications, generally useful optimizations, and basic research.

https://machinelearning.apple.com/

bobmarleybiceps9mo ago

This may not be the actual reason in this case, but I think it's good to be aware of: A non-zero chunk of "ai for science" research done at tech companies is basically done for marketing. Even in cases where it's not directly beneficial for the companies products or is unlikely to really lead to anything substantial, it is still good for "prestige"

giancarlostoro9mo ago

No idea, but can I be signed up for R&D jobs where you don't necessarily build something generating revenue?

Maybe these are just projects they use to test and polish their AI chips? Not sure.

1 more reply

nextos9mo ago

Local inference. I imagine they have an interest in making this and other cutting edge models small enough to be possible to do quick inference on their desktop machines. The article shows that, with Figure 1E demonstrating inference on an M2 Max 64 GB.

Frankly, it's a great idea. If you are a small pharma company, being able to do quick local inference removes lots of barriers and gatekeeping. You can even afford to do some Bayesian optimization or RL with lab feedback on some generated sequences.

In comparison, running AlphaFold requires significant resources. And IMHO, their usage of multiple alignments is a bit hacky, makes performance worse on proteins without close homologs, and requires tons of preprocessing.

A few years back, ESM from Meta already demonstrated that alignment-free approaches are possible and perform well. AlphaFold has no secret sauce, it's just a seq2seq problem, and many different approaches work well, including attention-free SSMs.

Zacharias0309mo ago

I think people often interpret a bit too much. Perhaps it’s just some researchers who got enough freedom to run and publish interesting work within apple. For a company like apple it makes sense to have a research lab with considerable freedoms even if protein folding is not a core interest, which is why you see it published but not the formula for the new Corning Gorilla glass…

mensetmanusman9mo ago

Will be fascinating to see how the market breaks down in the future, will enough people want a third best model they can run on prem, or will people all be fighting in line for the top models that are a few cents more per token on supercomputers.

cowsandmilk9mo ago

To sell computers? 20 years ago, Apple had scientific poster sessions at WWDC and worked to bring PyMol to the Mac. The pictures of proteins you see in the paper were generated with PyMol as are probably >50% of the protein images in scientific papers for the last 15 years.

whyenot9mo ago

If Warren Delano (the author of PyMol) were still with us, I think he would be amazed at where we are now with AlphaFold, and all the rest. At least what he hoped for, that software like this would be open source and peer-reviewable, has mostly held true.

shpongled9mo ago

Probably because ByteDance and Facebook (spun out into EvolutionaryScale) are doing it

IncreasePosts9mo ago

They're jealous they haven't won a Nobel prize

Forbo9mo ago

Reputation laundering?

EasyMark9mo ago

They have a much better reputation that most companies. I think they're doing okay compared to google, facebook, oracle, etc. Few people are going to think a corp is "doing good" but reputation does still matter somewhat.

1 more reply

jama2119mo ago

What’s there to launder? Perhaps they shouldn’t have as good a reputation as they do, but you can’t deny they do have a good reputation.

1 more reply

mabedan9mo ago

Prowlly cuz Siri didn’t work out

lovasoa9mo ago

How do you call the opposite of green washing? When you want to show that you are burning as much energy on training models as the others.

stephenpontes9mo ago· 10 in thread

I remember first hearing about protein folding with the Folding @Home project (https://foldingathome.org) back when I had a spare media server and energy was cheap (free) in my college dorm. I'm not knowledgable on this, but have we come a long way in terms of making protein folding simpler on today's hardware, or is this only applicable to certain types of problems?

It seems like the Folding @Home project is still around!

roughly9mo ago

As I understand it, folding at home was a physics based simulation solver, whereas alphafold and its progeny (including this) are statistical methods. The statistical methods are much, much cheaper computationally, but rely on existing protein folds and can’t generate strong predictions for proteins that don’t have some similarities to proteins in their training set.

In other words, it’s a different approach that trades off versatility for speed, but that trade off is significant enough to make it viable to generate protein folds for really any protein you’re interested in - it moves folding from something that’s almost computationally infeasible for most projects to something that you can just do for any protein as part of a normal workflow.

cowsandmilk9mo ago

1. I would be hesitant to not categorize folding@home as statistics based; they use Markov state models which is very much based on statistics. And their current force fields are parameterized via machine learning ( https://pubs.acs.org/doi/10.1021/acs.jctc.0c00355 ).

2. The biggest difference between folding@home and alphafold is that folding@home tries to generate the full folding trajectory while alphafold is just protein structure prediction; only looking to match the folded crystal structure. Folding@home can do things like look into how a mutation may make a protein take longer to fold or be more or less stable in its folded state. Alphafold doesn’t try to do that.

2 more replies

_joel9mo ago

Yep, that and SETI@Home. I loved the eye candy, even if I didn't know what it fully meant.

gregsadetsky9mo ago

That and project RC5 from the same time period..! :-)

https://www.distributed.net/RC5

https://en.wikipedia.org/wiki/RSA_Secret-Key_Challenge

I wonder what kind of performance would I get on a M1 computer today... haha

EDIT: people are still participating in rc5-72...?? https://stats.distributed.net/projects.php?project_id=8

seydor9mo ago

How come we don't have AI@Home

2 more replies

jffry9mo ago

Apparently from a F@H blog post [1] they say it's still useful to know the dynamics of how it folded, in addition to the final folded shape. And that having ML-folded proteins is a rich target for simulation to validate and to understand how the protein works

[1] https://foldingathome.org/2024/05/02/alphafold-opens-new-opp...

EasyMark9mo ago

They're still going and have made some great discoveries over the years.

https://foldingathome.org/papers-results/?lng=en

ge969mo ago

I contributed a lot on there too used my 3080Ti-FE as a small heater in the winter

EasyMark9mo ago

lol I still run it in the winter but I feel bad running it in the summer, so I don't run it when A/C or heating is not necessary. I figure some contribution is infinitely more than 0 contribution.

nkjoep9mo ago

Team F@H forever!

turblety9mo ago· 10 in thread

I wonder why Apple can create a model to fold proteins, but still can't get Siri to control the phone competently? I'm not sure I agree with Apple's priorities. I guess these things are not synchronous and they can work on multiple things at a time.

tanelpoder9mo ago

I guess it's because SimpleFold came from a research lab with different autonomy and less competing interests and internal politics...

frenchie41119mo ago

I am genuinely interested where the strong negativity towards Siri has come from in recent culture. From what I gather it's likely due to the high expectations we have for Apple. But what I don't really get is why is there not a similar amount of negativity being directed at Google or Samsung, who both have equally shit phone AI assistants (obviously this is just from my perspective, I am a daily user of both iOS and a Samsung Android)

I am not trying to defend Apple or Siri by any means. I think the product absolutely should (and will) improve. I am just curious to explore why there is such negativity being directed specifically at Apple's AI assistant.

xp849mo ago

As a vocal critic of Siri, I can give you a number of reasons we hate it:

1. It seems to be actively getting worse. On a daily basis, I see it responding to queries nonsensically, like when i say “play (song) by (artist)” (I have Apple Music) by opening my Sirius app and putting on a random thing that isn’t even that artist. Other trivial commands are frequently just met with apologies or searching the web.

2. Over a year ago Apple conducted a flashy announcement full of promises about how Siri would not only do the things that it’s been marketed as being able to do for the last decade, but also things that no one has seen an assistant do. Many people believe that announcement was based on fantasy thinking and those people are looking more and more correct every day that Apple ships no actual improvements to Siri.

3. Apple also shipped a visual overhaul of how Siri looks, which gives the impression that work has been done, leading people to be even more disappointed when Siri continues to be a pile of trash.

4. The only competitor that makes sense to compare is Google, since no one else has access to do useful things on your device with your data. At least Google has a clear path to an LLM-based assistant, since they’ve built an LLM. It seems believable that android users will have access to a Gemini-based assistant, whereas it appears to most of us that Apple‘s internal dysfunction has rendered them unable to ship something of that caliber.

samuelg1239mo ago

I think Siri has always been criticized, likely because it has never worked super well and it has the most eyes (or ears) on it (iPhones still have 50% market share in the US).

And now that we have ChatGPT with voice mode, Gemini Live, etc which have incredible speech recognition and reasoning comparatively, it's harder to argue that "every voice assistant is bad" still.

Invictus09mo ago

For the last three iOS major versions, Siri has been unable to execute the simple command "shuffle the playlist 'Jams'", or any variation, like "play the playlist Jams on shuffle". I am upset for that reason.

1 more reply

citizenpaul9mo ago

Is it just my rosie glasses or did siri work much better in the first couple of years and seem to decline continually since then. I actually used it a lot initially then eventually disabled it as it never worked anymore.

1 more reply

SoftTalker9mo ago

I've disabled Siri as much as I possibly can. I've never even tried to use it. I would do the same for any other AI assistant. I don't like that they are always listening, and I just don't like talking to computers. I find it unnatural, and I get irrationally angry when they don't understand what I want.

If I could buy a phone without an assistant I would see that as a desirable feature.

al_borland9mo ago

Something like this doesn’t actually have to work. There were no expectations at all in this space.

Meanwhile, people expect perfection from Siri. At this point a new version of Siri will never live up to people’s expectations. Had they released something on-par with ChatGPT, people would hate it and probably file a class action lawsuit against Apple over it.

The entire company isn’t going to work on Siri. In a large company there are a lot of priorities, and some things that happen on the side as well. For all we know this was one person’s weekend project to help learn something new that will later be applied to the priorities.

I’ve made plenty of hobby projects related to work that weren’t important or priorities, but what I learned along the want proved extremely valuable to key deliverables down the road.

mapmeld9mo ago

As I understand it, Siri and Alexa could be plugged into an LLM, but changing it to an "open world" device that can tell your kid something disturbing, text all of your contacts, buy groceries, etc. comes with serious risk of reputational harm. While still falling short of people's expectations if it isn't ChatGPT-quality. OpenAI is new enough that they get to play by different rules.

EasyMark9mo ago

Fair point, even X was able to pump out a usable AI, grok.

hashta9mo ago· 7 in thread

One caveat that’s easy to miss: the "simple" model here didn’t just learn folding from raw experimental structures. Most of its training data comes from AlphaFold-style predictions. Millions of protein structures that were themselves generated by big MSA-based and highly engineered models.

It’s not like we can throw away all the inductive biases and MSA machinery, someone upstream still had to build and run those models to create the training corpus.

aDyslecticCrow9mo ago

What i take away is the simplicity and scaling behavior. The ML field often sees an increase in module complexity to reach higher scores, and then a breakthrough where a simple model performs on-par with the most complex. That such a "simple" architecture works this well on its own, means we can potentially add back the complexity again to reach further. Can we add back MSA now? where will that take us?

My rough understanding of field is that a "rough" generative model makes a bunch of decent guesses, and more formal "verifiers" ensure they abide by the laws of physics and geometry. The AI reduce the unfathomably large search-space so the expensive simulation doesn't need to do so much wasted work on dead-ends. If the guessing network improves, then the whole process speeds up.

- I'm recalling the increasingly complex transfer functions in redcurrant networks,

- The deep pre-processing chains before skip forward layers.

- The complex normalization objectives before Relu.

- The convoluted multi-objective GAN networks before diffusion.

- The complex multi-pass models before full-convolution networks.

So basically, i'm very excited by this. Not because this itself is an optimal architecture, but precisely because it isn't!

nextos9mo ago

> Can we add back MSA now?

Using MSAs might be a local optimum. ESM showed good performance on some protein problems without MSAs. MSAs offer a nice inductive bias and better average performance. However, the cost is doing poorly on proteins where MSAs are not accurate. These include B and T cell receptors, which are clinically very relevant.

Isomorphic Labs, Oxford, MRC, and others have started the OpenBind Consortium (https://openbind.uk) to generate large-scale structure and affinity data. I believe that once more data is available, MSAs will be less relevant as model inputs. They are "too linear".

1 more reply

godelski9mo ago

Is this so unusual? Almost everything that is simple was once considered complex. That's the thing about emergence, you have to go through all the complexities first to find the generalized and simpler formulations. It should be obvious that things in nature run off of relatively simple rulesets, but it's like looking at a Game of Life and trying to reverse engineer those rules AND the starting parameters. Anyone telling you such a task is easy is full of themselves. But then again, who seriously believes that P=NP?

hashta9mo ago

To people outside the field, the title/abstract can make it sound like folding is just inherently simple now, but this model wouldn’t exist without the large synthetic dataset produced by the more complex AF. The "simple" architecture is still using the complex model indirectly through distillation. We didn’t really extract new tricks to design a simpler model from scratch, we shifted the complexity from the model space into the data space (think GPT-5 => GPT-5-mini, there’s no GPT-5-mini without GPT-5)

3 more replies

slashdave9mo ago

> It should be obvious that things in nature run off of relatively simple rulesets

Only if you are willing to call a billion years of evolutionary selection a "simple ruleset"

2 more replies

mapmeld9mo ago

And AlphaFold was validated with experimental observation of folded proteins using X-rays

slashdave9mo ago

Correct. For those that might not follow, the MSA is used to generalize from known PDB structures to new sequences. If you train on AlphaFold2 results, those results include that generalization, so that your model no longer needs that capability (you can rely on rote memorization). This simple conclusion seems to have escaped the authors.

frenchie41119mo ago· 6 in thread

I am curious to hear an expert weigh in on this approach's implications for protein folding research. This sounds cool but it's really unclear to me what the implications are

epistasis9mo ago

It may be a change in future models, perhaps. Here's one person's opinion:

https://genomely.substack.com/p/simplefold-and-the-future-of...

But as with anything in research, it will take months and years to see what the actual implications are. Predictions of future directions can only go so far!

geremiiah9mo ago

Their representation is simpler, just a transformer. That means you can just plug in all the theory and tools that have been developed specifically for transformers, most importantly you can scale the model easier. But more than that, I think, it shows that there was no magic to AlphaFold. The details of the architecture and training method didn't matter much. All that was needed was training a big enough model on a large enough dataset. Indeed lots of people who have experimented with AlphaFold have found it to behave similiar to LLMs, i.e. it performs well on inputs close to the training dataset and but it doesn't generalize well at all.

johncolanduoni9mo ago

Except their dataset is mostly the output of AlphaFold, which had to use the much smaller dataset of proteins analyzed by crystallography as input. This is really an exercise in model distillation - a worthy endeavor but it's not like they could have just taken their architecture and the dataset AlphaFold had and expect to get the same results. If that was the case, that's what they would have done because it would've been much more impressive.

visarga9mo ago

> But more than that, I think, it shows that there was no magic to AlphaFold. The details of the architecture and training method didn't matter much. All that was needed was training a big enough model on a large enough dataset.

People often like to say that we just need one more algorithmic breakthrough or two for AGI. But in reality it's the dataset and the environment based learning. Almost any model would do if you collected the data. It's not in the model, it's outside where we need to work on.

aDyslecticCrow9mo ago

I think the sentiment that simplicity is good, is a false conclusion. Simplicity is simply good scientific methodology.

Doing too many things at once makes methods hard to adopt and makes conclusions harder to draw. So we try to find simple methods that show measurable gain, so we can adapt it to future approaches.

Its a cycle between complexity and simplicity. When a new simple and scalable approach beats the previous state of art, that just means we discovered a new local maxima hill to climp up.

cma9mo ago

They had to largely use alpha fold for the data part of the transformer scaling laws so not quite a bitter lesson, but still interesting.

kylehotchkiss9mo ago· 6 in thread

> Folding Proteins Is Simpler Than You Think

Then why do we need customized LLM models, two of which seemed to require the resources of 2 of the wealthiest companies on earth (this and google's alphafold) to do it?

aDyslecticCrow9mo ago

Its not an LLM, It's a transformer. I know the terms are really being butchered in media, but if we're gonna use the term LLM instead of AI, we better make sure it's actually a "large language model" that is being refereed to. If you're unsure, call it a neural net, or machine learning algorithm, or AI.

It's indeed a large model. But if you knew the history of the field, it's a massive improvement. It has progressed from a almost "NP" problem only barely approachable with distributed cluster compute, to something that can run on a single server with some pricey hardware. The smallest model is only here is only 100M parameters and the largest is 3B parameters, that's very approachable to run locally with the right hardware, and easily within the range for a small biotech lab (compared to the cost of other biotech equipment)

It's also (i'd argue) one of the only truly economically and sociably valuable AI technologies we've found over the past few years. Every simulated protein fold is saving a biotech company weeks of work for highly skilled biotech engineers and very expensive chemicals (In a way that that truly only supplement rather than replace the work). Any progress in the field is a huge win for society.

kylehotchkiss9mo ago

I'm more teasing the title than the tech :) I'm all for innovation in the field especially with so much bio funding cut!

wrsh079mo ago

Folding proteins is pretty valuable and this model is comparably small

This doesn't seem like particularly wasteful overinvestment.

Granted, I'm more excited about the research coming out of arc

jjtheblunt9mo ago

what are you referring to by arc?

2 more replies

wrs9mo ago

How simple did you think it was before?

kylehotchkiss9mo ago

Not simple! Wasn't/Isn't X-ray crystallography what it usually takes to determine the structure?

1 more reply

phoenicyan9mo ago· 4 in thread

Curious since AlphaFold got released: have classical molecular dynamics sims in this area become obsolete, at least for protein folding? How does the research coming out of venues like DESRES compare? Are they working on more specific problems in the same area or are they in a different business altogether?

the__alchemist9mo ago

No. AlphaFold doesn't do dynamics; it does end-state snapshots only. It does not do anything about the motion of the atoms, which is the core functionality of MD.

tripplyons9mo ago

I was curious about what was released, and the parameters for AlphaFold V3 are only given to certain groups for non-commercial use: https://github.com/google-deepmind/alphafold3?tab=readme-ov-...

However, it seems like anyone can download the parameters for AlphaFold V2: https://github.com/google-deepmind/alphafold?tab=readme-ov-f...

dekhn9mo ago

MD was never really a viable way to do structure prediction, so it didn't become obsolete with AlphaFold. Instead, MD is more useful for studying the physical process of protein folding (before the protein folds to its final structure, as well as once it has reached its final structure and sort of jiggles and wiggles around that).

cowsandmilk9mo ago

MD simulations typically aren’t run for time scales that tell you anything about the folding process. Most people are looking at motion after the protein has folded.

IAmBroom9mo ago· 3 in thread

Link goes the github repository behind the article you might want to read.

https://arxiv.org/abs/2509.18480

IAmBroom9mo ago

And the abstract alone says (if I'm reading it correctly), "It still takes AI; just not nearly as much as others are doing."

mentalgear9mo ago

another form: transformers for the task

serjester9mo ago

For those interested in the GitHub link.

https://github.com/apple/ml-simplefold

dyauspitr9mo ago· 3 in thread

Isn’t this a largely solved problem after Alphafold?

the__alchemist9mo ago

Should an entry in a field preclude other ones. I encourage you to apply reductio-ad-absurdum here. Should Pepsi exist if Coke does? Should C exist if Fortran does?

samfriedman9mo ago

Maybe they've been working on it, but got scooped?

zamadatix9mo ago

I don't think that's the case. The numbers in the paper suggest ~92% of the training data comes from pre-existing AI models, including AlphaFold, and they claim things like:

> We largely adopt the data pipeline implemented in Boltz-11 1https://github.com/jwohlwend/boltz (Wohlwend et al., 2024), which is an open-source replication of AlphaFold3

I believe the story here is largely that they simplified the architecture and scaled it to 3B parameters while maintaining leading results.

GistNoesis9mo ago· 2 in thread

Intellectually, I don't like this approach.

Predicting the end-result from the sequence of protein directly is prone to miss any new phenomenon and would just regurgitate/interpolate the training datasets.

I would much prefer an approach based on first principles.

In theory folding is easy, it's just running a simulation of your protein surrounded by some water molecules for the same number of nano-seconds nature do.

The problem is that usually this take a long time because evolving a system needs to compute the energy of the system as a position of the atoms which is a complex problem involving Quantum Mechanics. It's mostly due to the behavior of the electrons, but because they are much lighter they operate on a faster timescale. You typically don't care about them, only the effect they have on your atoms.

In the past, you would use various Lennard-Jones potentials for pairs of atoms when the pair of atoms are unbounded, and other potentials when they are bonded and it would get very complex very quickly. But now there are deep-learning based approach to compute the energy of the system by using a neural network. (See (Gromacs) Neural Network Potentials https://rowansci.com/publications/introduction-to-nnps ). So you train these networks so that they learn the local interactions between atoms based on trajectories generated from ab-initio theories. This allows you to have a faster simulator which approximate the more complex physics. It's in a sort just tabulating using a neural network the effect of the electrons would have in a specific atom arrangements according to the theory you have chosen.

At any time if you have some doubt, you can always run the slower simulator in the small local neighborhood to check that the effective field neural network approximation holds.

Only then once you have your simulator which is able to fold, you can generate some dataset of pairs "sequence of protein" to "end of trajectory", to learn the shortcut like Alpha/Simple/Fold do. And when in doubt you can go back to the slower more precise method.

If you had enough data and can train perfectly a model with sufficient representation power, you could theoretically infer the correct physics just from the correspondence initial to final arrangements. But if you don't have enough data it will just learn some shortcut and accept that it will be wrong some times.

slashdave9mo ago

> it's just running a simulation of your protein surrounded by some water molecules for the same number of nano-seconds nature do.

No, the environment is important. Also, some proteins fold while being sequenced.

Folding can also take minutes in some cases, which is the real problem.

> which is a complex problem involving Quantum Mechanics

Most MD simulations use classical approximations, and I don't see why folding is any different.

GistNoesis9mo ago

Being able to quantify the importance of the environment is one advantage of using a simulator based approach. You know what's happening, and you can simulate other environments by adding the relevant molecules around.

Speeding-up the folding is not the real problem, knowing what happen is. One way to speed-up the process is just to minimize the free-energy of the configuration (or some other quantity you derive from the neural network vector potential). (That's what the game fold-it was about : minimizing the Rosetta energy function). An other way would be to just use generative method like diffusion model to generate a plausible full trajectory (but you need some training dataset to bootstrap the process). Or work with key-configuration frames. The simulation can take a long time but it goes through specific arrangements (the transitions between energy plateau), and you learn these key points.

The simulator can also be much faster because it doesn't have to consider all the pair of atom arrangements (n^2 behavior if you are naive) into O(n) with n the number of atoms (with the bigger constant which is running the neural network hidden inside the O notation).

The simulations are classical but fundamentally they rely on the shape of the electron clouds. The electron density can deform (that's what bonding is), providing additional degrees of liberty, allowing the atom configuration to slide more easily against itself and avoid getting stuck in local optimum. Fortunately all this mess is nicely encapsulated inside the neural network potential and we can work without worrying about the electrons, their shape being implicitly defined by the current position of the atoms (using the implicit function theorem make abstracting their behaviour sound because of the faster timescales).

1 more reply

foodevl9mo ago· 1 in thread

I was curious what the protein picture was showing: "Figure 1 Example predictions of SimpleFold on targets ... with ground truth shown in light aqua and prediction in deep teal."

and now I'm even more curious why they thought "light aqua" vs "deep teal" would be a good choice

gilleain9mo ago

Well, figure a) shows a ribbon representation of the fold (as helices and strands) of the protein 7QSW (https://www.ebi.ac.uk/pdbe/entry/pdb/7qsw) which is RubisCO (https://en.wikipedia.org/wiki/RuBisCO), an plant protein that plays a key role in photosynthesis.

The different colours are for the predicted and 'real' (ground truth) models. The fact that it is hard to distinguish is partly the - as you point out - weird colour choice, but also because they are so close together. An inaccurate prediction would have parts that stand out more as they would not align well in 3D space.

shpongled9mo ago· 1 in thread

It's not totally novel, but it's very cool to see the continued simplification of protein folding models - AF2 -> AF3 was a reduction in model architecture complexity, and this is a another step in the direction of the bitter lesson.

hashta9mo ago

I’m not sure AF3’s performance would hold up if it hadn’t been trained on data from AF2 which itself bakes in a lot of inductive bias like equivariance

underdeserver9mo ago· 1 in thread

So, how does this compare to AlphaFold?

mentalgear9mo ago

seems like they use the normal transformer architecture versus deep fold's more specialised machine-learning approaches.

331c8c719mo ago· 1 in thread

It is for structure prediction, not folding (rolleyes).

jandom9mo ago

Pssst they'll realise scientists hand out here too

Invictus09mo ago· 1 in thread

They'll do anything but fix Siri

mentalgear9mo ago

They can keep on doing stuff like this that's open-source and beneficial to society.

vbarrielle9mo ago

A paper that says: "our approach is simpler than the state of the art". But also does not loudly say "our approach is significantly behind the state of the art on all metrics". Not easy to get published, but I guess putting it as a preprint with a big company's name will help...

nicohayes9mo ago

This is a classic knowledge distillation pattern in ML - the "teacher" models (AlphaFold, ESMFold) with complex MSA-based architectures generate training data for a simpler "student" model. What s particularly interesting is how well the simplified architecture generalizes despite losing the evolutionary signal from MSAs. The performance suggests that much of the MSA complexity might be capturing patterns that can be learned more directly from structure data. This could be huge for real-time applications where MSA computation is the bottleneck. Has anyone benchmarked inference speed comparisons with the original AlphaFold pipeline?

tzumby9mo ago

Flow-matching, the technique they describe is incredibly interesting. I studied it in the context of generative AI and found it fascinating. It’s so fitting that a technique that borrows from thermodynamics and uses Brownian motion would go full circle to solve for protein folding.

nicohayes9mo ago

This is a classic knowledge distillation pattern in ML - the "teacher" models (AlphaFold, ESMFold) with complex MSA-based architectures generate training data for a simpler "student" model. What's particularly interesting is how well the simplified architecture generalizes despite losing the evolutionary signal from MSAs. The performance suggests that much of the MSA complexity might be capturing patterns that can be learned more directly from structure data. This could be huge for real-time applications where MSA computation is the bottleneck. Has anyone benchmarked inference speed comparisons with the original AlphaFold pipeline?

alex774569mo ago

Semi related, Veritassium channel made a nice video on protein folding

https://www.youtube.com/watch?v=P_fHJIYENdI

ziofill9mo ago

In the plots in Fig. 4 it looks like they should have continued the training because the performance was still climbing, am I reading it incorrectly?

kazinator9mo ago

I'm satisfied with with folding roast beef onto a sandwich, or folding egg whites into batter. All the protein folding action I could ever want.

barbazoo9mo ago

No folding here. Proteins go on the hanger or in the drawer.

nextworddev9mo ago

In industry Google practically dominates this field

wild_pointer9mo ago

Did you just assume what I think about protein folding simplicity?!

j / k navigate · click thread line to collapse

132 comments

96 comments · 25 top-level

barbarr9mo ago· 15 in thread

Why is apple doing protein folding?

robotresearcher9mo ago

Apple has an ML research group. They do a mixture of obviously-Apple things, other applications, generally useful optimizations, and basic research.

https://machinelearning.apple.com/

bobmarleybiceps9mo ago

giancarlostoro9mo ago

No idea, but can I be signed up for R&D jobs where you don't necessarily build something generating revenue?

Maybe these are just projects they use to test and polish their AI chips? Not sure.

1 more reply

nextos9mo ago

Zacharias0309mo ago

mensetmanusman9mo ago

cowsandmilk9mo ago

whyenot9mo ago

shpongled9mo ago

Probably because ByteDance and Facebook (spun out into EvolutionaryScale) are doing it

IncreasePosts9mo ago

They're jealous they haven't won a Nobel prize

Forbo9mo ago

Reputation laundering?

EasyMark9mo ago

1 more reply

jama2119mo ago

What’s there to launder? Perhaps they shouldn’t have as good a reputation as they do, but you can’t deny they do have a good reputation.

1 more reply

mabedan9mo ago

Prowlly cuz Siri didn’t work out

lovasoa9mo ago

How do you call the opposite of green washing? When you want to show that you are burning as much energy on training models as the others.

stephenpontes9mo ago· 10 in thread

It seems like the Folding @Home project is still around!

roughly9mo ago

cowsandmilk9mo ago

2 more replies

_joel9mo ago

Yep, that and SETI@Home. I loved the eye candy, even if I didn't know what it fully meant.

gregsadetsky9mo ago

That and project RC5 from the same time period..! :-)

https://www.distributed.net/RC5

https://en.wikipedia.org/wiki/RSA_Secret-Key_Challenge

I wonder what kind of performance would I get on a M1 computer today... haha

EDIT: people are still participating in rc5-72...?? https://stats.distributed.net/projects.php?project_id=8

seydor9mo ago

How come we don't have AI@Home

2 more replies

jffry9mo ago

[1] https://foldingathome.org/2024/05/02/alphafold-opens-new-opp...

EasyMark9mo ago

They're still going and have made some great discoveries over the years.

https://foldingathome.org/papers-results/?lng=en

ge969mo ago

I contributed a lot on there too used my 3080Ti-FE as a small heater in the winter

EasyMark9mo ago

lol I still run it in the winter but I feel bad running it in the summer, so I don't run it when A/C or heating is not necessary. I figure some contribution is infinitely more than 0 contribution.

nkjoep9mo ago

Team F@H forever!

turblety9mo ago· 10 in thread

tanelpoder9mo ago

I guess it's because SimpleFold came from a research lab with different autonomy and less competing interests and internal politics...

frenchie41119mo ago

xp849mo ago

As a vocal critic of Siri, I can give you a number of reasons we hate it:

3. Apple also shipped a visual overhaul of how Siri looks, which gives the impression that work has been done, leading people to be even more disappointed when Siri continues to be a pile of trash.

samuelg1239mo ago

I think Siri has always been criticized, likely because it has never worked super well and it has the most eyes (or ears) on it (iPhones still have 50% market share in the US).

And now that we have ChatGPT with voice mode, Gemini Live, etc which have incredible speech recognition and reasoning comparatively, it's harder to argue that "every voice assistant is bad" still.

Invictus09mo ago

1 more reply

citizenpaul9mo ago

1 more reply

SoftTalker9mo ago

If I could buy a phone without an assistant I would see that as a desirable feature.

al_borland9mo ago

Something like this doesn’t actually have to work. There were no expectations at all in this space.

I’ve made plenty of hobby projects related to work that weren’t important or priorities, but what I learned along the want proved extremely valuable to key deliverables down the road.

mapmeld9mo ago

EasyMark9mo ago

Fair point, even X was able to pump out a usable AI, grok.

hashta9mo ago· 7 in thread

It’s not like we can throw away all the inductive biases and MSA machinery, someone upstream still had to build and run those models to create the training corpus.

aDyslecticCrow9mo ago

- I'm recalling the increasingly complex transfer functions in redcurrant networks,

- The deep pre-processing chains before skip forward layers.

- The complex normalization objectives before Relu.

- The convoluted multi-objective GAN networks before diffusion.

- The complex multi-pass models before full-convolution networks.

So basically, i'm very excited by this. Not because this itself is an optimal architecture, but precisely because it isn't!

nextos9mo ago

> Can we add back MSA now?

1 more reply

godelski9mo ago

hashta9mo ago

3 more replies

slashdave9mo ago

> It should be obvious that things in nature run off of relatively simple rulesets

Only if you are willing to call a billion years of evolutionary selection a "simple ruleset"

2 more replies

mapmeld9mo ago

And AlphaFold was validated with experimental observation of folded proteins using X-rays

slashdave9mo ago

frenchie41119mo ago· 6 in thread

I am curious to hear an expert weigh in on this approach's implications for protein folding research. This sounds cool but it's really unclear to me what the implications are

epistasis9mo ago

It may be a change in future models, perhaps. Here's one person's opinion:

https://genomely.substack.com/p/simplefold-and-the-future-of...

But as with anything in research, it will take months and years to see what the actual implications are. Predictions of future directions can only go so far!

geremiiah9mo ago

johncolanduoni9mo ago

visarga9mo ago

aDyslecticCrow9mo ago

I think the sentiment that simplicity is good, is a false conclusion. Simplicity is simply good scientific methodology.

Doing too many things at once makes methods hard to adopt and makes conclusions harder to draw. So we try to find simple methods that show measurable gain, so we can adapt it to future approaches.

Its a cycle between complexity and simplicity. When a new simple and scalable approach beats the previous state of art, that just means we discovered a new local maxima hill to climp up.

cma9mo ago

They had to largely use alpha fold for the data part of the transformer scaling laws so not quite a bitter lesson, but still interesting.

kylehotchkiss9mo ago· 6 in thread

> Folding Proteins Is Simpler Than You Think

Then why do we need customized LLM models, two of which seemed to require the resources of 2 of the wealthiest companies on earth (this and google's alphafold) to do it?

aDyslecticCrow9mo ago

kylehotchkiss9mo ago

I'm more teasing the title than the tech :) I'm all for innovation in the field especially with so much bio funding cut!

wrsh079mo ago

Folding proteins is pretty valuable and this model is comparably small

This doesn't seem like particularly wasteful overinvestment.

Granted, I'm more excited about the research coming out of arc

jjtheblunt9mo ago

what are you referring to by arc?

2 more replies

wrs9mo ago

How simple did you think it was before?

kylehotchkiss9mo ago

Not simple! Wasn't/Isn't X-ray crystallography what it usually takes to determine the structure?

1 more reply

phoenicyan9mo ago· 4 in thread

the__alchemist9mo ago

No. AlphaFold doesn't do dynamics; it does end-state snapshots only. It does not do anything about the motion of the atoms, which is the core functionality of MD.

tripplyons9mo ago

I was curious about what was released, and the parameters for AlphaFold V3 are only given to certain groups for non-commercial use: https://github.com/google-deepmind/alphafold3?tab=readme-ov-...

However, it seems like anyone can download the parameters for AlphaFold V2: https://github.com/google-deepmind/alphafold?tab=readme-ov-f...

dekhn9mo ago

cowsandmilk9mo ago

MD simulations typically aren’t run for time scales that tell you anything about the folding process. Most people are looking at motion after the protein has folded.

IAmBroom9mo ago· 3 in thread

Link goes the github repository behind the article you might want to read.

https://arxiv.org/abs/2509.18480

IAmBroom9mo ago

And the abstract alone says (if I'm reading it correctly), "It still takes AI; just not nearly as much as others are doing."

mentalgear9mo ago

another form: transformers for the task

serjester9mo ago

For those interested in the GitHub link.

https://github.com/apple/ml-simplefold

dyauspitr9mo ago· 3 in thread

Isn’t this a largely solved problem after Alphafold?

the__alchemist9mo ago

Should an entry in a field preclude other ones. I encourage you to apply reductio-ad-absurdum here. Should Pepsi exist if Coke does? Should C exist if Fortran does?

samfriedman9mo ago

Maybe they've been working on it, but got scooped?

zamadatix9mo ago

I don't think that's the case. The numbers in the paper suggest ~92% of the training data comes from pre-existing AI models, including AlphaFold, and they claim things like:

> We largely adopt the data pipeline implemented in Boltz-11 1https://github.com/jwohlwend/boltz (Wohlwend et al., 2024), which is an open-source replication of AlphaFold3

I believe the story here is largely that they simplified the architecture and scaled it to 3B parameters while maintaining leading results.

GistNoesis9mo ago· 2 in thread

Intellectually, I don't like this approach.

Predicting the end-result from the sequence of protein directly is prone to miss any new phenomenon and would just regurgitate/interpolate the training datasets.

I would much prefer an approach based on first principles.

In theory folding is easy, it's just running a simulation of your protein surrounded by some water molecules for the same number of nano-seconds nature do.

At any time if you have some doubt, you can always run the slower simulator in the small local neighborhood to check that the effective field neural network approximation holds.

slashdave9mo ago

> it's just running a simulation of your protein surrounded by some water molecules for the same number of nano-seconds nature do.

No, the environment is important. Also, some proteins fold while being sequenced.

Folding can also take minutes in some cases, which is the real problem.

> which is a complex problem involving Quantum Mechanics

Most MD simulations use classical approximations, and I don't see why folding is any different.

GistNoesis9mo ago

1 more reply

foodevl9mo ago· 1 in thread

I was curious what the protein picture was showing: "Figure 1 Example predictions of SimpleFold on targets ... with ground truth shown in light aqua and prediction in deep teal."

and now I'm even more curious why they thought "light aqua" vs "deep teal" would be a good choice

gilleain9mo ago

shpongled9mo ago· 1 in thread

hashta9mo ago

I’m not sure AF3’s performance would hold up if it hadn’t been trained on data from AF2 which itself bakes in a lot of inductive bias like equivariance

underdeserver9mo ago· 1 in thread

So, how does this compare to AlphaFold?

mentalgear9mo ago

seems like they use the normal transformer architecture versus deep fold's more specialised machine-learning approaches.

331c8c719mo ago· 1 in thread

It is for structure prediction, not folding (rolleyes).

jandom9mo ago

Pssst they'll realise scientists hand out here too

Invictus09mo ago· 1 in thread

They'll do anything but fix Siri

mentalgear9mo ago

They can keep on doing stuff like this that's open-source and beneficial to society.

vbarrielle9mo ago

nicohayes9mo ago

tzumby9mo ago

nicohayes9mo ago

This is a classic knowledge distillation pattern in ML - the "teacher" models (AlphaFold, ESMFold) with complex MSA-based architectures generate training data for a simpler "student" model. What's particularly interesting is how well the simplified architecture generalizes despite losing the evolutionary signal from MSAs. The performance suggests that much of the MSA complexity might be capturing patterns that can be learned more directly from structure data. This could be huge for real-time applications where MSA computation is the bottleneck. Has anyone benchmarked inference speed comparisons with the original AlphaFold pipeline?

alex774569mo ago

Semi related, Veritassium channel made a nice video on protein folding

https://www.youtube.com/watch?v=P_fHJIYENdI

ziofill9mo ago

In the plots in Fig. 4 it looks like they should have continued the training because the performance was still climbing, am I reading it incorrectly?

kazinator9mo ago

I'm satisfied with with folding roast beef onto a sandwich, or folding egg whites into batter. All the protein folding action I could ever want.

barbazoo9mo ago

No folding here. Proteins go on the hanger or in the drawer.

nextworddev9mo ago

In industry Google practically dominates this field

wild_pointer9mo ago

Did you just assume what I think about protein folding simplicity?!

j / k navigate · click thread line to collapse