Allegedly broken this barrier. There is a reason that science is conducted in the open, in a reproducible and traceable manner. Those systems might not function properly at scale, or might not have run at an exaflop in double precision compute.
Frontier is certainly the first publicly verified system to achieve Exascale on the internationally accepted standard measurement.
https://www.tomshardware.com/news/chinese-exascale-supercomp...
Among other scientific purposes of course. But the 'quiet part' is that a lot of this comes down to simulated nukes. (Much like how the space program was really a nuke delivery project)
These computers remain useful for other physics simulations of course: atom to atom interactions, protein folding, weather modeling. So they also serve the scientific community.
So the supercomputer speed went 1000x from 2008-2022. But home computer speeds definitely did not go up that much, it was maybe around 10x. Does this mean there is more potential for home computers in the future?
Of course the supercomputers are massively parallel, but it's not like they got a 100x times larger building, or do they?
[1] https://en.wikipedia.org/wiki/Roadrunner_(supercomputer)
A lot of it is simply scale. This computer has ~8 million cores compared to ~12k full cores and ~100k "processing units" on the Roadrunner.
Secondly we have fundamentally changed how we do computation, by learning how to utilise GPUs (and GPU like architectures) better. This alone gives a far greater than 10x boost between 2008-2022
A 9800 GTX from March 2008 had 432.1 GFLOPS 32FP. A 3060 GTX from February 2021 is at a similar price point and 12.74 TFLOPS 32FP. A 3090 Ti is 40 TFLOPS 32FP or 100x the performance of the 9800 GTX.
With modern SAT solvers or with historic (single threaded) SAT solvers?
They're not so specialized in the scope of supercomputers.
Matrix multiplications are pretty much any "simulation of reality". Be it a nuclear explosion, weather modeling, finite element analysis (aka: simulated car crashes), protein folding, chemical atom-to-atom simulations and more.
> how much faster would it be at SAT solving?
Dumb WalkSAT is embarrassingly parallel and probably would scale to GPUs very easily actually. I admit that I'm not in the field of SAT but there seems to be plenty of research into how to get SAT onto GPUs.
I've personally been playing with BDDs (or really, MDDs for my particular application). Traversal of BDDs / MDDs is embarrassingly parallel, assuming your BDD / MDD is wide enough (which for any "complicated" function, it should be).
BDD / MDD based traversal of SAT / constrain satisfaction is likely going to be a growing area of research, since the problems are closely related. I've also seen some BDD/MDD "estimate" methodologies, akin to how arc-consistency / path-consistency estimates a SAT/Constraint solver.
In effect, if you say "BDD that underestimates the function" and a 2nd BDD that "overestimates the function", you can use BDDs / MDDs to have a lower-bound and upper-bound on some estimates. And those structures can scale from kilobytes to gigabytes, depending on how much or little "estimation" you want.
Does such a thing exist yet? I don't think so. The papers I've read on this subject are from 2018, and they weren't from parallel programmers who know much about high-speed GPU programming. But I definitely believe that there's some synergy here if researchers combine the two fields. You use the GPU to embarassingly parallel calculate the BDD every step to guide a sequential-algorithm search over the SAT/Constraint Satisfaction problems.
Using BDDs (instead of arc-consistency) as your "heuristic" of search is still a relatively unexplored field, but is "very obviously" a way to utilize a GPU if you know how they work.
-----------
Speaking of embarrassingly parallel, I'm pretty sure arc-consistency is also an embarrassingly parallel problem, but arc-consistency is "too inflexible", leading to "too small" (not efficiently utilizing the GBs or TBs of RAM we have today).
Extending the arc-consistency to path-consistency or K-consistency increases the size of the data-structure (and therefore the parallelism and work involved), but not very smoothly.
Instead, the estimated BDD/MDD methodology seems like a magical methodology that scales to different memory sizes better. That is, relaxed BDDs or restricted BDDs for lower-bounds and upper-bounds estimations.
Was going to say that it's always great to see GPU machines performing well. Looking forward to seeing how far off theoretical peak the benchmark hit.