brief funny story. My dad built the 5th computer in the UK, ICCE [0] and one of the tests they wanted to run was statistical analysis over random number fields.
They approached the General Post Office (GPO) which had an RNG called "ernie" [1] which ran the postal investment bond lottery. This is a true RNG, based on radio device avalanche diode behaviour (actually, neon tubes). It was dressed up as a computer but it was basically a detector device and A-to-D converter dressed up to look like one. They asked for a million truly random numbers to run some tests over. Interestingly, ERNIE was made by somebody who worked on Colossus at Bletchley.
The GPO refused to share a feed of numbers: They were concerned the team would discover some predictable event in the number field, and either destroy the post office bond scheme by revealing it, or use it to make millions.
[0] https://en.wikipedia.org/wiki/Imperial_College_Computing_Eng...
Going off on a tangent: An oft-neglected issue is that, even when the random source (like avalanched diodes) is actually sufficiently random, any apparatus that captures that randomness for use inherently causes a bias in the observations.
Even if everything else is perfect (it usually isn't), in terms of signal processing, any observation window (e.g. a finite length of time of measurement) is an aperture which ends up getting convolved with the signal source being observed.
It sometimes helps to convert the skew into white noise with a "whitening" post-pass algorithm.
Using real life randomness is still a good thing to do, of course, it's just that are always real world issues with anything and everything.
From your ERNIE link:
The designers were Tommy Flowers and Harry Fensom and it derives from Colossus, one of the world's first digital computers ...
https://en.wikipedia.org/wiki/Tommy_Flowershttps://en.wikipedia.org/wiki/Harry_Fensom
( I travelled to Canada from Australia in the 1980s and interviewed William Tutte about codebreaking and the war, I missed out on an opportnity to talk to Tommy Flowers the following year and didn't return to the UK until after his death )
Guess which WW2 activity used a very large number of relays, and was made by the GPO (who used relays heavily in telephony), and was subsequently discontinued by the GPO at wars end as things to do with encoding and decoding scaled back thus flooding the electronics market in the UK with relays...
Tommy Thomas (head of the ERCC in edinburgh where my dad wound up and I therefore lived) worked at Manchester on the Mark 1. He never talked about it and I never asked, subsequently. We both wound up in Australia at the CSIRO, where Radiophysics had been started by people in the radar space, and that bled into their interest in Computing. it's a small world. He was the head of the IT division and I was a lowly researcher, our paths didn't cross much. I wish I'd talked to him more, about this stuff and computer history.
Also, PCG didn't stop the development. Nowadays, modern PRNGs explore the usage of chaotic PRNGs (without a fixed period), which are often faster than non-chaotic ones. Notable examples are the Romu family [0] of PRNGs and sfc [1], and tylov's sfc derivative [2].
Another thing that would be nice to mention is that we went full circle, the good old middle square method already used by von Neumann, has been found to work very well if you scale it up and add a weyl sequence. [3]
Edit: And how could I forget, there has also been a lot of effort in using SIMD, e.g. by SHISHUA. [4]
Another thing to consider is how to efficiently distribute the generated numbers in a given distribution. I'm not aware of any recent improvements in that regard, other then some approximations that have probably been reinvented a bunch of times.
[0] https://www.romu-random.org/
[1] https://numpy.org/devdocs/reference/random/bit_generators/sf...
[2] https://github.com/tylov/STC/blob/master/docs/crandom_api.md
[3] https://arxiv.org/abs/1704.00358
[4] https://espadrine.github.io/blog/posts/shishua-the-fastest-p...
Edit: I had a few names mixed up
A new programmer reading this article would come away with the impression that, if they need random numbers, they should use xorshift or PCG, when in reality they should be calling getentropy(), or, if a syscall is too expensive, using a CSPRNG (e.g. ChaCha or BLAKE3) seeded with getentropy(). We now have RNGs that are both secure and really, really fast -- multiple GB/s fast -- so there are very few circumstances where a PRNG is truly necessary.
If one needs fast PRNGs, say for simulation, monte carlo stuff, etc. then CSPRNGs are a terrible idea. They're literally orders of magnitude slower than fast PRNGs. Almost nothing needs CSPRNGs (only things needing crypto level security, which is a tiny amount of the uses for PRNGs).
In short, use the right tool for the job.
In fact, there is! The idea that your kernel has some finite amount of entropy that can be "used up" is a persistent myth. See https://www.2uo.de/myths-about-urandom/
Okay, it's technically not infinite, but 128 bits of "true entropy" is enough to seed a CSPRNG that will generate as many random numbers as you will ever need.
> Almost nothing needs CSPRNGs
This is how we end up with, e.g., hashmap implementations that use non-cryptographic hash functions, and then an attacker DoS's your server with 10,000 keys that all end up in the same bucket. Or "random" automated tests that always use the same PRNG with the same seed, meaning certain code paths will never be taken.
We used to live in a world where developers frequently had to choose between safety and performance. Except for a few niche applications, that is no longer the case: we can have both. I certainly agree that there are situations where maximum performance is required, and security is pointless -- but 99% of the time, developers should not even be asking that question. They should just be using the default RNG provided by their language, which should be a CSPRNG, not a PRNG.
A new programmer shouldn't be meddling in cryptography, so they probably don't need either cryptographically-secure pseudo-random numbers nor true random numbers. True random numbers are tricky.
The more popular library rand usese ChaCha and getentropy as you described.
I guess that's the best we had for students before the widespread adoption of computers?
https://en.wikipedia.org/wiki/A_Million_Random_Digits_with_1...
You have to understand what properties you actually care about and choose a (P)RNG that has those properties.
Perhaps this is because true randomness doesn't always appear to be random enough. I would argue that this property is what makes it real. Sometimes true randomness might be six dice all showing the face of six.
For many use cases in statistics / sampling / monte carlo simulations, you often need millions/billions of well-distributed random numbers with very low generation overhead.
Even things like game AIs care about performance with regards to the RNGs they use.
X f(X u);
on which i want to iterate occasionally to get in turn f(u0),f(f(u0)),...
The "smart" way to do this, is X u=f(u0); //Initialize once
...
X smart_f() {
X w = u;
u = f(w);
return w; // This line is not stalled by the previous one,
// hence a super scalar processor might be able to hide the latency of calculation f(w)
}
I doubt any compiler will be able to figure this out, and surely not if f has side effects.Also, I learned to use a slide-rule in the sixties; I 've never touched one again until I inherited an antique, a few years ago. Nobody was using slide-rules in the seventies, surely.