The author writes "Meanwhile, at least one influential researcher (whose work I respect) had harsh words publicly for her result", and then quotes some of these words:
Note that (smartly enough) the PCG author avoids
carefully to compare with xorshift128+ or xorshift1024*.
However, the author fails to note that said "influential researcher", Sebastiano Vigna, is the author of xorshift128+ and related PRNG.In the linked test [2] by John D. Cook (who uses PactRand, a test similar to the (obsolete) DIEHARD), xorshift128+ and xoroshir0128+ fail within 3 seconds, while PCG ran 16 hours producing 2 TB of pseudo-random numbers without any suspicious p-value detected.
On the other hand, Vigna claims that the xoroshiro family does "pass" PactRand.
I've submitted an answer to StackOverflow a while ago [1], recommending xoroshiro and PCG, thus I'd be concerned if PCG turns out to be flawed. It's actually quite hard to get academics in the field to give an authoritative recommendation (I've tried) - their response is typically along the line "It's complicated"...
[1] https://stackoverflow.com/questions/4720822/best-pseudo-rand...
[2] https://www.johndcook.com/blog/2017/08/14/testing-rngs-with-...
Edit: remove italics due to asterisk in PRNG name, & add link to John. D Cook's test.
O'Neill has instructions on how to test with PractRand and with TestU01 on her blog (http://www.pcg-random.org/blog/). I had a go with TestU01 on Vigna's generators, and when you test the low 32 bits reversed (for 64-bit PRNGs, you have to test the high 32, the low 32, both forwards and reversed), I found that all Vigna's generators fail.
Given the PractRand results it makes sense, I guess, but I had read that Vigna's generators were supposed to pass TestU01.
Does anyone else wants to have a go at testing so I can know if I screwed up somehow?
The code does explain exactly what the issue is, i.e. that the last bit isn't random:
This generator passes the PractRand test suite
up to (and included) 16TB, with the exception of binary rank tests,
which fail due to the lowest bit being an LFSR; all other bits pass all
tests. We suggest to use a sign test to extract a random Boolean value.
But I'm tempted to agree this isn't a desirable property for a generic RNG.How many users of JavaScript know about this property? (it's the default RNG for most browser engines) Or does it not matter because they return 53-bit floats?
Most of the analysis is about the LCG or the final output. The suggested mixer is just
output = rotate64(uint64_t(state ^ (state >> 64)), state >> 122);
That's simple, and the insight in this paper is that something that simple helps a lot. I would have thought that you'd want a mixer where changing one bit of the input changes, on average, half the bits of the output. The mixer above won't do that. DES as a mixer would probably be better, but it's slower. The new result here is that something this simple passes many statistical tests.This isn't crypto-grade; both that mixer and a LCG generator are reversible with enough work.
Relevant quotes from the paper:
"But if you began reading the section with the belief that “linear congruential generators are bad” (a fairly widely-held belief amongst people who know a little about random number generation), you may have been surprised by how well they performed. We’ve seen that they are fast, fairly space efficient, and at larger sizes even make it through statistical tests that take down other purportedly better generators. And that’s without an improving step."
and
"Despite their flaws, LCGs have endured as one of the most widely used random- number generation schemes, with good reason. They are fast, easy to implement, and fairly space efficient. As we saw in Section 3.3, despite poor performance at small bit sizes, they continue to improve as we add bits to their state, and at larger bit sizes, they pass stringent statistical tests (provided that we discard the low-order bits), actually outperforming many more-complex generators. And in a surprise upset, they can even rival the Mersenne Twister at its principle claims to fame, long period and equidistribution."
"Nevertheless, there is much room for improvement. From the empirical evidence we saw in Section 3.3 (and the much more thorough treatment of L’Ecuyer & Simard [28], who observe that LCGs are only free of birthday-test issues if n < 16p1/3, where n is the number of numbers used and p is the period), we can surmise that we may observe statistical flaws in a 128-bit LCG after reading fewer than 247 numbers (which is more than BigCrush consumes but nevertheless isn’t that many—an algorithm could plausibly use one number per nanosecond and 247 nanoseconds is less than two days)."
But I'll confess to not really understanding what all the fuss is about insecure generators.
That's the paper, basically.
Based on things she's said on her site and in comments on John D. Cook's blog, it's all about algorithmic complexity attacks on randomized algorithms.
In other words, if you're doing quicksort on external input with a random pivot, and someone knows the PRNG state, they can make a pathological input that'll trigger quadratic behavior.
I don't know how likely this is to happen, but I know there were similar attacks on hash tables a few years ago.
As a tenured professor I want to say two things about this piece:
1. I think academic publishing will be forced to change. I'm not sure what it's going to look like in the end, but traditional journals are starting to seem really quaint and outdated now.
2. As far as I can tell from what she's written on the PCG page, the submission to TOMS is a poor example, because no one I know expects to be done with one submission. That is, no one I know submits a paper to one journal, even one reputable journal, and is done. They submit and it gets rejected and revise it and resubmit it, maybe three or even four times. After the fourth or fifth time, you might give up, but not necessarily even then.
I have mixed feelings about the PCG paper as an example, because in some ways it's great: an example of how something very influential has superceded traditional academic publishing. In other ways, though, it's horrible, because it's misleading about the typical academic publishing experience. Yes, academic publishing is full of random nonsense, and corruption, but yes, you can also get past it (usually) with just a little persistence. In still other ways, it's a good example of what we might see increasingly, which is a researcher having a lower threshold for the typical bullshit out there.
I think there are two topics here. One is whether academic research and work is becoming less relevant to practice. The other is whether the formalism of academic-style publishing are becoming less relevant to the modern world where more and more venues for publishing, rating, and discovering work.
On the former, I believe that academic work is as relevant as ever. There are some areas (like systems) where I'm doubtful about relevance from the point of view of a practitioner, but other areas (like hardware and ML where work remains extremely relevant). I haven't noticed a trend there over the last decade, except in some areas of systems where the industrial practice tends to happen on cluster sizes that are often not approachable for academia.
On the latter, academic publication does indeed seem to be getting less relevant. There are other (often better) ways to discover work. There are other ways to tell whether a piece of work is relevant, or credible. There are other, definitely better, ways to publish and distribute work. In some sense I think this is a pity: as an academic-turned-practitioner I like academic-style publications. Still, I think they are going to either change substantially or die.
This article raises another very good point: sometimes the formalism of academic publication makes the work harder to understand, less approachable, or less valuable. That's clear harm, and it seems like this professor was right to avoid that.
- PCG is not crypto, everybody should understand that. It's for simulation and rendering.
- PCG mainly replaces Mersenne Twister which is in c++11. The Twister has a LOT more state and is a LOT slower for less randomness.
- In rendering and simulation speed really matters, and PCG excels there.
- Xorshift is another algorithm in the same class. I would really like to see an objective comparison. In my cursory engineering look PCG seemed better.
- Fast PRNG is almost a new field again: It's not crypto, but immensely useful. How did the Twister get into C++11 while it is so much worse than PCG or Xorshift? Nobody cared!
- Maybe PCG should have been a paper at SigGraph.
- For the style of the paper, I think one contribution is rethinking PRNG outside crypto. That deserves and requires a lot of exposition.
I think that the whole point of the prediction difficulty stuff is that a library (e.g., C++11's) with general purpose PRNGs can't know how they'll be used. Maybe some idiot write code for a gambling machine in C++ and use whatever PRNG is to hand. There was a story in the news the other week about people going around casinos predicting slot machines, so maybe this has already happened! PCG is trying to make your simulation and rendering code fast while trying to offer at least some defense against egregious misuse.
Basically PCG is trying to be a good all rounder. As you say, it's meant as a replacement for the Mersenne Twister.
And yes, PCG is harder to exploit in this way than the Twister, but you still really should not bet money on it!
That she ran into a paper wall doesn't bother her because she's openly publishing is even better.
If you want to make your research more accessible, there are ways to do that without assuming that your reader is coming in from a dead start on the field.
An easy way to make research accessible is to write a monograph!
That said, even if multi-culti math means that top-line researchers are going to be spending time with song-and-dance introductions, she should still have put a grad student onto the task of making the short paper that experts will actually read.
If the whole thing is a matter of style and not of obfuscation, this would have given a grad student an easy, cool first publication.
[1] https://ee.stanford.edu/event/seminar/ee380-computer-systems...
Because the reviewers took over 10 months to respond with a rejection mainly citing the length of the paper. And more importantly, "By that point, everyone who might have wanted to read it had almost certainly found it here and done so, so I saw little merit in drastically shortening the paper."[1]
She has updated the blog post which discusses all the nuanced details of the whole affair last month (2017-07-25)[2].
[1] http://www.pcg-random.org/paper.html [2] http://www.pcg-random.org/posts/history-of-the-pcg-paper.htm...
[EDIT] The actual paper is here: http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf
On the other hand, maybe spending more than a line explaining what the birthday paradox is should be cut out and put in a backgrounder paper or appendix so that the paper can focus on the actual novel ideas.
That was my annoyance with the paper as well. Add to that explanations that amount to, "What even is determinism?" or "What's a seed?" and I'm unsurprised it's nearly 60 pages.
It worked as far as I can tell. But I don't trust the statistical tests. Who is to say there isn't a very obvious pattern in the numbers that I didn't test for or notice? How do you prove a random number generator is good?
You can't; that's the nature of randomness. You can prove they're bad, though.
Compress its output.
1. The paper itself[1] is extremely readable by the standards of most cryptography research. On one hand, this is great because I was able to follow the whole thing in essentially one pass. On the other hand, the paper is very long for its result (58 pages!), and it could easily do without passages like this one:
Yet because the algorithms that we are concerned with are deterministic, their behavior is governed by their inputs, thus they will produce the same stream of “random” numbers from the same initial conditions—we might therefore say that they are only random to an observer unaware of those initial conditions or unaware of how the algorithm has iterated its state since that point. This deterministic behavior is valuable in a number of fields, as it makes experiments reproducible. As a result, the parameters that set the initial state of the generator are usually known as the seed. If we want reproducible results we should pick an arbitrary seed and remember it to reproduce the same random sequence later, whereas if we want results that cannot be easily reproduced, we should select the seed in some inscrutable (and, ideally, nondeterministic) way, and keep it secret. Knowing the seed, we can predict the output, but for many generators even without the seed it is possible to infer the current state of the generator from its output. This property is trivially true for any generator where its output is its entire internal state—a strategy used by a number of simple random number generators. For some other generators, such as the Mersenne Twister [35], we have to go to a little more trouble and invert its tempering function (which is a bijection; see Section 5), but nevertheless after only 624 outputs, we will have captured its entire internal state.
That's a lot of setup for what is frankly a very basic idea. A cryptographer being verbose in their writing might briefly remind the reader of these properties with the first sentence, but they'd still likely do that with much more brevity than this. I understand wanting to make your research accessible, but for people who understand the field this detracts from getting to the "meat." It might make it harder to get through, but a 10-30 page result is preferable to a nearly 60-page one that assumes I know nearly nothing about the field. If I don't know these details very well, how can I properly assess the author's results?
2. The author's tone in her writing is something I take issue with. For example, passages like this one...
Suppose that, excited by the idea of permutation functions, you decide to always improve the random number generators you use with a multiplicative step. You turn to L’Ecuyer’s excellent paper [25], and without reading it closely (who has time to read papers these days!), you grab the last 32-bit constant he lists, 204209821. You are then surprised to discover that your “improvement” makes things worse! The problem is that you were using XorShift 32/32, a generator that already includes multiplication by 747796405 as an improving step. Unfortunately, 204209821 is the multiplicative inverse of 747796405 (mod 2 32), so you have just turned it back into the far-worse–performing XorShift generator! Oops.*
...go a bit beyond levity. If you're trying to establish rigorous definitions and use cases to distinguish between generators, functions and permutations, this isn't the way to do it. This isn't appropriate because it doesn't go far enough to formalize the point. It makes it intuitive, sure, and that's a great educational tool! But it's a poor scenario to use as the basis for a problem statement - research is not motivated by the failure of an engineer to properly read and understand existing primitives, it's motivated by novel results that exhibit superior qualities over existing primitives.
3. The biggest grievance I have with this paper is the way in which it analyzes its primitives for cryptographic security. For example, this passage under 6.2.2 Security Considerations:
In addition, most of the PCG variations presented in the next section have an output function that returns only half as many bits as there are in the generator state. But the mere use of a 2 b/2-to -1 function does not guarantee that an adversary cannot reconstruct generator state from the output. For example, Frieze et al. [12] showed that if we simply drop the low-order bits, it is possible for an adversary to discover what they are. Our output functions are much more complex than mere bit dropping, however, with each adding at least some element of additional challenge. In addition, one of the generators, PCG-XSL-RR (described in Section 6.3.3), is explicitly designed to make any attempt at state reconstruction especially difficult, using xor folding to minimize the amount of information about internal state that leaks out.17 It should be used when a fast general-purpose generator is needed but enhanced security would also be desirable. It is also the default generator for 64-bit output.
That's not a rigorous analysis of a primitive's security. It is an informal explanation of why the primitive may be secure, but it so high level that there is no proof based on a significant hardness assumption. Compare this with Dan Boneh's recent paper, "Constrained Keys for Invertible Pseudorandom Functions"[2]. Appendices A and B after the list of references occupy nearly 20 pages of theorems used to analyze and prove the security of primitives explored in the paper under various assumptions.
Novel research exploring functions with (pseudo)random properties is inherently mathematical; it's absolutely insufficient to use a bunch of statistical tests, then informally assess the security of a primitive based on the abbreviated references to one or two papers.
_________
1. She purports to introduce a novel result that bridges "medium-grade" performance characteristics and security characteristics in one primitive. In fact, if you look at the PCG Random website (pcg-random.org), she very clearly compares and emphasizes both performance and security characteristics with functions like xorshift and ChaCha.
2. We see cryptography papers submitted to all manner of theoretical CS conferences and journals, for example Symposium on the Theory of Computing, which are not uniformly crypto-focused.
3. She acknowledges herself that she found it hard to categorize her paper (it could be relevant for simulstion, it could be relevant for stream ciphers, etc) in a blog post about how she chose the venue: http://www.pcg-random.org/posts/history-of-the-pcg-paper.htm...
As a meta point I read the whole thing, and I actually think it would be a nice publishable result if it were, say 10 - 20 pages. But 60 is wild! It took me longer to get through this "accessible" paper than it did for me to get through any of Boneh's papers on constrained and puncturable pseudorandom functions!
It's definitely interesting, and sure, why not explore "medium-grade security" that makes explicit tradeoffs with performance and security. But the presentation seems like it was written by someone writing for a non-academic audience, and the content of 6.2.2 "Security Considerations" is really light on provable security.
Off-topic
> And it is not even entirely clear what “really random” would mean. It is not clear that we live in a randomized universe…
At the quantum level it really is clear that we live in a really random universe. What's the meaning of really random? The outcome of a quantum process.
On-topic. Yeah, you have to know your audience. As OP mentions, just because the paper wasn't published doesn't prevent anyone from thinking about it and even building on it. On the other hand these scientific publications have styles and target audiences, and maybe she got rejected not due to lack of relevance or rigor, but because the paper didn't match the publication's non-scientific criteria for publication.
https://en.wikipedia.org/wiki/Bell%27s_theorem
> Bell's theorem states that any physical theory that incorporates local realism cannot reproduce all the predictions of quantum mechanical theory. Because numerous experiments agree with the predictions of quantum mechanical theory, and show differences between correlations that could not be explained by local hidden variables, the experimental results have been taken by many as refuting the concept of local realism as an explanation of the physical phenomena under test. For a hidden variable theory, if Bell's conditions are correct, the results that agree with quantum mechanical theory appear to indicate superluminal effects, in contradiction to the principle of locality.