Maybe not by itself, but it does allow for the ecosystem to be audited, in a way that ultimately benefits the end-user. It really is an important part of a healthy supply chain.
This is nice pat yourself on the back achievement for people that prefer security theatre and checking boxes than doing something actually useful, and they wasted thousands man hours of poor victims that had to implement it
The thing reproducible builds aim to prevent is Debian or individual developers and system administrators with access rights to binary uploads and signing keys to get forced to sign and upload binary packages by attackers - be these governments (with or without court orders) or criminal organizations.
As of now, say if I were an administrator of Debian's CI infrastructure, technically there would be nothing preventing me from running an "extra" job on the CI infrastructure building a package for openssh with a knock-knock backdoor, properly signing it and uploading it to the repository. For someone to spot the attack and differentiate it, they'd have to notice that there is a package in the repository that has no corresponding build logs or has issues otherwise.
But with reproducible builds, anyone can set up infrastructure to rebuild Debian packages from source automatically and if there is a mismatch with what is on Debian's repository, raise alarm bells.
Indeed, this could mitigate an attacker replacing the binary with something that's not produced from the code, but it does not mitigate the tool chain or code itself containing the exploit, creating a malicious binary.
Curious, what distros where affected by npm supply chain attacks?
Not being able to see if the source code shipped is the same as been used for creating the binary is scary
Reproducable builds are not solving all issues as you rightly observed, but they can be a stepping stone (or even a pre-condition) for further measures.
The end-user experience is that now you can host your Debian binaries in caches and CDNs without worrying about supply chain hackers.
You can verify that file hashes match the ones on Debian's website and sleep much better at night.
If you don't trust Debian's website then you can rebuild yourself and check if Debian has been compromised.
It took a while though until this was understood. In 2007 when pointing out on debian-devel that this is needed, I was still told what huge waste of time this would be. And indeed it took a huge amount of work by many people to get there, but it is well worth it.
"Well worth it" is not correct. And it just ups the the contribution barrier to Debian higher, I already heard a lot of people complaining that contributing to Debian is hard and while in past I defended it by "they need all the checks and bounds to make sure packages play with eachother nicely", this is just step that makes it hard for no reason and little benefit.
https://reproducible-builds.org/
Could you perhaps respond to the argumentation here?
Anyone having to maintain a code base or a distributed fleet of devices will gain from this decision, immensely, as their operational periods come and go.
Reproducible builds are about longevity as much as they are about security.
Please don’t make bold claims about ‘no reason and little benefit’ while demonstrating ignorance of this hard fact: reproducible builds should have been the norm, in computing, from the get-go.
Have many organizations produce the binaries independently and post the arifacts.
Once n of m parties agree on the arifact hash, take that as the trusted build.
If every party reaches a different hash then we cannot build consensus.
(It was caught before being promoted into a stable Debian release, yes, but this sort of relied on a happy accident, too close for comfort)
Those people do not care about quality in opensource at all. For longliving software this is very important.
Of course, all those javascript and kubernetes packages which are irrelevant in a few years again, might complain, but let them complain.
I'm reading this as a suggestion that the reproducible builds effort was an ineffective deterrent.
However, note that your observation could also be explained by the opposite: the reproducible builds effort was an effective deterrent, so nobody bothered with attempts.
> And it just ups the the contribution barrier to Debian higher
Until yesterday, the package just got flagged in the tracker, and you could either ignore it, or fix it yourself, or the kind people behind the reproducible builds effort supplied a patch themselves.
Now, you can no longer ignore it. But fixes are often trivial. Use a (stable) timestamp provided by the build, seed RNGs with some constant (instead of eg: time), etc. These are best practices anyway.
BTW, most Debian packages have reproducible builds. Those which have not (I'd say 5%) are shown in orange in the graph there: https://wiki.debian.org/ReproducibleBuilds
It's not like the Linux world where you have distinct projects like the Kernel, GNU, OpenSSL, and then it's the distributions job to assemble everything.
In the BSD projects, the scope is developing and distributing an entire base system, i.e., the kernel but also the libc, the shell/all posix utilities, and a few third parties like OpenSSH (which are usually "softforked").
It's quite visible in the sources, it's a lot more than just a kernel: https://github.com/NetBSD/src
Additional packages you could get from pkg_in/pkgsrc (NetBSD), pkg-ng/ports (FreeBSD) or pkg_add (OpenBSD) are clearly distinct from the base system, installed in a dedicated subtree (/usr/src in NetBSD, /usr/local/ OpenBSD/FreeBSD), and provided in a best effort manner.
The reproducible build target was almost certainly only for the base system, which is a few percent of what Debian tries to achieve, and on which NetBSD has a tighter control over (developer + distributor instead of downstream assembler+distributor).
A reproducible base system is useful, but given how quickly you typically need to install packages from pkgsrc, it's not quite enough.
Maybe that's trying harder on design rather than trying to remedy the consequences later.
Debian has come along way, but when Debian says reproducible they mean they grab third party binaries to build theirs. When we say reproducible we mean 100% bootstrapped from source code all the way through the entire software supply chain.
We think that distinction matters.
also, stagex and others probably profited QUITE A LOT from the debian efforts, because they started to go upstream and talking to developers..
just arch linux profited from debian maintainers a decade before that an debian people asking upstream to improve...
Unfortunately, the term “reproducible” can be interpreted in many ways because there is no strict and complete definition. People and projects bend it to their liking.
Your approach is correct.
What is a win is that two independent parties can run the same build, and get the same binaries.
This is important because it removes trust from builders: anyone can verify their output.
It just so happens that unimportant things like build versions impede that.
Given how many quick & dirty sed patching or exec commands I've seen in the few nix package/modules I've read, I would not exactly bet my life on it being completely idempotent & reproducible.
It's not reproducible bit by bit, it fetch the current version of anything, but it's still easy to reproduce enough, stable enough and complete enough, while classic distros need a fresh install every major release or facing issues an keeping a system in unknown state for long until it explode.
They're still a pragmatic choice for many usecases.
You don't have permission to access this resource. Apache Server at lists.debian.org Port 443
:/
It does work with my privacy/scrapping setup (residential proxy, spoofed fingerprints, Qubes and so on), great job debian.
The build timestamps in the PE header and export table are also a problem as well.
(Orange = FTBR = "failed to build reproducibly")
I'm not good at reading numbers from charts, but I'd guess it's a few percent (4-5ish?).
> Forbidden
> <p>You are not allowed to access this!</p>
(yes, with HTML tags on display) :)
EDIT: I also found a "I Challenge Thee" page in history. did I just get blocked by antibot measures? why???
Reproducible builds are an essential method in industrial computing - Debian isn’t at the forefront of this, it is merely adopting industry wide techniques also applied to other operating systems in use in long-term and safety-related applications.
Certainly, a lot of the hard work of the Yocto and Debian developers is already in your hands.
What is interesting is that this is now being applied in a more forward-focused policy by the Debian developers, that it will now be the norm rather than an option…
giant leap for mankind.
They're a guarantee that if there's a backdoor, it's reproducible 100% of the time.
This is a godsend for white hats fighting the good fight.
And, as a side note, it's strongarming vs the bad guys: "Would be too bad if we could reproduce your shiny exploit 100% of the time wouldn't it!?".
Note that we should go further (but it's a bit orthogonal to reproducible builds): builds of the final binary/package should happen by first entirely discarding all files not necessary for the final build (like all test cases and all test assets). The build should literally happen in an environment that gets rid of those (after, of course, having test in another environment that all tests cases succeed): if I'm not mistaken get rid of test assets would have stopped Jia Tan's XZ backdoor attempt dead in its track (for example). Because IIRC there were binary data part of the backdoor hidden in some asset only used by test cases.
P.S: as a bonus they also allow to detect bit-flips (I'm not saying there aren't other ways to detect bit-flips: what I'm saying is that if you have deterministic builds anyway and something doesn't reproduce correctly due to a flipped-bit, it's going to be noticed).
I think Magnus Ihse Bursie said it best while working on reproducible builds of OpenJDK: "If you were to ask me, the fact that compilers and build tools ever started to produce non-deterministic output has been a bug from day one." [2]
[1] https://www.linux.com/news/preventing-supply-chain-attacks-l...
[2] https://github.com/openjdk/jdk/pull/9152#issue-1270543997
reproduced: 97.02% good: 17586 bad: 511 fail: 30 unknown: 0
This, statistics for other architectures, and the reasons for unreproducibility can be found at https://reproduce.debian.net.
Most with failed to reproduce: NT_GNU_BUILD_ID. The others on some other bits. Mostly timestamps or hashes I assume
It feels like AI and traditional software are converging in complexity.