[0] https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-...
> it gives us a reliable way to verify the binaries we ship are faithful to their sources
That's the thing many don't understand: it's not about proving that the result is 100% trustable. It's about proving it's 100% faithful to the source. Which means that should monkey business be detected (like a sneaky backdoor), it can be recreated deterministically 100% of the time.
In other words for the bad guys: nowhere to run, nowhere to hide.
Reproducibility makes bugs more shallow. If hydra builds a bit-for-bit identical iso to what you build locally, that means a developer can make a change to the iso inputs, test it, and know that testing will also apply to the final ci-built one.
If a user reports a bug in the iso, and you want to test if a change fixes it locally, you can start from an identical source-code commit as the iso was built from, make some minimal changes, and debug from there, all without worrying that you're accidentally introducing unintended differences.
It minimizes "but it works on my machine" type issues.
Still, that POSIX operating system bit is also being worked on.
I'd think one could hand-document all 357 bytes of machine code and have them be intelligible.
[0] https://github.com/oriansj/bootstrap-seeds/blob/master/POSIX...
wtf that is mind-boggling. Thanks for the link.
Nix has more packages and advocacy (even if the vast majority of people exposed to nix/guix will never actually use it), but Guix is a lot more interesting to me with the expressive power of scheme on offer.
That said, there are some sharp edges[0] that seem a bit harder to figure out (is this just as inscrutable/difficult as nix?).
Does anyone have some good links with people hacking/working with guix? Maybe some blogs to follow?
I care more about the server use-case and I'm a bit worried about the choice of shepherd over something more widely used like systemd and some of the other libre choices which make Guix what it is. Guix is fine doing what it does, but it seems rather hard to run a Guix-managed system with just a few non-libre parts, which is a non-starter.
Also, as mentioned elsewhere in this thread, the lack-of-package-signing-releases is kind of a problem for me. Being source and binary compatible is awesome, but I just don't have time to follow the source of every single dependency... At some point you have to trust people (and honestly organizations) -- the infrastructure for that is signatures.
Would love to hear of people using Guix in a production environment, but until then it seems like stuff like Fedora CoreOS/Flatcar Linux + going for reproducible containers & binaries is what makes the most sense for the average devops practitioner.
CoreOS/Flatcar are already "cutting edge" as far as I'm concerned for most ops teams (and arguably YAGNI), and it seems like Nix/Guix are just even farther afield.
[EDIT] Nevermind, Guix has a fix for the signature problem, called authorizations![1]
[0]: https://unix.stackexchange.com/questions/698811/in-guix-how-...
[1]: https://guix.gnu.org/manual/devel/en/html_node/Specifying-Ch...
We've just had a conference about Guix in HPC: https://youtu.be/dT5S72x18R8
This is a recording of a stream for the second day with talks about large scale deployments of Guix System in HPC.
I mean if 2 copies of a piece of software were compiled from the same source, what stops them from being identical each and every time?
I know there are so many moving parts, but I still can't understand how discrepancies can manifest themselves.
https://reproducible-builds.org/docs/
The main overall issue is that developers don't test to ensure they reproduce. Once it's part of the release tests it tends to stay reproducible.
It would be great if 100% of builds were reproducible, but I don't believe developers shouldn't be testing for reproducibility unless it's a defined goal.
As generalized reproducible build tooling (guix, nix, etc.) becomes more mainstream, I imagine we'll see more reproducible builds as adoption grows and reproducibility is no longer something developers have to "check for", but simply rely upon from their tooling.
How do you get similar behaviour while having a reproducible build?
Can you, for example, have the final binary contain a reproducible part, and another section of the elf file for deliberately non-reproducible info?
Well no: that's really the thing reproducible packages are showing: there's only one correct binary.
And it's the one that's 100% reproducible.
I'd even say that that's the whole point: there's only one correct binary.
I'll die on the hill that if different binaries are "all correct", then none are: for me they're all useless if they're not reproducible.
And it looks like people working on entire .iso being fully bit-for-bit reproducible are willing to die on that hill too.
These comparisons don't have to go the same way for everything to be correct.
Imagine the program uses the current date or time as a value. When compiled at different moments, the bits change.
Same applies to anything where the build environment or timing influences the output binary
I’ve successfully built tools to compare Java JARs that required getting around two of those and other test tools that required the third. I’m sure there are more.
I have only ~2 hours experience with Nixos, wanted to try hyprland, I thought it would be easier on Nixos since hyprland needs a bit of setup and maybe it's easier to use someone else's config on nixos, than on some other distro. Finding a config was hard too, found like 3 on some random github gists, thought there would be more... and none of them worked, at that point I gave up.
Nixos has the advantage that everything is built in its own sandbox with only its explicitly declared (and hashed) dependencies available, unlike in mainstream distros where it's the full system environment, so in many cases you already get the same binary every time. But this doesn't immediately lead to reproducibility because the build process might be nondeterministic for various packages.
Debian has been building in a clean sandbox with only required, tracked dependencies since decades.
It's also building the large majority of packages reproducibly including the binary and whole installation packages (not just the sources like nixos)
Usually packages are built in an environment which has only a minimal base system plus the package's explicitly dependencies. They don't have random unnecessary packages installed.
Upvote from me FWIW.
The sense you're thinking of is that you can easily rebuild a binary package and it will use the same dependency versions, build options, etc. There should be no chance of a compiler error that didn't happen the first time (the old "but it worked on my laptop" syndrome).
The sense used here is that every build output is byte-for-byte binary identical. It doesn't depend on the machine name, the time it was compiled or anything like that (or, in a parallel build, the order in which files finish compiling). That is much harder.
And that's just for Nixpkgs, the packages themselves that also work outside NixOS. NixOS has reproducibility of the entire system complete with configuration.
> I thought one of the main reason for nixos's existence is reproducibilty
NixOS uses "reproducible" to mean "with the same Nix code, you get the same program behaviour". This is more/less what people hope Dockerfiles provide.
This is the level of reproducibility you want when you say "it works on my machine" or "it worked last time I tried it".
Whereas "reproducible build" aims for bit-for-bit equality for artifacts build on different machines. -- With this, there's a layer of security in that you can verify that code has been built from a particular set of sources.
> Finding a config was hard too
What search query were you using? Searching "nixos configuration" on https://github.com/search?q=nixos%20configuration&type=repos...
Or searching for hyprland specifically, there seem to be many using that https://github.com/search?q=wayland.windowManager.hyprland&t...
Note that ”Nix code” also includes the hashes of all non-Nix sources. One way to think of it is that Nix has reliable build cache invalidation.
> This is more/less what people hope Dockerfiles provide.
Indeed, but importantly they do not provide input-reproducibility (while Nix does) because, at least, there are no hashes for remote data.
I just wanted to take a quick look at hyprland, I imagined I just use an existing config, I never thought it would need hours of research. Later I installed an arch vm and managed to install hyprland with some basic components in less than an hour from the first guide I found.
Looks like I misunderstood, what nix was made for. I just want a system I can more or less set up with a simple config file.
I saw this os, didn't have time to try it yet, but I thought this is how nix works. https://blendos.co/
For example you just define gnome like this, the nix configs I found looked similar, they just didn't work.
>gnome:
> enabled: true
> style: light
> gtk-theme: 'adw-gtk3'
> icon-theme: 'Adwaita'
> titlebar:
> button-placement: 'right'
> double-click-action: 'toggle-maximize'
> middle-click-action: 'minimize'
> right-click-action: 'menu'
> at that point I gave up.
NixOS is not for the weak or time constrained, currently. Hopefully it will be one day. Still if you push through, you reap the benefits.
I started with this one, the minimal version, then moved on to something more like the standard version, and now I'm moving on to something based on his much more complicated and flexible build in a different repo. I had been flailing, then this repo made it click.
That sounds odd, did you use github code search?
Find relevant home manager options:
https://mipmip.github.io/home-manager-option-search/?query=h...
Then search those on github:
https://github.com/search?utf8=%E2%9C%93&q=lang%3Anix+hyprla...
Note some option searches imply more casual or advanced users.
Guix, Archlinux, Debian do the binary reproducibility better than Nix / NixOS / Nixpkgs.
Sources :
- https://r13y.com/ ( Nix* )
- https://tests.reproducible-builds.org/debian/reproducible.ht... ( Debian )
- https://tests.reproducible-builds.org/archlinux/archlinux.ht... ( Archlinux )
- https://data.guix.gnu.org/repository/1/branch/master/latest-... (Guix, might be a bit slow to load, here is some cached copy https://archive.is/lTuPk )
* Input reproducibility, meaning "perfect cache invalidation for inputs". Nix and Guix do this perfectly by design (which sometimes leads to too many rebuilds). This is not on the radar for Debian and Arch Linux, which handle the rebuild problem ("which packages should I rebuild if a particular source file is updated?") on an ad-hoc basis by triggering manual rebuilds.
* Output reproducibility, meaning "the build process is deterministic and will always produce the same binary". This is the topic of the OP. Nix builds packages in a sandbox, which helps but is not a silver bullet. Nix is in the same boat as Debian and Arch Linux here; indeed, distros frequently upstream patches to increase reproducibility and benefit all the other distros. In this context, https://reproducible.nixos.org is the analogue of the other links you posted, and I agree Nix reports aren't as detailed (which does not mean binary reproducibility is worse on Nix).
Your comment can be misinterpreted as saying "Nix does not do binary reproducibility very well, just input reproducibility", which is false. That's the whole point of the milestone being celebrated here!
It's only "false" as nobody has actually tried to rebuild the entire package repository of nixpkgs, which to my knowledge is an open problem nobody has really worked on.
The current result is "only" ~800 packages and the set has regular regressions.
it's about binary bit by bit reproducibility of not just the binaries but also how they get packed into an iso (i.e. r13y.com is outdated, the missing <1% where also as far as I remember a _upstream_ python regression as reproducability of binaries (ignoring the packaging into an iso) was already there a few years ago)
now when it comes to packages beyond the core iso things become complicated to compare due to the subtle but in this regard significant different ways they handle packages, e.g. a bunch of packages you would find on arch in aur you find as normal packages in nix and most of the -bin upstream packages are simply not needed with nix
in general nix makes it easier to create reproducible builds but (independent of nix) this doesn't mean that it's always possible and often needs patching which often but not always is done if you combine this with the default package repository of nix being much larger (>80k) then e.g. arch (<15k non aur) comparing percentages there isn't very useful.
through one very common misconception is that the hash in the nix store path is based on the build output, but it's instead based on all sources (weather binary not) used for building the binary in an isolated environment
this means it has not quite the security benefit some people might think it has, but in turn is necessary as it means nix can use software which is non reproducible buildable in a way which still produces reasonable reproducable deplyments (as in not necessary all bits the same but all functionality, compiler-cfgs, dependencies versions, users, configurations etc. being the same
Isn't that exactly what your first source and OP are about? They check that the binaries are the same when built from the same sources on different machines. The point is exactly that the binaries don't change with every build.
> How are these tested?
> Each build is run twice, at different times, on different hardware running different kernels.
Huh, didn't know that Arch Linux tests reproducibility. It's apparently 85.6% reproducible: https://reproducible.archlinux.org
I wonder how much work would be needed for NixOS, considering it has more than 80k packages in the official repository.
With input-addressing you look things up in the store based on the input hash. You can determine the input hash yourself, but you have to trust the store to provide a response that corresponds to the sources. With Reproducible Builds you can have a third party confirm that output matches that input.
With content-addressing, you look things up in the store based on the output hash. You no longer need to trust the store here: you can check yourself the response matches the hash. However, you now have to trust who-ever told you that output hash corresponds to the input you're interested in. With Reproducible Builds you can now have a third party confirm that output hash matches that input.
I have not worked with content-addressed nix in depth yet, but my understanding is that this stores the mapping between the inputs and their output hashes in 'realizations' which are also placed in the store. Reproducible Builds will still be useful to validate this mapping is not tampered with.
While I understand that these two goals, reproducible builds and unique installs, are orthogonal to each other, both can be had at the same time, the duality of the situation still makes me laugh.
Alternatively, randomizing the offsets when starting the program is another way to keep reproducibility and even increase security; the offsets would change at every run.
Until signing is standardized, it is hard to imagine using nix in any production use case that protects anything of value.
You don't need to trust it wasn't packaged maliciously, nix does reproducible builds so you can just look at the derivation and build it yourself if you don't feel like relying on the binary cache.
As for whether the underlying contents are malicious, that's between you and the developer. If other distributions have have lead you to believe otherwise, then I think they have misled you.
The only exception I can think of is Tails, and they don't exactly have the breadth that Nix does.
And yet most of the packages from most major linux distributions are signed. If you are going to spend hours maintaining a package, it takes only an extra half a second to tap a yubikey to prevent someone from impersonating you.
Package maintainers from say Arch and Debian go through a vetting process, multiple people sign their keys, and it is a responsibility. Yes, it is volunteer, but there are also volunteer firefighers. Some volunteer jobs are important to keep others safe, and they should be done with care.
If Arch, Debian, Fedora, Ubuntu can all sign packages, then this excuse does not really hold for Nix.
"You don't need to trust it wasn't packaged maliciously, nix does reproducible builds so you can just look at the derivation and build it yourself if you don't feel like relying on the binary cache."
Reproducible builds and package definition signing solve totally different problems. Assume you trust a given developer has been maintaining a package non-maliciously, then you see they made a new update, and so you and other people trust it and build it. You get the same result, so you trust the binaries too. However, you still end up with malware. How? Simple. The developers github account was compromised due to a sim swap on their email account while they were on vacation, and someone pushed a fake commit as that person.
Or maybe a malicious Github employee is bribed to serve manipulated git history only to the reproducible build servers but to no one else, so it goes undetected for years.
Supply chain attacks like this are becoming very common, and there is a lot of motivation to do it to Linux distributions which power systems worth billions of dollars regularly.
It is also so easy to close these risks. Just tap the damn yubikey or nitrokey when it blinks. It is wildly irresponsible not to mandate this, and we should question the motivations of anyone not willing to do something so basic to protect users.
The NixOS build system signs all build outputs with its key and the signature is verified upon download. If you’re paranoid, Nix at least makes it easy to just build everything from source.
As for Debian, maintainers must still all be part of the web of trust with an application process and a registered key in the web of trust to contribute packages. https://wiki.debian.org/DebianMaintainer#step_5_:_Keyring_up...
Arch also still has each maintainer sign the packages they maintain, and there is a similar approval process to earn enough trust in your keys to become a maintainer, so there is a decentralized web of trust. Then you add reproducible builds on top of that to almost complete the story.
Signed reviews would remove the remaining SPOFs in debian and arch but sadly no one does this. Nix however does not even allow -optional- key pinning or verify signatures of any authors that choose to sign expressions or their commits, so NixOS is way behind Arch and Debian in this respect.
> [...] actually rebuilding the ISO still introduced differences. This was due to some remaining problems in the hydra cache and the way the ISO was created.
Can anyone shed some light on the fix for "how the ISO was created"? I attempted making a reproducible ISO a while back but could not make the file system create extents in a deterministic fashion.
It sounds like what you're looking for is the commands that that build invoked, but I'm not sure what step you're looking for. For example, the xorriso invocations are at https://github.com/NixOS/nixpkgs/blob/master/nixos/lib/make-...
$ python3
Python 3.10.7 (main, Jan 1 1970, 00:00:01) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
Perhaps a relic from when software had to be manually updated?It's mostly annoying as gcov will actively prevent you from using gcda files from a different but equivalent binary than what generated the gcno.
Reproducing the build already goes a long way in making such attacks increasingly unlikely, though.
Imagebuilder claims reproducibility, but as far as I know it mostly installed rpm packages as binaries, not from source, so it's not really proper reproducibility unless all the input packages are also reproducible.
If the descriptions of building packages from source, building distro images, and reproducibility in the linked thread didn't make sense to you, you're probably not really the target audience anyway.
If you're interested in an Ansible alternative that uses Jsonnet and state tracking to somewhat mimic Nix, check out Etcha: https://etcha.dev
I think precision is important.
"Nix" refers to the package manager (and the language the package manager uses).
Whereas it's "NixOS" that's the OS which makes use of Nix to manage the system configuration.
Nix is immutable. A new change is made entirely new, and only after the build is successful, all packages are "symlinked" to the current system.
Fedora Silverblue is based on ostree [1]. It works similarly like git, but on your root tree. But it requires you to reboot the whole system for the changes to take effect. Since Nix is just symlinked packages, you don't need to reboot the system.
More detailed explanation here [2].
[1]: https://github.com/ostreedev/ostree
[2]: https://dataswamp.org/~solene/2023-07-12-intro-to-immutable-...