* Running a bare `nix build` in your CI isn't really enough— no hosted logs, lack of proper prioritization, may end up double-building things.
* Running your own instance of Hydra is a gigantic pain; it's a big ball of perl and has compiled components that link right into Nix internals, and architectural fiasco.
* SaaS solutions are limited and lack maturity (Hercules CI is Github-only, nixbuild.net is based in Europe and last I checked was still missing some features I needed).
* Tvix is cool but not ready for primetime, and the authors oppose flakes, which is a deal-breaker for me.
Something that's a barebones capable of running these builds and could be wrapped in a sane REST API and simple web frontend would be very appealing.
The hurdles to interop I see are:
- Nixpkgs is not content-addressed (yet). I made a conscious decision to only support content-addressed derivations in zb to simplify the build model and provide easier-to-understand guarantees to users. As a result, the store paths are different (/zb/store instead of /nix/store). Which leads to... - Nix store objects have no notion of cross-store references. I am not sure how many assumptions are made on this in the codebases, but it seems gnarly in general. (e.g. how would GC work, how do you download the closure of a cross-store object, etc.) - In order to obtain Nixpkgs derivations, you need to run a Nix evaluator, which means you still need Nix installed. I'm not sure of a way around this, and seems like it would be a hassle for users.
I have experienced the same friction in build infra for Nix. My hope is that by reusing the binary cache layer and introducing a JSON-RPC-based public API (already checked in, but needs to be documented and cleaned up) for running builds that the infrastructure ecosystem will be easier.
[1] https://github.com/bazelbuild/remote-apis?tab=readme-ov-file...
I even had to avoid flakes in a system I developed used by ~200 developers since it involved a non-nixos OS and it involved user secrets (Tokens etc...) So with flakes I had to keep track of the secrets (and was a pain point, since they obviously didn't have to push them into the git repo) but nix flakes doesn't handle well omitting files on git (it ignores them also on nix commands). In the end, the workarounds were too messy and had to drop flakes entirely.
The problem for me is that I see no benefit on using this over nix language (which I kinda like a lot right now)
And it has Windows support, of course. It can also be used to build your own distribution (e.g. here is one for a bunch of Rust utilities: https://github.com/wolfv/rust-forge)
Is this Ericsson... the corporation? Windows support for nix is something I don't hear much about, but if there is progress being made (even slowly) I'd love to know more.
You can read a post on that here: https://lastlog.de/blog/libnix_roadmap.html
I'm not sure if Lua is the right choice, though. A declarative language seems like a better fit for reproducibility. The goal of supporting non-deterministic builds also seems to go against this. But I'm interested to know how this would work in practice. Good luck!
If I understand the architecture correctly, the imperative calls in the config file don't actually run the build process. They run a Builder Pattern that sets up the state machine necessary for the builds to happen. So it's a bit like LINQ in C# (but older).
I have no idea how that plays out single-step debugging build problems though. That depends on how it's implemented and a lot of abstractions (especially frameworks) seem to forget that breakpoints are things other people want to use as well.
It's a good point about debugging build problems. This is an issue I've experienced in Nix and Bazel as well. I'm not convinced that I have a great solution yet, but at least for my own debugging while using the system, I've included a `zb derivation env` command which spits out a .env file that matches the environment the builder runs under. I'd like to extend that to pop open a shell.
I think this is actually a great escape hatch. Supporting non-deterministic builds means more folks will be able to migrate their existing build to zb. Postel's law and all that.
One of the insane things with Nix is that the suggested workflow is to manage _everything_ with it. This means that it wants to replace every package manager in existence, so you see Python, Emacs and other dependency trees entirely replicated in Nix. As well as every possible configuration format. It's craziness... Now I don't need to depend on just the upstream package, I also have to wait for these changes to propagate to Nix packages. And sometimes I just want to do things manually as a quick fix, instead of spending hours figuring out why the Nix implementation doesn't work.
So, yeah, having an escape hatch that allows easier integration with other ecosystems or doing things manually in some cases, would be nice to have.
I like Nix and NixOS a lot, its really cool, but it has some really odd management issues and the language IMO is horrendous. I used NixOS for around a year and I was changing my Nixpkgs version and I got that same generic nonsense error that doesn't have any semantic meaning and I was just over it. I'm not too fond of commenting out random parts of code to figure out where something minor and obscure failed. Sometimes it tells you the module it had a problem with, or will point out an out of place comma, and other times its just like "idk bruh ¯\_(ツ)_/¯ "failed at builtin 'seq'" is the best I can do"
the paradigm is a million dollar idea though. I have no doubt its the future of a large portion of the future, both for programming and generic systems. I just wish it wasn't a pain to write and it had some sensible error handling.
Still struggle with the tracebacks though. It's painful when things go wrong.
What I want in a build tool is universality. Sometimes a whole directory tree is the dependency of a target. Sometimes it's an url and the build tool should correctly download and cache that url. Sometimes the pre-requisite is training an ML model.
http://git.annexia.org/?p=goals.git;a=summary
http://oirase.annexia.org/2020-02-rjones-goals-tech-talk.mp4
In particular, the `release.txt` task is trivial by adding a dummy rule to generate and include dependencies; see https://www.gnu.org/software/make/manual/html_node/Remaking-... (be sure to add empty rules to handle the case of deleted dynamic dependencies). You can use hashes instead of file modification times by adding a different kind of dummy rule. The only downside is that you have to think about the performance a little.
I imagine it's possible for a project to have some kind of dynamic dependencies that GNU make can't handle, but I dare say that any such dependency tree is hard to understand for humans too, and thus should be avoided regardless. By contrast, in many other build tools it is impossible to handle some of the things that are trivial in `make`.
(if you're not using GNU make, you are the problem; do not blame `make`)
The main part that's difficult for humans is if there's a non-public class at top level rather than nested (I forget all the Java-specific terminology for the various kinds of class nesting).
> I guess you aren't keen on Java then?
Can you explain more? I don't follow.Does it support sandboxing?
I want to introduce Windows sandboxing, too, but I'm not as confident about how to do that: https://github.com/256lights/zb/issues/31
I think making a new build system without sandboxing (or at least a plan for it) would be pretty stupid.
Fortunately he is planning it.
Quick question: if the build graph can be dynamic (I think they call it monadic in the paper), then does it become impossible to reason about the build statically? I think this is why Bazel has a static graph and why it scales so well.
IME import-from-derivation and similar in Nix is usually used for importing build configurations from remote repositories. Bazel has a repository rule system that is similar: https://bazel.build/extending/repo
So to answer your question: yes from the strictest possible definition, but in practice, I believe the tradeoffs are acceptable.
And actually, if you take the view that build systems are a form of staged programming, then all build systems are monadic because the first stage is building the graph at all, and the second stage is evaluating it. Make, for example, has to parse the Makefiles, and during this phase it constructs the graph... dynamically! Based on the input source code! Rather it is during the second phase done later, when rules are evaluated, and that is now the time when the graph is static and all edges must be known. See some notes from Neil Mitchell about that.[1]
The other key is in a system like Buck or Bazel, there are actually two graphs that are clearly defined. There is the target graph where you have abstract dependencies between things (a cxx_binary depends on a cxx_library), and there is the action graph (the command gcc must run before the ld command can run).
You cannot have dynamic nodes in the target graph. Target graph construction MUST be deterministic and "complete" in the sense it captures all nodes. This is really important because it breaks features like target determination: given a list of changed files, what changed targets need to be rebuilt? You cannot know the complete list of targets when the target graph is dynamic, and evaluation can produce new nodes. That's what everyone means when they say it's "scalable." That you can detect, only given a list of input files from version control, what the difference between these two build graphs are. And then you can go build those targets exactly and skip everything else. So, if you make a small change to a monumentally sized codebase, you don't have to rebuild everything. Just a very small, locally impacted part of the whole pie.
In other words, "small changes to the code should have small changes in the resulting build." That's incremental programming in a nutshell.
OK, so there's no target graph dynamism. But you can have dynamic actions in the action graph, where the edges to those dynamic actions are well defined. For example, compiling an OCaml module first requires you to build a .m file, then read it, then run some set of commands in an order dictated by the .m file. The output is an .a file. So you always know the in/out edges for these actions, but you just don't know what order you need to run compiler commands in. That dynamic action can be captured without breaking the other stuff. There are some more notes from Neil about this.[2]
Under this interpretation, Nix also defines a static target graph in the sense that every store path/derivation is a node represented as term in the pure, lazy lambda calculus (with records). When you evaluate a Nix expression, it produces a fully closed term, and terms that are already evaluated previously (packaged and built) are shared and reused. The sharing is how "target determination" is achieved; you actually evaluate everything and anything that is shared is "free."
And under this same interpretation, the pure subset of Zb programs should, by definition, also construct a static target graph. It's not enough to just sandboxing I/O but also some other things; for example if you construct hash tables with undefined iteration order you might screw the pooch somewhere down the line. Or you could just make up things out of thin air I guess. But if you restrict yourself to the pure subset of Zb programs, you should in theory be fine (and that pure subset is arguably the actual valuable, useful subset, so it's maybe fine.)
[1] https://ndmitchell.com/downloads/paper-implementing_applicat...
[2] https://ndmitchell.com/downloads/slides-somewhat_dynamic_bui...
In https://github.com/NixOS/rfcs/blob/master/rfcs/0092-plan-dyn... there is only an action graph but it is dynamic. Dynamic craft would depend on an entire directory, and thus need to be rebuilt a lot. But when individual files are projected out, there is a new opportunity for early cut-off.
Also, I'm curious to know if you've considered using Starlark or the build file syntax used in multiple other recent build systems (Bazel, Buck, Please, Pants).
I'm mostly contrasting from Nix, which has difficulty with poisoning cache when faced with non-deterministic build steps when using input-addressing (the default mode). If zb encounters a build target with multiple cached outputs for the same inputs, it rebuilds and then relies on content-addressing to obtain build outputs for subsequent steps if possible. (I have an open issue for marking a target as intentionally non-deterministic and always triggering this re-run behavior: https://github.com/256lights/zb/issues/33)
I'll admit I haven't done my research into how Bazel handles non-determinism, especially nowadays, so I can't remark there. I know from my Google days that even writing genrules you had to be careful about introducing non-determinism, but I forget how that failure mode plays out. If you have a good link (or don't mind giving a quick summary), I'd love to read up.
I have considered Starlark, and still might end up using it. The critical feature I wanted to bolt in from Nix was having strings carrying dependency information (see https://github.com/NixOS/nix/blob/2f678331d59451dd6f1d9512cb... for a description of the feature). In my prototyping, this was pretty simple to bolt on to Lua, but I'm not sure how disruptive that would be to Starlark. Nix configurations tend to be a bit more complex than Bazel ones, so having a more full-featured language felt more appropriate. Still exploring the design space!
The way Nix-like systems achieve hermetic sandboxing isn't so much a technical feat, in my mind. That's part of it -- sure, you need to get rid of /dev devices, and every build always has to look like it happens at /tmp/build within a mount namespace, and you need to set SOURCE_EPOCH_DATE and blah blah, stuff like that.
But it's also a social one, because with Nix you are expected to wrap arbitrary build systems and package mechanisms and "go where they are." That means you have to bludgeon every random hostile badly written thing into working inside the sandbox you designed, carve out exceptions, and write ptaches for things that don't -- and get them working in a deterministic way. For example, you have to change the default search paths for nearly every single tool to look inside calculated Nix store path. That's not a technical feat, it's mostly just a huge amount of hard work to write all the abstractions, like buildRustPackage or makeDerivation. You need to patch every build system like CMake or Scons in order to alleviate some of their assumptions, and so on and so forth.
Bazel and Buck like systems do not avoid this pain but they do pay for it in a different way. They don't "go where they are", they expect everyone to "come to them." Culturally, Bazel users do not accept "just run Make under a sandbox" nearly as much. The idea is to write everything as a BUILD file rule, from scratch rewriting the build system, and those BUILD files instead should perform the build "natively" in a way that is designed to work hermetically. So you don't run ./configure, you actually pick an exact set of configuration options and build with that 100% of the time. Therefore, the impurities in the build are removed "by design", which makes the strict requirements on a sandbox somewhat more lenient. You still need the sandbox, but by definition your builds are much more robust anyway. So you are trading the pain of wrapping every system for the pain of integrating every system manually. They're not the same thing but have a lot of overlap.
So the answer is, yes you can write impure genrules, but the vast majority of impurity is totally encapsulated in a way that forces it to be pure, just like Nix, so it's mostly just a small nit rather than truly fundamental. The real question is a matter of when you want to pay the pied piper.
For those who don’t know, its build descriptors are just Scala classes with functions. A function calling another function denotes a dependency, and that’s pretty much it. The build tool will automatically take care of parallelizing build steps and caching them.
How do you think it relates to Nix and alia on a technical level?
Naming is hard.
Also I hope we can keep the store layer compatible. It would be good to replace ATerm with JSON, for example. We should coordinate that!
The string tweak is transparent to users broadly speaking. IME with Nix this thing works the way people expect (i.e if you use a dependency variable in your build target, it adds a dependency).
- Does this sandbox builds the way flakes do?
- What is MinGW used for on Windows? Does this rely on the MinGW userland, or is it just because it would be painful to write a full bootstrap for a windows compiler while also developing Zb?
Also, its great to see the live-bootstrap in there. I love the purity of how Guix's packages are built, and I like the idea Zb will be that way from the start
MinGW is used to build Lua using cgo. I'd like to remove that part, see https://github.com/256lights/zb/issues/28 I haven't started the userspace for Windows yet (https://github.com/256lights/zb/issues/6), but I suspect that it will be more "download the Visual C++ compiler binary from this URL" than the Linux source bootstrap.
Yeah, I'm happy with live-bootstrap, too! I tried emulating Guix's bootstrap, but it depended a little too much on Scheme for me to use as-is. live-bootstrap has mostly worked out-of-the-box, which was a great validation test for this approach.
The last commit using that approach was https://github.com/256lights/zb/tree/558c6f52b7ef915428c9af9... if you want to try it out. And actually, I haven't touched the Lua frontend much since I swapped out the backend: the .drv files it writes are the same.
The motivation behind replacing the backend was content-addressibility and Windows support, which have been slow to be adopted in Nix core.
This way with libcosmopolitan, you could just checkin a copy of your build tool in a project, to be self sufficient. Think of it like gradlew( the gradle bash/bat wrapper) but completely self contained and air gapped
Please do not give into the temptation to just write a version manager and stitch together some hodgepodge and throw the hard problem over the fence to the "community", a set of balkanized repositories to make everything work. It is really really really hard to overstate how much value Nixpkgs gets from going the monorepo route and how much the project has been able to improve, adapt, and overcome things thanks to it. It feels like Nixpkgs regularly pulls off major code-wide changes on an average Tuesday that other projects would balk at.
(It's actually a benefit early on to just keep everything in one repo too, because you can just... clean up all the code in one spot if you do something like make a major breaking change. Huge huge benefit!)
Finally: as a die hard Nix user, I also have been using Buck2 as a kind of thing-that-is-hermetic-cloud-based-and-supports-Windows tool, and it competes in the same space as Zb; a monorepo containing all BUILD files is incredibly important for things to work reliably and it's what I'm exploring right now and seeing if that can be viable. I'm even exploring the possibility of starting from stage0-posix as well. Good luck! There's still work to be done in this space and Nix isn't the final answer, even if I love it.
I'm personally convinced monorepo is strictly superior (provided you have the right tooling to support it).
Topological. The topological scheduler pre-computes a linear order of tasks, which when followed, ensures the build result is correct regardless of the initial store. Given a task description and the output key, you can compute the linear order by first finding the (acyclic) graph of the key’s reachable dependencies, and then computing a topological sort. However this rules out dynamic dependencies.
Restarting. To handle dynamic dependencies we can use the following approach: build tasks in an arbitrary initial order, discovering their dependencies on the fly; whenever a task calls fetch on an out-of-date key dep, abort the task, and switch to building the dependency dep; eventually the previously aborted task is restarted and makes further progress thanks to dep now being up to date. This approach requires a way to abort tasks that have failed due to out-of-date dependencies. It is also not minimal in the sense that a task may start, do some meaningful work, and then abort.
Suspending. An alternative approach, utilised by the busy build system and Shake, is to simply build dependencies when they are requested, suspending the currently running task. By combining that with tracking the keys that have already been built, one can obtain a minimal build system with dynamic dependencies. This approach requires that a task may be started and then suspended until another task is complete. Suspending can be done with cheap green threads and blocking (the original approach of Shake) or using continuation-passing style (what Shake currently does).
If the build language is LUA, doesn't it support top level variables. It probably just takes a few folks manipulating top level variables before the build steps and build logic is no longer hermetic, but instead plagued by side effects.
I think you need to build inside very effective sandboxes to stop build side effects and then you need your sandboxes to be very fast.
Anyway, nice to see attempts at more innovation in the build space.
I imagine a kind of merging between build systems, deployment systems, and running systems. Somehow a manageable sea of distributed processes running on a distributed operating system. I suspect Alan Kay thought that smalltalk might evolve in that direction, but there are many things to solve including billing, security, and somehow making the sea of objects comprehensible. It has the hope of everything being data driven, aka structured, schemad, versions, json like data rather than the horrendous mess that is unix configuration files and system information.
There was an interested talk on Developer Voice perhaps related to a merger of Ocaml and Erlang that moved a little in that direction.
TIL I can also use semicolons on lua tables, not just commas:
return derivation {
name = "hello.txt";
["in"] = path "hello.txt";
builder = "/bin/sh";
system = "x86_64-linux";
args = {"-c", "while read line; do echo \"$line\"; done < $in > $out"};
}
I like using lua as a DSL, now I like it even more! I've using lua as a html templating language that looks like this: DIV {
id="id";
class="class;
H1 "heading";
P [[
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor ]] / EM incididunt / [[ ut labore et
dolore magna aliqua.
]];
PRE ^ CODE [[ this is <code> tag inside <pre> ]];
}