However, they package IceCat instead of Firefox, and that's a much tougher one. Note IceCat is not very well maintained.
Nonetheless, there are a few third party repos from users with non-GNU-sanctioned software. I hope it becomes a bit like Emacs, where GNU Elpa coexists in harmony with MELPA.
Its maintainer is working on upgrading to the latest ESR now. If anyone is interested in helping maintain IceCat, please e-mail maintainers@gnu.org.
EDIT: Also what's unsexy about GNU? I'm really curious.
(edit: I understand the why of it, and even agree on principle, but it still prevents me from running linux-libre on most of my systems)
GNU utilities, are not only unsexy, they are bloated and messy, and prone to failure; the GNU implementations (coreutils: grep, cat, tail, etc) of standard UNIX tools are not done with simplicity in mind.
But hey, after all GNU is Not Unix. For those of us, who really appreciate the UNIX philosophy still have OpenBSD, which is the only light in a world of chaos, in my opinion.
And this is coming from a genx open source / Linux guy. What it must look like to the current generation?!?!
"Gnu's Not Unix": A recursive acronym used as a pun about an operating system from the 1970s, existing solely as a reflection of an aging neckbearded hippie hacker's personal philosophy about software, that is pronounced "GUH-NEW".
(I'm a Schemer and I'd love to have a Lisp machine user environment using Scheme.)
Do you want to convince people that something like guix is better than docker? Then take something that is currently distributed using docker and actually show how the guix approach is simpler.
i.e. I have a random app I recently worked on where the dockerfile was something like
FROM python:2.7
WORKDIR /app
ADD requirements.txt /app
RUN pip install -r requirements.txt
ADD . /app
RUN groupadd -r notifier && useradd --no-log-init -r -g notifier notifier
USER notifier
EXPOSE 8080/tcp
CMD ./notify.py
How do I actually take a random application like that and build a guix package of it?Another project I work on is built on top of zeromq, and it would be great to use something like guix to define all the libsodium+zeromq+czmq+zyre dependancies and be able to spit out an 'ultimate container image' of all of that, but all this post shows me how to do is install an existing guile package.
The simplest way would be to package the app for guix and you could just run '$ guix environment <name-of-package>' and you would be dropped into an environment with all your dependencies and whatever else the application requires in your path ready for hacking, get your sources and editor and start working.
If you need a vm or similar though I'd translate your example above into a system config where:
- packages include python-2.7 and whatever is in requirements.txt (this may mean you have to package a few things, but again this is usually super easy)
- users and groups are added to the config, as they always are, no extra step necessary.
- exposing ports and networking is available as options for qemu script guix produces to launch the vm.
- CMD ./notify.py: create a "simple" service that can be autostarted by the system on boot.
- filesystem access is also handled by arguments to the qemu script.
As always though there are several paths to Rome, and these are just two of them.
Zeromq and libsodium are already packaged on guix, czmq and zyre looks like they would be simple to package, guix is really quite simple to work with, which I think is the reason so many of the users and devs are running it as our daily drivers, even though it is strictly beta (0.14. I think is the last release).
And pointless, come on - what does that even mean? Does it mean you don't value them? I was quite happy to read about a neat new thing I can use my favorite tool for.
Yes, I know all that. It's neat. I would like to learn more about it.
> The simplest way would be to package the app for guix
I was asking how to package the app for guix, and your response is the simplest way would be to package the app for guix...
> If you need a vm or soimilar though I'd translate your example above into a system config where: - packages include python-2.7 and whatever is in requirements.txt (this may mean you have to package a few things, but again this is usually super easy) - users and groups are added to the config, as they always are, no extra step necessary. - exposing ports and networking is available as options for qemu script guix produces to launch the vm. - CMD ./notify.py: create a "simple" service that can be autostarted by the system on boot. - filesystem access is also handled by arguments to the qemu script.
Yes, I'm sure it is super easy. How do I do it?
Do you know how to use the dockerfile I posted above? You run
docker build -t myapp .
docker run myapp
that's super easy. 9 lines and 2 commands. You can now add docker expert to your resume.> Zeromq and libsodium are already packaged on guix, czmq and zyre looks like they would be simple to package,
Well, I was working on a fork of things, so I would have needed to install my forks.
> guix is really quite simple to work with
I'm sure it is!
> And pointless, come on - what does that even mean? Does it mean you don't value them? I was quite happy to read about a neat new thing I can use my favorite tool for.
You are correct, I don't really value posts saying how cool and easy something is and how much better it is than other solutions, when they don't actually present a complete solution someone can actually use.
I get that it is not other peoples job to teach me how to use something like guix, but do people not understand why things like Docker won?
While it's neat that I can do introspection on my package graph, I don't immediately see any benefit for me when I startup my containers.
I would love to see a full guix/nix script of what GP asked to see a comparison, I like to see hands-on stuff not theoretical.
No, we show that Guix is a tool that gives you a way to work with software environments at a higher level; but at the same time you don't have to give up on application bundles like Docker. You can simply generate Docker images or other forms of applications bundles from that higher-level representation.
You are welcome to take a look at this paper that I co-authored where we explain why we use Guix for a reproducible bioinformatics pipeline, and the rigorous, declarative functional package management approach instead of the imperative approach of Docker files:
https://www.biorxiv.org/content/early/2018/04/21/298653
We're also providing Docker images, but we generate them from a higher-level declarative specification that ensures a high degree of bit-reproducibility.You define a package for your own project that depends on libsodium/zeromq/etc from GuixSD. Then you export your own package with 'guix pack'. For an example of what a package definition looks like, take a look in /gnu/packages in the GuixSD repository, for instance libsodium [1] or Vim [2].
I did something similar recently to build an Nginx "application bundle" [3]. It uses Nix (previously Guix, but Nix worked better for me in the end) to build a squashfs image. You can then run the binary on that filesystem with systemd-nspawn, or as a regular service by setting RootImage=. Some advantages over the Docker approach are that you can easily customise the build (e.g. changing the ./configure flags for Nginx without having to manually perform all other build steps), and bit by bit reproducibility (if you build the same commit six months from now, on a different machine, you will still get the same image out).
[1]: https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages... [2]: https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages... [3]: https://github.com/ruuda/miniserver#readme
FROM nixos/nix
RUN nix-channel --update
RUN nix-env -i python2.7-{twisted,treq,txgithub}
WORKDIR /app
ADD . /app
EXPOSE 8080/tcp
CMD python notify.py
The next level would be using the nixpkgs Docker builder directly: https://nixos.org/nixpkgs/manual/#sec-pkgs-dockerToolsAnyway, in your Dockerfile I see that your application uses Python and you do some package management and service management stuff that is mixed together. In Guix, these things are separated. So the first step would be to define a package for your software, and then you would deploy that package. For a real world example of a Python application, here is what the AWS CLI package looks like:
(define-public awscli
(package
(name "awscli")
(version "1.14.41")
(source
(origin
(method url-fetch)
(uri (pypi-uri name version))
(sha256
(base32
"0sispclx263lybbk19zp1n9yhg8xxx4jddypzgi24vpjaqnsbwlc"))))
(build-system python-build-system)
(propagated-inputs
`(("python-colorama" ,python-colorama)
("python-botocore" ,python-botocore)
("python-s3transfer" ,python-s3transfer)
("python-docutils" ,python-docutils)
("python-pyyaml" ,python-pyyaml)
("python-rsa" ,python-rsa)))
(arguments
'(#:tests? #f))
(home-page "https://aws.amazon.com/cli/")
(synopsis "Command line client for AWS")
(description "AWS CLI provides a unified command line interface to the
Amazon Web Services (AWS) API.")
(license license:asl2.0)))
The package recipe contains all the metadata, build instructions, and dependencies. Now that you have a package, it can be built with Guix and then deployed in a variety of ways. Judging from the Dockerfile, your software is some daemon that listens on port 8080, so:* You can install the software directly using 'guix package -i your-package-name' and run the notify.py program. Good for trying things out.
* If you are deploying to the Guix system distribution, you could write a service definition so that you can manage the daemon via the init system. The service would take care of creating the notifier user and group, starting the service on boot, etc.
* You could use 'guix pack --format=docker' to export an image suitable for running with 'docker load'
* You could use a different 'guix pack' format (and maybe make it relocatable) for running on some other non-Guix system
I should also add that I don't think the work is fully done yet on handling the entirety of Docker use-cases. It's a work in progress. I can think of a number of things that I want to add to Guix to make this workflow better that I haven't had a chance to hack on yet.
If the versions are specified in the 'python-botocore' type definitions, how do you install more than one version of a library?
Does guix only track the latest version of dependencies or can you request any version of something?
Then it would be great to see t0nt0n or someone else who knows guix do the port so we can fully compare these two approaches.
requests==2.18.4
the actual packages generally aren't important.The cases were that would become interesting are where they require some C library dependencies first, like libpq-dev. In those cases something like guix/nix would be nice because it could be used to pull in the specific external dependencies as well.
Either you extract it from scratch every time you run an app, taking a long time penalty...
... or you extract once to cache, and assume that nothing changes the cache. This is pretty bad from both operational and security perspective:
- backups have to walk through tens of thousands of files, thus becoming much slower
- a damaged disk or a malicious actor can change one file in the cache, making damage which is very hard to detect.
There are plenty of mountable container formats -- ISO, squashfs, even zip files -- which all provide much faster initial access, and much better security/reliability guarantees, especially with things like dm-verity.
Where to find these extensions? Are they portable between Linux and BSD?
The 1998 dict project included a utility called "dictzip" for random access to the contents of gzip compressed files.
Dumb question: Is it possible to create a utility or even a hack that performs "random access" into tar archives?
Example use case: the user only wants to untar a small number of selected files from a large tarball such as a source tree. The user has tried both the "-T filelist" option and using memory file systems instead of hard disk drives.
Funny how tar was originally developed for tape drives!
- Does not slow down your backup by adding thousands of files
- No need to wait for initial file extraction
- You can quickly and easily verify integrity of the whole archive
And if you are using fuse, it does not require any special privileges either!
If your operational and security model really frowns on trusting your extraction cache, then perhaps a different workflow is more appropriate - download the container, verify the container, extract, bake the OS plus extracted apps into an image, sign the image, verify the image upon each boot and mount apps read-only. Then you don't need to re-verify anything upon each launch, instead trusting that your image creation process is routinely updating and re-verifying the software in your current images.
A simple example: my /usr/include is 33037 files, 356M uncompressed. On SSD with cold cache, it takes 6.7 sec to read each file individually, or 0.7 sec to checksum a single 356M archive, a 10x difference.
The difference in the backup time is even more dramatic -- the backup program has to call stat() either 33K times, or just once, a 3,330,000% improvement! The other filesystem tools (What takes all the space? What has changed in the last X hours? Please sync this directory elsewhere.) will have similarly high speed improvements.
So if I had a choice, I would love my dev environment to come in mountable form. Similarly, I don't understand why container runtimes (like docker) don't use loop mounts more -- it seems like many advantages and very few disadvantages.
As for signature verification -- I don't care about 3rd party signature and revocation, I just want to ensure that I am running the same code every time. There are many ways one can damage extraction cache, especially if it is owned by the same user as application (like the topicstarter post described) -- sysadmin errors (`sudo find / -name app-old -delete`), application errors (create cache file in bin dir), disk errors (silent corruption), transfer errors (one file did not get transferred to a new computer). Loop mounting makes disk errors easier to detect, and eliminates other classes of error entirely.
It's brought production to a halt on more than one occasion if I try to "restore" from a backup by extracting the files and moving into production without manually fixing them first.
When using plain Guix you won't need to use any archive format at all; packages simply end up each in their own unique directory and can be used just like that. You can easily spawn a container environment where only the relevant directories under `/gnu/store` are mounted.
It's on my list to add more target formats for `guix pack`, but generally I'd recommend using Guix directly to reap all benefits. `guix pack` is only really useful for cases where you cannot use Guix on the target system.
http://lists.gnu.org/archive/html/guix-patches/2018-05/msg00...
Another problem is that there is no way to just get the latest entry in a multi-layered image without scanning every layer sequentially (this can be made faster with a top-level index but I don't think anyone has implemented this yet -- I am working on it for umoci but nobody else will probably use it even if I implement it). This means you have to extract all of the archives.
Yet another problem is that if you have a layer which just includes a metadata change (like the mode of a file), then you have to include a full copy of the file into the archive (same goes for a single bit change in the file contents -- even if the file is 10GB in size). This balloons up the archive size needlessly due to restrictions in the tar format (no way of representing a metadata entry in a standard-complying way), and increases the effect of the previous problem I mentioned.
And all of the above ignores the fact that tar archives are not actually standardised (you have at least 3 "extension" formats -- GNU, PAX, and libarchive), and different implementations produce vastly different archive outputs and structures (causing problems with making them content-addressable). To be fair, this is a fairly solved problem at this point (though sparse archives are sort of unsolved) but it requires storing the metadata of the archive structure in addition to the archive.
Despite all of this Docker and OCI (and AppC) all use tar archives, so this isn't really a revolutionary blog post (it's sort of what everyone does, but nobody is really happy about it). In the OCI we are working on switching to a format that solves the above problems by having a history for each file (so the layering is implemented in the archiving layer rather than on top) and having an index where we store all of the files in the content-addressable storage layer. I believe we also will implement content-based-chunking for deduplication to allow us to handle minor changes in files without blowing up image sizes. These are things you cannot do in tar archives and are fundamentally limited.
I appreciate that tar is a very good tool (and we shouldn't reinvent good tools), but not wanting to improve the state-of-the-art over literal tape archives seems a bit too nostalgic to me. Especially when there are clear problems with the current format, with obvious ways of improving them.
* The container (file) for the filesystem must necessarily be larger than the metadata+data for the filesystem because filesystems really don't like almost-full disks. And unless I'm mistaken sparse files are not usable for loopback devices (so you can't hack your way out of it).
* Most filesystems don't have a snapshot-style history so you would have to pick a specific filesystem from that list (otherwise you'd be forced to make CoW duplicates of the filesystem to create snapshots -- which is interestingly how Docker does layered storage with devicemapper) which has slightly similar problems to layered tar archives.
* The kernel's filesystem parsers are not really considered to be safe against an adversary, from what I've been told by filesystem engineers. So mounting random loopback files with filesystems on them might end badly.
* There is no way of looking at the archive using a userspace tool (without mounting), unless you re-implement the kernel parser for the filesystem. To be fair, this is true for any format, but filesystems are far more complicated and harder-to-parse than most other formats.
* Having a single blob as your entire image history and so on will mean that you can no longer have content-addressable storage for your images without adding something like content-defined chunking on top (which is then another layer of storage on top of your underlying storage).
* Using a Linux filesystem would mean you couldn't use the filesystem on different operating systems very easily. Even if it was compatible on whatever other filesystem you are using, userspace has no way of being sure there isn't a bug in either side's parser -- and what happens if one side changes the on-disk format. If the protocol is in userspace then it can be handled there.
* Most filesystems don't let you remap users, so if you wanted to run a container in a user namespace you would need to either rewrite the filesystem structure or mount the filesystem and copy it to another filesystem. To be fair, tar archives require you to do the mapping on extraction which is a similar problem, but far less complicated.
* Everyone would be opinionated about what filesystem to use, which means that you'd have to deal with every filesystem people throw at you, making it harder to be interoperable and adding choices where they aren't necessary. It should be up to the user what filesystem they use for storage, not the image distributor.
Now, this hasn't stopped people from trying to use this. Singularity's internal format is a loopback file with a filesystem inside, and they have privileged suid binaries that mount it. And it does have genuine performance benefits, and if you don't want things like content-addressability then it can work for some usecases.
One could create a utility to make tarballs with a TOC and the ability to index while still remaining compatible with tar and gzip. Pigz is one step in the direction.
to extract a particular entity 'tar -vxf tarball.tar path_in tarball_to_entity`
edit: good points on it not being efficient for large archives, just demonstrating it is possible.
It isn't that it is possible, it is that is horribly inefficient.
Zips on other hand unify storage and compression such that one has random access to particular file, hence most modern file formats are zips with xml or json inside.
- environment variables like locales. If your software expects to run with English sorting rules and UTF-8 character decoding, it shouldn't run with ASCII-value sorting and reject input bytes over 127.
- Entrypoints. If your application expects all commands to run within a wrapper, you can't enforce that from a tarball.
You can make conventions for both of these like "if /etc/default/locales exists, parse it for environment variables" and "if /entrypoint is executable, prepend it to all command lines", but then you have a convention on top of tarballs. (Which, to be fair, might be easier than OCI—I have no particular love for the OCI format—but the problem is harder than just "here are a bunch of files.")
And entrypoints/wrappers are definitely possible from a tarball. Just wrap the executables in bin/, replacing them with shell script (or whatever) wrappers pointing to the real executables. That's what Nix/Guix do for languages like Python which require dependencies to be provided by environment variables (as they don't have a way to "close over" the locations of their dependencies).
And around and around we go
1. Binary packages are simply compressed archives (tarballs) of the relevant branch in the /Programs tree.
2. branches do not have to actually live inside the /Programs tree. There are tools available to move the branches in and out of /Programs.
All this because Gobolinux leverages symbolic links as much as possible.
I suspect there are ways to introduce hashes to Gobo, if one were so inclined. But so far nobody has.
I wonder if this could be implemented with the WAL/journal system. Make each layer immutably append to the previous layers to make restarting at any layer trivial. I'm not sure if there's such a way to hook into the journal directly like that though.
sqlar is after all only a table definition, if you don't need FUSE access or are willing to write your own, SQLite3 can go a long way of providing arbitrary neat functionality.
Relocation currently requires a little C wrapper, which uses Linux namespaces, as the blog post indicates.
If you want something more advanced, such as a bundle that includes an init and services, it's best to use `guix system`, which builds VM images among others.
[0] https://lists.gnu.org/archive/html/bug-guile/2013-03/msg0000...