The fundamental problem of programming language package management (opens in new tab)

(blog.ezyang.com)

134 pointsroute6611y ago72 comments

72 comments

52 comments · 14 top-level

chrisfarms11y ago· 9 in thread

Nix, NixOS, Nix ... a thousand times Nix.

I can't believe the article doesn't mention it.

I've been using NixOS as my OS for development, desktop and we're in the middle of transitioning to using it for production deployments too.

Nix (the package manager not the distribution) solves so many of the discussed problems. And NixOS (the linux distribution) ties it all together so cleanly.

I keep my own fork of the Nixpkgs repository (which includes everything required to build the entire OS and every package), this is like having your own personal linux distribution with the but with the simplest possible way of merging changes or contributing from upstream.

I use it like I'd use virtualenv. I use it like I'd use chef. I use it like I'd use apt. I use it like I'd use Docker.

http://www.nixos.org

davexunit11y ago

Yes! The Haskell community has been using Nix to great effect and I would like to see other programmers catch on as well. Here is a great talk about using Nix targeted at Python programmers: http://pyvideo.org/video/3036/rethinking-packaging-developme...

In addition to Nix, there is also a newer project: GNU Guix. Guix is built on top of Nix but replaces the custom package configuration language with Scheme, among other differences. https://gnu.org/software/guix/

When package management is solved at the system level, our deployment situation becomes a whole lot better. I used to do a lot of Ruby programming. Wrestling with RVM and bundler was a real pain, especially since bundler was incapable of helping me with the non-Ruby software that I needed as well like libmysqlclient, imagemagick, etc. Using Nix/Guix, you can throw out hacky RVM (that overrides bash built-ins like cd!) and simply use a profile that has the right Ruby version.

Bye pip, bundler, composer, CPAN, puppet, ansible, vagrant, ..., and hello Nix/Guix!

vertex-four11y ago

> In addition to Nix, there is also a newer project: GNU Guix. Guix is built on top of Nix but replaces the custom package configuration language with Scheme, among other differences. https://gnu.org/software/guix/

Personally, I'm rather more keen on Nix; the language is pretty much designed for writing JSON-style configuration except as a formal programming language, which is what the vast majority of Nix code is (both package definitions and system configurations).

Additionally, with Nix, you can be close to certain that if you build something twice, you'll get the same result, because it can't access impure resources.

Finally, because Guix is a GNU project, the official repositories are going to go nowhere near non-free software. Nixpkgs contains non-free software, although disabled from installation by default. You might be a little less likely to have people help you get non-free software working on the GNU Guix mailing lists, if you happen to use any.

1 more reply

bad_user11y ago

You could of course use a platform that doesn't depend on OS dependent binaries (like the JVM) and a package manager that likes ad-hoc and easily created repositories and that has lots of plugins available (Maven, or derivates like Gradle, SBT or Leiningen).

I worked with a lot of platforms, such as PHP, Perl, Ruby, Python, Node.js and .NET. I felt the pain of pip, easy_install, setup-tools, virtualenv, bundler, gems, cpan, pear, rvm, rbenv, npm, bower, apt-get or whatever else I used at some point or another. And I swear, in spite of all the criticism that Java or Maven get and in spite of all warts, in terms of packaging and deployment for me it's been by far the sanest. I mean, it's not without warts, heaven forbid to end up with classpath issues due to transitive dependencies, but at the very least it is tolerable.

2 more replies

1_player11y ago

I would be very interested in reading a detailed post about your experiences with Nix/NixOS.

I've been hearing a lot about this project, but I always thought it was just an academic experiment. I'm in the process of packaging and maintaining a Python+Javascript+Redis+PostgreSQL application and Nix certainly is something I should learn more about.

platz11y ago

But is it hard getting packages into the Nix ecosystem? If there's too much friction or if it's too hard to resolve all the dependencies yourself before pushing your pkg, I fear it may suffer from lagging behind the latest available versions of most packages.

vertex-four11y ago

Everyone has to resolve their immediate dependencies anyway to push it to pip or whatever else, or nobody can install it. The only additional dependency to push it to Nix would be the language interpreter.

Nix doesn't require you to specify the entire dependency tree; each dependency specifies its own dependencies, and those are resolved during the build process.

(For the record, at least Haskell, Python, and node.js packages are pulled into the nixpkgs tree from their respective package repositories regularly, albeit many missing native dependencies; there's a separate file that you can edit and send a pull request for packages which have native dependencies.)

innguest11y ago

Yes, thank you! But people still want to "solve" the problem the hard way, I guess.

Also, "Java" is mentioned twice in the article but I can't find mention of Ocaml Functors. I thought they solved most package problems even before Nix was around?

tel11y ago

There's a bit of missing context: the author is writing this most likely as part of his research on implementing Backpack—a systems which would being OCaml Functor-like things to Haskell.

mercurial11y ago

Functors are part of the Ocaml module system, but they don't really have any relation to package management with version numbers and dependencies.

akavel11y ago· 7 in thread

This topic reminded me of some very interesting thoughts from Joe Armstrong, that I remember seeing posted somewhere (HN?) some time ago -- "Why do we need modules at all?":

    [...]
    The basic idea is
    
        - do away with modules
        - all functions have unique distinct names
        - all functions have (lots of) meta data
        - all functions go into a global (searchable) Key-value database
        - we need letrec
        - contribution to open source can be as simple as
          contributing a single function
        - there are no "open source projects" - only "the open source
          Key-Value database of all functions"
        - Content is peer reviewed
    
    These are discussed in no particular order below:
    [...]

Full thread: http://thread.gmane.org/gmane.comp.lang.erlang.general/53472

munificent11y ago

> all functions have (lots of) meta data

This is the crux of the problem. Before too long, the amount of metadata dwarfs the thing it describes and it's easier to rewrite the function than it is to find or describe it.

peterwwillis11y ago

I already deal with a large library of functions like that, and it's totally mind-numbing. I had to create tools just to try to group the functions in different ways so I could figure out if there was anything like what I wanted to use, what it was called, and how to use it.

Modules and sub-module hierarchy offers a greater, simpler organizational methodology.

TeMPOraL11y ago

I think developers of QuickUtil independently thought of the same and are pursuing this idea.

http://quickutil.org/

bcoates11y ago

I don't see the difference between this and how module systems in dynamic languages work, the key-value database is accessed with require(), include(), import() or whatever. I suppose you'd need to write a shim to invoke the package manager when an unexpected module is requested but that wouldn't be hard.

zak_mc_kracken11y ago

> - all functions have unique distinct names

Seriously? Is that his solution to package management?

TeMPOraL11y ago

Well, if you're using packages and modules, the full name of your function is still packagename/modulename::functionname; it might as well be packagename.modulename.functionname.

Keep in mind that Joe Armstrong is talking about Erlang here, which is a functional language - most of the functions in libraries are sort-of kind-of independent from each other; they especially don't share state.

xsace11y ago

I can't see how that can be better than a module cohesive API.

calpaterson11y ago· 5 in thread

The downside he mentions to "pinned versions" actually applies to everything on this page. If you don't pay attention to security updates, you will be vulnerable whether or you forgot about your pinned versions or you forgot about your stable distribution.

"Stable" distributions have an additional downside he doesn't mention: when you upgrade every package all at once it's a LOT more effort than if you had upgraded them slowly over time. Dealing with multiple library changes at once is an order of magnitude more difficult than dealing with them one-at-a-time.

And also, to some extent, if all the libraries you are using have a long term stable API, then it doesn't actually matter which one you pick - anything is painless.

mcherm11y ago

> Dealing with multiple library changes at once is an order of magnitude more difficult than dealing with them one-at-a-time.

Curious... I have exactly the opposite experience. I find that a certain amount of time is required to carefully regression-test my application code after upgrading a library. Doing this 23 times for my 23 different dependencies that need to be upgraded can be quite costly. If I, instead, upgrade all of the libraries at once and perform my extensive regression testing just once, I save a great deal of effort.

That's if everything goes smoothly. If something does NOT go smoothly and I encounter an error, then I need to determine which upgrade caused the problem. Most of the time (85% perhaps?), that turns out to be easy and obvious just by looking at the error that presents itself. In the remaining cases, I simply roll back half of the package upgrades and start binary-searching to identify the culprit (or culprits in the case of a conflict between libraries).

bad_user11y ago

Personally I agree with you, except that sometimes you find out that a library you depend on hasn't been upgraded and cannot be used in combination with your other libraries, due to some conflict somewhere. At which point one needs to drop the library from the project and that can prove to be costly, so it is better to identify libraries that aren't well maintained earlier rather than later. This isn't a problem for well established popular libraries, but is a problem if you're using newer, more cutting age stuff.

calpaterson11y ago

Do you have an automated testing suite? That might explain the difference in experience. I use a lot of automated tests

1 more reply

mrottenkolber11y ago

> Dealing with multiple library changes at once is an order of magnitude more difficult than dealing with them one-at-a-time.

From my experience exactly the opposite is true. Compare uprading Slackware to keeping an Arch Linux running. With Slackware, I have to sit down for an hour, do the upgrade, read the notices that come along with it, maybe see if it will break any of my custom packages. This happes once or twice a year (security upgrades are completely painless as they don't break things). With Arch Linux I need to do that every day. If I don't have time to do it for a month, the system is basically broken beyond recognition...

calpaterson11y ago

What I'm advocating is not analogous to the ArchLinux model, partly because I don't think we should stop supporting old versions and I don't think upgrades should be essentially required the moment something is release.

I'm fine with having the odd out of date version of something, I'm just saying: be incremental about keeping your stuff up to date.

avsm11y ago· 4 in thread

> The Git of package management doesn't exist yet

We've taken a pretty good shot at this in the OCaml ecosystem the via OPAM package manager (https://opam.ocaml.org).

* OPAM composes its package universe from a collection of remotes, which can be fetched either via HTTP(S), Git, Hg or Darcs. The resulting package sets are combined locally into one view, but can be separated easily. For instance, getting a view into the latest XenAPI development trees just requires "opam remote add xapi-dev git://github.com/xapi-project/opam-repo-dev".

* The same feature applies to pinning packages ("opam pin add cohttp git://github.com/avsm/ocaml-cohttp#v0.6"). This supports local trees and remote Git/Hg/Darcs remotes (including branches).

* OCaml, like Haskell, is statically typed, and so recompiles all the upstream dependencies of a package once its updated. This lets me work on core OCaml libraries that are widely used, and just do an "opam update -u" to recompile all dependencies to check for any upstream breakage. We did not go for the very pure NixOS model due to the amount of time it takes to compile distinct packages everywhere. This is a design choice to balance composability vs responsiveness, and Nix or 0install are fine choices if you want truely isolated namespaces.

* By far the most important feature in OPAM is the package solver core, which resolves version constraints into a sensible user-facing solution. Rather than reinvent the (rather NP-hard) solver from scratch, OPAM provides a built-in simple version and also a CUDF-compatible interface to plug into external tools like aspcud, which are used by other huge repositories such as Debian to handle their constraints.

This use of CUDF leads to some cool knobs and utilities, such as the OPAM weather service to test for coinstallability conflicts: http://ows.irill.org/ and the solver preferences that provide apt-like preferences: https://opam.ocaml.org/doc/Specifying_Solver_Preferences.htm...

* Testing in a decentralized system is really, really easy by using Git as a workflow engine. We use Travis to test all incoming pull requests to OPAM, much like Homebrew does, and can also grab a snapshot of a bunch of remotes and do bulk builds, whose logs are then pushed into a GitHub repo for further analysis: https://github.com/ocaml/opam-bulk-logs (we install external dependencies for bulk builds by using Docker for Linux, and Xen for *BSD: https://github.com/avsm/docker-opam).

All in all, I'm very pleased with how OPAM is coming along. We use it extensively for the Mirage OS unikernel that's written in OCaml (after all, it makes sense for a library operating system to demand top-notch package management).

If anyone's curious and wants to give OPAM a spin, we'd love feedback on the 1.2beta that's due out in a couple of weeks: http://opam.ocaml.org/blog/opam-1-2-0-beta4/

Tyr4211y ago

I am a Haskell programmer who got to use OPAM when installing Coq (HoTT version) recently. It was surprisingly nice.

Also, you can pick which version of the compiler to run, and have it manage switching everything.

It seemed like it was years ahead of cabal, but that might just be because I only used it a little, I don't know. But there are some things to learn from OPAM.

Do you have a blog post like this, or something I could post the the Haskell subreddit?

avsm11y ago

We're planning a post after the ICFP rush next week dies down, but there's a rather specific one on the new pinning workflow in OPAM 1.2 here:

http://opam.ocaml.org/blog/opam-1-2-pin/

(How to pin a development is central to the day-to-day development workflow of OCaml/OPAM users and quite annoying to change after-the-fact, so we're eager for feedback on this iteration before we bake it into the 1.2.0 release).

The OPAM blog is only about 2 weeks old, so there'll are quite a few more posts coming up as our developers discover there's quite a lot to write about :)

1 more reply

stewbrew11y ago

Does OPAM work on Windows? I've read it doesn't which has kept me from further investigations.

avsm11y ago

OPAM itself compiles on Windows, but most of the package repository doesn't. That's next on our list after the 1.2.0 release comes out (along with cross compilation for targets like iOS, Android and Java, due to the availability of compiler backends for all of these systems now).

davidgerard11y ago· 3 in thread

Every program with plugins of any sort will eventually include a sketchy rewrite of apt-get. Not just languages - WordPress, MediaWiki ...

If you're very lucky, the packaging in question will not conflict horribly with apt or yum. So you probably won't be lucky.

bad_user11y ago

As if apt-get would solve the problems that we have. Good luck in installing multiple versions of the same package with apt-get.

elwin11y ago

Inability to install multiple versions of the same package can be a feature. It encourages the creation of stable, backwards-compatible packages.

1 more reply

davidgerard11y ago

Yes, there's that too.

shadowmint11y ago· 3 in thread

yeah yeah, I read the previous post (http://www.standalone-sysadmin.com/blog/2014/03/just-what-we...) too.

Maybe this time we can talk about how to meaningfully solve these problems instead of just fighting pointlessly about if old tools are so great should be used for everything.

Decentralized package management huh?

How would that work?

A way of specifying an ABI for a packages instead of a version number? A way to bundle all your dependencies into a local package to depend on and push changes from that dependency tree automatically to builds off of it, but only manually update the dependency list?

I'm all for it. Someone go build one.

talex511y ago

"Decentralized package management huh? How would that work?"

http://0install.net/ does this (sad to see it wasn't mentioned in the article). Basically:

1. Use URIs rather than short names to identify packages.

2. Scope dependencies so different applications can see different versions of the same library where necessary.

Here's an OSNews article from 2007 about such things:

http://www.osnews.com/story/16956/Decentralised-Installation...

mercurial11y ago

> A way of specifying an ABI for a packages instead of a version number?

Technically impossible for many languages (have fun figuring out what it would look like in Perl...). And even when it's possible, it's not a guarantee: you can have a semantic change without an ABI change. Cargo, Rust's newfangled package manager, supposes semantic versioning, and I think it's a sane attitude.

tel11y ago

Stronger and stronger typing makes this ABI guarantee stronger and stronger. Dependent typing could package theorems about the properties of your interface and then ensure that all matches satisfy those properties. You'd like depend upon their theorems to prove things about your own program knowing that nothing can break.

1 more reply

pkolaczk11y ago· 3 in thread

Just by reading the title I expected this would be about fixing the dependency diamond problem. I.e. when library A needs library C 1.0 and B needs library C 1.1 incompatible with C 1.0 and then libraries A and B meet in the same project :(

cgh11y ago

In Java-land, OSGi was invented to solve this very problem. Every module has its own classloader so module A can load C 1.0 and module B can load C 1.1. Modules are registered and other modules can look them up in the registry and call them so A can look up and then call B without conflicts.

ExpiredLink11y ago

OSGi is good in theory but too much ado for most real-world projects.

1 more reply

jberryman11y ago

The author is actually working on implementing a solution to that in ghc/cabal as well!

Chris_Newton11y ago· 2 in thread

The situation with packages and dependency hell today is horrendous, particularly if you work in a highly dynamic environment like web development.

I want to illustrate this with a detailed example of something I did just the other day, when I set up the structure for a new single page web application. Bear with me, this is leading up to the point at the end of this post.

To build the front-end, I wanted to use these four tools:

- jQuery (a JavaScript library)

- Knockout (another JavaScript library)

- SASS (a preprocessor to generate CSS)

- Jasmine (a JavaScript library/test framework)

Notice that each of these directly affects how I write my code. You can install any of them quite happily on its own, with no dependencies on any other tool or library. They are all actively maintained, but if what you’ve got works and does what you need then generally there is no need to update them to newer versions all the time either. In short, they are excellent tools: they do a useful job so I don’t have to reinvent the wheel, and they are stable and dependable.

In contrast, I’m pretty cynical about a lot of the bloated tools and frameworks and dependencies in today’s web development industry, but after watching a video[1] by Steven Sanderson (the creator of Knockout) where he set up all kinds of goodies for a large single page application in just a few minutes, I wondered if I was getting left behind and thought I’d force myself to do things the trendy way.

About five hours later, I had installed or reinstalled:

- 2 programming languages (Node and Ruby)

- 3 package managers (npm with Node, gem with Ruby, and Bower)

- 1 scaffolding tool (Yeoman) and various “generator” packages

- 2 tools that exist only to run other software (Gulp to run the development tasks, Karma to run the test suite) and numerous additional packages for each of these so they know how to interact with everything else

- 3 different copies of the same library (RequireJS) within my single project’s source tree, one installed via npm and two more via Bower, just to use something resembling modular design in JavaScript.

And this lot in turn made some undeclared assumptions about other things that would be installed on my system, such as an entire Microsoft Visual C++ compiler set-up. (Did I mention I’m running on Windows?)

I discovered a number of complete failures along the way. Perhaps the worst was what caused me to completely uninstall my existing copy of Node and npm — which I’d only installed about three months earlier — because the scaffolding tool whose only purpose is to automate the hassle of installing lots of packages and templates completely failed to install numerous packages and templates using my previous version of Node and npm, and npm itself whose only purpose is to install and update software couldn’t update Node and npm themselves on a Windows system.

Then I uninstalled and reinstalled Node/npm again, because it turns out that using 64-bit software on a 64-bit Windows system is silly, and using 32-bit Node/npm is much more widely compatible when its packages start borrowing your Visual C++ compiler to rebuild some dependencies for you. Once you’ve found the correct environment variable to set so it knows which version of VC++ you’ve actually got, that is.

I have absolutely no idea how this constitutes progress. It’s clear that many of these modern tools are only effective/efficient/useful at all on Linux platforms. It’s not clear that they would save significant time even then, compared to just downloading the latest release of the tools I actually wanted (there were only four of those, remember, or five if you count one instance of RequireJS).

And here’s the big irony of the whole situation. The only useful things these tools actually did, when all was said and done, were:

- Install a given package within the local directory tree for my project, with certain version constraints.

- Recursively install any dependent packages the same way.

That’s it. There is no more.

The only things we need to solve the current mess are standardised, cross-platform ways to:

- find authoritative package repostories and determine which packages they offer

- determine which platforms/operating systems are supported by each package

- determine the available version(s) of each package on each platform, which versions are compatible for client code, and what the breaking changes are between any given pair of versions

- indicate the package/version dependencies for a given package on each platform it supports

- install and update packages, either locally in a particular “virtual world” or (optionally!) globally to provide a default for the whole host system.

This requires each platform/operating system to support the concept of the virtual world, each platform/operating system to have a single package management tool for installing/updating/uninstalling, and each package’s project and each package repository to provide information about versions, compatibility and dependencies in a standard format.

As far as I can see, exactly none of this is harder than problems we are already solving numerous different ways. The only difference is that in my ideal world, the people who make the operating systems consider lightweight virtualisation to be a standard feature and provide a corresponding universal package manager as a standard part of the OS user interface, and everyone talks to each other and consolidates/standardises instead of always pushing to be first to reinvent another spoke in one of the wheels.

We built the Internet, the greatest communication and education tool in the history of the human race. Surely we can solve package management.

[1] http://blog.stevensanderson.com/2014/06/11/architecting-larg...

peterwwillis11y ago

Sure we can. I imagine first we'll need the ISO to create a project to begin the standardization of software management. Then there'll probably be a few years of research to identify all the kinds of software, platforms they run on, interoperability issues, levels of interdependencies, release methodologies, configuration & deployment models, maintenance cycles, and expected use cases. Then the ISO can create an overly-complex standard that nobody wants to implement. Finally somebody will decide it's easier to just create smaller package managers for each kind of software and intended use case and write layers of glue to make them work together.

So now that we know what to do, the big question is: who's going to spend the next 5-10 years of their life on that project?

Chris_Newton11y ago

So now that we know what to do, the big question is: who's going to spend the next 5-10 years of their life on that project?

But this is my point: We are already solving all of those problems, and doing almost all of the work I suggested.

All of the main package managers recognise versions and dependencies in some form. Of course the model might not be perfect, but within the scope of each set of packages, it is demonstrably useful, because many of us are using it every day.

All of the people contributing packages to centralised package repositories for use with npm and gem and pip and friends are already using version control and they are already adding files to their projects to specify the dependencies for the package manager used to install their project — or in many cases, for multiple package managers, so the project can be installed multiple different ways, which is effectively just duplicated effort for no real benefit.

All major operating systems already come with some form of package management, though to me this is the biggest weak point at the moment. There are varying degrees of openness to third parties, and there is essentially no common ground across platforms except where a few related *nix distributions can use the same package format.

All major operating systems also support virtualisation to varying degrees, though again there is plenty of scope for improvement. I’ve suggested before that it would be in the interests of those building operating systems to make this kind of isolation routine for other reasons as well. However, even if full virtual machine level isolation if too heavyweight for convenient use today, usually it suffices to install the contents of packages locally within a given location in the file system and to set up any environment accordingly, and again numerous package managers already do these things in their own ways.

There is no need for multi-year ISO standardisation processes, and there is no need to have everything in the universe work the same way. We’re talking about tools that walk a simple graph structure, download some files, and put them somewhere on a disk, a process I could have done manually for the project I described before in about 10 minutes. A simple, consolidated version of the best tools we have today would already be sufficient to solve many real world problems, and it would provide a much better foundation for solving any harder problems later, and it would be in the interests of just about everyone to move to such a consolidated, standardised model.

1 more reply

peterwwillis11y ago· 1 in thread

Completely misses the real fundamental problem: people make assumptions about their target. Almost all of this discussion is centered around Linux systems. What about Windows? What about Solaris? HPUX? AIX? VMS? Tru64? Plan9? BeOS? Android?

Package management itself is not a solved problem, so you can't very well expect programming languages to be any different. The existing systems work quite well and make total sense: your package manager is tailored to your specific use case. Centralized/decentralized is a red herring. First figure out how to package every single thing for every single system and use case in the world, and then come back to me about organizational systems.

munificent11y ago

Agree completely. The fundamental problem of package management is the dependencies of the package manager itself.

Every programming language has its own package manager because it's written in that language. No language maintainer is going to say something like, "Hey, want to use Ruby? Just install Perl first so you can install some Ruby packages!"

Likewise, every OS-level package manager assumes an OS. I'm sure apt-get, and yum and Nix are great. I'm also sure their greatness isn't very helpful to Windows users.

There's also the dependency between those two. An OS-level package manager can't easily be written in a high-level language, because one of its core jobs is to install high level languages. A language-level package manager doesn't want to re-invent the OS stack.

Bootstrapping is hard. Package managers sit very very low on the software stack where any dependencies are very difficult to manage and where consolidation is nigh impossible.

xsace11y ago· 1 in thread

Dependency management (ie class/script/resource loading) is too coupled with the programming language execution environment.

It's not something you can make generic like a file/folder based version control tool. It's like asking for the Git of unit testing/continuous integration or whatever, not going to happen.

talex511y ago

You can still build your language package manager on top of an existing one. For example, the ebox installer (for the E language) uses 0install to download metadata and package archives, cache things, solve version constraints, etc, but it takes care of actually wiring the language-level modules together:

http://0install.net/ebox.html

It needs to do this because each application is sandboxed. For most uses a generic packager is fine though. After all, most languages also have RPM, Deb, packages etc.

danielweber11y ago

When I sit around and think "what's the biggest improvement I could make personally to the computing world?" there's always this voice in my head saying "kidnap whoever is building the next package management system and lock them in a deep dark box."

There seems this fundamental disconnect between people making languages about how people use their languages. I don't have time to follow your Twitter feed, because I'm working on a lot of different things. I know it's important to you, the Language Developer, and so you think it should be important to me, the Language User. But I have dozens of things to keep track of, and all of them imagine that they're the most important thing in my world.

It's like the old office culture mocked in "Office Space" where the guy has 7 different bosses, each imagining their own kingdom is the most important.

scottlocklin11y ago

Even reducing the problem to one language you run into problems. All serious R developers have run into issues with CRAN (which is mostly centralized) causing problems with different version installs using install.package(). There are mechanisms for dealing with it, but the best solution is usually to maintain your own distribution of packages and a build script: further centralization. And R has a relatively good/simple package management system compared to something like pip, luarocks or Maven. Two which I never had problems with ... maybe because I didn't use them enough: leiningen and go get.

ownedthx11y ago

Apache Ivy is built on Java and could be a starting point for the uber package manager. It is extremely extensible and can resolve depencies against maven style repos, file systems, and really any other storage mechanism. It can resolve against multiple repos simultaneously.

The real problem is that its so powerful and hard to ramp up on... The docs aren't sufficient for its overall complexity. That all aside, if the will were there, it could be the git of package managers.

saosebastiao11y ago

From a dumb-user perspective, I feel like most of the effort of package managers is spent on resolving these fundamentally incompatable philosophical dilemmas instead of common-denominator solvable problems like:

* Quality and Trust mechanisms. If there are 14 different postgres clients, which do I choose?

* Package Metadata management. Where can I send bug reports? Who is the maintainer? How can I contact someone? Is there an IRC channel?

* Documentation and Function/Class Metadata. Why should I go to the Github README for one package, and to a random domain for another package?

* Linking compile and runtime error messages to documentation or bug reports. Why is google still the best way to track down the cause of an obscure error message?

* Source data linking and code reviews. I should be able to type in a module/namespace qualified function name and view the source without having to scour a git repository. I should also be able to comment directly on that source in a way that is publicly visible or privately visible.

j / k navigate · click thread line to collapse

72 comments

52 comments · 14 top-level

chrisfarms11y ago· 9 in thread

Nix, NixOS, Nix ... a thousand times Nix.

I can't believe the article doesn't mention it.

I've been using NixOS as my OS for development, desktop and we're in the middle of transitioning to using it for production deployments too.

Nix (the package manager not the distribution) solves so many of the discussed problems. And NixOS (the linux distribution) ties it all together so cleanly.

I use it like I'd use virtualenv. I use it like I'd use chef. I use it like I'd use apt. I use it like I'd use Docker.

http://www.nixos.org

davexunit11y ago

Bye pip, bundler, composer, CPAN, puppet, ansible, vagrant, ..., and hello Nix/Guix!

vertex-four11y ago

Additionally, with Nix, you can be close to certain that if you build something twice, you'll get the same result, because it can't access impure resources.

1 more reply

bad_user11y ago

2 more replies

1_player11y ago

I would be very interested in reading a detailed post about your experiences with Nix/NixOS.

platz11y ago

vertex-four11y ago

Nix doesn't require you to specify the entire dependency tree; each dependency specifies its own dependencies, and those are resolved during the build process.

innguest11y ago

Yes, thank you! But people still want to "solve" the problem the hard way, I guess.

Also, "Java" is mentioned twice in the article but I can't find mention of Ocaml Functors. I thought they solved most package problems even before Nix was around?

tel11y ago

There's a bit of missing context: the author is writing this most likely as part of his research on implementing Backpack—a systems which would being OCaml Functor-like things to Haskell.

mercurial11y ago

Functors are part of the Ocaml module system, but they don't really have any relation to package management with version numbers and dependencies.

akavel11y ago· 7 in thread

This topic reminded me of some very interesting thoughts from Joe Armstrong, that I remember seeing posted somewhere (HN?) some time ago -- "Why do we need modules at all?":

    [...]
    The basic idea is
    
        - do away with modules
        - all functions have unique distinct names
        - all functions have (lots of) meta data
        - all functions go into a global (searchable) Key-value database
        - we need letrec
        - contribution to open source can be as simple as
          contributing a single function
        - there are no "open source projects" - only "the open source
          Key-Value database of all functions"
        - Content is peer reviewed
    
    These are discussed in no particular order below:
    [...]

Full thread: http://thread.gmane.org/gmane.comp.lang.erlang.general/53472

munificent11y ago

> all functions have (lots of) meta data

This is the crux of the problem. Before too long, the amount of metadata dwarfs the thing it describes and it's easier to rewrite the function than it is to find or describe it.

peterwwillis11y ago

Modules and sub-module hierarchy offers a greater, simpler organizational methodology.

TeMPOraL11y ago

I think developers of QuickUtil independently thought of the same and are pursuing this idea.

http://quickutil.org/

bcoates11y ago

zak_mc_kracken11y ago

> - all functions have unique distinct names

Seriously? Is that his solution to package management?

TeMPOraL11y ago

Well, if you're using packages and modules, the full name of your function is still packagename/modulename::functionname; it might as well be packagename.modulename.functionname.

xsace11y ago

I can't see how that can be better than a module cohesive API.

calpaterson11y ago· 5 in thread

And also, to some extent, if all the libraries you are using have a long term stable API, then it doesn't actually matter which one you pick - anything is painless.

mcherm11y ago

> Dealing with multiple library changes at once is an order of magnitude more difficult than dealing with them one-at-a-time.

bad_user11y ago

calpaterson11y ago

Do you have an automated testing suite? That might explain the difference in experience. I use a lot of automated tests

1 more reply

mrottenkolber11y ago

> Dealing with multiple library changes at once is an order of magnitude more difficult than dealing with them one-at-a-time.

calpaterson11y ago

I'm fine with having the odd out of date version of something, I'm just saying: be incremental about keeping your stuff up to date.

avsm11y ago· 4 in thread

> The Git of package management doesn't exist yet

We've taken a pretty good shot at this in the OCaml ecosystem the via OPAM package manager (https://opam.ocaml.org).

* The same feature applies to pinning packages ("opam pin add cohttp git://github.com/avsm/ocaml-cohttp#v0.6"). This supports local trees and remote Git/Hg/Darcs remotes (including branches).

If anyone's curious and wants to give OPAM a spin, we'd love feedback on the 1.2beta that's due out in a couple of weeks: http://opam.ocaml.org/blog/opam-1-2-0-beta4/

Tyr4211y ago

I am a Haskell programmer who got to use OPAM when installing Coq (HoTT version) recently. It was surprisingly nice.

Also, you can pick which version of the compiler to run, and have it manage switching everything.

It seemed like it was years ahead of cabal, but that might just be because I only used it a little, I don't know. But there are some things to learn from OPAM.

Do you have a blog post like this, or something I could post the the Haskell subreddit?

avsm11y ago

We're planning a post after the ICFP rush next week dies down, but there's a rather specific one on the new pinning workflow in OPAM 1.2 here:

http://opam.ocaml.org/blog/opam-1-2-pin/

The OPAM blog is only about 2 weeks old, so there'll are quite a few more posts coming up as our developers discover there's quite a lot to write about :)

1 more reply

stewbrew11y ago

Does OPAM work on Windows? I've read it doesn't which has kept me from further investigations.

avsm11y ago

davidgerard11y ago· 3 in thread

Every program with plugins of any sort will eventually include a sketchy rewrite of apt-get. Not just languages - WordPress, MediaWiki ...

If you're very lucky, the packaging in question will not conflict horribly with apt or yum. So you probably won't be lucky.

bad_user11y ago

As if apt-get would solve the problems that we have. Good luck in installing multiple versions of the same package with apt-get.

elwin11y ago

Inability to install multiple versions of the same package can be a feature. It encourages the creation of stable, backwards-compatible packages.

1 more reply

davidgerard11y ago

Yes, there's that too.

shadowmint11y ago· 3 in thread

yeah yeah, I read the previous post (http://www.standalone-sysadmin.com/blog/2014/03/just-what-we...) too.

Maybe this time we can talk about how to meaningfully solve these problems instead of just fighting pointlessly about if old tools are so great should be used for everything.

Decentralized package management huh?

How would that work?

I'm all for it. Someone go build one.

talex511y ago

"Decentralized package management huh? How would that work?"

http://0install.net/ does this (sad to see it wasn't mentioned in the article). Basically:

1. Use URIs rather than short names to identify packages.

2. Scope dependencies so different applications can see different versions of the same library where necessary.

Here's an OSNews article from 2007 about such things:

http://www.osnews.com/story/16956/Decentralised-Installation...

mercurial11y ago

> A way of specifying an ABI for a packages instead of a version number?

tel11y ago

1 more reply

pkolaczk11y ago· 3 in thread

cgh11y ago

ExpiredLink11y ago

OSGi is good in theory but too much ado for most real-world projects.

1 more reply

jberryman11y ago

The author is actually working on implementing a solution to that in ghc/cabal as well!

Chris_Newton11y ago· 2 in thread

The situation with packages and dependency hell today is horrendous, particularly if you work in a highly dynamic environment like web development.

To build the front-end, I wanted to use these four tools:

- jQuery (a JavaScript library)

- Knockout (another JavaScript library)

- SASS (a preprocessor to generate CSS)

- Jasmine (a JavaScript library/test framework)

About five hours later, I had installed or reinstalled:

- 2 programming languages (Node and Ruby)

- 3 package managers (npm with Node, gem with Ruby, and Bower)

- 1 scaffolding tool (Yeoman) and various “generator” packages

And here’s the big irony of the whole situation. The only useful things these tools actually did, when all was said and done, were:

- Install a given package within the local directory tree for my project, with certain version constraints.

- Recursively install any dependent packages the same way.

That’s it. There is no more.

The only things we need to solve the current mess are standardised, cross-platform ways to:

- find authoritative package repostories and determine which packages they offer

- determine which platforms/operating systems are supported by each package

- determine the available version(s) of each package on each platform, which versions are compatible for client code, and what the breaking changes are between any given pair of versions

- indicate the package/version dependencies for a given package on each platform it supports

- install and update packages, either locally in a particular “virtual world” or (optionally!) globally to provide a default for the whole host system.

We built the Internet, the greatest communication and education tool in the history of the human race. Surely we can solve package management.

[1] http://blog.stevensanderson.com/2014/06/11/architecting-larg...

peterwwillis11y ago

So now that we know what to do, the big question is: who's going to spend the next 5-10 years of their life on that project?

Chris_Newton11y ago

So now that we know what to do, the big question is: who's going to spend the next 5-10 years of their life on that project?

But this is my point: We are already solving all of those problems, and doing almost all of the work I suggested.

1 more reply

peterwwillis11y ago· 1 in thread

munificent11y ago

Agree completely. The fundamental problem of package management is the dependencies of the package manager itself.

Likewise, every OS-level package manager assumes an OS. I'm sure apt-get, and yum and Nix are great. I'm also sure their greatness isn't very helpful to Windows users.

Bootstrapping is hard. Package managers sit very very low on the software stack where any dependencies are very difficult to manage and where consolidation is nigh impossible.

xsace11y ago· 1 in thread

Dependency management (ie class/script/resource loading) is too coupled with the programming language execution environment.

It's not something you can make generic like a file/folder based version control tool. It's like asking for the Git of unit testing/continuous integration or whatever, not going to happen.

talex511y ago

http://0install.net/ebox.html

It needs to do this because each application is sandboxed. For most uses a generic packager is fine though. After all, most languages also have RPM, Deb, packages etc.

danielweber11y ago

It's like the old office culture mocked in "Office Space" where the guy has 7 different bosses, each imagining their own kingdom is the most important.

scottlocklin11y ago

ownedthx11y ago

saosebastiao11y ago

* Quality and Trust mechanisms. If there are 14 different postgres clients, which do I choose?

* Package Metadata management. Where can I send bug reports? Who is the maintainer? How can I contact someone? Is there an IRC channel?

* Documentation and Function/Class Metadata. Why should I go to the Github README for one package, and to a random domain for another package?

* Linking compile and runtime error messages to documentation or bug reports. Why is google still the best way to track down the cause of an obscure error message?

j / k navigate · click thread line to collapse