I create a venv. Pip install and keep my direct deps in requirements.txt
That's it. Never understood all these python dependency management problems dramas.
Recently, I started using pyproject.toml as well which makes the whole thing more compact.
I make lots of python packages too. Either I go setup.py or sometimes I like to use flit for no specific reason.
I haven't ever felt the need for something like uv. I'm good with pip.
To really pin everything you'd need to use something like asdf, on top of poetry or a manual virtualenv.
Otherwise you get your colleagues complaining that pip install failed with mysterious errors.
Even in huge monorepos you can just use something like a Makefile to produce a local venv using PHONY and add it to clean too
This is how I actually test old versions of python, with versioned build targets, cython vs ...
You can set up almost any IDE to activate them automatically too.
The way to get you coworkers to quit complaining is to automate the build env setup, not fighting dependency hell, which is a battle you will never win.
It really is one of the most expensive types of coupling.
No matter what tooling you have, that kind of test is really the only way to be sure anyway.
If something doesn't work, I can play around with dependency versions and/or do the appropriate research to figure out what's required for a given Python version, then give the necessary hints in my `pyproject.toml` (https://packaging.python.org/en/latest/specifications/pyproj...) as environment markers on my dependency strings (https://peps.python.org/pep-0508/#environment-markers).
"Mysterious errors" in this area are usually only mysterious to end users.
(Of course, I don't distribute most of my projects, so I just dump them all in the global install and don't worry about it)
People end up committing either one or the other, not both, but:
- You need the source code, else your project is hard to update ("why did they pick these versions exactly?" - the answer is the source code).
- You need the compiled pinned versions in the lock file, else if dependencies are complicated or fast-moving or a project goes unmaintained, installing it becomes a huge mindless boring timesink (hello machine learning, all three counts).
Whenever I see people complaining about python dependencies, most of the time it seems just that somebody lacked this concept, or didn't know how to do it with python, or are put off by too many choices? That plus that ML projects are moving quickly and may have heavy "system" dependencies (CUDA).
In the source code - e.g. requirements.in (in the case of pip-tools or uv's clone of that: uv pip compile + uv pip sync), one lists the names of the projects one's application depends on, with a few version constraints explained with comments (`someproject <= 5.3 # right now spamalyzer doesn't seem to work with 5.4`).
In the compiled output - i.e. the lock files (pip-tools or uv pip sync/compile use requirements.txt for this) one makes sure every version is pinned to one specific version, to form a set of versions that work together. A tool (like uv pip compile) will generate the lock files from the source code, picking versions that are declared (in PyPI metadata) should work together.
My advice: pip-tools (pip-compile + pip-sync) does this very nicely - even better, uv's clone of pip-tools (uv pip compile + uv pip sync), which runs faster. Goes nicely with:
- pyproject.toml (project config / metadata)
- plain old setuptools (works fine, doesn't change: great)
- requirements.in: the source for pip-tools (that's all pip-tools does: great! uv has a faster clone)
- pyenv to install python versions for you (that's all it does: great! again uv has a faster clone)
- virtualenv to make separate sandboxed sets of installed python libraries (that's all it does: great! again uv has a faster clone)
- maybe a few tiny bash scripts, maybe a Makefile or similar just as a way to list out some canned commands
- actually write down the commands you run in your README
PS: the point of `uv pip sync` over `uv pip install -r requirements.txt` is that the former will uninstall packages that aren't explicitly listed in requirements.txt.
uv also has a poetry-like do-everything 'managed' everything-is-glued-together framework (OK you can see my bias). Personally I don't understand the benefits of that over its nice re-implementations of existing unix-y tools, except I guess for popularizing python lockfiles - but can't we just market the idea "lock your versions"? The idea is the good part!
Never had a problem making reproducible builds doing so.
TBH, I've seen tutorials or even some companies simply do `pip freeze > requirements.txt` :shrug: which is a mess.
(EDIT: Sorry, HN doesn't like code, see the start of https://github.com/skorokithakis/calumny/blob/master/calumny... for an example)
The script will run with uv and automatically create a venv and install all dependencies in it. It's fantastic.
The other alternative, if you want to be extra sure, is to create a pex. It'll even bundle the Python interpreter in the executable (or download it if it's not available on the target machine), and will run anywhere with no other dependency (maybe libc? I forget).
#!/usr/bin/env -S uv run
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "google-api-python-client",
# "google-auth-httplib2",
# "google-auth-oauthlib",
# "selenium",
# "webdriver_manager",
# "pydantic",
# ]
# ///
import argparse
import datetime
I built a tool to help me do that: https://observablehq.com/@simonw/wrap-text-at-specified-widt...Edit: Huh, two spaces works too:
#!/usr/bin/env -S uv run
# /// script
# requires-python = ">=3.12" [project.scripts]
mytool = "path.to.script:main"
and publish to PyPi. Then anyone can directly run your script via uvx --from mylib mytool
As an example, for Langroid (an open-source agent-oriented LLM lib), I have a separate repo of example scripts `langroid-examples` where I've set up some specific scripts to be runnable this way:https://github.com/langroid/langroid-examples?tab=readme-ov-...
E.g. to chat with an LLM
uvx --from langroid-examples chat --model ollama/qwen2.5-coder:32b
or chat with LLM + web-search + RAG uvx --from langroid-examples chatsearch --model groq/llama-3.3-70b-versatileYes, that format is specified by PEP 723 "Inline script metadata" (https://peps.python.org/pep-0723/). The cause was originally championed by Paul Moore, one of the core developers of Pip, who authored the competing PEP 722 (see also https://discuss.python.org/t/_/29905) and was advocating for the general idea for long before that.
It's also supported by Pipx, via the `run` subcommand. There's at least one pull request to put it directly into Pip, but Moore doesn't seem to think it belongs there.
Just Kidding! It's amazing. It gets along with existing installations very well.
Which, fair. Python is and will always be a bazaar.
>> PDM (Edit 14/12/2024) When I shared this article online, I was asked why I did not mention PDM. The honest reason was because I had not heard of it.
Ha! On brand.
It was originally not a bazaar but "batteries included" where the thing you wanted to do had an obvious best way of doing it. An extremely difficult property to maintain over the decades.
pip, pipenv, poetry, conda, setuptools, hatch, micropipenv, PDM, pip-tools, egg, uv, ActiveState platform, homebrew, your operating system's package manager, and many others . . .
Relevant xkcd: https://xkcd.com/1987/
pipenv, micropipenv and pip-tools are utilities for creating records of dependencies, but don't actually "manage" those dependencies in the above sense.
Your list also includes an installer (Pip), a build backend (Setuptools - although it has older deprecated use as something vaguely resembling a workflow tool similar to modern dependency managers), a long-deprecated file format (egg) which PyPI hasn't even accepted for a year and a half (https://packaging.python.org/en/latest/discussions/package-f...), two alternative sources for Python itself (ActiveState and homebrew - and I doubt anyone has a good reason to use ActiveState any more), and two package management solutions that are orthogonal to the Python ecosystem (Conda - which was created to support Python, but its environments aren't particularly Python-centric - and Linux system package managers).
Any system can be made to look complex by conflating its parts with other vaguely related but ultimately irrelevant objects.
New workflow tools like Poetry, PDM, Hatch, uv etc. tend to do a lot of wheel reinvention, in large part because the foundational tools are flawed. In principle, you can do everything with single-purpose tools. The real essentials look like:
* Pip to install packages
* venv to create environments
* `build` as a build frontend to create your own distributions
* a build backend (generally specified by the package, and set up automatically by the frontend) to create sdists and wheels for distribution
* `twine` to put sdists and wheels on PyPI
The problems are:
* Determining what to install is hard and people want another tool to do that, and track/update/lock/garbage-collect dependencies
* Keeping track of what venvs you made, and which contains what, is apparently hard for some users; they want a tool to help make them, and use the right one when you run the code, and have an opinion about where to keep them
* Pip has a lot of idiosyncrasies; its scope is both too wide in some places and too narrow in others, it's clunky to use (the UI has never had any real design behind it, and the "bootstrap Pip into each venv" model causes more problems), and it's way too eager to build sdists that won't end up getting installed (which apparently is hard to fix because of the internal structure of the code)
* Setuptools, the default build backend, has a legacy of trying to be an entire workflow management tool, except targeting very old ideas of what that should entail; now it's an absurdly large pile of backwards-compatibility wrappers in order to keep supporting old-fashioned ways of doing things. And yet, it actually does very little in a modern project that uses it: directly calling `setup.py` is deprecated, and most of what you would pass to the `setup` call can be described in `pyproject.toml` instead; yet when you just properly use it as a build backend, it has to obtain a separate package (`wheel`) to actually build a wheel
* Project metadata is atrocious, proximately a standardization issue, but ultimately because legacy `setup.py`-exclusive approaches are still supported
Just having a virtual environment and requirements.txt alone would solve 90% of this article.
Also with python 3.12 you literally CANT install python packages at the system level. Giant full page warning saying “use a venv you idiot”
I expected something along these lines and was still disappointed by TFA
But nowadays people seem to put the cart before the horse, and try to teach about programming language ecosystems before they've properly taught about programming. People new to programming need to worry about programming first. If there are any concepts they need to learn before syntax and debugging, it's how to use a command line (because it'll be harder to drive tools otherwise; IDEs introduce greater complexity) and how to use version control (so they can make mistakes fearlessly).
Educators, my plea: if you teach required basic skills to programmers before you actually teach programming, then those skills are infinitely more important than modern "dependency management". And for heavens' sake, you can absolutely think of a few months' worth of satisfying lesson plans that don't require wrapping one's head around full-scale data-science APIs, or heaven forbid machine-learning libraries.
If you need any more evidence of the proper priorities, just look at Stack Overflow. It gets flooded with zero-effort questions dumping some arcane error message from the bowels of Tensorflow, forwarded from some Numpy 2d arrays used as matrices having the wrong shape - and it'll get posted by someone who has no concept of debugging, no idea of any of the underlying ML theory, and very possibly no idea what matrix multiplication is or why it's useful. What good is it to teach "dependency management" to a student who's miles away from understanding the actual dependencies being managed?
For that matter, sometimes they'll take a screenshot of the terminal instead of copying and pasting an error message (never mind proper formatting). Sometimes they even use a cell phone to take a picture of the computer monitor. You're just not going to teach "dependency management" successfully to someone who isn't properly comfortable with using a computer.
In fifteen years of using Python, the only people I see getting burned are, conveniently, the folks writing blogs on the subject. No one I've worked with or hired seems to be running into these issues. It's not to say that people don't run into issues, but the problems seem exaggerated every time this subject comes up.
-r test.txt
And, to double-down, if you read the pip documentation (the second sin of software development?), you can use things other than pip freeze. Like, python -m pip list --not-required
That option flag is pretty nice because it excludes packages that aren't dependencies (aka - the primary packages that you need). If you do that you don't to worry about dependency management as much.Pip install -f requirements.txt
Pip freeze > requirements.lock.txt
Pip install -f requirements.lock.txt
The old ways of doing things have existed for much longer than the new ways, and become well established. Everyone just accepts the idea of copying Pip into every new virtual environment, even though it's a) totally unnecessary (even before the `--python` option was introduced two years ago, you could sort of get by with options like `--target` and `--python-version` and `--prefix` and `--platform` and `--abi`) and b) utterly insane (slow, wasteful, makes it more confusing when your PATH gets messed up, leads to misconceptions...). And that's before considering the things people try to do that aren't officially blessed use cases - like putting `sudo` and `--break-system-packages` on the same command line without a second thought, or putting code in `setup.py` that actually tries to copy the Python files to specific locations for the user, or trying to run Pip explicitly via its non-existent "API" by calling undocumented stuff instead of just specifying dependencies (including "extras" lists) properly. (The Pip documentation explicitly recommends invoking Pip via `subprocess` instead; but you're still probably doing something wrong if this isn't part of your own custom alternative to Poetry etc., and it won't help you if Pip isn't available in the environment - which it doesn't have to be, except to support that kind of insane use case).
Another part is that people just don't want to learn. Yes, you get a 'Giant full page warning saying “use a venv you idiot”'. Yes, the distro gets to customize that warning and tell you exactly what to do. Users will still give over a million hits to the corresponding Stack Overflow question (https://stackoverflow.com/questions/75608323), which will collect dozens of terrible answers, many of them suggesting complete circumvention of the package-management lock. It was over a year before anything significant was done about the top answer there (disclosure: I contributed quite a bit after that point; I have a personal policy of not writing new answers for Stack Overflow, but having a proper answer at the top of this question was far too important for me to ignore), which only happened because the matter was brought to the attention of the Python Discourse forum community (https://discuss.python.org/t/_/56900).
The proliferation of requirements.txt files is a massive reason for why Python dependency management sucks.
The author would be well served by using the first person and not including us in his uncertainty.
So if you find some script on the web that has an `import foo` at the top, you cannot just `pip install foo`. Instead, you'll have to do some research into which package was originally used. Maybe it's named `pyfoo` or `foolib`.
Compare that to for example Java, which does not have that problem, thanks to Reverse Domain Name Notation. That is a much better system.
The lack of good namespacing practice is a problem. Part of the reason for it, in my estimation, is that developers have cargo-culted around a mistaken understanding of `__init__.py`.
I think that would have been the single biggest improvement to the Python onboarding experience ever.
There were many problems with the proposal. The corresponding discussion (https://discuss.python.org/t/_/963) is worth looking through, despite the length.
Installers like Pip could help by offering to install `--in-new-environment` for the first install, and Brett Cannon (a core developer) has done some work on a universal (i.e. not just Windows) "launcher" (https://github.com/brettcannon/python-launcher) which can automatically detect and use the project's venv if you create it in the right place (i.e., what you'd have to do with __pypackages__ anyway).
One can emulate it with tools like poetry and uv but that incurs a performance penalty that every script has to go through `poetry run` and `uv run` which is often a few hundret ms and unsuitable for performant CLIs.
I know other languages have various solutions for this (to basically have package namespaces local to a branch of the dependency tree), but I don't know how much better that experience is.
You could try running
python -m pip check
To check dependencies. Or, python -m pip inspect
To get a JSON output of your current virtual environment.Or, update stuff automatically:
python -m pip install --upgrade
Or, skip worrying about dependencies: python -m pip install --no-deps
Or, do a dry run installation with full report on what would be installed:pip install --ignore-installed --dry-run --quiet --report
And, there's a lot more than that. Pip is pretty powerful and I'm surprised everyone dislikes it so much.
Hope this helps. Cheers!
For example, `pip install --ignore-installed --dry-run --quiet --report` will build sdists (and run arbitrary code from `setup.py` or other places specified by a build backend) - just so that it can confirm that the downloaded sdist would produce a wheel with the right name and version. Even `pip download` will do the same. I'm not kidding. There are multiple outstanding issues on the tracker that are all ultimately about this problem, which has persisted through multiple versions of the UI and all the evolving packaging standards, going back almost the entire history of Pip.
See for example https://github.com/pypa/pip/issues/1884 ; I have a long list of related reports written down somewhere.
A security researcher was once infamously bitten by this (https://moyix.blogspot.com/2022/09/someones-been-messing-wit...).
Invoking SAT with clause count: 9661561
Invoking SAT with clause count: 5164645
There is a trade off here and the equilibrium is probably deliberate. The sort of person who tries to get dependencies right up is a professional programmer and although a lot of them use Python the language is designed for a much broader audience. Java is an example of much better dependency management by default and in the main it is only professionals with a view to the long term using Java. Setting up a Java project needs a tutorial and ideally specialist software.
Compare that to a Python beginner where they install Python & use the text editor that exists on their machine & it can be a general purpose text editor rather than something specifically developed to write Java code. There might be one `pip install` along the way, but no requirement to understand a folder layout, project concept or even create a file for dependency management. There is even a reasonable chance that Python is pre-installed on their machine.
Indeed, the novice shouldn't have to make all kinds of intricate choices about the build system to get something running. The language designers should have provided a good set of default choices here. The problem with python is that the default choices aren't actually the good ones.
The post, despite its length, doesn't really spend time on this aspect, and I think it's one of the weaker areas.
Suppose my library depends on both A and B, and both of A and B depend on C, and in the current version of my code, there's some set of versions that matches all declared dependencies. But A releases a new version which has new features I want to use, and also depends on a newer version of C than B supports. I'm blocked from upgrading A with the normal tooling, unless I either ditch or modify B.
This can be a problem in other ecosystems too, but better tools support working around it. In the java world (and I may be out of date here), the maven-shade tool would allow you to effectively use multiple versions of C, e.g. by letting B use its older version but re-package everything under a non-colliding name. Or perhaps I depend on library B and B uses C but I don't use the parts of B that interact with C. I could build a fat-jar for my project, and effectively drop the parts of B that I don't need.
I think this is also most of the time in principle possible in python (though perhaps some relevant static analysis of seeing what is actually used in more difficult), but the ecosystem doesn't really support or encourage it.
No, the main problems with this are at the language level. Languages like Java can do it because the import is resolved statically. Python's `import` statement is incredibly dynamic, offering a variety of rarely-used hooks (start at https://docs.python.org/3/library/sys.html#sys.meta_path and follow the links, if you dare) that ultimately (by default) put imported modules into a single global dictionary, keyed by name. If you want to work around that, you can, but you'll definitely have to roll up your sleeves a bit.
Even if you just want to choose between two different versions at runtime, the base `import` syntax doesn't have a way to specify a version. If the package version isn't explicitly namespaced (which causes more pain for everyone who doesn't care about the exact version) then you need to make special arrangements so that the one you want will be found first (normally, by manipulating `sys.path`).
But if you want to use both during the same run, you'll encounter far more problems. The global dict strategy is very deliberate. Of course it offers a performance benefit (caching) but it's also required for correctness in many normal cases. It makes sure that different parts of the code importing the same module see the same module object, and therefore the same module-level globals.
That is: in the Python world, it's common that the design of C involves mutating its global state. If you "let B use its older version" then it would have to be an entirely separate object with its own state, and mutating that would result in changes not seen by A. Whether or not that's desirable would depend on your individual use case. A lot of the time, you would want the change to be shared. But then, even if you could propagate the changes automatically, you'd have to deal with the fact that the two versions of C don't necessarily represent their state identically.
> by letting B use its older version but re-package everything under a non-colliding name. Or perhaps I depend on library B and B uses C but I don't use the parts of B that interact with C. I could build a fat-jar for my project, and effectively drop the parts of B that I don't need.
In principle, the B author can make the dependency on C optional ("extra"), for those users who do need that part of B. Or even for B to optionally depend on B-prime, which encapsulates the C-using parts.
This is actually not very difficult (and the above-described dynamic nature of Python is very helpful here), but in practice it isn't done very much (perhaps Python developers are wary of creating another left-pad situation). But I do wish e.g. Numpy were more like that.
But the "fat-jar" analog is also possible in Python, as long as it isn't a problem that you aren't sharing state between C-old and C-new. It's called "vendoring"; Pip does a lot of it; and the tooling that Pip uses to enable it is also made available (https://pypi.org/project/vendoring/) by one of the main Pip devs (Pradyun Gedam).
> put imported modules into a single global dictionary, keyed by name.
> If the package version isn't explicitly namespaced (which causes more pain for everyone who doesn't care about the exact version) then you need to make special arrangements so that the one you want will be found first (normally, by manipulating `sys.path`).
> That is: in the Python world, it's common that the design of C involves mutating its global state. If you "let B use its older version" then it would have to be an entirely separate object with its own state, and mutating that would result in changes not seen by A. Whether or not that's desirable would depend on your individual use case
I haven't worked on a java project that needed this in some years so I may be out of date, but all the same things you describe are also (mostly) true of and relevant to the java ecosystem (or at least were), and these are the same considerations that come to the choices around shading in java land.
- the class name, method names etc are statically known at compile time, but version is not indicated by import or use. The compiler finds (or doesn't) the requisite definitions from name, based on what's available on the classpath at compile time. At runtime, we load whichever version of that class is on the classpath. If between compilation and running, some part of your build process or environment changed to make a different version of the class available, and names/signatures align, nothing at runtime knows of the change. Order still matters, and just as with python, java (outside of custom classloader shenanigans) also works hard to only load each class once, associated to a qualified name.
- shading works around these constraints by renaming. E.g. you might have two SDKs which wrap API calls to different vendors, which depend on different major versions of a serialization library. No functionality depends on passing any of internal classes of the serialization library between calls to these two vendors, so you're safe to shade one SDK's use of that library (package) to a new, unambiguous name. The key point is both the conflicted library C and the SDK that uses it B get rewritten. Note, this would break at runtime if either library had code that e.g. constructed a string which was then used as a classname, but this would already be pretty abusive.
- similarly in python, if in your project you use two libraries which each separately use e.g. pydantic at 1.x and 2.x, and your own code isolates these from each other (i.e. classes from B don't get passed through methods in A, etc), then you could pretty safely rename one version (e.g. `pydantic` 1.x in library gets renamed as `pydanticlegacy`) -- but the common tooling doesn't generally support this. Just as in the python case, if the library code does something weird (e.g. `eval`ing a string that uses the original name), stuff will break at runtime, very analogously to the java situation.
In both cases, the language on its own doesn't support a concept of import of a specified version, and at runtime each unique name is associated with code which is resolved by a search through an ordered collection of paths, and the first version found wins. What differs is the level of tooling support for consistent package/module renaming. If anything, I think the actual requirements here on the python side should be lower; because java shading must work on class files (so you don't have to recompile upstream stuff), it needs to use ASM.
"Articles like that over-dramatize things a lot. As a self-taught Python developer, I mostly learned from looking at code. When I started, I didn't know anything about packaging. I just installed stuff globally with pip and moved on. I didn't really build projects, just scripts. Once in a while I used a venv when I was following a guide using them. To give a bit of perspective, for a while I didn't really know how to make a class, only a function. Fast forward a few months, I learned a bit more about projects and also started contributing and/or forking open source projects (again, never took classes, I just explore). I used poetry a little, it works nicely. Now, I use uv for anything new, and it works beautifully. It's really simple. uv init, uv add any deps, and uv run to run the project or just activate the venv. And I never run into dependency issues. It's like really simple. And being able to manage python versions really simply is a really nice bonus. My cicd doesn't even use a special docker container or anything. Literally installs uv (using curl | sh), uv sync, uv run. Finished. And very fast too. So yeah, Python dependencies aren't automatically vendored. And yes, Python tooling has historically been bad. But now we have amazing tooling and it's really really easy to learn. uv is wonderful, poetry is also great, and anyone complaining it's too hard is either over dramatizing it or doesn't have 5 minutes to read a readme. So yeah, people should stop over dramatizing something that really isn't dramatic."
Poetry was a wonderful breath of fresh air. uv is the same, but 100x faster.
(Astral sponsors me on GitHub, but I'd hold the same belief even if they didn't.)
This doesn’t always work, of course (especially for large projects), but for smaller ones, it absolutely does.
Which itself can run Python now. Only in the Microsoft Cloud, not only to rake in that sweet subscription money, but probably also to avoid these headaches.
We use conda/momba to install some more complex dependencies that are piles of shared libraries and a messy web of sub dependencies.
Are you referring to something else?
100% disagree, yes it may be used for that, but it's a fully fledged language, people use it for data science or for backend business logic.
Django users are using it natively, and cherryPy/etc. Python full-stack maybe, but they'll have DB drivers and JSON parsers probably from C.
FWIW, I've rolled, crashed, and burned on at least one attempt to use Python tooling due to forgotten about installs/environments which required uninstalling/deleting everything and start anew.
It would be nice if there was a consensus and an agreed-upon solution and process which would work as a new user might expect.
https://open.substack.com/pub/martynassubonis/p/python-proje...
However, these days, I would just probably go with uv and hope that astral doesn't pull any VC shenanigans:
https://open.substack.com/pub/martynassubonis/p/python-proje...
It's not Rust/Go toolchain, but we work with what we got here...