> This destroyed 3 production server after a single deploy!
I do think that the developers have a duty to do some testing of their software before putting out releases/updates. However, users also have a duty to perform sufficient testing before they push new versions to their production environments.
In my opinion, it's kinda like losing data because you didn't make and/or test your backups. It's a really crappy way to have to learn a lesson but at least they've finally learned it -- and if they haven't, well, then maybe they will the next time it happens.
And it's not specific to npm, I would do the same with gem, pip, cpan, etc. Not to mention curl http://ex.io/install.sh | sudo bash.
Call me old school, but personally, I would avoid installing anything from language specific package managers. I would instead either build an rpm/deb package for every dependencies as a single package or if it's too complex, bundle the dependencies and the application in one package which deploys the bundle under /opt/.
That way I only have one source to check in order to see what is installed on my systems. Also, rpm and dpkg tends to be far better at managing what is installed by each package, and far better at uninstalling everything during cleanups.
Also, mixing a language specific package manager and a distribution package manager can have unforeseen side effects as the two can step on one another (for example, I ran into issues recently with a pip install python-consul overlapping with a yum install salt-minion as both of them download python-requests as a dependency).
This is in fact way easier than options for python, ruby (and probably many others) which tend to install versioned dependencies to some shared directory and then add them to the path at runtime. So you're very right, it's trivial to not need npm at all, ever, in production.
Should I have done this on a staging server? Sure, but that does not change the fact that I would have had to rebuild the whole server there too. It is not expected that updating npm will kill the complete system it is on... It would be expected to have some deploy failure of some sort.
As previously noted, `npm update -g npm` pulls in version 5.7.0. Version 5.6 is still the latest but for some obscure reason if you have thisupdate anywhere in your deploy script you are screwed.
I don't disagree with you at all on that. The reality is, however, that sometimes "shit happens".
I'm more of a sysadmin than a developer and I learned many, many years ago that even the smallest little updates can "go wrong" and take the rest of the system with it. After getting burned a few times, even a baby will learn to stop touching a hot stove.
> Should I have done this on a staging server? Sure, but that does not change the fact that I would have had to rebuild the whole server there too.
Yes, but in that case your production servers would still be humming along just fine, no?
Yeah, but that is why you test your deployments BEFORE deploying them.
Hell would be had had any developer at my company ran any such command on a production sever. The notion of even running a command at the terminal on a production server is even scary.
Things like should be done on build servers which are in general throwaway. Your build server should produce a artifact that can then be deployed to your staging servers and if all is well THEN productions servers. npm is a build tool and should not be installed or ran on production servers -- for many reasons more than just stupid stuff like this.
it's lesson and reminder to everyone out there, be careful.
When dev env broken, alpha skipped, staging unusable, then test on production, you sure like to live on the hell.
That's the responsible method. The fact that you'd have to rebuild your staging server is exactly why you should have tested it there.
Sure NPM shouldn't have broken this but any number of things can cause issues during deployment and it's your job to check for them before pushing it out
chef-client -z -r 'my-cookbook::npm_web_server'
Obviously the behavior of NPM absolutely sucks and is a total mess here, but "I had to rebuild the server", in 2018, is not nearly the material complaint it was a decade ago.The tradeoff of using rapidly-evolving tools with minimal oversight from the people creating them is that sometimes stuff blows up and not even always for good reasons. It is incumbent upon you, as the recipient of this enormous, jaw-dropping raft of free stuff that occasionally explodes, to write code and operate your systems defensively. Part of which is implementing those systems to be repeatable and quickly reinstantiated.
If you do not like this tradeoff, you have other options as well.
"It is not expected that" is the definition of unexpected behavior, which is the very reason why we use staging servers. So your message is essentially "I didn't use a server meant to check for unexpected behavior, because I didn't expect that behavior to happen". Well, yeah, that's the point.
Also, I'm really not sure what your smiley is trying to convey here, and of all the possibilities I can't see one that's positive and contributive to the conversation. Really un-needed, please refrain from doing that.
You have a responsibility to test before releasing to production, yes. But the amount of fucked up your program has to be for `sudo ___ --help` to wreck the operating system, the unexpectedness of that result... IMO attention should be focused here on the irresponsibility of the npm team, not their users.
That means npm is causing side effects even before reading what the user wants, or is blatantly ignoring the user's request
Perhaps the problem is that the stability of the world's fastest growing development platform hangs on the implementation of best practices by a two person developer team.
NPM needs to step up its game or we need to make something like yarn the standard.
During the bubble, it was the same with PHP, and some of us were part of it.
Youngsters must start to code somewhere.
Correction will be little harder on Debian derivatives and whole incident can be completely prevented on Solaris with file-mac-profile.
Everything is super dangerous as root, one should avoid using root at all costs until there is no other way.
NPM's awfulness notwithstanding, it's trivial to write a shell script to do what you say and add a symlink to ~/bin. But everyone on StackOverflow will tell each other "just run it with sudo", and they do, and then quickly move on with their lives (presumably to be followed with "and break things"). Instead of doing the right thing, raising their hackles about how poorly NPM is designed, and holding its community leaders accountable.
1. https://github.com/Microsoft/TypeScript/blob/b29e0c9e3ab2471...
It's extra-frustrating writing those instructions, because not only are they platform-specific, but they are different depending on what the user has already done to their system. If some other tool told them to create ~/.bash_profile or ~/.bash_login, the more shell-agnostic option of modifying ~/.profile will no longer work.
Figuring out where to change the PATH is also confusing, and you might come across solutions that seem to work, but cause weird errors later down the road. For example, the tool being unavailable when invoked remotely, because you only changed the PATH for interactive shells.
It's understandable that people use sudo when they don't see it causing any obvious problems. Installing user-local packages should be one simple command, and it's a failure of operating systems and package managers that it's not. As it stands, correct usage is much harder than incorrect usage, and this is the result.
I'm sure you are now going to tell me there is an easy way to fix that too, and I'd be happy if there was, but for me I just want to use npm to install a program or two.
Its absolutely banana-pants crazy to run `npm install` as a root user in any circumstance.
But using npm with root user? I can't think of a single usecase.
Furthermore if you do have a application that requires root level access then the parts that do should be isolated from the parts that don't. You don't get to just get a blank check to run as root because you need to bind to a low port.
Looks like the line responsible checks if the npm binary is run as sudo and then uses the UID and GID of the invoking user when chowning the directory. [ https://github.com/npm/npm/blob/latest/lib/utils/correct-mkd... ]
I feel like screaming, who thought this was a good idea? If I invoke something as sudo, why does anyone think it should try to detect that and do anything about it? I want to run as the user sudo has set, not my own user, OBVIOUSLY.
Don't try to be smart about sudo, you will break stuff.
But in fairness, I can't count the number of times that I've needed to fix things after people treated `sudo npm` as Simon Says[1].
I'm sure they struggled a lot with that issue before coming to this solution. Was it the right solution? Absolutely not. But that's not the point I'm trying to make.
It's all too easy to tunnel vision on a particular solution. I've done it plenty of times, and I'm thankful to those who have helped me to see other alternatives in time.
Ideally npm should simply setup a dedicated directory in /opt or /usr/local/ (ie, /usr/local/node/bin or /opt/node/bin) in which it dumps all the global stuff. That way you can easily set permissions for a user and/or contain any damages to that folder. If npm blows up that way it doesn't murder the entire system, you'll still be able to SSH in. (That is unless you use a SSH agent based on node.js in which case; "why?")
Once npm has implemented such a location it should refuse to run with sudo and demand the user setup the correct permissions within the node folder (maybe setup a group "npm-manage" during install?)
I think npm could implement a similar strategy and educate the users how packages should really be installed.
There are 3 use cases I can think for chown(2):
- Implementing the chown command (or other tools whose purpose is explicitly and only to manage permissions)
- Implementing a file copy/archive command that preserves permissions
- For package managers that set up a daemon user for a package, and want to set up the a writable area of the fs for use by that user
In other words, the ownership of files is something that should be totally up to the user, and not something implicitly done by a tool on their behalf.
I can't think of a single other place where trying to automatically manage file ownership is warranted. Files I touch should be owned by me, files root touches should be owned by root, and the correct way to make sure new files are not owned by root is to not be root. Doing literally anything else with chown is being overly clever and is a guaranteed landmine.
Since I 'vote' with my code - this migration page has been helpful today - and I hope it will help others: https://yarnpkg.com/lang/en/docs/migrating-from-npm/
It took me ~5 mins to migrate all of my code from npm to yarn. But I don't have complex CI tasks either.
I use ncu to check updates every couple of days, sometimes more frequently. To further distance myself from npm, can anyone comment on the pros/cons of github repo paths instead of package names in package.json?
[0] https://www.theregister.co.uk/2016/03/23/npm_left_pad_chaos
[1] https://github.com/npm/registry/issues/255
*edit: formatting
[0] https://disjoint.ca/til/2017/11/10/managing-package-dependen...
You can use "yarn outdated" for that.
Installing software where your options are
1. running as a regular user, and the install script can put whatever it wants within your user's directories
or
2. running as root, and the install script can do literally anything to anywhere on your system
is not fit for purpose, when the risks from both malice and incompetence are both reaching new heights almost daily.
These are systems we use for real work, but even smartphones and their toy app stores do better now. How do we still not have controls so applications can always be installed/uninstalled in a controlled way, can only access files and other system resources that are relevant to their own operation, and so on?
You mean something, that won't allow two packages to own the same file? Something, like, rpm or apt?
The Linux solution I suppose is Nix/Guix or Flatpak/Snap or Docker a la RancherOS. Perhaps more restrictive SELinix profiles could work as well.
We already have the necessary tools to do it, eg. firejail. We only have to make every binary run in firejail by default (and write firejail profiles for more binaries).
I agree. For instance, on recent macOS versions you cannot modify most system directories as root, unless System Integrity Protection (SIP) is enabled [1]. SIP can only be disabled by the user by booting into the recovery OS. Just making these directories read-only prevent accidents and malice.
AFAIK in Fedora Atomic Host/Server some system directories are also read-only [2]. Moreover, Fedora Atomic uses OSTree as a content-addressed object store, similarly to git, where the current filesystem is just a 'checkout'. So, you can do transactional rollbacks, upgrades, etc.
[1] https://support.apple.com/en-us/HT204899
[2] https://rpm-ostree.readthedocs.io/en/latest/manual/administr...
The idea that correctMkdir() exists at all seems to me to be so wrong-headed.
This comment from the source says a lot:
// annoying humans and their expectations!
Good UX is an important, oft-overlooked consideration, but there is definitely such a thing as taking it too far. If your humans are expecting this level of hand-holding, it's because you've trained them to expect it by pandering to them up until now. This is the kind of problem that should be handled with good, detailed, error message display when users don't get the result they expect, not "fixing" it with over-reaching magic.I'm not sure I'd trust anything put out by the npm team in general from hereonin if they genuinely thought creating the correct-mkdir.js file in the first place was a reasonable idea. Is it? Genuinely open to a counter-argument.
It all started with ubuntu pointing home to the user's dir when running sudo. This was done out of convenience to have gedit and other Xorg apps work when run with sudo...
Then there is also the terrible fact that ~/.local/bin didn't exist as a "standard" at the time. Which means your only sure-fire non-complicated way to install local bins guaranteed to work for the user was to put them in /usr/local/bin which meant running sudo.
But if you create a package cache dir during sudo in ubuntu in $HOME, thats with root permissions! Then you get errors when trying to run npm without root and it tries to manipulate the cache. How do we fix this, by changing cache dir permissions of course. https://github.com/npm/npm/commit/ebd0b32510f48f5773b9dd2e36...
Through series of refactorings, this became correctMkdirp, which is a non descriptive name of mkdir with (among other things) changing of permissions. And with a name like that it eventually was used in the wrong context and did the wrong thing.
I call this death-by-10-small-missteps. But I would pin the biggest problem on a missing omnipresent `~/.local/bin` standard (at the time). It doesn't cost much if anything, and it would single-handedly obliterate the need for users to much around in their paths (bad usability) or run sudo to install command line tools for personal use (clearly not the best idea).
<pseudocode>
if dir !exists:
mkdir && chown
else:
if dir has correct ownership:
traverse
else:
// our code chowns correctly on create
// so user must have done something
// independently; better NOT mess with it
throw "helpful descriptive message"
</pseudocode>mkdirp creates OR traverses recursively based on whether each directory already exists or not. This is why correctMkdirp() is an insane idea: the "correct"-ing chown step should never be internal to mkdirp because it should never occur on traversal (i.e. when a pre-existing directory is encountered).
The correctMkdir change seems more recent, but not really related to that specific comment.
That comment should not have survived 4 years. Again, inner grumpy old man showing through.
Edit: to be clear, such comments are treated as reflective of the people and organization behind them.
With the full comment, it seems they're instead bemoaning having to adhere to a user's config. Not sure which is worse...
[0] https://github.com/npm/npm/commit/94227e15eeced836b3d7b3d2b5...
Has this issue provoked so much outrage that GitHub can't handle the constant stream of angry emojis on the issue comment thread?
EDIT: I opened the original link in incognito mode and the page seemed to load fine.
I had an inkling that NPM was cancer, but not like this.
Yarn, by contrast, has everything you would expect of a Facebook-engineered library: https://github.com/yarnpkg/yarn/tree/master/__tests__/util
Will be closely evaluating a switch to Yarn for our live apps. This is simply sad.
So it collects your personal information, even when not using it, and uses it for profit?
https://github.com/npm/npm/pull/19889
This kind of thing disintegrates my confidence on npm as a project.
After passing the test, the PR was made and merged, and the PR-test failed because it branch was already merged and travis-CI has races around that.
[1]: https://travis-ci.org/npm/npm/builds/344892198?utm_source=gi...
When their Official Blog makes no mention of 5.7.0 being a prelease [0] and is semantically versioned as a stable release. The thread also later details that running `npm upgrade -g npm` instead of `npm install -g npm`will get 5.7.0 instead of 5.6.0 [1].
Is it standard practice for an `upgrade` command to pull a pre-release or beta? When I upgrade Firefox I don't get put onto the Nightly branch...
[1] https://github.com/npm/npm/issues/19883#issuecomment-3677268...
It is not.
Thankfully, it only affected users running `npm@next`, which is part of our staggered release system
#STOPUSINGPRERELEASEWITHSUDO
Really now? #ANGRYORANGEWEBSITE #PEOPLEGOTMAD
:)All tags:
#ANGRYORANGEWEBSITE #PEOPLEGOTMAD #STOPUSINGPRERELEASESWITHSUDO #CLIHOTFIX #WEGOTUBB #LITERALLYKILLEDGITHUB
Author:
FEBRUARY 22, 2018 (9:53 AM) @MAYBEKATZ
https://web.archive.org/web/20180222201315/http://blog.npmjs...
I think no documentation should ever include sudo in their commands. You should put a note "depending on your environment, some of those commands might require root privileges" or something.
- we are relying on 2 people team for our applications. - maintainer doesn't seem to care much about this horrific bug: https://imgur.com/a/v4Ndb
Yeah, but if you had switched to yarn beforehand, then you would not be facing this issue.
bugs are bound to happen and it's part of software development. however, the team size of npm cli and the way they react to this incident are what make me concern more.
Indeed. Another odd thing that it's been doing lately is when I run some NPM scripts on one of our machines, it starts shouting about some sort of update not working (why was it updating anything at all just because I ran `npm run something`?) and gives me instructions on how to fix it from the Linux shell (on a Windows box). The depth of failure implied by that message is disturbing on several levels.
"Contain async insanity so that the dark pony lord doesn't eat souls"
Just... What. I feel like when you need to reach for tools to "contain insanity", you might want to backup and ask someone who has written to a filesystem before... The linked blog about "preventing the release of Zalgo" and the linked https://blog.ometer.com/2011/07/24/callbacks-synchronous-and... seem completely erroneous. The entire point of callbacks is to _surrender_ control to a function - here is a piece of code to run when you are ready - now, sometime, or never, or maybe many times, as you see fit. Waiting until the next process tick seems so completely unnecessary... This strikes me heavily as "a solution in desperate search of a problem" - although I have that feeling with a _lot_ of NodeJS code I read...
The author of the blog linked on the dezalgo project seems to, at the end of the post, imply the purpose is for performance? By deferring work until a later date?
"The basic point here is that “async” is not some magic mustard you smear all over your API to make it fast. Asynchronous APIs do not go faster. They go slower. However, they prevent other parts of the program from having to wait for them, so overall program performance can be improved."
Other parts of the program _other than the work we've asked it to do_? What if we're only "correctly making" one directory? So we intentionally make our code slower... So that "other code" can run? He continues:
"This makes the API a bit trickier to use, because the caller has to know to detect the error state. If it’s very rare, then there’s a chance that they might get surprised in production the first time it fails. This is a communication problem, like most API design concerns, but if performance is critical, it may be worth the hit to avoid artificial deferrals in the common cases."
So it's slower -and- more complicated, and we're gonna hide it behind a meme. Gotcha.
The other issue is let's say you have some code like...
var f = 1
doSomeOperation(function done(){
console.log(f)
})
f = 5
If doSomeOperation calls done() sometimes syncronously and sometimes asynchronously, it will sometimes log 1 and sometimes log 5. If doSomeOperation always works one way it's more consistent. It's not a perf thing it's just consistency.Why wouldn't you naively assume sudo npm was safe if you wanted a global package (I know the behavior of npm...)
This is blaming the victims. If the user can blow their foot off, its not the users fault.
"Thankfully, it only affected users running npm@next, which is part of our staggered release system, which we use to prevent issues like this from going out into the wider world before we can catch them. Users on latest would have never seen this!"
If you are updating to the latest pre-release of something within mere hours of it dropping and you are updating production systems (presumably that have some business value) with no previous testing then the consequences of that aren't on the devs they are 100% on you. And you don't deserve to call yourself an IT (or Ops or DevOps or what-have-you) professional, that is amateurish behavior in the extreme.
I wrote about this in longer form here: https://github.com/crystal-lang/crystal/pull/3328#issuecomme....
http://appleinsider.com/articles/09/10/12/snow_leopard_guest...
(Yeah, it's that much-vaunted Snow Leopard.)
I do remember scrambling to recover my backups. Back then, I didn't make full-disk backups, so I had to assemble my user folder from various places. Everything else that transpired that night and the day after remains a haze.
With something that has as many people using it, it's just... I dunno, it's disheartening.
Edit: oh well, this was a @next release only. Not as bad. Still scary.
That is a recipe for disaster.
If you are reading this: You are doing great work, I wish you the energy and strength to ignore the trolls.