Too many of us are so used to git clone'ing a repo and building the software with make or its descendants that we overlook the security considerations.
The issue here is that the "git clone ..." allows for arbitrary code execution so the flow of 1) clone 2) analyze 3) make breaks at step 1.
Looks like it's back to tarballs for me.
We've been totally conditioned to just wget some archive, unpack it and build it, and even if git clone takes it one step further in day-to-day practice there is no difference between the two.
It does not, actually. It's the submodule steps that create the vulnerability, and those have to be done manually. There are no standard or automated ways of pulling submodules, every project that uses them has its own scheme and provides its own build instructions. Frankly they're pretty obscure and in most communities replaced by tricks like npm's dependency management instead. It's a goof, and fixed, but even for the most naive users the exposure is fairly low.
It's true that it happens early in the process, but it's not true that a simple git clone command is a vector.
I’m pretty sure that it would be trivial to hide malicious code in a tarball that you wouldn’t find (especially if you’re not expecting it).
npm from a security standpoint feels a little like a house of cards.
If you run code without trusting the author, you're likely going to have a bad time.
This is, of course, unexpected. And while you should perhaps raise an eyebrow if somebody you don't know asks you to recursively clone a repository that you're not interested in - this is indeed a problem and you should upgrade your Git client.
Given that running the code therein (if the owner of the repo. is malicious) will hurt you, this doesn't do too much apart from have it happen earlier on :)
I guess one scenario where it could be a problem is if you were planning to clone untrusted code and read it all carefully before running it.
I've never seen this. It smacks of terrible practice that would not last long in a daily CI system. Once you have a good version, locking on to that version is one thing. Just randomly downloading new code versions for a daily build is impractical.
I'm not saying it's trivial , but more that people execute code from GH and similar alll the time without reading/evaluating , and this won't do any worse than that.
While I try not to run code I don't trust, I have much more liberal point of view when it comes to cloning it. I assume others do too.
I'd be interested in hearing how you establish trust in the software you run. Assuming you're using git for cloning software code, do you include libraries that are dependencies of the code you're running in your trust calculations?
It can also be used for other cases where you'd like to amend what git does by default when updating the tree. See this recent thread[1] where some users want to have mtime behavior on files that's different from git's defaults, and one way to do that is via a post-checkout hook.
1. https://public-inbox.org/git/20180413170129.15310-1-mgorny@g...
Put
git submodule sync && git submodule update --init --recursive
into `.git/hooks/post-checkout` and never again wonder why your code doesn't work after switching branches (because you forgot to update submodules).Also put it into `post-rewrite` so that the same works for `git rebase`.
- you have to tell git to use submodules for this to trigger (so `clone --recurse-submodules` or a manual `git submodule update --init`)
- credit for discovery goes to Etienne Stalmans, who reported it to GitHub's bug bounty program
- most major hosters should prevent malicious repositories from being pushed up. This is actually where most of the work went. The fix itself was pretty trivial, but detection during push required a lot of refactoring. And involved many projects: I wrote the patches for Git itself, but others worked on libgit2, JGit, and VSTS.
unpatched hosting site ->
in house (patched) v2.17.1 --bare mirror ->
unpatched client
The transfer.fsckObjects setting needs to be explicitly turned on for the in-house mirror so that it doesn't collude in passing the bad objects along from the unpatched hosting site.The protection in v2.17.1 only gets enabled by default if you're checking out a repository yourself, not if you're merely fetching and re-serving git objects[1].
Turning on receive.fsckObjects as the official v2.17.1 release notes suggest is not sufficient to protect against this attack. It needs to be transfer.fsckObjects, which also turns on fetch.fsckObjects, which is what's needed here.
1. https://public-inbox.org/git/20180529211950.26896-1-avarab@g...
I should have clarified above, too: there were folks from GitHub, Microsoft, and Google working on the various fixes.
Not sure why the erroneous releases haven't been removed? Seems a bit confusing.
[0] https://github.com/git-for-windows/git/releases/tag/v2.17.1....
[1] https://github.com/git-for-windows/git/releases/tag/v2.17.1....
Chocolatey has the original 2.17.1, the 2.17.1-2 update is not yet approved.
Cygwin still only has 2.17.0-1. I wonder whether they were even part of the embargo.
I usually just track the repository's atom feed [0] and download urgent updates directly from there (git-scm.com eventually links to the releases published on GitHub anyway).
[0] https://github.com/git-for-windows/git/releases.atom
Edit: The Git 2.17.1.2 Chocolatey package has now been approved https://chocolatey.org/packages/git/2.17.1.2
None of the use-cases I read are convincing enough to allow `git clone` to do anything but what its short man description says.
I'm not even thinking about security, just basic separation of concerns. If `git clone` leaves a script-hooked repo in an unusable state for building, I want to know up front so I can complain to the maintainer and get that problem fixed.
Of course, then the goal just becomes attacking that whitelist, and all the complexity that comes with that. Security is hard.
Although I'm not sure those tools could 'find' and build a vuln, but there could be ways to analyze an algorithm, and detect that it can do dangerous things it's not not supposed to do. A little like static analysis works.
I'm sure those tools are already built by the NSA at least, so they just have to peek into github repos, point out what code is vulnerable, give it to some developer to make an exploit. Done.
That way the NSA would clearly wins the cyber arms race, versus those pairs of eyes Torvalds was being quoted for, would surely be obsoleted.
In addition to our recently implemented monthly non-critical security release process (we already had a critical release process before), we are making a number of changes in how we secure GitLab.com, which includes expanding our HackerOne program this year to be a public bounty program. As always, we appreciate the contributions of security researchers.
Storing config data outside the repo would not be a foolproof solution, but it would probably make things a little safer. (Having the <repo_root>/.git folder has always felt a little bit "in-band" to me, and I don't like it.)
Guess when it's not your direct product this is OK.