EdenSCM – A cross-platform, scalable source control management system (opens in new tab)

(github.com)

139 pointsimoldfella6y ago31 comments

31 comments

23 comments · 8 top-level

Game_Ender6y ago· 5 in thread

Facebook rewrote Mercurial, while Microsoft has essentially expanded git with their own virtual file system VFSforGit [0] and a bunch of performance improvements.

0 - https://vfsforgit.org/

StephenAmar6y ago

And Google has Piper and Srcfs http://google-engtools.blogspot.com/2011/06/build-in-cloud-a...

rakoo6y ago

I found this piece to have all the interesting details: https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

searchableguy6y ago

Curious why facebook went for mercurial?

wincent6y ago

They explain it here: https://engineering.fb.com/core-data/scaling-mercurial-at-fa...

The official reason is that the "internals of Git" weren't conduce to the kinds of invasive changes they needed/wanted. But I think the truth is closer to being that it was going to be too hard/slow to get those invasive changes past the Git mailing list.

Here's an example of a FB eng reaching out to the mailing list: http://git.661346.n2.nabble.com/Git-performance-results-on-a...

4 more replies

kyrra6y ago

(googler, opinions are my own)

Google interestingly also still contributes to mercurial[0], but I don't think has said why externally officially.

But yes, my understanding is that mercurial is much easier to replace parts of the flow without entirely hacking the codebase to bits. Microsoft's solution for git required them to fork the code base, which I still don't think has been fully merged with upstream yet? And as others said, the mercurial community was more open to enterprise contributing things than git can be.

https://www.mercurial-scm.org/wiki/5.2sprint

m12k6y ago· 4 in thread

Does this work together with a build cache? My dream setup for dealing with building huge code bases is a file system integrated with the version control system to only download files when they are accessed (which sounds like what this does) but also employing a build cache and module system, so it doesn't even need to download and compile any module that has not been touched, it just downloads the result from the build cache instead.

kyrra6y ago

Seems like they can be separated systems. For example, Google's Bazel supports build caches.

https://docs.bazel.build/versions/master/remote-caching.html

chadaustin6y ago

EdenFS makes it possible to query for hashes of files so any cached outputs can be looked up from the build cache without having to download the source control. There’s more to do here, but that’s a big benefit of having a custom filesystem.

rakoo6y ago

This is exactly what Google's internal system does: https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

The SCM, the CI, the Merge Request system, the build system, they're all tied together into a single thing. Google has open sourced parts of it but nothing exists as a whole unit the way Google does it

smilliken6y ago

Nix has a build cache which is keyed by a hash of the build inputs, so a package will only be rebuilt if sources or dependencies change. The cache is immutable and is populated on demand from the network. https://nixos.org/nix/

eitland6y ago· 2 in thread

BTW : with so many smart and connected people interested in source control management in one place, does anyone know what happened to veracity scm (http://veracity-scm.com)?

It was very promising but the suddenly stopped updating and then (more or less intentionally it seemed) links stopped working, but the site is still up 7 years later...

jboynyc6y ago

The developers went on to work on other things: https://web.archive.org/web/20130915093113/veracity-scm.com/...

eitland6y ago

ah, thanks, I tried to figure out for years because I felt they got so much right ux wise.

beagle36y ago· 1 in thread

There is more tooling needed in general - just one recursive grep will populate the entire EdenFS.

I suppose FB has better tools. But I won’t touch this until the ecosystem is sufficient (and also, because git and hg are perfectly sufficient for the monorepo I oversee)

wincent6y ago

The file system, the VCS, and the ability to index/grep the repo are just the tip of the iceberg. Pretty much everything that needs to access the filesystem or which depends on the contents/structure of the repo in some way needs to be re-built from scratch or otherwise dramatically customized to operate in this new landscape. That means a long catalog of tools and many years of effort.

This was already starting to become true even before FB switched to Eden, and before it switched to Mercurial (grepping, for example, was already back in the Git days being served by a custom grep service).

amelius6y ago· 1 in thread

Interesting. This might be a step in the direction of being able to store big files in repositories without hassle.

However, perhaps OS level support would be preferred. Imagine you have a type of symbolic link that is not just followed, but executed when you access it. That would be really powerful and would allow this kind of optimization. And you wouldn't even need to install or run anything.

oconnor6636y ago

Sounds a lot like FUSE?

qznc6y ago· 1 in thread

> A virtual filesystem for speeding up the performance of source control checkouts.

To describe it as a filesystem matches my thinking [0]: "It already resembles a network file system, so it should provide an interface nearly as easy to use."

If it really takes off as an Open Source project, we might be able to "mount" repostories eventually.

[0] http://beza1e1.tuxen.de/monorepo_vcs.html

cat1996y ago

> we might be able to "mount" repostories eventually.

a) https://github.com/presslabs/gitfs

b) https://en.wikipedia.org/wiki/Rational_ClearCase#History

smitty1e6y ago· 1 in thread

If I were going to try something other than git, it would be fossil => https://fossil-scm.org/home/doc/trunk/www/index.wiki

bch6y ago

Fossil is awesome, but (currently) would not scale to Facebook-size repos. It seems to be fine with long histories (lots of commits), but bogs down on lots of files (based on discussions surrounding a large repo considering moving to fossil).

For personal projects, it’s absolutely the bees knees. I can have multiple checkouts of the same repo, ask fossil about every single repo I have, has a really pleasant CLI, has a web-interface (which I personally hardly use anymore), manages tickets, ... sort of github in-a-box, but more pleasant.

galaxyLogic6y ago

I think the one big differentiator for this is:

"EdenSCM is not a distributed source control system. In order to support massive repositories, not all repository data is downloaded to the client system when checking out a repository"

j / k navigate · click thread line to collapse

31 comments

23 comments · 8 top-level

Game_Ender6y ago· 5 in thread

Facebook rewrote Mercurial, while Microsoft has essentially expanded git with their own virtual file system VFSforGit [0] and a bunch of performance improvements.

0 - https://vfsforgit.org/

StephenAmar6y ago

And Google has Piper and Srcfs http://google-engtools.blogspot.com/2011/06/build-in-cloud-a...

rakoo6y ago

I found this piece to have all the interesting details: https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

searchableguy6y ago

Curious why facebook went for mercurial?

wincent6y ago

They explain it here: https://engineering.fb.com/core-data/scaling-mercurial-at-fa...

Here's an example of a FB eng reaching out to the mailing list: http://git.661346.n2.nabble.com/Git-performance-results-on-a...

4 more replies

kyrra6y ago

(googler, opinions are my own)

Google interestingly also still contributes to mercurial[0], but I don't think has said why externally officially.

https://www.mercurial-scm.org/wiki/5.2sprint

m12k6y ago· 4 in thread

kyrra6y ago

Seems like they can be separated systems. For example, Google's Bazel supports build caches.

https://docs.bazel.build/versions/master/remote-caching.html

chadaustin6y ago

rakoo6y ago

This is exactly what Google's internal system does: https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

smilliken6y ago

eitland6y ago· 2 in thread

BTW : with so many smart and connected people interested in source control management in one place, does anyone know what happened to veracity scm (http://veracity-scm.com)?

It was very promising but the suddenly stopped updating and then (more or less intentionally it seemed) links stopped working, but the site is still up 7 years later...

jboynyc6y ago

The developers went on to work on other things: https://web.archive.org/web/20130915093113/veracity-scm.com/...

eitland6y ago

ah, thanks, I tried to figure out for years because I felt they got so much right ux wise.

beagle36y ago· 1 in thread

There is more tooling needed in general - just one recursive grep will populate the entire EdenFS.

I suppose FB has better tools. But I won’t touch this until the ecosystem is sufficient (and also, because git and hg are perfectly sufficient for the monorepo I oversee)

wincent6y ago

amelius6y ago· 1 in thread

Interesting. This might be a step in the direction of being able to store big files in repositories without hassle.

oconnor6636y ago

Sounds a lot like FUSE?

qznc6y ago· 1 in thread

> A virtual filesystem for speeding up the performance of source control checkouts.

To describe it as a filesystem matches my thinking [0]: "It already resembles a network file system, so it should provide an interface nearly as easy to use."

If it really takes off as an Open Source project, we might be able to "mount" repostories eventually.

[0] http://beza1e1.tuxen.de/monorepo_vcs.html

cat1996y ago

> we might be able to "mount" repostories eventually.

a) https://github.com/presslabs/gitfs

b) https://en.wikipedia.org/wiki/Rational_ClearCase#History

smitty1e6y ago· 1 in thread

If I were going to try something other than git, it would be fossil => https://fossil-scm.org/home/doc/trunk/www/index.wiki

bch6y ago

galaxyLogic6y ago

I think the one big differentiator for this is:

"EdenSCM is not a distributed source control system. In order to support massive repositories, not all repository data is downloaded to the client system when checking out a repository"

j / k navigate · click thread line to collapse