Wrangling 2000 Git Repos at Reddit (opens in new tab)

(old.reddit.com)

55 pointsjdorfman2y ago62 comments

62 comments

45 comments · 14 top-level

IshKebab2y ago· 5 in thread

That's crazy. Monorepo definitely makes more sense.

Though I always wonder - how do Google, Microsoft, Facebook etc. deal with developing code near the root of their dependency tree? Utility libraries for example. Technically you're going to have every change you make there building all the code and running all the tests, which is obviously unworkable. What do they do?

spankalee2y ago

First, you prune tests to only those affected by the change. There's basically nothing (other than the build system itself) at the root of all code because of the huge number of languages and platforms in the repo.

But you still have library code that's depended on so much that you can't easily run all the tests for each change. So you run a train system so that you can run tests for a whole bunch of changes (those on the same train) at once. Developers have to schedule changes to get on the next train. Other code analyzes the failures from the train to try to apportion blame correctly.

Then, because getting on trains can be cumbersome and tests you care about can fail because of other changes, you build random test sampling and smoke test subsets so that you can get some immediate results to attach to a review, etc.

It roughly works, as well as anything can at that scale.

IshKebab2y ago

Are you speaking from experience or is this just a guess?

Merge trains/queues don't really help with the problem I'm describing - they just prevent race conditions between commits. You still need to run all the tests for each PR/MR first before joining the queue/train.

aylmao2y ago

1. Types, tests, all the build-time analysis you can rely on 2. Gradual migrations. You run two versions for a while and make it easy for certain codebases and/or users to be switched between the new and the old as necessary. 3. Sometimes you just don't touch that stuff, slowly build new things with new technology and hope eventually the old thing will fall in use enough that you'll be able to delete it.

kleton2y ago

Very conservatively, with automation for large scale changes. Google and Facebook do run all the tests, but Facebook has way fewer tests.

IshKebab2y ago

Do they not have "internal" releases for these root components that are not depended on by everything, so you aren't having to wait 2 days and use a gazillion hours of computer for every single change though?

1 more reply

nolist_policy2y ago· 5 in thread

I don't get the hating here. With the right tooling it doesn't matter if its 10, 100 or 2000 repos. And it buys you some nice things like per-repo permission settings.

AlotOfReading2y ago

What's your preferred solution to coordinating PRs across multiple repo boundaries? This is an enormous pain to do manually and the only tooling solution to it I've seen is Gerrit, which I find hard to describe as "right".

Jach2y ago

I believe the "right" way to do it is to have every repo only able to talk to another repo's code via versioned APIs. So e.g. a client repo can ship an update that will talk to some service repo's code at either version N or N+1, the server repo can later on leisurely ship its update that actually makes version N+1 available (and has to support however many prior versions by policy). Or vice-versa, the server side can have N and N+1 already in service while the clients roll out their updates. You have to do something like this anyway if you have a mobile app.

Of course it has consequences and makes some things harder, you really have to go in on API-first even for things like UI widgets if different ones live in different repos, but it at least alleviates the common problem of trying to get multiple repo commits merged at the same time and have everything keep agreeing. I still prefer the monorepo approach.

MenhirMike2y ago

It's one of the reasons I prefer monorepos in practice. Especially if you use something like GitHub Enterprise, the cross-repo experience in terrible. In theory, many repos is better than a monorepo. In actual practice, most tooling only works well for monorepos. (And don't get me started on CI/CD tooling like GitHub Actions)

Or you do what the .NET team at Microsoft did, create a separate monorepo that rolls up individiual repos: https://github.com/dotnet/dotnet

sethammons2y ago

Don't? You make backwards compatible changes and provide an API. To sunset a thing takes coordination and measuring usage.

If you are having to keep two or more modules/libs/packages/repos in sync, those should be unified.

1 more reply

aylmao2y ago

I mean, with the correct amount of investment, research and work, one can build a livable house out of Lego. I'm sure that gets you some nice things too.

It doesn't change the fact other people will think: "why didn't you just build it out of normal bricks to save yourself all that trouble?".

hackmiester2y ago· 5 in thread

Non-legacy Reddit link: https://reddit.com/r/RedditEng/comments/1bdtrjq/wrangling_20...

GenerocUsername2y ago

The hacker news crowd is overwhelmingly in support of old.reddit. I mean look at the UI of this place. Obviously this crowd appreciates substance more than Web 3.0 ad trackers and GIF chat

hackmiester2y ago

I don't come here because of the UI, I come here because of the content. If it were my site, the UI would be a little more suited to the current year.

If everyone loved it, there'd be no Hacker News mobile apps.

2 more replies

emestifs2y ago

Probably better that op posted the unshitified link

woadwarrior012y ago

There's a nice browser extension for that. I've been using it for a while now.

https://github.com/tom-james-watson/old-reddit-redirect

hackmiester2y ago

I don't agree, but I'm sure that was clear from my initial comment.

1 more reply

conjecTech2y ago· 3 in thread

I worked at Reddit in the not so distant past. The entire recommendation system lived in 3 repos. I'm pretty sure there are just 2000 repos because the onboarding tutorials have you create one, and that number is probably around the number of engineers that have worked there. I'd guess 100-200 have some production component.

alumic2y ago

If this is the case, then why in the world wasn't that mentioned in TFA? I'm not questioning your comment but it strikes me as odd that there would be no attempts to clean any of that up.

SahAssar2y ago

When I got the responsibility for our aws accounts and github repos there where over 1000. 99% are not used, and closing them down is a PITA since AWS only allows you to close 10% of your accounts per month and finding out who actually uses a github repo can be hard (if it is not commited to regularly). My predecessor didn't care as long as they don't cost too much, which is a reasonable stance but I want to know what is running.

I'm working my way through them but it takes quite a bit of time and checking in with people on what is actually used.

Besides that there is some prestige in saying that you handle 2000 amount of repos instead of it being "we have 20 prod repos and 1980 personal playground repos with one commit"

1 more reply

squigz2y ago

Why would it be? That doesn't look good at all - why not clean them up periodically, or simply after onboarding? - and this makes them seem more capable than if they were managing 1/10 the number of repos.

(This is assuming GP is correct. I have my doubts)

sethammons2y ago· 3 in thread

To those who are swinging towards monorepos, I don't think that is a good solution. The reason being is that developers simply cannot be trusted to "do the right thing" on data and module boundaries. Someone comes in new to the project and does something they don't know they shouldn't. It is the honor system backed by weak linters and tooling.

In our monorepo, everyone passes around django orm objects and boundaries are practically non-existent. N+1 queries abound. Tests are full of patching and mocking and are _slow_. Our build takes over an hour to run tests. Someone on team A can and absolutely will mess up what someone on team B is doing. We are now having to spend quarter upon quarter as we define and enforce domain boundaries within the python code base. It is all bolted on checks. Tests are getting worse and people are actively trying to figure out ways around the testing system because it sucks.

Compare to my last gig. We had several hundred production repos. Each repo starts from a template with its own build pipeline. All production repos are gated so that any PR must pass tests before it can merge. Any merge has to pass tests before it could be deployed. As the base build processes matured, teams could, at their leisure, pull their services up to the latest and greatest. We even migrated from Jenkins to Buildkite; yeah, it took N pulls into N repos. Not a big deal. Most projects' tests and builds could get code out to production in under 10 minutes, including all those checks. Due to the network boundary, you couldn't accidentally get around someone's abstraction. And if one team blew up their build doing something dumb? No problem, it only affects that one team.

The argument is "gah, managing all those services!" Keep data behind APIs. Keep APIs backwards compatible. Keep dependencies acyclic. This is _possible_ with monorepos, but you have to do extra work compared to networked services -- yes, when any particular team/service can deploy in minutes due to low build system complexity you are winning. Can you get that wrong and make strange cyclic dependencies and introduce performance issues due to network hops? Yeah, of course. However, we were processing, literally, 10s of billions of api requests on this system and teams could work untethered from one another. The new gig does eerily similar software, but is several orders of magnitude slower in their ability to process data and their ability to move new features.

yes, yes, you could have networked services and a monorepo and you can leverage tooling like Pants to minimize the testing to only account for changed files. It is just fighting what I have found to be a better model. Keep things separate. Keep things fast to change.

aylmao2y ago

There's three systems an org can (and should) have in place to prevent this:

- Dev-time analysis. This is, types, linters and tests that will throw errors and block merges before they happen.

- A good code-review environment. Not one that's so tedious that everyone just seeks the stamp to move on, but also no one that's so lax there's no code review at all. One in which code-review is really an opportunity to improve the code before it's merged, people are committing to unblocking others, and everyone assumes their responsibility in having the right people look at changes.

- A better team structure. One where ownership of the code is shared, and code quality goals are present.

Usually if computer systems, peers and the org at large all agree in creating a good dev environment, dev environments tend to not be bad.

In my experience not having a monorepo doesn't fix the issue, it just hides it away. Repos don't have the same code-quality, and overall everyone tries to stay in their little island and not deal with problems and disagreements outside of it. You lose the opportunity to talk about these issues. There isn't that learning and teaching dynamic in the wider org, and the whole company is worse off as a consequence.

Plus, of course, there's a lot of extra-overhead in developing with APIs between each component.

lijok2y ago

You're describing a failure in leadership, not a failure in the monorepo model. If your dev leads are not bringing these issues up to management, or management are not backing them on solving these issues, you're doomed whether you're doing monorepos or not.

Monorepos have their drawbacks (and I personally don't like them either), but being unable to trust devs is not one of them.

If you can't make monorepos work, what makes you think you can get "microrepos" to work?

alumic2y ago

While I have yet to work with a large Django monorepo, it's the architecture I'm leaning towards for a product I'm building. Are you guys using packages like nplusone [0] to help with some of this ORM complexity? As I mentioned, I am headed in this direction so I'm curious to hear your insight. I'd also love to pick your brain on any other pitfalls of this approach.

[0] https://github.com/jmcarp/nplusone

airstrike2y ago· 2 in thread

You know, big sweeping refactors deservedly get a bad rep, but as everything else in life, there are always exceptions

At some point, I don't know, maybe when you cross the 100 repos mark, you've gotta ask yourself "maybe we could try a different approach?"

It's not like reddit has been known for its wonderful stability over the years

I'm sure the scale here is completely unlike anything I've ever worked on, but how hard can it be to write a sane implementation of a message board?

I'd be curious how much of this problem is caused by the junk that is "new reddit". I've been there since 2007... The day old.reddit.com is the day I abandon it for good

lp0_on_fire2y ago

My guess is that old Reddit will disappear in short order once the IPO is done. I have no inside knowledge, just a feeling. Once there are external shareholders it will be hard to justify maintaining both versions of the site.

Hope I’m wrong.

MenhirMike2y ago

I am honestly surprised that it is still around, I expected them to kill it off once they hopped into the "We want to be Instagram" train with the new design and give it maybe a month or three. Who knows, maybe they just can't find the correct repo for it?

A small part of me actually hopes they kill off old reddit, so I have a reason to completely stop using it :)

1 more reply

mebazaa2y ago· 2 in thread

Yes, the Reddit dev team might have spawned a 2000+ repo mess, but they also host it under the snooguts.net domain name, which is objectively adorable, so all is forgiven.

GenerocUsername2y ago

I for one am sick of cute mascots being used as excuses and distraction from actual substance of conversations. Its a very reddit thing too. funny how that works.

alumic2y ago

Along these same lines, it is common at my organization for people to gleefully say "welcome to ${ORG}!" or "It's the ${ORG} way!" in response to some failure that comes down to managerial knee-jerk or some fundamentally broken process.

It always seemed so perverse to me to promote this sort of tacit approval--as you say--passing it off as "cute" instead of... improving things.

2 more replies

tayo422y ago· 2 in thread

I worked with a monorepo and multiple teams dedicated to the dev experience, I had my complaints but I was spoiled in hindsight.

I know they did alot with git to make it manageable, hopefully what ever they did makes it to the open source world eventually so we can all avoid these crazy thousand repo worlds.

munchbunny2y ago

Having worked with both, I think it's really much more about having someone (a team) dedicated to dev experience, maintaining good onboarding tooling, CI/CD, build infrastructure, security configurations, SOP's around package repositories, container image registries, static code analysis tooling, and the works. If you have 2000 repos but finding what you need is easy, then the number itself is not an issue. If you have 1 repo and running unit tests for a small component before merging takes 8 hours, again the number of repos is probably not the issue.

hinkley2y ago

Some of the big monorepos drive me nuts too. Finding things in opentelemetryjs is a test of my patience. Every single time.

ZephyrBlu2y ago· 2 in thread

2000 repos what the fuck. More repos than engineers sounds terrible. Having worked in a large monolithic repo, I much prefer that. Everything (Shipping, testing, debugging, etc) is much easier that way.

hinkley2y ago

I think we had 200 across three teams and that was starting to become miserable. Which is to say I was miserable and others were slowly coming around.

I started eyeballing two of them to combine because they were dominating stack trace frames, and after a number of rounds of feature toggle cleanup and refactoring to use other modules, had dwindled down to about 60% of a reasonable module size.

Not a lot of this sort of work gets done.

klodolph2y ago

Eh.

I’ve worked at companies with large monorepos and I’ve worked on teams with more repos than engineers. At large scale, it takes good tooling to make it work well, and that’s true both for monorepos and for multirepos.

I do think that the monorepo tooling is better, but I think the difference isn’t so large. You can have a completely miserable experience in a monorepo or multirepo setup, or a good experience.

ivanjermakov2y ago· 1 in thread

Out of those 2k repos, how many of them actually used in production?

Alifatisk2y ago

I think it’s hard to tell, especially for a large organization.

MilStdJunkie2y ago· 1 in thread

Holy Jesus Buddha Muhammad on a Harley. 2000 repos for a messageboard?!

I don't think someone knows what "repository" means.

At least they're bringing in Sourcegraph. That tool's helped me make sense of some chaos. Not 2000 repos' worth of chaos, but still, some chaos.

hinkley2y ago

I bet the commit history on half of those is completely opaque too.

The nice thing about 2024 is that it’s not 1999. But every once in a while I run into people making 25 year old mistakes.

heads2y ago

We used to have an ecosystem like this. In our case it reflected an entrenched set of divisions between warring teams. In some ways it may have then enhanced those positions and we still bear a few of the scars today.

A lot of the old guard have left the company though and our main product moved from four repos to just one. The threat from the legal team to have enforced OWNERS files — essentially replicating the divisive politics of the old repos but in the monorepo — thankfully withered on the vine. We still audit what goes into each release but it’s no longer part of any active permissions thing. We trust our developers but verify, for legal reasons, that nothing went wrong.

You either want one engineering team to act in unison behind your company’s mission, or you want to live a divisive narrative that you are actually multiple teams “working” together with none of the advantages of living under one roof and all the disadvantages of hard repository boundaries crisscrossing your intellectual property.

So many factors threaten to curdle your team dynamic: multiple offices, multiple floors, work from home hermits, bad management, etc. It’s simply org entropy and it takes much effort to keep the weeds out of the garden. Multiple repositories is one less bullet you can keep out of your feet while fighting all the other battles that threaten to turn your team from 1990s Sun Microsystems into 2010 Sun Microsystems.

miduil2y ago

I can't believe how it is working in such a big structure with just GitHub alone. GitLab with groups/subgroups and also integrated sourcegraph seems such more practical at this scale.

ydnaclementine2y ago

Sounds made up for why the R&D costs in their IPO docs was 450million or whatever

j / k navigate · click thread line to collapse

62 comments

45 comments · 14 top-level

IshKebab2y ago· 5 in thread

That's crazy. Monorepo definitely makes more sense.

spankalee2y ago

It roughly works, as well as anything can at that scale.

IshKebab2y ago

Are you speaking from experience or is this just a guess?

aylmao2y ago

kleton2y ago

Very conservatively, with automation for large scale changes. Google and Facebook do run all the tests, but Facebook has way fewer tests.

IshKebab2y ago

1 more reply

nolist_policy2y ago· 5 in thread

I don't get the hating here. With the right tooling it doesn't matter if its 10, 100 or 2000 repos. And it buys you some nice things like per-repo permission settings.

AlotOfReading2y ago

Jach2y ago

MenhirMike2y ago

Or you do what the .NET team at Microsoft did, create a separate monorepo that rolls up individiual repos: https://github.com/dotnet/dotnet

sethammons2y ago

Don't? You make backwards compatible changes and provide an API. To sunset a thing takes coordination and measuring usage.

If you are having to keep two or more modules/libs/packages/repos in sync, those should be unified.

1 more reply

aylmao2y ago

I mean, with the correct amount of investment, research and work, one can build a livable house out of Lego. I'm sure that gets you some nice things too.

It doesn't change the fact other people will think: "why didn't you just build it out of normal bricks to save yourself all that trouble?".

hackmiester2y ago· 5 in thread

Non-legacy Reddit link: https://reddit.com/r/RedditEng/comments/1bdtrjq/wrangling_20...

GenerocUsername2y ago

The hacker news crowd is overwhelmingly in support of old.reddit. I mean look at the UI of this place. Obviously this crowd appreciates substance more than Web 3.0 ad trackers and GIF chat

hackmiester2y ago

I don't come here because of the UI, I come here because of the content. If it were my site, the UI would be a little more suited to the current year.

If everyone loved it, there'd be no Hacker News mobile apps.

2 more replies

emestifs2y ago

Probably better that op posted the unshitified link

woadwarrior012y ago

There's a nice browser extension for that. I've been using it for a while now.

https://github.com/tom-james-watson/old-reddit-redirect

hackmiester2y ago

I don't agree, but I'm sure that was clear from my initial comment.

1 more reply

conjecTech2y ago· 3 in thread

alumic2y ago

If this is the case, then why in the world wasn't that mentioned in TFA? I'm not questioning your comment but it strikes me as odd that there would be no attempts to clean any of that up.

SahAssar2y ago

I'm working my way through them but it takes quite a bit of time and checking in with people on what is actually used.

Besides that there is some prestige in saying that you handle 2000 amount of repos instead of it being "we have 20 prod repos and 1980 personal playground repos with one commit"

1 more reply

squigz2y ago

(This is assuming GP is correct. I have my doubts)

sethammons2y ago· 3 in thread

aylmao2y ago

There's three systems an org can (and should) have in place to prevent this:

- Dev-time analysis. This is, types, linters and tests that will throw errors and block merges before they happen.

- A better team structure. One where ownership of the code is shared, and code quality goals are present.

Usually if computer systems, peers and the org at large all agree in creating a good dev environment, dev environments tend to not be bad.

Plus, of course, there's a lot of extra-overhead in developing with APIs between each component.

lijok2y ago

Monorepos have their drawbacks (and I personally don't like them either), but being unable to trust devs is not one of them.

If you can't make monorepos work, what makes you think you can get "microrepos" to work?

alumic2y ago

[0] https://github.com/jmcarp/nplusone

airstrike2y ago· 2 in thread

You know, big sweeping refactors deservedly get a bad rep, but as everything else in life, there are always exceptions

At some point, I don't know, maybe when you cross the 100 repos mark, you've gotta ask yourself "maybe we could try a different approach?"

It's not like reddit has been known for its wonderful stability over the years

I'm sure the scale here is completely unlike anything I've ever worked on, but how hard can it be to write a sane implementation of a message board?

I'd be curious how much of this problem is caused by the junk that is "new reddit". I've been there since 2007... The day old.reddit.com is the day I abandon it for good

lp0_on_fire2y ago

Hope I’m wrong.

MenhirMike2y ago

A small part of me actually hopes they kill off old reddit, so I have a reason to completely stop using it :)

1 more reply

mebazaa2y ago· 2 in thread

Yes, the Reddit dev team might have spawned a 2000+ repo mess, but they also host it under the snooguts.net domain name, which is objectively adorable, so all is forgiven.

GenerocUsername2y ago

I for one am sick of cute mascots being used as excuses and distraction from actual substance of conversations. Its a very reddit thing too. funny how that works.

alumic2y ago

It always seemed so perverse to me to promote this sort of tacit approval--as you say--passing it off as "cute" instead of... improving things.

2 more replies

tayo422y ago· 2 in thread

I worked with a monorepo and multiple teams dedicated to the dev experience, I had my complaints but I was spoiled in hindsight.

I know they did alot with git to make it manageable, hopefully what ever they did makes it to the open source world eventually so we can all avoid these crazy thousand repo worlds.

munchbunny2y ago

hinkley2y ago

Some of the big monorepos drive me nuts too. Finding things in opentelemetryjs is a test of my patience. Every single time.

ZephyrBlu2y ago· 2 in thread

hinkley2y ago

I think we had 200 across three teams and that was starting to become miserable. Which is to say I was miserable and others were slowly coming around.

Not a lot of this sort of work gets done.

klodolph2y ago

Eh.

I do think that the monorepo tooling is better, but I think the difference isn’t so large. You can have a completely miserable experience in a monorepo or multirepo setup, or a good experience.

ivanjermakov2y ago· 1 in thread

Out of those 2k repos, how many of them actually used in production?

Alifatisk2y ago

I think it’s hard to tell, especially for a large organization.

MilStdJunkie2y ago· 1 in thread

Holy Jesus Buddha Muhammad on a Harley. 2000 repos for a messageboard?!

I don't think someone knows what "repository" means.

At least they're bringing in Sourcegraph. That tool's helped me make sense of some chaos. Not 2000 repos' worth of chaos, but still, some chaos.

hinkley2y ago

I bet the commit history on half of those is completely opaque too.

The nice thing about 2024 is that it’s not 1999. But every once in a while I run into people making 25 year old mistakes.

heads2y ago

miduil2y ago

I can't believe how it is working in such a big structure with just GitHub alone. GitLab with groups/subgroups and also integrated sourcegraph seems such more practical at this scale.

ydnaclementine2y ago

Sounds made up for why the R&D costs in their IPO docs was 450million or whatever

j / k navigate · click thread line to collapse