Keeping master green at scale (opens in new tab)

(eng.uber.com)

301 pointsroshanj7y ago115 comments

115 comments

67 comments · 13 top-level

chairleader7y ago· 15 in thread

Quite a premise: "Giant monolithic source-code repositories are one of the fundamental pillars of the back end infrastructure in large and fast-paced software companies."

huac7y ago

facebook, google, airbnb, quora, many more all use monorepo

obviously there are many others who do not use monorepo (amazon comes to mind) but it's reasonable to claim that they are actually widely used and fundamental when used

jhenkens7y ago

Microsoft uses it for Windows as well, which was so large they wrote their own git filesystem to power it.

vruiz7y ago

Does anybody know how these companies development environments look like? I know about Piper at Google but how do the rest manage? Does every single engineer have the entire monorepo in their machines?

2 more replies

venantius7y ago

Airbnb uses a monorepo for JVM-based project but most of Airbnb's code at least as of mid-2017 was not run on the JVM and was hosted multi-repo.

2 more replies

twic7y ago

Notably, Netflix doesn't (or least didn't):

https://medium.com/netflix-techblog/towards-true-continuous-...

chairleader7y ago

til! Most of my experience with large, single-repository projects are just plain monoliths. The design goal we strive for tends to be the microservice architecture, assuming that isolation of responsibility leads to more maintainability, better decision making, etc. I can see how, with a well disciplined team, the monorepo could have the best of both worlds.

bluGill7y ago

Many companies do use a monorepo. Many other companies do not. There are trade offs.

msangi7y ago

All the companies have in common a huge budget they can invest on their build systems to overcome the shortcomings of monorepos.

They do have some benefits, but they also come with an immense cost

2 more replies

googlemike7y ago

I like it at Google!

Aqua_Geek7y ago

Google has the team + tooling to properly support it. The same cannot be said for many other orgs.

3 more replies

k__7y ago

Yes, my first thought "I wish the systems I worked on in big corps were big monolithic giants"

robocat7y ago

Which huge successful software companies don't use a monorepo?

xxpor7y ago

Amazon.

slsteele7y ago

I think these count as huge (although maybe not when next to Google), but Spotify and Netflix.

draw_down7y ago

I'm willing to agree with that premise.

huac7y ago· 8 in thread

"Based on all possible outcomes of pending changes, SubmitQueue constructs, and continuously updates a speculation graph that uses a probabilistic model, powered by logistic regression. The speculation graph allows SubmitQueue to select builds that are most likely to succeed, and speculatively execute them in parallel"

This is either brilliant or just something built for a promotion packet

pastor_elm7y ago

Sounds so much simpler outside the context of a 'research' paper:

>When an engineer attempts to land their commit, it gets enqueued on the Submit Queue. This system takes one commit at a time, rebases it against master, builds the code and runs the unit tests. If nothing breaks, it then gets merged into master. With Submit Queue in place, our master success rate jumped to 99%.

https://eng.uber.com/ios-monorepo/

sundargates7y ago

I can guarantee you that the system that's described in the paper is what we use at production. The blog post that you are pointing to was meant to describe the usage of monorepo at Uber and the challenges we faced at a high level. It didn't dive deep into the submission system and we have the paper to address that :-).

(I'm one of the authors as well as the tech-lead of the system.)

1 more reply

joshuamorton7y ago

That's because the research-y part is "how do we pick which commit to enqueue next", and that's harder to answer succinctly.

state_less7y ago

You can get most of the benefit on smaller scales by building feature branches and ensuring they pass unit tests, deployment and integration testing before they're allowed to be merged to mainline.

It still depends on well written tests, lest your confidence be dashed when a human starts pushing buttons and pulling levers.

Also, don't break up tightly coupled code/modules into separate repos for the sake of microservices. Hard working developers will have to do two or more builds, PRs, possibly update semvers, etc... Find the right seams. If two repos tend to always change in lockstep, think about merging.

2 more replies

huac7y ago

submit queue makes sense and is used by lots of people, it's the "machine learning" which is applied to choosing commits to enqueue which I found to be interesting. if the master success rate was already 99% in 2017, with just submit queue, why build the complex ML stuff?

4 more replies

mcqueenjordan7y ago

Promotion-oriented design, no doubt.

burakcosk7y ago

no it is not, this whole thing being used in production and really reduces the time changes submitted and they are merged into master. and the master is almost always green, meaning developers can build and test any piece of code without problem.

They have designed this as a result of a need, not just a fancy project.

sundargates7y ago

I can guarantee you that none of the ideas on the paper were born out of a desire to get promoted. They were invented because ML models helped figure out which set of builds we need to run more accurately at scale.

1 more reply

7e7y ago· 8 in thread

Is this novel? Other companies have had this for ages.

ricardobeat7y ago

No, they haven't. This is a system to queue commits, not a simple CI setup. This problem only comes up when you start having contention due to commit volume in a monorepo (think thousands commits/day). This is only the 3rd one I've heard about.

> This paper introduces a change management system called SubmitQueue that is responsible for continuous integration of changes into the mainline at scale while always keeping the mainline green. Based on all possible outcomes of pending changes, SubmitQueue constructs, and continuously updates a speculation graph that uses a probabilistic model, powered by logistic regression. The speculation graph allows SubmitQueue to select builds that are most likely to succeed, and speculatively execute them in parallel. Our system also uses a scalable conflict analyzer that constructs a conflict graph among pending changes. The conflict graph is then used to (1) trim the speculation space to further improve the likelihood of using remaining speculations, and (2) determine independent changes that can commit in parallel

roshanjOP7y ago

The problem also occurs if your CI Build + Test steps take a while to run, even on a small team pushing dozens of commits per day.

Two code-conflict-free changes may pass a pre-merge build+test cycle independently but may logically break one another if both changes are merged into master. Using a submit/merge queue guarantees that each change has passed tests with the exact ordering of commits it would be merged onto. The example described here is a better explanation: https://github.com/bors-ng/bors-ng#but-dont-githubs-protecte...

zyang7y ago

I don't quite understand the problem they are trying to solve. Is there so many change sets that they couldn't provision enough ci servers, hence the "speculation graph with probabilistic model"?

2 more replies

foobiekr7y ago

I think they are more common than you are thinking. I am familiar with several, even going back to the svn and cvs era, all of which predated the whole formalized-and-named CI/CD thing. In my experience, we called this model submit-to-commit, and depending on the specific manifestation, worked with diffs or branches. I'm talking 1990s.

The fancy bits in this implementation from the paper are interesting but the model itself is not that unusual.

UweSchmidt7y ago

In companies like that, is there any consideration given to minimizing conflict-prone actions, like, say renaming functions, an activity that could conflict with any commit that uses the old function name, but which in itself is unlikely to break anything? Maybe certain commits could be scheduled over the weekend?

I guess I just have a hard time imagining how many buys developers really commit important work all at once on large projects...

ungzd7y ago

Other companies often just wait for tests to finish, while at the time of running tests proposed changes (branch/PR) might be not based on current version of master. Then they just rebase/merge after tests pass, without running tests again. For smaller projects, this rarely breaks. For monorepo with lots of committers rate of breakage becomes too large.

Next step is to serialize all proposed changes, so they are rebased one on top of other before running tests. This eliminates breakage due to merging, but does not scale:

> The simplest solution to keep the mainline green is to enqueue every change that gets submitted to the system. A change at the head of the queue gets committed into the mainline if its build steps succeed. > > This approach does not scale as the number of changes grows. For instance, with a thousand changes per day, where each change takes 30 minutes to pass all build steps, the turnaround time of the last enqueued change will be over 20 days.

This paper is about scaling a variant of such queue.

paxys7y ago

Which companies?

bostik7y ago

Us, for instance.[0,2]

But sure enough, we definitely weren't the first to go down this path. Facebook was using (or developing the tech for) server-side rebasing in 2015.[1] Gitlab provides native server-side rebase functionality, likely inspired by various parties already having developed tools to do the same.

These aren't new ideas. But handling them at the scale where you land hundreds or even thousands of commits a day to a repo and require the ability to deploy at will, that's where engineering comes into play.

0: https://smarketshq.com/marge-bot-for-gitlab-keeps-master-alw...

1: https://softwareengineering.stackexchange.com/questions/2787...

2: https://github.com/smarkets/marge-bot

1 more reply

underrun7y ago· 7 in thread

Adrian Colyer dug into this a little further on the morning paper:

https://blog.acolyer.org/2019/04/18/keeping-master-green-at-...

His analysis indicates that what uber does as part of its build pipeline is to break up the monorepo into "targets" and for each target create something like a merkle tree (which is basically what git uses to represent commits) and use that information to detect potential conflicts (for multiple commits that would change the same target).

what it sounds like to me is that they end up simulating multirepo to enable tests to run on a batch of most likely independent commits in their build system. For multirepo users this is explicit in that this comes for free :-)

which is super interesting to me as it seems to indicate that an optimizing CI/CD systems requires dealing with all the same issues whether it's mono- or multi- repo, and problems solved by your layout result in a different set of problems that need to be resolved in your build system.

ori_b7y ago

> For multirepo users this is explicit in that this comes for free :-)

Only if you spend the time to build tools to detect commits in your dependencies, as well as your dependent repositories, and figure out how to update and check them out on the appropriate builds.

So, no, it doesn't come for free.

underrun7y ago

sorry, "this" is rather ambiguous.

You are totally correct that to achieve the same performance, correctness, and overall master level "green"ness in a multirepo system you would have to either define or detect dependencies and dependent repos, build the entire affected chain, and test the result. That part is much easier in monorepo.

What I was referring to with "this" is that Uber's method of detecting potential conflicts. In multirepo land it would be a "conflict" if two people commit to the same repo. In multirepo, therefore, detecting potential conflict is trivial.

If Bob commits to repo A and Sally commits to repo B, their commits can't result in a merge conflict. Well, unless the repos are circularly dependent - which would be bad :-) don't do that. Of course, monorepo makes that situation impossible so there's an advantage for monorepo.

It seems like whether you have mono- or multi- the problems solved by one choice will leave other problems the build system has to solve that it wouldn't have to solve if the other option were chosen.

Different work would be required in multirepo but it would be work to solve the problems that monorepo solves just by virtue of it being a monorepo.

2 more replies

msangi7y ago

Package managers solve it quite well. Just depend on the latest version of your dependencies and tag a new version whenever they change.

3 more replies

zb7y ago

Not exactly for free, but there are free tools that handle this job for you very nicely:

https://zuul-ci.org/

sundargates7y ago

You are right to say that conflict analyzer tries to treat commits independently based on the service or app (which are usually in separate repositories in a multi-repo world). However, note that the problem of conflicting changes (or a red master) exists even when you are in a multi-repo world as you could have one repository getting a large number of commits.

In fact, at Uber we have seen that behaviour with one of our popular apps when we did not have a monorepo. The construct of probabilistic speculation explained in the paper applies even in this scenario to guarantee a green master.

underrun7y ago

Do you mean the construct of probabilistic speculation applies in multirepo because you may end up with a hot spot repo that receives a high volume of commits at once?

Or do you mean that multirepo could also benefit from the construct of probabilistic speculation by ordering commits across multiple repos such that you are maximizing the number of repos that have changed before you build and minimising the number of commits applied to single repos?

Or both :-)

1 more reply

ryanmarsh7y ago

Funny that you would draw a comparison to a Merkle tree. At one client they had such coupling between systems CI/CD was nearly impossible without either an explosion of test environments or grinding everything to a near halt.

We began working with the idea of consensus based CI/CD. If you pushed a change, you published that to the network. It gave other systems the opportunity to run their full suite of tests against the deployment of your code. Some number of confirmations from dependent systems was required to consider your code "stable". This progressed nearly sequentially assembling something like a block chain.

Ultimately the client was unable to pull this off for the same reason they were unable to decouple the systems: lack of software engineering capability.

Scaevolus7y ago· 4 in thread

There's a nice middle ground between this and a one-at-a-time submit queue: have a speculative batch running on the side. This gives nice speedups (approaching N times more commits, where N is the batch size) with minimal complexity.

One useful metric is the ratio between test time and the number of commits per day. If your tests run in a minute, you can test submissions one at a time and still have a thousand successful commits each day. If your tests take an hour, you can have at most 24 changes per day under a one-at-a-time scheme.

I worked on Kubernetes, where test runs can take more than an hour-- spinning up VMs to test things is expensive! The submit queue tests both the top of the queue and a batch of a few (up to 5) changes that can be merged without a git merge conflict. If either one passes, the changes are merged. Batch tests aren't cancelled if the top of the queue passes, so sometimes you'll merge both the top of the queue AND the batch, since they're compatible.

Here's some recent batches: https://prow.k8s.io/?repo=kubernetes%2Fkubernetes&type=batch

And the code to pick batches: https://github.com/kubernetes/test-infra/blob/0d66b18ea7e8d3...

Merges to the main repo peak at about 45 per day, largely depending on the volume of changes. The important thing is that the queue size remains small: http://velodrome.k8s.io/dashboard/db/monitoring?orgId=1&pane...

jacques_chester7y ago

The paper mentions Zuul as a previous work, but notes that batching has downsides:

> Optimistic execution of changes is another technique being used by production systems (e.g., Zuul [12]). Similar to optimistic concurrency control mechanisms in transactional systems, this approach assumes that every pending change in the system can succeed. Therefore, a pending change starts performing its build steps assuming that all the pending changes that were submitted before it will succeed. If a change fails, then the builds that speculated on the success of the failed change needs to be aborted, and start again with new optimistic speculation. Similar to the previous solutions, this approach does not scale and results in high turnaround time since failure of a change can abort many optimistically executing builds. Moreover, abort rate increases as the probability of conflicting changes increase (Figure 1).

viraptor7y ago

The same thing was (is?) done in openstack with zuul, I believe. When you going to merge something, your branch goes on top of things already going through the CI.

Scaevolus7y ago

We talked to the Zuul team, they use more parallelism but it's similar: https://zuul-ci.org/docs/zuul/user/gating.html

Most of the complexity and suffering of a submit queue evolves from the interactions between your VCS and CI systems. Keeping things simple is great! Kubernetes' CI system is Prow, which runs the tests as pods in a Kubernetes cluster. Dogfooding like this is great, since the team you're providing CI for can also help fix bugs that arise.

bjackman7y ago

Yes, I recently switched my org to using Zuul for this purpose; by having an internal speculative queue of future states for master you can have multiple pending changes tested at once, while also ensuring that the tested code is exactly what goes into master. So far it's been a really good experience, in particular as our tests take a long time.

It sounds like Uber's thing has a lot more smarts regardint deciding what gets tested. For the scale I work at (<200k lines of code) that isn't necessary.

richardwhiuk7y ago· 4 in thread

Anyone fancy comparing this to bors?

sundargates7y ago

Actually we have compared it in our paper.

Bors builds one change at a time. On the other hand, Submit Queue speculatively builds several changes at a time based on the outcomes of other pending changes in the system. Apart from that, Submit Queue uses a conflict analyzer to find independent changes in order to commit changes in parallel as well as trim the speculation graph.

We have also evaluated the performance of Single-Queue (idea of Bors) on our workloads. In fact, as described in the paper, the performance of this technique at scale was so high (~132x slower) that we omitted its results. Submit Queue on the other hand operates at 1-3x region compared to an optimal solution.

I recommend you to read the paper here for further details. https://dl.acm.org/citation.cfm?id=3303970

richardwhiuk7y ago

> Bors builds one change at a time.

Bors builds multiple changes at once (it creates a merge commit of all available changes and then runs the tests on all of them), and merges if all of them are good.

Possibly you are thinking of the older bors, as opposed to modern bors-ng?

drodgers7y ago

The main difference is in the conflict-detection system. Whereas bors only has a single queue, this new system can have one queue for each set of changes which doesn't interact with any other set. Eg. if you've got an ios app, a webapp, and a bunch of documentation all in the same repo, then this system will automatically work out that changes to each of those independent projects can be tested and merged in parallel, because they can't possibly conflict.

It relies on understanding the inputs and outputs for all CI build steps to work out how changes to particular files might conflict.

Also, it has a much more sophisticated understanding of how likely a change is to be the source of failure, which it updates in response to repeated test runs. It can then prioritise the changes which are most likely to succeed.

richardwhiuk7y ago

Is the logic of which queue what files trigger automatically or manually determined?

antimora7y ago· 3 in thread

I am still trying to wrap my head around a giant monolithic repo model instead of breaking codes into multiple repos.

At Amazon, for example, they have multi repos setup. A single repo represents one package which has major version.The Amazon's build system builds packages and pulls dependencies from the artifact repository when needed. The build system is responsible for "what" to build vs "how" to build, which is left to the package setup (e.g. maven/ant).

I am currently trying to find a similar setup. I have looked as nix, bazel, buck and pants. Nix seems to offer something close. I am still trying to figure how to vendor npm packages and which artifact store is appropriate. And also if it is possible to have the nix builder to pull artifacts from a remote store.

Any pointer from the HN community is appreciated.

Here is what I would like to achieve:

1. Vendor all dependencies (npm packages, pip packages, etc) with ease. 2. Be able to pull artifact from a remote store (e.g. artifactory). 3. Be able to override package locally for my build purposes. For example, if I am working on a package A which depends on B, I should be able to build A from source and if needed to build B which A can later use for its own build. 4. Support multiple languages (TypeScript, JavaScript, Java, C, rust, and go). 5. Have each package own repository.

PKop7y ago

> At Amazon, for example, they have multi repos setup.

And didn't you find that this created massive headaches trying to build many disparate and inconsistent dependencies across repos? I think the benefits touted from mono-repos are exactly illustrated by the pain points working with Amazon's multi repo setup, in my opinion.

https://danluu.com/monorepo/

"Refactoring an API that's used across tens of active internal projects will probably a good chunk of a day."

This was my experience.

awinder7y ago

How often have you interacted with “hot” packages that both change rapidly and are high dependency? Haven’t worked at amazon but in my experience that’s been low occurrence or a reason to build evolving api / not breaking the api.

I’m just curious, but in fairness both of these schemes have obvious issues that will become headaches or positive design depending on your outlook. Clearly you can engineer effectively in either scheme.

dilyevsky7y ago

Monorepos are really nice if you want to enforce consistent and sane engineering practices and not waste time managing all the repos individually by teams.

Bazel has target caching including remote caching which can be shared across multiple engineers/execution environments. The tricky part would be ensuring your builds are hermetic and reproducible (which is also easier to achieve in monorepo setup).

jonthepirate7y ago· 3 in thread

Having been at both Lyft and DoorDash where I've been an engineer responsible for unit test health, I decided to do a side project called Flaptastic (https://www.flaptastic.com/), a flaky unit test resolution system.

Flaptastic will make your CI/CD pipelines reliable by identifying which tests fail due to flaps (aka flakes) and then give you a "Disable" button to instantly skip any test which is immediately effective across all feature branches, pull requests, and deploy pipelines.

An on-premise version is in the works to allow you to run it onsite for the enterprise.

roskilli7y ago

I don't want to come across as negative, but just an observation and to play devil's advocate - wouldn't it be better to fix the flaky test or delete it entirely instead of build a feature to disable it during a test run in an automated fashion?

Whenever our team has a significant number of flakey tests (more than 1-2) we usually schedule a bug squash session to fix them and amortize the cost over the whole team.

jonthepirate7y ago

What you really want to do is first disable a test you know is unhealthy to unblock everybody. Then, you fix it. After you've reintroduced it healthy, you can turn it back on.

2 more replies

viklove7y ago

Best practice is actually just to disable all tests that are failing. Can't hold up our sprint deadlines!

1 more reply

revskill7y ago· 2 in thread

What's exactly a monothlic ? Is it only related to codebase (monothlic vs monorepo) ? Or it's about runtime like microservices vs monothlic.

jade127y ago

From the first sentence of the abstract:

> monolithic source-code repositories

A monorepo is a monolithic repository

ricardobeat7y ago

To answer the parent, it doesn’t imply a monolith application, but deployment to multiple server roles and apps will happen using the same source repository.

jl-gitlab7y ago

We're building some similar tech at GitLab, though without the dependency analysis yet.

Merge Requests now combine the source and target branches before building, as an optimization: https://docs.gitlab.com/ee/ci/merge_request_pipelines/#combi...

Next step is to add queueing (https://gitlab.com/gitlab-org/gitlab-ee/issues/9186), then we're going to optimistically (and in parallel) run the subsequent pipelines in the queue: https://gitlab.com/gitlab-org/gitlab-ee/issues/11222. At this point it may make sense to look at dependency analysis and more intelligent ordering, though we're seeing nice improvements based on tests so far, and there's something to be said for simplicity if it works.

shimont7y ago

I think that what works for companies like Uber/Google/Facebook is not applicable to the rest of fortune 500 or all of the rest of the companies.

disclaimer: I am one of Datree.io founders. We provide a visibility and governance solution to R&D organizations on top of GitHub.

Here are some rules and enforcement around Security and Compliance which most of our companies use for multi-repo GitHub orgs. 1. Prevent users from adding outside collaborators to GitHub repos. 2. Enforce branch protection on all current repos and future created ones - prevent master branch deletion and force push. 3. Enforce pull request flow on default branch for all repos (including future created) - prevent direct commits to master without pull-request and checks. 4. Enforce Jira ticket integration - mention ticket number in pull request name / commit message. 5. Enforce proper Git user configuration. 6. Detect and prevent merging of secrets.

cjfd7y ago

A possible complication would occur if there are tests that occasionally fail.

techmortal7y ago

How common is this in the industry? Do multirepos run on a batch?

j / k navigate · click thread line to collapse

115 comments

67 comments · 13 top-level

chairleader7y ago· 15 in thread

Quite a premise: "Giant monolithic source-code repositories are one of the fundamental pillars of the back end infrastructure in large and fast-paced software companies."

huac7y ago

facebook, google, airbnb, quora, many more all use monorepo

obviously there are many others who do not use monorepo (amazon comes to mind) but it's reasonable to claim that they are actually widely used and fundamental when used

jhenkens7y ago

Microsoft uses it for Windows as well, which was so large they wrote their own git filesystem to power it.

vruiz7y ago

2 more replies

venantius7y ago

Airbnb uses a monorepo for JVM-based project but most of Airbnb's code at least as of mid-2017 was not run on the JVM and was hosted multi-repo.

2 more replies

twic7y ago

Notably, Netflix doesn't (or least didn't):

https://medium.com/netflix-techblog/towards-true-continuous-...

chairleader7y ago

bluGill7y ago

Many companies do use a monorepo. Many other companies do not. There are trade offs.

msangi7y ago

All the companies have in common a huge budget they can invest on their build systems to overcome the shortcomings of monorepos.

They do have some benefits, but they also come with an immense cost

2 more replies

googlemike7y ago

I like it at Google!

Aqua_Geek7y ago

Google has the team + tooling to properly support it. The same cannot be said for many other orgs.

3 more replies

k__7y ago

Yes, my first thought "I wish the systems I worked on in big corps were big monolithic giants"

robocat7y ago

Which huge successful software companies don't use a monorepo?

xxpor7y ago

Amazon.

slsteele7y ago

I think these count as huge (although maybe not when next to Google), but Spotify and Netflix.

draw_down7y ago

I'm willing to agree with that premise.

huac7y ago· 8 in thread

This is either brilliant or just something built for a promotion packet

pastor_elm7y ago

Sounds so much simpler outside the context of a 'research' paper:

https://eng.uber.com/ios-monorepo/

sundargates7y ago

(I'm one of the authors as well as the tech-lead of the system.)

1 more reply

joshuamorton7y ago

That's because the research-y part is "how do we pick which commit to enqueue next", and that's harder to answer succinctly.

state_less7y ago

You can get most of the benefit on smaller scales by building feature branches and ensuring they pass unit tests, deployment and integration testing before they're allowed to be merged to mainline.

It still depends on well written tests, lest your confidence be dashed when a human starts pushing buttons and pulling levers.

2 more replies

huac7y ago

4 more replies

mcqueenjordan7y ago

Promotion-oriented design, no doubt.

burakcosk7y ago

They have designed this as a result of a need, not just a fancy project.

sundargates7y ago

1 more reply

7e7y ago· 8 in thread

Is this novel? Other companies have had this for ages.

ricardobeat7y ago

roshanjOP7y ago

The problem also occurs if your CI Build + Test steps take a while to run, even on a small team pushing dozens of commits per day.

zyang7y ago

I don't quite understand the problem they are trying to solve. Is there so many change sets that they couldn't provision enough ci servers, hence the "speculation graph with probabilistic model"?

2 more replies

foobiekr7y ago

The fancy bits in this implementation from the paper are interesting but the model itself is not that unusual.

UweSchmidt7y ago

I guess I just have a hard time imagining how many buys developers really commit important work all at once on large projects...

ungzd7y ago

Next step is to serialize all proposed changes, so they are rebased one on top of other before running tests. This eliminates breakage due to merging, but does not scale:

This paper is about scaling a variant of such queue.

paxys7y ago

Which companies?

bostik7y ago

Us, for instance.[0,2]

0: https://smarketshq.com/marge-bot-for-gitlab-keeps-master-alw...

1: https://softwareengineering.stackexchange.com/questions/2787...

2: https://github.com/smarkets/marge-bot

1 more reply

underrun7y ago· 7 in thread

Adrian Colyer dug into this a little further on the morning paper:

https://blog.acolyer.org/2019/04/18/keeping-master-green-at-...

ori_b7y ago

> For multirepo users this is explicit in that this comes for free :-)

Only if you spend the time to build tools to detect commits in your dependencies, as well as your dependent repositories, and figure out how to update and check them out on the appropriate builds.

So, no, it doesn't come for free.

underrun7y ago

sorry, "this" is rather ambiguous.

Different work would be required in multirepo but it would be work to solve the problems that monorepo solves just by virtue of it being a monorepo.

2 more replies

msangi7y ago

Package managers solve it quite well. Just depend on the latest version of your dependencies and tag a new version whenever they change.

3 more replies

zb7y ago

Not exactly for free, but there are free tools that handle this job for you very nicely:

https://zuul-ci.org/

sundargates7y ago

underrun7y ago

Do you mean the construct of probabilistic speculation applies in multirepo because you may end up with a hot spot repo that receives a high volume of commits at once?

Or both :-)

1 more reply

ryanmarsh7y ago

Ultimately the client was unable to pull this off for the same reason they were unable to decouple the systems: lack of software engineering capability.

Scaevolus7y ago· 4 in thread

Here's some recent batches: https://prow.k8s.io/?repo=kubernetes%2Fkubernetes&type=batch

And the code to pick batches: https://github.com/kubernetes/test-infra/blob/0d66b18ea7e8d3...

jacques_chester7y ago

The paper mentions Zuul as a previous work, but notes that batching has downsides:

viraptor7y ago

The same thing was (is?) done in openstack with zuul, I believe. When you going to merge something, your branch goes on top of things already going through the CI.

Scaevolus7y ago

We talked to the Zuul team, they use more parallelism but it's similar: https://zuul-ci.org/docs/zuul/user/gating.html

bjackman7y ago

It sounds like Uber's thing has a lot more smarts regardint deciding what gets tested. For the scale I work at (<200k lines of code) that isn't necessary.

richardwhiuk7y ago· 4 in thread

Anyone fancy comparing this to bors?

sundargates7y ago

Actually we have compared it in our paper.

I recommend you to read the paper here for further details. https://dl.acm.org/citation.cfm?id=3303970

richardwhiuk7y ago

> Bors builds one change at a time.

Bors builds multiple changes at once (it creates a merge commit of all available changes and then runs the tests on all of them), and merges if all of them are good.

Possibly you are thinking of the older bors, as opposed to modern bors-ng?

drodgers7y ago

It relies on understanding the inputs and outputs for all CI build steps to work out how changes to particular files might conflict.

richardwhiuk7y ago

Is the logic of which queue what files trigger automatically or manually determined?

antimora7y ago· 3 in thread

I am still trying to wrap my head around a giant monolithic repo model instead of breaking codes into multiple repos.

Any pointer from the HN community is appreciated.

Here is what I would like to achieve:

PKop7y ago

> At Amazon, for example, they have multi repos setup.

https://danluu.com/monorepo/

"Refactoring an API that's used across tens of active internal projects will probably a good chunk of a day."

This was my experience.

awinder7y ago

dilyevsky7y ago

Monorepos are really nice if you want to enforce consistent and sane engineering practices and not waste time managing all the repos individually by teams.

jonthepirate7y ago· 3 in thread

An on-premise version is in the works to allow you to run it onsite for the enterprise.

roskilli7y ago

Whenever our team has a significant number of flakey tests (more than 1-2) we usually schedule a bug squash session to fix them and amortize the cost over the whole team.

jonthepirate7y ago

What you really want to do is first disable a test you know is unhealthy to unblock everybody. Then, you fix it. After you've reintroduced it healthy, you can turn it back on.

2 more replies

viklove7y ago

Best practice is actually just to disable all tests that are failing. Can't hold up our sprint deadlines!

1 more reply

revskill7y ago· 2 in thread

What's exactly a monothlic ? Is it only related to codebase (monothlic vs monorepo) ? Or it's about runtime like microservices vs monothlic.

jade127y ago

From the first sentence of the abstract:

> monolithic source-code repositories

A monorepo is a monolithic repository

ricardobeat7y ago

To answer the parent, it doesn’t imply a monolith application, but deployment to multiple server roles and apps will happen using the same source repository.

jl-gitlab7y ago

We're building some similar tech at GitLab, though without the dependency analysis yet.

Merge Requests now combine the source and target branches before building, as an optimization: https://docs.gitlab.com/ee/ci/merge_request_pipelines/#combi...

shimont7y ago

I think that what works for companies like Uber/Google/Facebook is not applicable to the rest of fortune 500 or all of the rest of the companies.

disclaimer: I am one of Datree.io founders. We provide a visibility and governance solution to R&D organizations on top of GitHub.

cjfd7y ago

A possible complication would occur if there are tests that occasionally fail.

techmortal7y ago

How common is this in the industry? Do multirepos run on a batch?

j / k navigate · click thread line to collapse