I've always heard people talk about how it doesn't scale, but I use rebase like 99% of the time, and have worked on projects with hundreds of ICs. This is the first time I've seen someone explain it in a way where I get it. NO I'M NOT FORCE PUSHING TO MAIN YOU SILLY NILLY! Turns out I'm "squash rebasing." I guess I didn't know I need to specify that.
I do it slightly differently tough. I use git commit --amend to build up a single commit as I go.
git checkout -b blah-feature
# do some work
git commit -m "Main description of my feature"
# do more work
# no -m, and then I just add bullet points for each subsequent change in vim (example below)
git commit --amend
Then, once I'm ready to make a PR I do the following: # pulls from remote without merging
git fetch origin main
# adds my single commit to the end of the current main
git rebase main
# push up feature branch for code review
git push
# get yelled at about --set-upstream, and copy/paste that command :-)
My commit messages typically look like: Add some new feature
- do some sub task
- do another sub task
- ...The way I look at it, lets say I have 10 commits. If I rebase main, commits 3, 4, 7, 10 have "conflicts" with the code that is on main now compared to when I started writing my feature branch and making commits. But, now, I have an opportunity to update code in each of those commits as if I was writing it based on what is now currently on main. If done like that, incrementally, it usually doesn't cascade into conflicts on each commit.
The problem, IMO, comes when at commit 3, the dev says "Oh, I did this like this in commit 10, so let me just put that solution here in commit 3 to resolve this merge conflict". Now, instead of 4 conflicts, you have conflicts on all of the commits between 3 and 10 (because the dev in effect moved the fix from 10 up 7 commits from where it originally was committed). Instead, each conflict resolution should aim to maintain the code as close to the committed code as possible while integrating the code from main. That way also, the feature branch commits still reflect the iterative process that having multiple commits is designed to show.
I don't embrace a FULL squash rebase, but I do embrace cleaning up your branch commits with an interactive cleanup rebase (not on main, just going through the commits for the branch and squashing any minor fixes that belong with the previous commit, etc.) THEN, once you have a clean feature branch, rebase main. The feature branch may now have, instead of the 10 commits above, eliminated 6 commits that were just things like minor test fixes, typos, etc, and now only has 4 commits total. Instead of 4 conflicting commits, it may now only have 1 or 2, making the rebase simpler as well. And the branch still maintains the traceable history of the development of that feature (assuming good commit messages were used, which is not something the original poster values either).
git checkout master; git pull
git checkout -b fb
git commit
git commit
# new changes arrive on master branch
git checkout master; git pull;
git checkout fb; git merge master
git commit
# "went to the git brownbag and heard about rebase for the first time,
# missing that part up front about not intermingling merges with rebases
# and not having a good mental model of git
git rebase master
# WTF conflict everywhere! rebase sucksWhile this is a sensible approach, it doesn't work in scenarios where you essentially have to "test in production" to actually test things (Jenkins, looking at you). I've experienced some of that in the real world, and combined with the inability to force-push, it seems like the squash rebasing is the most sensible thing to do to keep main clean within these constraints.
> get yelled at about --set-upstream, and copy/paste that command :-)
But you might like this https://git-scm.com/docs/git-config#Documentation/git-config...
alias gpush='git push --set-upstream origin $( git branch --show-current )'> # adds my single commit to the end of the current main
The biggest benefit to this IMO is that you can resolve conflicts in YOUR branch, get it all cleaned up, and then when you merge there are no conflicts. This allows you to test any changes made during conflict resolution in your feature branch still.
> Add everything I’m working on (new and edited files).
Bad idea. Extraneous cruft that isn't caught by .gitignore will leak into the repo. Always run git diff and git status first to see what you are about to add.
That's not a reasonable argument. The problem is pushing confidential info into a repository. It matters nothing what the branch you push it is called.
if something leaks and I need to add to gitignore I'll amend.
there's definitely cases for staging more complicated edits but not in my daily work.
"But my feature branch gets squashed anyway"
1. I believe that in itself is a mistake, especially as feature branches get large and, though you say you're all for small feature branches, your commit history says you do something different.
2. You don't clean up that giant merge commit message with all those "WIP" "fixed stupid mistake" comments in there.
Use rebase to keep your workspace clean and main/master comprehensible to your colleagues.
> Push my code to the remote; likely spinning up lots of machines in CI to check my work.
That's a good callout; something that I have noticed I frequently miss when iterating on my code is the cost of the CI/CD systems already set up. It's something to consider especially when iterating using a system like Sapling for Stacked PRs; each individual PR push may update a chain which causes a lot of wasted CI/CD time.
They have a ton of other handy toys, even if the syntax is ... Very Git :-( https://docs.gitlab.com/ee/user/project/push_options.html#pu... I suggested a feature-request to add `git push -o gitlab.help`: https://gitlab.com/gitlab-org/gitlab/-/issues/359267 as well as showing the CLI equivalent for the web workflow:
CI variables: https://gitlab.com/gitlab-org/gitlab/-/issues/359268
> This isn’t grade school, you don’t have to show your work As you become an efficient engineer, the path you took to get to the final state of a pull request becomes far less important — and is academically interesting at best. You shouldn’t have to show your work like you did in school. Land the feature or bug and move on to the next one. The code speaks for itself (alongside some judiciously placed comments).
> Having granular annotations of all your work is unnecessary, ...
This premise underlies the particular workflow that the posted article assumes, and all the described command+option incantations are directed to it.
But there is another, very different git workflow used by a project we've all heard of, and that is the Linux kernel core code. The trunk.io workflow is unsuitable for Linux due to different requirements. Some of which being:
1. Commitment to support for indefinite future.
2. Large, complex feature PRs.
3. Human review of PRs, by maintainers fully empowered to reject.
4. A low-level programming language, in which subtle bugs are easily introduced.
Also different luxuries:
1. Willingness to put off merge of a "hot" new feature indefinitely.
So, kernel PRs are structured as a linear series of numbered patches meeting the requirement that each step along the way compile cleanly. This is primarily to ease the task of the maintainer responsible for the subsystem involved, and who will have deal with the fallout of bugs introduced by the PR. Example:
Credential: I have written code for the Linux kernel core, and it was merged, and it was buggy.
I don't think you will need to reclone if did nothing wrong. You can always use `git reset --hard origin/...`, and if that does not work the you definitely did something wrong.
> But in Linus’s own words, Git is “the information manager from hell.”
That git is not the same git we have today, and it was handed over to another person quite early in the development. Although I agree that git ux is sometimes confusing.
> Limit your Git actions ... for peak Git efficiency
I pretty much disagree, if I can give my two cents, read the manual. Git have some really handy tooling that can help with non-git issue (e.g.: `git bisect`). Limiting your knowledge brings no benefit.
I like some point of the text, but overall I don't like the premise. it exaggerate a lot a problem to prove a point.
Some times they are described as such “merged master into feature” and can be avoided if you review the PR per commit. But more often I want to review the PR as a whole, and then the tool fails to show a good diff of what’s actually developed in the PR. This to me is a much larger problem than the log pollution, which can be solved by squashing.
Other than that (missing the bigger reason for rebase and focusing on a lesser argument in my opinion) I quite agree with the article.
This has been my experience as well.
Great article, thanks! I've been using essentially this same subset of commands for many years, and it's worked extremely well for me: does everything I need/my team needs, and avoids complication. I'm glad to have this as a reference I can point people to when they ask for git advice.
The irony, IMO, is that Linus prefers C to C++ exactly because C++ gives developers too much rope.
That said, every time I try to really teach someone rebase, particularly a new dev, I understand why people shy away from it. I did for a very long time. So I totally understand and get why the above style workflow may terrify folks (or just seem unnecessary). It is easy to mess up and there are a lot of little gotchas if someone isn't careful. And worst of all, it can result in lost work (although even that is "usually" recoverable, but not always). I do think there are some benefits to it, and I think it is something that can be integrated into a dev's workflow a bit at a time. And it really doesn't take significantly more time, in my experience.
I'm not gonna argue here for adopting that. Except for "no commit messages", I'd be pretty ok with a workflow as outlined by this post. I do think folks should understand how rebase works, what commits will be moved/changed when they run a rebase (this is vital), and how to recover when a rebase goes bad (no, not reclone, not generally even delete branch and check it out again).
.
Couple random thoughts I try to communicate to folks who decide to utilize rebase more in their workflow:
- rebase often (if main updates often)
- if worried the rebase may be messy, create a temporary branch prior to starting the rebase at the feature branch HEAD - allows for an easy way back (and prevents lost code)
- don't rebase shared branches - this is a tool to use PRIOR to "sharing" (i.e. pushing) code
- squash/clean up unneeded commits before rebasing on another branch (this may bring it all the way down to a single commit, but for larger features, I think there is value in seeing the main decision points along the way)
- fix conflicts with the code at the specific commit you are on only, don't fix it with the eventual end result X commits down the line - this will generally avoid the dreaded "fix the same conflict over and over for each commit" problem some people encounter with rebasing
- remember rebase creates new history - it doesn't rewrite history (however, old commits will eventually be garbage collected)
- pro tip: understand how `rebase --onto` works, sometimes you shouldn't, or at least don't want to, take all of the commits