Much like the thesis of this article, the goal is to have a set of well organized commits so that when it comes time to do a PR review, you have 1-3 logical units of change and it also improves the ability to do reverts.
But you can very easily do this just by rebasing instead of resetting and re-committing.
In particular, `git commit --fixup <ref>` and `git commit --squash <ref>` are extremely useful for this during the "WIP" stage as well as when handling PR comments, and these are things that I only learned about in the last 6 months. I would recommend doing a google search on "fixup commits" to learn more about them. I enjoyed this article: https://www.mikulskibartosz.name/git-fixup-explained/
Yes, rebasing is scary if you're new to git, but it does everything in this article in a way that's much cleaner and once you learn how to use it effectively you'll feel like you have super powers. It's worth taking the time to learn.
If your branch consists of 10 commits of trying various things, then the sum of changes in all the commits can easily become significantly larger than the final branch diff. In that case resetting is orders of magnitudes faster than fiddling with rebase.
I would probably branch off my branch for each logical change I wanted merged and use `rebase -i` to wholesale drop commits that I didn't want on the branch anymore.
This is also an example of where stacked diffs are far superior to PR's but that's a separate issue.
Use fixup commits.
Edit: to be clear, I know it's possible by using git reset... but I'd much rather have something like git add -i that interactively shifts changes from the commit itself and moves them into working directory-only changes.
2. mark the commit you want to split as an `edit`
3. remove the file(s) you want to edit from the index so that it's part of your working tree
4. use `git add -p` to stage the hunks you want in the first commit (assuming you want to split commits in a single file) or just commit the files you want in the first commit first, second commit second.
git rebase --interactive master
In Vim: ciw e ESC :x
git reset HEAD~
// Make your commits
git rebase --continueWhen I'm working on a more complex patch series, it's not uncommon for me to have a bunch of already cleaned up, or semi-cleaned up commits on top of the remote branch I'm working against, and then a bunch of random garbage "WIP" commits. Something that often happens is that I `git reset` not to the upstream branch but to my most recent clean commit, take the resulting changes apart into individual commits (often, some of them will be squash/fixup as I discovered a bug or missing piece of earlier work) and then `git rebase` with `rebase.autosquash` enabled in my global Git config.
Sometimes you can even create empty commits (--allow-empty) up-front (if you already know about all the things that have to be changed) and fill them via fixup/rebase --autosquash
The issue is one of review scaling. I wrote a blog post about this a while ago[0], but the gist of it is that those clean logical units are often too small for meaningful high-level reviews of more complex work.
With complex features or refactorings, you're often in a situation where those clean logical units allow reviewers to do a good low-level review (do a check for logic corner cases, style issues, etc.) but they don´t allow a high-level review of how all the pieces of the feature work together.
IMHO the most open-source process friendly solution to the issue is to review patch series, where you can review the series as a whole for the big picture, but also dig into individual commits for the details. Building such a patch series requires an approach as described in the article.
(In closed source environments, you may get a good enough approximation of the result with a separate, disciplined software design process.)
[0] http://nhaehnle.blogspot.com/2020/06/they-want-to-be-small-t...
I don't care one lick what someone's branch history looks like. If they want to commit every day, or every hour, or after every keystroke - I don't care. All I know is that once the PR is merged, it's all going to be squashed into a logical unit so the `main` commit history will look just fine.
I could avoid creating those commits in the first place, but asking me to only commit my local changes when I have something intelligent to say about them, is a damn near impossibility for me. I basically use git to "save my work" before I try an approach to something. I reset back if it doesn't work out. Sometimes I reset back again, if the approach that didn't work out turned out to be the least worst option. I create commits to experiment with something, so that I can quickly compare back and forth between two approaches. etc. etc. etc.
I would instead say, that if you're only making commits when you have something logical to say, you're probably not using git to its fullest potential. You should really go nuts with it IMO, and only bother to sound intelligent in your commit messages once you're ready to do so, and (most importantly) you should still make commits before that happens. Git's decentralized for a reason, take advantage!
Nothing forces you to keep the messy commit messages either - keep the short commit message and make sure it's good, then delete the combined individual commit messages from the long message field, done.
If someone else is doing the merge and you’re unsure if they will you can preemptively squash your whole branch so there’s only one commit in the pr.
Spending effort on managing git is mental effort you don't spend on solving your actual problems. By far the best experience I've ever hadeith git was: everyone works straight on the dev branch, just rebase, fix your stuff, test often, and if you're doing some multi-day work then sure, branch and think it over then merge.
That's it. That's all you need. I've had a million more problems with every attempt at making this process "clean", or "smart". Dumb was by far more efficient, more enjoyable, helped us find and fix bugs faster, and had the shortest time to market ever.
Dont make work upfront, that is rarely important. When it is important, find the commit and split it up (as necessary) at that singular point (instead of across all PRs). You have now cut down how much time it takes to make PRs while retaining the same end result.
But this part set off alarm bells:
> Once you’ve finished making your changes, it’s time to prepare your work for some “git clean up.” To do this, we’ll run the following command:
> git reset origin/main
Be very careful here! If you've run `git fetch origin` since you've started your work, you may be resetting to a commit that's newer than what you based your work on, and thus you'll wind up creating a commit which effectively reverts anything that's happened on `main` since then.
The more technically correct command would be:
> git reset $(git merge-base origin/main HEAD)
Since that resets to the commit you started your branch from.
it's more complicated but vastly more powerful than reset
Squash-rebasing on top of main makes git replay all of those dumb commits, just to squash them, and I have to fix merge conflicts individually, for every commit I don't even care about, before it's done.
I'd much rather soft-reset to the merge-base, make one clean commit, then rebase that one commit, than any of the other approaches.
The typical response people have is that I shouldn't create so many garbage WIP commits, but... that's just not how my brain works. See https://news.ycombinator.com/item?id=30000320
Then the article describes a very elaborate way of… not addressing the problem?
Reset-Cleanup is a decent idea and helps keep a tidy repo which is a good goal. But the revert reason seems contrived.
The problem is going to be later code (from other pull requests) being dependent on the code in my pull request.
If my pull request is reverted then any later change can break. If my PR consists of 5 well separated commits or 3 messsy ones doesn’t matter in that scenario.
If someone against all odds wants to revert one individual commit from a merged branch but not revert the whole PR by reverting the merge commit then not only can the feature in the PR stop working (if the feature/bug fix in the PR didn’t need that commit to work then why was it there in the first place!?), any later code can break just as it can when PR is reverted as a whole.
This is why I recommend squashing for almost all cases. For really complex features with dozens of commits you can always do a rebase + FF (but in that case all commits should build + pass tests, which is an unrealistic goal in all code bases where a build + test takes hours).
As far as I could tell, this was the recommended workflow from the earliest users of git. Look at the Linux kernel's public history.
When git gained widespread adoption, it was sold as "Subversion, but with cheap branching." People don't realize that this power requires discipline, otherwise you end up with complicated history that makes change management a nightmare.
In code review systems like Gerrit, you have to approve every single commit, and you can also enforce rules like "fast-forward merges only." It sure makes Github's pull-request model feel like an anti pattern.
> Treat yourself as a writer and approach each commit as a chapter in a book. Writers don't publish first drafts. Michael Crichton said, "Great books aren't written-- they're rewritten."
This is a more succinct and poignant way of describing my workflow, and why "just put a PR up with all your WIP commits" is so difficult for me. I don't want people to see how long I spent getting this one unit test to pass when the answer was staring me in the face the whole time. I don't people to see just how badly I wrote the first draft of my code, when my approach was basically "make it work e2e first, no matter how bad of a hack it is, then actually figure out how the code should be laid out later."
Folks say "your PR can just be squashed when merging anyway", but it misses the point: My private commit history is very, very, intensely personal to me, and I don't want anybody seeing it, ever.
I use GitHub and always squash commits before merging a PR. This keeps the commit log clean and also has the side effect that you have the PR number in the merge commit.
Having said that I also suggest keeping PRs small. If you are going to reformat a code base, make that a separate commit. Updating a library, separate commit. Adding a library you need for a new feature: create a PR for the library update, base your implementation off that branch and rebase against main when the first PR is merged.
From reading a lot of feedback on this post one thing that stands out is, the best way to use git just depends on the context. But it doesn't hurt to have commands like this in your toolbox and know how to use the tool well. Plus we all have our private, icky antipatterns that we know we should improve, right?
All depends on if you want the most efficient approach, or if you want serendipity to strike.
When developing my priority is to easily get back to a last known working state. This allows me to try out risky changes, and throw them all away if it doesn't work out.
When pushing my changes for PR, those working states I saved before may not be the best logically. I usually re-write my commits to break it into more logical chunks, that are easy to revert and easier for other teammates to digest.
Previously I avoided creating messy commits simply because it was "tedious" to reorganize commits. And making overly atomic commits and typing out git commands more frequently didn't appeal to me either.
Once I got used to the above tools, life got so much easier. Lazygit makes it super easy to amend, reword, and even re-position commits in a TUI environment. Real life-changer for me. I can stage hunks super easily too, though that's even easier in neovim.
The only issue I face is my C-j/C-k keys are already bound to tmux, but are needed by lazygit to reposition commits.
[0]: https://github.com/jesseduffield/lazygit [1]: https://github.com/tpope/vim-fugitive
I'm not sure this gives the right impression. Yes rename/squash/interactive rebase if necessary to tidy, however I still believe you should strive to create clear, separate commits as you go.
If you have to regularly make major changes to history before review, I could be a sign that your process/approach is disorganised.
Of course my process is disorganized! I'm an extremely disorganized person. I like to try out wild tangents in my coding, experimenting with some crazy idea or another and saving my place before and after I do so. I even have a separate .txt file of stream-of-consciousness writing about my coding, to keep track of the tremendous amount of complexity I have to deal with on a regular basis.
My reflog reads like a personal diary of failings, dead-ends, "FINALLY COMPILES!" commits, etc etc. I use `git branch someNewBranch` really often, keeping a namespace of local branches under the name `deadend/*` to mark tombstones of approaches I tried in code but didn't work out, and I `reset --hard` back to a previous commit after I save the branch. I like it this way, it maps better to my totally disorganized brain.
But I'm also a firm believer that none of my colleagues should be able to tell how disorganized all of this is, because by the time I make a PR, I squash it all down and write a very long, detailed commit message of what exactly I did, why I'm doing it, and how it works (sometimes, albeit rarely, spread across multiple logical commits in one PR.) They never get to see my private commit history.
Why does it matter to anyone how much I have to prune my commit history before a PR? That's like complaining to another student in class that their short-form handwriting is hard to read, in their own personal notes they're taking during class, when said student is acing all the tests. It's simply not the metric you should be judging people on.
Right. Next time think before do.
"Added new styles to navigation"
No duh, that's what the diff shows, but that doesn't tell us why you made the change, and that's the part we're going to struggle to remember in 6 months.
As far as the proposal goes, this is a much better way to do it than trying to preserve every mistake you made along the way. I'd prefer to see either this or just the one squashed commit. I'd just argue that maybe if its important to break it up into multiple commits that you might consider that it should also be separate PRs. If it is necessarily so interlinked that you can't produce separate passing PRs then I probably need to wrap my brain around the whole thing at once as well.
Usually my objection is that multiple different concerns need to be done totally separately. I can't think of the last time I really wanted just separate commits and not entirely different PRs.
For example,
Most of one core change is in multiple files, it will be very bad to commit by files group
Either way, in the end it's up to developer discipline to make clean commits, git doesn't really help all that much although in my (long) source control history before git I never thought of committing hunks.
Because committing is useful? It saves my work. I can get to a point where my code finally compiles, make a quick commit, then do some more dangerous refactoring. If something breaks, I can `git diff` to see what changes I've made since it last compiled.
I'll turn this around and ask you: Why do you feel like you can only make a commit if you have something that's reviewable? Is your code writing style so perfect that it all works the first time? Or do you just not bother committing anything until your work is done? (That personally sounds terrifying to me... I really need to be able to say "what have I changed since 10 minutes ago when this compiled?" on a very regular basis.)
> Be mindful of not leaving your codebase in a broken state during this step
You’re still relying on very fallible human intervention here. Even worse, often times when grouping commits post-facto you’ve forgotten some of the context of what depends on what.
Ostensibly the approach presented could be better than other strategies in this regard, but that’s now how the article presents itself and it loses credibility for that.
[tig "bind"]
main = = !git commit --fixup=%(commit)This is what the project I was just hired to work on looks like. Result is a linear history, with multi-thousand-line changes in a commit with message "Implement fixes" and a list of "fix bug", "fix more", "format", "change implementation" in the long-form commit message.
People out there do not even know how to work properly with git, I think it is way too much for them when we start telling them how to "workflow" with git. It is sad, but that's what I see.
Example from the start ... revert.
Don't revert stuff, fix stuff forward (make new commits with new changes) you won't introduce bugs in silly way.
If you want to follow the pattern described here, create a new branch and do `git checkout the_feature_branch -- $filename` and pull the changes from the branch you did the work in into the new branch, making commits as described in the article. You can even do `git diff --numstat $feature_branch $new_branch` to see what has changed.
> When the feature is complete, make a pull request.
gotta love 100 file commits that take a day to review.
Git should introduce stages and allow you to have any number of stages. Git should also have some porcelain around stashing all but one stage and restoring all etc etc.
the problem with your approach is that sometimes, changes are logically grouped but cross file.