Thou Shalt Not Lie: git rebase, ammend, squash, and other lies (opens in new tab)

(paul.stadig.name)

40 pointsmattrepl15y ago25 comments

25 comments

19 comments · 11 top-level

rlpb15y ago· 5 in thread

Linus thinks that it's fine to rewrite history on a private branch: http://www.mail-archive.com/dri-devel@lists.sourceforge.net/...

> but instead of committing all of your work as you have come to it naturally, you decide to break your work up into several small, "logical" commits. This makes you look good, but it's a lie.

No. Breaking your work up into several small, "logical" commits is exactly the right thing to do.

I find this very useful when doing experimental coding: when I don't really know where I'm going or if my changes will work. I end up with a pile of commits that do break tests and thus bisect etc, before ending up with something that works. Reworking this private branch is exactly the right thing to do.

This is in fact the exact model that open source development has used for years. Try submitting a patch series for inclusion in the kernel which include a bunch of mistakes that you've later corrected (as happens naturally during development) and see what happens. You'll be asked to rework it, since it makes it harder to review.

Take a look at the Linux git tree and find me a single merge commit where the branch being merged contains mistakes and corrections. You won't find one. You will find plenty of regression fixes, but these are there because by this time the commits were public and couldn't be fixed retrospectively using a rebase without rewriting public history (which would obviously cause all sorts of problems).

stevelosh15y ago

> No. Breaking your work up into several small, "logical" commits is exactly the right thing to do.

The author is not arguing against making small, logical commits. He's arguing against making a ton of changes in your working directory, then running `git add file1, file2... ; git commit` a bunch of times in a row to record a series of commits.

The problem with this is exactly the one he mentioned: almost no one ever goes back and makes sure each commit actually builds and passes tests.

After they finish making a series of, say, three commits they just go ahead and push. It's only weeks or months later when someone bisects that they find out the first commit was broken without the contents of the third.

Mercurial users use MQ for this process, and to me it seems like it's a better and safer method with an uglier UI.

With MQ, once you were ready to create your three commits you would say:

`hg qnew file1 file2 ... --message 'Fix the foo bug'`

`hg qnew file2 file3 ... --message 'Add feature bar'`

`hg qnew file4 file5 ... --message 'Add feature baz'`

Now you've got three patches that appear (mostly) as normal changesets in your repo.

You can `hg qpop` back to the beginning of the set, run your tests, fix anything that's broken and add it into the patch. Then you would `hg qpush` the next patch and do the same thing.

Once you know that all three patches represent a working state you can `hg qfinish --all` to turn them into vanilla commits.

Yes, it's more work, but you've got three "logical" commits that actually work, instead of three "logical" commits that might hopefully work.

MQ is also great for the open source workflow you mention, where you send your patches to a mailing list (for example), get feedback, and rework them.

If someone tells you changeset X has problem Y, you can just `hg qpop` back to the patch, fix Y, refresh the patch with that change, and retest/resend.

If you want even more crazy power, you can make the directory containing your patches a Mercurial repository, which lets you track the evolution of your patches over time. It's very weird and meta, but extremely powerful.

masklinn15y ago

> Mercurial users use MQ for this process, and to me it seems like it's a better and safer method with an uglier UI.

MQ lets you do the exact same thing as the interactive rebase. It's not like it forces you to go back and ensure every single patch is correct after you've qfolded some together or reordered them.

> You can `hg qpop` back to the beginning of the set, run your tests, fix anything that's broken and add it into the patch. Then you would `hg qpush` the next patch and do the same thing.

And you can do the exact same thing with your git commits before pushing them, last time i checked MQ had `qpush -a` and did not force you to run your tests between two qpushs.

> Yes, it's more work

Indeed. And if you have no problem with that more work, you can do it just as well with git. MQ doesn't magically make people care.

> If someone tells you changeset X has problem Y, you can just `hg qpop` back to the patch, fix Y, refresh the patch with that change, and retest/resend.

No you can't. Because if it's a changeset (rather than a patch in a series) then you've already qfinished it and pushed it to a public repository, and you're now rewriting public history.

I don't like git for a number of reasons, but this is a terrible strawman: git provides all the tools needed to ensure each and every commit is correct (whereas bazaar, for instance, doesn't. Not without untold amounts of pain anyway), and I've seen a number of blag posts and comments which recommended exactly that approach: tinker on your local branch, rewrite to your heart's content, and before you push anything to remote test each commit individually. There is nothing which prevents you from doing that, just as there is nothing that forces you to do that with mercurial.

1 more reply

jamie_ca15y ago

> The author is not arguing against making small, logical commits. He's arguing against making a ton of changes in your working directory, then running `git add file1, file2... ; git commit` a bunch of times in a row to record a series of commits. > The problem with this is exactly the one he mentioned: almost no one ever goes back and makes sure each commit actually builds and passes tests.

Right. For git, IMO the right way to do this is:

    $ git add -p # stage the first bunch of changes you want to commit
    $ git stash save -k # push all the unstaged changes to the stash
    $ # build your code, run tests, whatever
    $ # repeat until it builds clean
    $ git commit # record the clean (possibly fixed) commit
    $ git stash pop # get your other original changes back

Repeat as needed until all your stuff is committed. If you can't make a subset of your changes build clean, then it's not independent and should not be committed standalone.

dlsspy15y ago

I like to polish my commits and make them be concise, informative and correct. I wrote a tool to help with some of that: http://dustin.github.com/2010/03/28/git-test-sequence.html

It takes time to write a good change, but it takes way, way more time to figure out what the purpose of some random change was later. I've had people contribute code to a project and try to tell me they didn't understand it well enough to separate it and document it. If you don't understand it, how can I possibly understand it?

The largest, least well-documented commits are the ones that seem to cause the most confusion when I go back to figure them out.

pjstadig15y ago

I never said that one should not make small logical commits. I just think you should actually work that way, instead of trying to fake it after the fact.

It sounds like you are, so cheers!

gte910h15y ago· 2 in thread

Grow up. It's source control, not marriage. Sometimes it makes sense to make bigger commits (there is lots of reasons to sometimes present the checkin in a manner other than 1000 little incremental fixes; its a naive techie who thinks otherwise) or fix those you have made. Get over it and relax; don't try to add morals to source control.

andybak15y ago

He is light-heartedly using 'morals' as a part of his schtick. No-one is expected to take it seriously.

Underneath the style there are some interesting points being made about how certain uses of certain features breaks other features. I'd be interested to hear your rebuttal.

gte910h15y ago

This is supposed to be light hearted?

What am I supposed to rebut? He gives little reason why the features says get broken are more important than the other things. He just states it.

3 more replies

santry15y ago· 1 in thread

> [If you use git merge --squash some-branch] [w]hen a QA person or your boss says, "Hey, is some-feature {merged into QA, deployed}" you have to resort to `git log` spelunking.

Of course you shouldn't use merge --squash in that context. That doesn't mean you should never use merge --squash. You just shouldn't use it to rewrite public history. Do you have a local branch in which you fixed some bugs and which branch has not been pushed out to the world and would you like to use merge --squash to merge it into master? Go for it! But if that branch _has_ been pushed out to the world, just use a regular merge.

The fundamental practice the author is arguing against is rewriting public history. But instead of making that point, he makes these sensationalist, dogmatic assertions that rebase, merge --squash and commit --amend are evil and should never be used. Until the end of the article, where he finally admits that really what he (correctly) has a problem with is rewriting public history.

OK, fine. Just title the article "You shouldn't rewrite public history".

pjstadig15y ago

I would, but that would mean rewriting public history! So, I'm kinda stuck on that one. :-(

patio1115y ago

I think this depends a lot on your team, your practices, and the way you use source control. At the old job, source control (mostly SVN) was essential for figuring out why a new change to the framework broke a customer-specific customization written six years ago by someone no longer with the company. The company culture was to keep commits and merges so clean they were almost releases in themselves. (It took me, ahem, quite some time to adapt to that style of working -- coming as I did from not using source control at all.)

If they had used git, they would totally have approved use of rebase for optimizing readability/understandability of commits for the beleaguered maintenance engineer in 2016 trying to figure out what why what we did today (in 37 separate incremental commits) just interacted with new code and blew up $MAJOR_UNIVERSITY's payroll system.

I follow very, very different practices when developing by myself. I largely leave the history as it is, warts and all. If in the course of trying to fix a bug I make five exploratory commits and on the fifth one finally find the magic that works, then history just happens to include five increasingly frustrated commit notes.

cschneid15y ago

Core disconnect: what is source controls role: is it a journal of all history, or is it a tool to find code.

I fall on the tool side, which means I rebase & amend and keep history semantically organized, and such. All that makes it easy later to find exactly where I made XYZ change. BUT, I admit that it does ruin the journal aspect of every change. rebase -i does in fact change history, and can make things appear out of order.

I think it's ok, the author doesn't.

protomyth15y ago

Look, the world doesn't need to know it took me 13 attempts to get the patch right and passing all its tests. The world really doesn't need to see the childish mistake I made on attempt 5 or the cuss words I was using in the #define on attempt 8. Just take the sausage and skip the tour of the factory.

yason15y ago

Bah, that's bullshit. These are Git features and meant to be used, and these are part of the reason Git rocks so great.

It's not lying, it's cleaning up. History rewriting is assembling all the intermediate crap into a comprehensible patch that as an added bonus also comes with a hindsight, or keeping sense in maintaining your work-in-progress on top of some other branch by continuous rebasing.

It's not lying, it's making sense. It's the same thing why you don't save your Emacs editing history to have someone else replay it to produce the source file. You just save the source file because you've spent time doing work to eventually produce something that makes sense. You don't want to bother other people with your errors. It's of no value to them.

However, the time you do not want to do this with public commits. Anything that you have pushed or someone has pulled is public. Anything that flies out of your local nest shall become immutable.

msy15y ago

However if you do compile and/or run the tests it makes a lot of sense to make logical commits instead of monolithic commits of changes that are not interdependent. And --amend ing a commit message to fix a typo before pushing it is completely harmless.

There's some valid bits in there but commandment-from-on-high writing style is pretty offputting.

wnoise15y ago

Fictions are useful. Source is meant to tell a story to the other developers, and only incidentally to control a computer. The history of development is just as much a part of this story. There is a balance between making it clear and making it reflect what happened in a way that's not misleading, but if each release in the new history is actually reasonably tested, I have no problem rewriting private history.

perlgeek15y ago

If rewriting private history is a lie, is rewriting URLs also a lie? Should we all expose ugly /app.cgi?foo=bar?action=edit URLs because we shouldn't lie? Ss fiction a lie?

If all that is a "lie", I'll happily lie to outsiders to present them a nicer image, as long as no harm is done that way.

foobarbazoo15y ago

Though Shalt Not Lie: a stupid article title, idiotic advice, and other lies

j / k navigate · click thread line to collapse

25 comments

19 comments · 11 top-level

rlpb15y ago· 5 in thread

Linus thinks that it's fine to rewrite history on a private branch: http://www.mail-archive.com/dri-devel@lists.sourceforge.net/...

> but instead of committing all of your work as you have come to it naturally, you decide to break your work up into several small, "logical" commits. This makes you look good, but it's a lie.

No. Breaking your work up into several small, "logical" commits is exactly the right thing to do.

stevelosh15y ago

> No. Breaking your work up into several small, "logical" commits is exactly the right thing to do.

The problem with this is exactly the one he mentioned: almost no one ever goes back and makes sure each commit actually builds and passes tests.

Mercurial users use MQ for this process, and to me it seems like it's a better and safer method with an uglier UI.

With MQ, once you were ready to create your three commits you would say:

`hg qnew file1 file2 ... --message 'Fix the foo bug'`

`hg qnew file2 file3 ... --message 'Add feature bar'`

`hg qnew file4 file5 ... --message 'Add feature baz'`

Now you've got three patches that appear (mostly) as normal changesets in your repo.

You can `hg qpop` back to the beginning of the set, run your tests, fix anything that's broken and add it into the patch. Then you would `hg qpush` the next patch and do the same thing.

Once you know that all three patches represent a working state you can `hg qfinish --all` to turn them into vanilla commits.

Yes, it's more work, but you've got three "logical" commits that actually work, instead of three "logical" commits that might hopefully work.

MQ is also great for the open source workflow you mention, where you send your patches to a mailing list (for example), get feedback, and rework them.

If someone tells you changeset X has problem Y, you can just `hg qpop` back to the patch, fix Y, refresh the patch with that change, and retest/resend.

masklinn15y ago

> Mercurial users use MQ for this process, and to me it seems like it's a better and safer method with an uglier UI.

MQ lets you do the exact same thing as the interactive rebase. It's not like it forces you to go back and ensure every single patch is correct after you've qfolded some together or reordered them.

> You can `hg qpop` back to the beginning of the set, run your tests, fix anything that's broken and add it into the patch. Then you would `hg qpush` the next patch and do the same thing.

And you can do the exact same thing with your git commits before pushing them, last time i checked MQ had `qpush -a` and did not force you to run your tests between two qpushs.

> Yes, it's more work

Indeed. And if you have no problem with that more work, you can do it just as well with git. MQ doesn't magically make people care.

> If someone tells you changeset X has problem Y, you can just `hg qpop` back to the patch, fix Y, refresh the patch with that change, and retest/resend.

No you can't. Because if it's a changeset (rather than a patch in a series) then you've already qfinished it and pushed it to a public repository, and you're now rewriting public history.

1 more reply

jamie_ca15y ago

Right. For git, IMO the right way to do this is:

    $ git add -p # stage the first bunch of changes you want to commit
    $ git stash save -k # push all the unstaged changes to the stash
    $ # build your code, run tests, whatever
    $ # repeat until it builds clean
    $ git commit # record the clean (possibly fixed) commit
    $ git stash pop # get your other original changes back

Repeat as needed until all your stuff is committed. If you can't make a subset of your changes build clean, then it's not independent and should not be committed standalone.

dlsspy15y ago

I like to polish my commits and make them be concise, informative and correct. I wrote a tool to help with some of that: http://dustin.github.com/2010/03/28/git-test-sequence.html

The largest, least well-documented commits are the ones that seem to cause the most confusion when I go back to figure them out.

pjstadig15y ago

I never said that one should not make small logical commits. I just think you should actually work that way, instead of trying to fake it after the fact.

It sounds like you are, so cheers!

gte910h15y ago· 2 in thread

andybak15y ago

He is light-heartedly using 'morals' as a part of his schtick. No-one is expected to take it seriously.

Underneath the style there are some interesting points being made about how certain uses of certain features breaks other features. I'd be interested to hear your rebuttal.

gte910h15y ago

This is supposed to be light hearted?

What am I supposed to rebut? He gives little reason why the features says get broken are more important than the other things. He just states it.

3 more replies

santry15y ago· 1 in thread

> [If you use git merge --squash some-branch] [w]hen a QA person or your boss says, "Hey, is some-feature {merged into QA, deployed}" you have to resort to `git log` spelunking.

OK, fine. Just title the article "You shouldn't rewrite public history".

pjstadig15y ago

I would, but that would mean rewriting public history! So, I'm kinda stuck on that one. :-(

patio1115y ago

cschneid15y ago

Core disconnect: what is source controls role: is it a journal of all history, or is it a tool to find code.

I think it's ok, the author doesn't.

protomyth15y ago

yason15y ago

Bah, that's bullshit. These are Git features and meant to be used, and these are part of the reason Git rocks so great.

However, the time you do not want to do this with public commits. Anything that you have pushed or someone has pulled is public. Anything that flies out of your local nest shall become immutable.

msy15y ago

There's some valid bits in there but commandment-from-on-high writing style is pretty offputting.

wnoise15y ago

perlgeek15y ago

If rewriting private history is a lie, is rewriting URLs also a lie? Should we all expose ugly /app.cgi?foo=bar?action=edit URLs because we shouldn't lie? Ss fiction a lie?

If all that is a "lie", I'll happily lie to outsiders to present them a nicer image, as long as no harm is done that way.

foobarbazoo15y ago

Though Shalt Not Lie: a stupid article title, idiotic advice, and other lies

j / k navigate · click thread line to collapse