Extremely Linear Git History (opens in new tab)

(westling.dev)

498 pointszegl3y ago358 comments

358 comments

198 comments · 56 top-level

larschdk3y ago· 42 in thread

I want the 'merge' function completely deprecated. I simply don't trust it anymore.

If there are no conflicts, you might as well rebase or cherry-pick. If there is any kind of conflict, you are making code changes in the merge commit itself to resolve it. Developer end up fixing additional issues in the merge commit instead of actual commits.

If you use merge to sync two branches continously, you completely lose track of what changes were done on the branch and which where done on the mainline.

GuB-423y ago

> I want the 'merge' function completely deprecated. I simply don't trust it anymore.

Merge is perfectly fine and it is the only way to synchronize repositories without changing the history, which is very important for a decentralized system. It certainly has the potential to make a mess if used improperly, but so do rebase, cherry-pick, and basically every other command.

> If you use merge to sync two branches continously, you completely lose track of what changes were done on the branch and which where done on the mainline.

If you do things correctly, that is by making sure that when you merge changes from a feature branch into the mainline, the mainline is always the first parent, you shouldn't have any problem. Git is designed this way, so normally, you have to go out of your way to mess things up. If did it like that and you don't want to see the other branch commits, git-log has the --first-parent option.

benatkin3y ago

Proper use of merge is table stakes. You get warned in your PR if your non-main branch is out of date with your main branch, and after you rebase and force push your non-main branch, you review the diff in the PR.

1 more reply

cerved3y ago

way easier to mess up a rebase

1 more reply

simiones3y ago

Unfortunately, git rebase has a very very annoying limitation that git merge doesn't. If you have a branch with, say, masterX + 10 commits, and commit 1 from your branch is in conflict with masterX+1, then when you rebase your branch onto masterX+1, you will have to resolve the conflict 10 times (assuming all 10 commits happen in the same area that had the original conflict). If instead you merge masterX+1 onto your branch, you will only have to resolve the conflict once.

Even though I much prefer a linear history, losing 1h or more to the tedious work of re-resolving the same conflict over and over is not worth it, in my opinion.

broeng3y ago

In your example, you pretty much have to change the same line, or neighbouring line, those 10 times to end in that scenario. If it's just somewhere else in the file, git auto-merging will handle it just fine.

It seems like a very contrived example to me. We have been running rebase/fast-forward only for close to 10 years now, and I have never experienced anything that unfortunate.

5 more replies

semiquaver3y ago

As sibling mentioned, this is totally solved by git-rerere.

2 more replies

adgjlsfhk13y ago

you can often solve this by squashing before rebasing.

3 more replies

TheRealPomax3y ago

To be fair, if you have 10 commits that all change the same file: squash with respect to your first commit, _then_ rebase. If you have lots of commits, always first squash-rebase to your own first commit, and only rebase to current main once that's done.

Rebase is being annoying here mostly because it's doing exactly what you want it to do: warn you about merge conflicts for every commit in the chain that might have any.

1 more reply

quadhome3y ago

man git-rerere

1 more reply

acchow3y ago

Hmmm I think in your scenario you could avoid resolving the conflict 10 times by using `git rebase --onto`

Suppose "masterX+1" is called latest

Suppose "masterX" is the SHA of your mergebase with master (on top of which you have 10 commits)

`git rebase --onto latest masterX`

jacobsenscott3y ago

Yes. Usually I just squash merge to main and then `git checkout my-branch; git rebase --hard main`. Sure it squashes all the commits, but keeping them all is nearly never needed.

desbo3y ago

https://git-scm.com/docs/git-rerere

mamcx3y ago

I was converted to rebase by my current team, and this hit every time.

I wish it works like merge, or exist a way to merge, resolve conflict, rebase?

2 more replies

YetAnotherNick3y ago

You should do reverse rebase(if it makes sense lol) for this. Instead of rebasing branch to master, rebase master to branch. The only downside is that it requires many force push in the branch.

3 more replies

bsza3y ago

> you might as well rebase or cherry-pick

Both tools are pure vandalism compared to merge. Among the two, cherry-picking is preferable in this case because you're "only" destroying your own history, so in the end, it's your funeral.

> Developer end up fixing additional issues in the merge commit instead of actual commits.

A merge commit IS an actual commit, in every sense of the word. The notion it somehow isn't, is what you need to get rid of.

cerved3y ago

Rebasing / rerolling is completely fine if done right, no need to be overly zealous. But merges are often more elegant

masklinn3y ago

I think merge is great, having a “unit” for a feature branch being integrated is nice and not all things can be done in commits which are individually justifiable. The ability to bisect cleanly through the first ancestor is precious.

I do agree that resolving conflicts in merges is risky though. It can make sense when merging just one way between permanent branch (e.g. a 1.x branch into a 2.x), but as soon as cross merges become a possibility it’s probably a mistake.

losvedir3y ago

> I do agree that resolving conflicts in merges is risky though.

How do you do otherwise, though? Or is your workflow a combination of rebases and merges? Continual rebasing of the feature branch onto `main` and then a final merge commit when it's ready to go?

2 more replies

BiteCode_dev3y ago

This work if you have only experienced professional developpers in the team. If you have juniors or non devs (mathematicians, geographers, qwants...) that just happen to also code, rebase is a minefield. This is espacially true in open source contributions.

kzrdude3y ago

Merge and conflict resolution is a minefield if unexperienced developers do it too. Fortunately it can (often) be arranged that those with some understanding of the issues involved can do the resolution.

2 more replies

voakbasda3y ago

If you can’t rebase, I don’t want you pushing to my main branches. I would rather teach everyone how to rebase before I cave and allow merge commits.

1 more reply

rjmunro3y ago

I usually rebase the branch onto the upstream branch (master or main or whatever) if there are merge conflicts. You can then resolve the conflicts commit by commit. This requires force pushes, but they are are not normally a problem because only one dev tends to work on a particular branch before it's merged.

If you do have multiple devs working on the same branch, use `git pull --rebase` to stay in sync with each other, don't use merges and leave lots of merge commits. If you need to resolve conflicts with upstream, make sure other people have stopped working on the branch, rebase it, then merge.

sidlls3y ago

Rebasing is a tool of last resort, when something has so fowled up the code that merging a large-scale refactor is even more time consuming.

Rebasing takes longer and is actually more prone to error because of the clunky interface. There is absolutely nothing wrong with squashing commits in a feature branch and merging that into master/main. In fact, it's generally better for the health of the repo and the mental health of developers.

cerved3y ago

in my experience, rebase works great if the commits are structured and much more painful with lots of overlapping changes, say by continiusly doing _wip_ commits every hour

1 more reply

pnt123y ago

I don't understand the hate for merges, or the love for rrbaded. Let's consider what may happen using a github flow strategy (main branch, feature branches based solely on main):

* If you screw up a merge, you undo the merge commit. Now your branch is exactly as it were. May not happen with a rebase.

* If you push some code to the remote, and later find out it was outdated, you can merge it with main and push again: no need to force, github can distinguish what's already been reviewed and what hasn't. With rebase, you may need to push -- force, and if someone already reviewed the code they're going to be shit out of luck, as github will lose the capability to review "changes since last review", as the reference it has may have been lost.

I also merge these features using squash commits, which provides a very linear history. This also saves some effort (you don't need to be rebase the commits in the feature branch, which can be a pain in the ass for unorganized people and git newbies, and you are pushed towards making smaller, granular PRs that make sense for the repo history).

danwee3y ago

I usually do `git merge --no-ff development` when working on my feature branch. We do not leave feature branches "open/live" for too much time, so merge conflicts are not usually a problem, but sometimes they do happen.

I like cherry-pick, but I barely use it (e.g., I need to cherry-pick one commit from branch X into my branch). I don't like rebase much because it requires force-push.

cerved3y ago

rebase only require force push if you're rebasing something already pushed

u801e3y ago

What I would like to see is a way to enforce fast-forward only merges along with the forced creation of a merge commit that references the same git tree as the HEAD commit of the branch that was just merged.

This way, you know which set of commits was in the branch by looking at the parent commits of the merge commit, but the merge commit itself did not involve any automated conflict resolution.

breatheoften3y ago

I've wanted this for awhile as well. Squash only merges, which are enforceable in github, get you close but leave you without any automated way to determine if a given branch was ever merged to main or not ...

1 more reply

wnoise3y ago

Yes, it is a shame that you can't combine git merge --ff-only --no-ff .

1 more reply

eurasiantiger3y ago

It is also a security risk. Someone could add whatever unreviewed code and it would get glanced over as a merge commit. Put your payload in an innocuous file not likely to be touched and call a boilerplate-looking function as a side effect from somewhere.

gpvos3y ago

> If there is any kind of conflict, you are making code changes in the merge commit itself to resolve it.

I don't get it. If you rebase, you get 20 chances to do the same.

thargor903y ago

Reviewing merge commits is harder because they will sometime have huge diffs to both branches.

Rebasing is the process of redeveloping your feature based on the current master. This is smaller, easier steps to review later.

It is a pitty that we can't have tooling to create "hidden" merge commits to allow to connect rebased branches, this would retain the history better and allow pulling more easily.

nextlevelwizard3y ago

If you have bunch of commits in a feature that are related it is easier to revert merges (even if you do pre-merge rebase from master and then merge with --no-ff)

Chris20483y ago

I'd like a "quick ff" that will ff if there are no conflicts, or ff as far as it can with no conflicts - and an easy way to apply to many branches.

Also, a way to "rebase" that works the same as cherry picking commits on top of the target. As far as I can see, the regular rebase works it's way up the target branch, so that I end up resolving conflicts in code that eventually changed in the target.

foobarbecue3y ago

> Developer end up fixing additional issues in the merge commit instead of actual commits.

As long as the merge commit is being reviewed with the rest of the PR, that's fine, right? (We use rebase while working on feature branches, and then squash & merge for completed PRs, which seems to be the best of both worlds)

silon423y ago

Personally, I believe merging 'master' to your feature branch is the wrong model... what one should do is create a new branch from master and merge the old branch into it.

dahart3y ago

Why? Merging master into the feature branch is done so that you can test the conflict resolution in the branch before inflicting it on everyone. It’s also done on a regular basis in longer running feature branches to prevent large conflicts from accumulating- you can merge master into your branch multiple times to stay current with master before ever merging back into master. I’m not sure why parent says this causes them to lose track of which changes happened in which branch. The history does get a bit more complex at a glance, but for any given commit, it’s easy to pinpoint their origin if using only merge commits. It only gets harder if you accidentally rebase someone else’s commits along the way. For smaller feature branches and smaller projects, it’s okay to merge branches into master, but for large branches, large projects, large teams, and teams that care about testing, merging master into feature branches is a best practice. What makes you consider it ‘wrong’?

phailhaus3y ago

A merge commit is just a commit with two parents. You're not affecting the master branch at all when you "merge in master", you're just creating a new commit where the first parent is your branch, and the second parent is the master branch.

If you do things the way you're suggesting, you'll make it really hard to tell what commits were made on your branch. Git clients tend to assume the first parent is the branch you care about.

piskerpan3y ago

If you’re merging (and not rebasing) it’s the same exact thing. You’re just switching the “incoming” version, but conflicts will be identical.

e403y ago

I have never had issues with merge, unless rerere was enabled. I've had some extremely surprising results recently with it enabled and I finally disabled it for good.

palata3y ago

What about the commit signatures? Only merge keeps them, right?

wirrbel3y ago· 15 in thread

I think the sweet spot in Developer productivity was when we had SVN repos and used git-svn on the client. Commits were all rebased on git level prior to pushing. If you committed something that broke unit tests your colleagues would pass you a really ugly plush animal of shame that would sit on your desk until the next coworker broke the build.

We performed code review with a projector in our office jointly looking at diffs, or emacs.

Of course it’s neat to have GitHub actions now and pull-requests for asynchronous code review. But I learned so much from my colleagues directly in that nowadays obscure working mode which I am still grateful for.

sshine3y ago

> If you committed something that broke unit tests your colleagues would pass you a really ugly plush animal of shame that would sit on your desk until the next coworker broke the build.

We did have an ugly plush animal, but it served more obscure purposes. For blame of broken builds, we had an info screen that counted the number of times a build had passed, and displayed below the name of the person who last broke it.

Explaining to outsiders and non-developers that "Yes, when you make a mistake in this department, we put the person's name on the wall and don't take it down until someone else makes a mistake" sounds so toxic. But it strangely enough wasn't so harsh. Of course there was some stigma that you'd want to avoid, but not to a degree of feeling prolonged shame.

jonstewart3y ago

I once interviewed a junior-ish developer who told me that his then-current team had a dunce cap to be worn by whomever broke the build. I copied it immediately. There was no toxicity, it was a good laugh, and as manager I wore it more than once being a bit too liberal with my commits.

On another team I was on, in 2002 using CVS, we had an upside-down solo cup as a base for a small plastic pirate flag. If you were ready to commit, you grabbed the pirate flag as a mutex on the CVS server. Of course, this turned competitive… and piratical.

I despair about long-lived git feature branches and pull requests. The pull request model is fine for open source development, but it’s been a move backwards for internal development, from having a central trunk that a team commits to several times a day. The compensating factors are git’s overall improvements (in speed and in principled approach to being a content addressable filesystem) and all of the fantastic improvements in linters and static analysis tools, and in devops pipelines.

2 more replies

cfuendev3y ago

No joke, a few weeks ago a colleague from university shared a few anecdotes about his mentor-coworker-boss at work with me, and it's similar. Every time they broke the production branch and the boss had to change the code or pull out some AWS magic to restore a database, he would give the fixed commits names like "Cgada de [Employee Name]" which roughly translates to "[Employee Name] F*ed Up", since he knew they wouldn't forget it that way.

It's specially cool given that he would always see his employees' f*k-ups as learning opportunities. He would always teach them what went wrong and how to fix it before shaming them in the git history. He always told them he did it to assure they wouldn't forget both the shameful f*k-up + the bit of learning that came along with it. They always laugh it off and understand the boss' intentions. It isn't harsh or anything.

1 more reply

com2kid3y ago

Thankfully modern development practices should ideally run tests before commiting, the build should never be broken.

With good infra, everything from unit tests to integration to acceptable tests get ran before code hits main.

The only excuse for builds breaking nowdays. is insufficient automated safeguards.

1 more reply

rfrey3y ago

It's not toxic because every single developer knows that it could be them next time around.

1 more reply

exabrial3y ago

we did something similar, but everyone knew it was a joke and we all took turns with it. I guess we didn't take ourselves as seriously

couchand3y ago

I have my old team's rubber chicken and I'm never giving it up.

In-person code review is the only way to do it. Pull requests optimize for the wrong part of code review, so now everyone thinks it's supposed to be a quality gate.

wirrbel3y ago

Yep. It makes a lot of sense for open source where gate keeping makes sense (to reduce feature bloat, back doors and an inflated API surface that needs to be maintained almost indefinitely).

Most corporate code bases are written by a smallish team operating under tight time constraints so most contributions are actually improving on the current state of the code base. Then PRs delay the integration, and lead to all kinds of follow up activities in keeping PR associated problems at bay. For example the hours wasted by my team in using stacked PRs to separate Boy Scout rule changes to the code from the feature is just abnormal.

titzer3y ago

Commit queues are so far superior to shaming broken builds that I think it's only nostalgia that makes you miss it.

thewebcount3y ago

Absolutely. In my experience, it’s only “not toxic” to a few people, and for most others it is toxic, but the people who like it won’t ever be able to see that.

2 more replies

tomtom13373y ago

This (plush toy and projector) has “feel good” all over it :)

unnah3y ago

Next step: a svn-git proxy that allows one to use a subversion client with a remote git repository.

flir3y ago

Two years from peak Covid, and the plushies are the object of nostalgia.

newswasboring3y ago

I am literally in the middle of trying to convince my group from moving away from all this. Would you recommend going back to this system?

wirrbel3y ago

In this case I reminisced about the toolset but the work flow is what brought the value so I advise of course against using subversion.

Look up trunk based development and read the continuous integration book published by Addison Wesley (Is it the hez humble book or the Duvall book I always confuse the authors, both books are great though).

The hard part will be to convince people of exploring a different way working mode AND to learn that what is proposed is not an anarchist style of development but a development model that optimizes on efficiency

1 more reply

infogulch3y ago· 11 in thread

Github-style rebase-only PRs have revealed the best compromise between 'preserve history' and 'linear history' strategies:

All PRs are rebased and merged in a linear history of merge commits that reference the PR#. If you intentionally crafted a logical series of commits, merge them as a series (ideally you've tested each commit independently), otherwise squash.

If you want more detail about the development of the PR than the merge commit, aka the 'real history', then open up the PR and browse through Updates, which include commits that were force-pushed to the branch and also fast-forward commits that were appended to the branch. You also get discussion context and intermediate build statuses etc. To represent this convention within native git, maybe tag each Update with pr/123/update-N.

The funny thing about this design is that it's actually more similar to the kernel development workflow (emailing crafted patches around until they are accepted) than BOTH of the typical hard-line stances taken by most people with a strong opinion about how to maintain git history (only merge/only rebase).

couchand3y ago

What's weird about most of these discussions is how they're always seen as technical considerations distinct from the individuals who actually use the system.

The kernel needs a highly-distributed workflow because it's a huge organization of loosely-coupled sub-organizations. Most commercial software is developed by a relatively small group of highly-cohesive individuals. The forces that make a solution work well in one environment don't necessarily apply elsewhere.

pnt123y ago

I wholeheartedly agree!

With this, you can also push people towards smaller PRs which are easier to review and integrate.

The downside is that if you és o work on feature 2 based on feature 1,either you wait for the PR to be merged in main (easiest approach) or you fork from your feature branch directly and will need to rebase later (this can get messier, especially if you need to fix errors in feature 1).

bpugh3y ago

Git recently added a --update-refs option to rebase that makes dealing with this scenario a lot easier. This post does a good job explaining how to use it: https://andrewlock.net/working-with-stacked-branches-in-git-...

1 more reply

przemo_li3y ago

Git branchless have restack command that restacks whole trees/branches of commits.

alvarlagerlof3y ago

Second part explains exactly what I'm juggling with right now.

palata3y ago

What about commit signatures? If you rebase, you lose the original signature, don't you?

GauntletWizard3y ago

If you let Github do the rebase, yes, you do. But you can do so manually yourself, taking the commit down to a single squashed commit, that you then sign.

This is a tooling issue that needs to be solved client-side (i.e. where the signing key lives). It's an important one but actually really simple.

2 more replies

smcameron3y ago

Did you even read the article? This article is about perversely forcing the commit hashes to come out a certain way for lulz.

latexr3y ago

From the guidelines[1]:

> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".

[1]: https://news.ycombinator.com/newsguidelines.html

juped3y ago

But why do you "squash" it! Why do people do this?

plonk3y ago

Ever seen a PR that implements something in a GitHub Actions workflow? The history usually looks like: clear cache, fix path, fix variable expansion, fix command, fix command again, fix syntax, […].

The best way IMO is to interactive-rebase the branch locally (or force-push a rebased version later), but sometimes 50 commits merge into a 30-ligne single-file change and nothing beats squash.

blux3y ago· 10 in thread

I fail to see the point of this, in fact, I think this is a fundamentally flawed approach to dealing with your revision history. The problem is that rebasing commits has the potential of screwing up the integrity of your commit history.

How are you going to deal with non-trivial feature branches that need to be integrated into master? Squash them and commit? Good luck when you need to git bisect an issue. Or rebase and potentially screwing up the integrity of the unit test results in the rebased branch? Both sound unappealing to me.

The problem is not a history with a lot of branches in it, it is in not knowing how to use your tools to present a view on that history you are interested in and is easy for you to understand.

boxed3y ago

It's a joke. The swooshing sound you heard was it going past you.

thewebcount3y ago

> The problem is not a history with a lot of branches in it, it is in not knowing how to use your tools to present a view on that history you are interested in and is easy for you to understand.

To me this is like saying to a construction worker: “The problem is not that your hammer has sharp spikes coming out of the handle at every angle. The problem is that you don’t put on a chain mail glove when using it.” That’s certainly one way to look at it.

ranguna3y ago

Pretty analogy, but I don't see how a specific functionality of git (commit history) that has no use case other that looking tidy compares to a handle of a hammer.

michaelmior3y ago

This somewhat depends on how big your features are. Arguably, large long-lived feature branches are the problem themselves. If larger features are broken down and developed/merged piecemeal, then you still have smaller commits you can fall back on.

IIRC, GitHub uses a development model where partially implemented features are actually deployed to production, but hidden behind feature flags.

tasuki3y ago

> I fail to see the point of this

I'm pretty sure the point is that this is a one-person project and the author can play around. He's not suggesting your team of 100 people to adopt this for the development of your commercial product.

lanza3y ago

Quite the opposite. The largest companies just about all use linear commit histories.

1 more reply

lanza3y ago

I think the fundamental misunderstanding people with your point of view have regarding linear commit histories is that it's not just about different VCS usage, the entire development process is changed.

When you are using linear histories and rebasing you don't do monolithic feature branches. You land smaller chunks and gate their functionality via some configuration variable. `if (useNewPath) { newPath(); } else { oldPath(); }` and all your new incremental features land in `newPath`. All tests pass on both code paths and nothing breaks. When the feature is fully done then you change the default configuration to move to the `newPath`.

> How are you going to deal with non-trivial feature branches that need to be integrated into master?

That's the point -- this isn't a thing in rebase workflows. That's a feature. You don't have to deal with megapatches for massive features. It's incrementally verified along the way and bisect works flawlessly.

diekhans3y ago

It is amazing how much time projects seem to spend on rewriting history for the goal of displaying in in a pretty way. Leaving history intact and having better ways to display it seems far saner. Even after a merge, history in the branch maybe useful for bisect, etc.

ranguna3y ago

Yes, a thousand times yes.

tcoff913y ago

If people knew about --first-parent everyone could stop complaining about merge commits in the history.

kinduff3y ago· 6 in thread

See also Lucky Commit [0], which uses various types of whitespace characters instead of a hash inside the commit, which makes it look more magical.

I wonder about performance, though. Why is the author's method slower than the package I linked?

[0]: https://github.com/not-an-aardvark/lucky-commit

zeglOP3y ago

Thanks for sharing, this is really cool! Using whitespace is a really clever trick, and running on the GPU makes it even more impressive.

I've been using githashcrash [1], but it's only running on the CPU, which is why it's a bit slower. :-)

[1]: https://github.com/Mattias-/githashcrash

oneeyedpigeon3y ago

Using whitespace is cool, but you know what would be really cool? Using a thesaurus to reword the commit message until it matches the hash :)

3 more replies

zeglOP3y ago

Update: git-linearize now uses lucky_commit as it's backend!

1 more reply

masklinn3y ago

Git also support extra headers in commits. Interesting that neither went with that.

jwilk3y ago

What do you mean by "extra headers"?

1 more reply

kevincox3y ago

I figured that a good option would be to slightly change the date. I don't know what the date resultion us but shuffling it around by a bit shouldn't be an issue.

Of course if the date only has seconds resolution it may be to big of a shift to be reasonable.

enriquto3y ago· 6 in thread

I love linear git! Branches are very confusing for a nonempty set of people. For us, it is always clearer to work with explicit files in the main branch. You are implementing a new feature? Nice: just create a new file on the main branch and keep updating it until you add it to the tests, and later you call it from the main program. This system may break down on large teams, but when you are just a handful of grug-brained developers, it's perfectly appropriate.

bigDinosaur3y ago

This doesn't handle the reasonably common case very well where someone is working on changes which are constantly breaking the branch for everyone else. They should have their own branch and be frequently rebasing/merging so as to not disrupt others.

Also exploratory branches where any nonsense may go on (that may end up being merged, at least partially!). Also test/development vs. production branches! One may be broken, the production branch should ideally never be in a state that cannot be deployed.

That said, keep the branches limited and try to keep them 'linear' in the sense that you don't want to be merging between 100 different non-main branches in some byzantine nightmare. Perhaps encourage merges only to the development branch and then rebranching.

Filligree3y ago

> Also exploratory branches where any nonsense may go on (that may end up being merged, at least partially!). Also test/development vs. production branches! One may be broken, the production branch should ideally never be in a state that cannot be deployed.

Well, why don't you simply copy the code into a new directory and commit that? Then you can do whatever you want in the scratch directory.

1 more reply

enriquto3y ago

> the reasonably common case very well where someone is working on changes which are constantly breaking the branch

But isn't this bad practice? My grug brain refuses to commit anything that does not pass tests. Check tests, then commit. Check tests, then commit.

You can hide your as yet incomplete feature inside an undocumented option, and work from there, without breaking anything.

3 more replies

mjburgess3y ago

That requires your programming language to identify files with modules, and with your system architecture to be extended by modules alone.

This is an ideal case, of course.

enriquto3y ago

I don't understand your comment. The method that I describe only requires that the programming language ignores unused files. As far as I know, all modern programming languages have this feature.

1 more reply

aqme283y ago

This feels like a lot of extra work to throw away the benefits you actually get out of version control. I would very much not like to work on this team.

zeglOP3y ago· 4 in thread

I don't know how stupid this is on a scale from 1 to 10. I've created a wrapper [1] for git (called "shit", for "short git") that converts non-padded revisions to their padded counterpart.

Examples:

"shit show 14" gets converted to "git show 00000140"

"shit log 10..14" translates to "git log 00000100..00000140"

[1]: https://github.com/zegl/extremely-linear/blob/main/shit

informalo3y ago

Other customers also brew-installed: fuck [1]

[1]: https://github.com/nvbn/thefuck

anderskaseorg3y ago

You may want to take a look at the monotonic commit numbering scheme that Git already has, before trying to hack one into the hashes:

https://git-scm.com/docs/git-describe

thih93y ago

Why the trailing zero? The article quotes hashes starting with "0000001", or "0000014".

Shouldn't "shit show 14" get converted to "git show 0000014"?

couchand3y ago

Thank you for addressing my one and only concern with this scheme! No notes.

chrismorgan3y ago· 4 in thread

It has been my habit for a while to make the root commit 0000000 because it’s fun, but for some reason it had not occurred to me to generalise this to subsequent commits. Tempting, very tempting. I have a couple of solo-developed-and-publicly-shared projects in mind that I will probably do this for.

jrmg3y ago

How do you make the first commit 0000000? (Without using this project, obviously).

boxed3y ago

You only need to do it once if it's the first commit and you make it empty...

robertlagrant3y ago

Might be by using that hashcrash tool.

chrismorgan3y ago

lucky_commit

gyulai3y ago· 4 in thread

Sane revision numbers are among the many reasons I prefer SVN to GIT.

jstimpfle3y ago

You could automatically tag each uploaded commit with a number drawn from a sequence - using a git post-update hook. The only problem is that this centralizes the process. It's not possible to have fully "blessed" commits without pushing them first. And that's how SVN works, too.

loeg3y ago

For local repositories, you can do it as a post-commit hook.

In the hook:

  prefix=whatever
  old=$(git rev-parse HEAD)
  new=$(brute force $prefix)
  git update-ref -m "chose prefix $prefix" --create-reflog HEAD "$new"

Of course, it's pretty silly and slow.

MAGZine3y ago

... What are the other reasons?

gyulai3y ago

Basically, not to put too fine a point on it, I believe that distributed version control is a problem no one ever truly had, and no one intends to ever have in the future.

I mean: Imagine going back in time 20 years to when git, hg, and bzr were created and telling the creators of those tools: "Hey, while designing your technology, you should be aware that it'll end up being used as a worldwide centralized monorepo run by Microsoft, and no one will ever use any of that distributed stuff."

They'll either laugh you out of the room or you'll be in trouble with the Department of Temporal Investigations for polluting the time line, because what we currently understand as git sure as hell won't be the design they'll come up with.

So for me: I prefer centralized. And SVN is just a reasonable one to use.

4 more replies

unnouinceput3y ago· 4 in thread

Extremely Linear Git History...also known as SVN. Guess reinventing the wheel does get you to top HN.

mgsk3y ago

Are you allergic to fun? :(

adql3y ago

You need to do it badly and wastefully.

zeglOP3y ago

Using SVN doesn't cause your computer to heat up. ;-)

1 more reply

oneeyedpigeon3y ago

This is Hacker News. You could add this as a canonical example in the FAQ as far as I'm concerned.

oneeyedpigeon3y ago· 3 in thread

I bet I wasn't the first person who thought this would have to be done by modifying actual file content — e.g. a dummy comment or something. That would clearly have been horrible, but the fact that git bases the checksum off the commit message is... surprising and fortunate, in this case!

entropy_3y ago

It's a hash of everything that goes into a commit, including the commit message. The idea is that nothing that makes up a commit can change without changing the hash.

mjochim3y ago

> It's a hash of everything that goes into a commit, including the commit message

... and, very notably, the hash of the parent commit. That is also part of the commit, which means that changing a parent commit would also imply changing the hashes of all later commits. This is sort of the whole point of git/version control.

2 more replies

belinder3y ago

I feel like it would be better to have some dummy file in your repo that the tool modifies than mucking up your commit messages

chrismorgan3y ago· 3 in thread

The article talks about eight-character prefixes later in the article, but Git short refs actually use seven-character prefixes when there is no collision on that (and that’s what’s shown earlier in the article). So you can divide time by 16.

For me on a Ryzen 5800HS laptop, lucky_commit generally takes 11–12 seconds. I’m fine with spending that much per commit when publishing. The three minutes eight-character prefixes would require, not quite so much.

zeglOP3y ago

I left some details out of the post to make it shorter.

What I’m actually is doing is generating a 7-digit incremental number followed by a fixed 0. Some UIs show 7 characters and some show 8, this felt like a nice compromise. Plus it’s easier to distinguish between the prefix and the suffix when looking at the full SHA when they are always separated by a 0.

avar3y ago

Git hasn't used "seven-character prefixes when there are no collisions" in a long time.

It's a combination of the "repo size" (as in, estimated number of objects) and a hard floor of seven characters.

You can see this by running "git log --oneline=7" on any non-trivially sized repository (e.g. linux.git). There's plenty of hashes that uniquely abbreviate to 7 characters, but they're currently all shown with 12 by default.

chrismorgan3y ago

There may be some extra trigger that causes it to go beyond seven for everything, I don’t know (never worked on a repository anywhere near that large), but there’s certainly still at least some form of collision logic in there (and this is why I said what I said, because I’ve used lucky_commit enough to experience it):

  $ git init x
  Initialized empty Git repository in /tmp/x/.git/

  $ cd x

  $ git commit --allow-empty -m one
  [master (root-commit) 4144321] one

  $ git log --oneline
  4144321 (HEAD -> master) one

  $ lucky_commit

  $ git log --oneline
  0000000 (HEAD -> master) one

  $ git commit --amend --no-edit --reset-author --allow-empty
  [master 3430e13] one

  $ git log --oneline
  3430e13 (HEAD -> master) one

  $ lucky_commit

  $ git log --oneline
  0000000f (HEAD -> master) one

  $ git reflog --oneline
  0000000f (HEAD -> master) HEAD@{0}: amend with lucky_commit
  3430e13 HEAD@{1}: commit (amend): one
  00000005 HEAD@{2}: amend with lucky_commit
  4144321 HEAD@{3}: commit (initial): one

  $ git reflog expire --expire=now --all

  $ git reflog --oneline

  $ git log --oneline
  0000000f (HEAD -> master) one

  $ git gc --aggressive --prune=now
  Enumerating objects: 2, done.
  Counting objects: 100% (2/2), done.
  Writing objects: 100% (2/2), done.
  Total 2 (delta 0), reused 0 (delta 0), pack-reused 0

  $ git log --oneline
  0000000 (HEAD -> master) one

1 more reply

shawabawa33y ago· 3 in thread

Sadly if you use commit signing it's unfeasibly slow to do this :(

mkj3y ago

It shouldn't be, won't the signature be made after the hash brute force is finished?

chipsa3y ago

The signature is part of the commit hash.

imiric3y ago

You think signing is what makes this unfeasibly slow? :)

rock_artist3y ago· 3 in thread

To make sure I've got it right.

In order to get this 'beautiful' hashes, they're crunching numbers leveraging cpu power?

kzrdude3y ago

Yep. If you squint it's similar to bitcoin mining in that particular aspect

Bellyache53y ago

And vanity Tor .onion addresses

contradictioned3y ago

yep

ChrisMarshallNY3y ago· 2 in thread

I find tags to be a fairly useful way of providing a linear progression, but I guess that's no fun.

> but it can also mean to only allow merges in one direction, from feature branches into main, never the other way around. It kind of depends on the project.

That sounds like the Mainline Model, championed by Perforce[0]. It's actually fairly sensible.

[0] https://www.perforce.com/video-tutorials/vcs/mainline-model-...

actuallyalys3y ago

Yeah, I think tags are a more practical way of accomplishing this. If you’re really interested in having a linear history, it might also make sense to switch to an alternative. Mercurial has linear version numbers and can even push to Git repositories.

At risk of coming across as a humorless Hacker News commenter, I will add that I enjoyed this post. It’s a neat hack!

ChrisMarshallNY3y ago

Yes, it is a cool hack. I enjoy these, even if I can't find a practical application.

davide_v3y ago· 2 in thread

I thought I was a very tidy person, then I saw this.

Thev00d003y ago

Im not sure it is tidy to inject random junk into your commit message to get a hash prefix

mgsk3y ago

Or maybe it is _extremely_ tidy.

1 more reply

malkia3y ago· 2 in thread

Hail p4, g4, svn and blessed be their monotonically increasing revision number!

cerved3y ago

All hail centralized version control, make life slow again!

charcircuit3y ago

Decentralized version control is slower than centralized version control since it requires downloading and working with the entire repository.

1 more reply

jasmer3y ago· 2 in thread

Wouldn't it have been better if we could use something other than SHA1 as the actual name of something?

Where in the worst dystopian parts of software do we do this?

The SHA1 is kind of a security feature if anything, a side-show thing that should be nestled 1-layer deep into the UI and probably most people are unaware of.

Whereas commits and branches should be designed specifically for the user - not 'externalized artifacts' of some acyclic graph implementation.

Git triggers a product designers OCD so hard, it's hard for some of us to not disdain it for spite.

jonstewart3y ago

I don’t want to make up a good name for every commit. Good comments are hard enough.

A SHA-1 might not look friendly to a dev who doesn’t understand it, but as someone who works with hash values all the time, having my repo be a Merkle tree gives me a warm fuzzy.

jasmer3y ago

You wouldn't 'make one up' there would be an automatic variation of Semantic Versioning, or something actually useful.

Your 'warm and fuzzy' comes at the cost of confusion (even to yourself), not having any clue what the information really means.

It's not even clear that it's a commit, it could be anything.

This posture is exactly what I'm complaining about: it's objectively bad design engineering, embraced as though somehow it's 'smart'.

Git has a few problems like this.

1 more reply

hoseja3y ago· 2 in thread

This is very silly.

chii3y ago

but fun. Also might as well just mine bitcoins tbh...

zeglOP3y ago

Do not try this at home.

pcthrowaway3y ago· 2 in thread

I mean.. this kind of breaks down if you have more than one person on the team

globular-toast3y ago

It just means you have to coordinate more. Or just have one person in charge of the master branch. I don't think the post is supposed to be taken so seriously, though.

Zoadian3y ago

which just means: lets waste many hours coordinating, for the benefit of having a 'nice looking' history.

2 more replies

jordigh3y ago· 1 in thread

Mercurial always has had sequential revision numbers in addition to hashes for every commit.

They aren't perfect, of course. All they indicate is in which order the current clone of the repo saw the commits. So two clones could pull the commits in different order and each clone could have different revision numbers for the same commits.

But they're still so fantastically useful. Even with their imperfections, you know that commit 500 cannot be a parent of commit 499. When looking at blame logs (annotate logs), you can be pretty sure that commit 200 happened some years before commit 40520. Plus, if you repo isn't big (and most repos on Github are not that big by numbers of commits), your revision numbers are smaller than even short git hashes, so they're easier to type in the CLI.

silvestrov3y ago

Seems like a design fault in git that commits only have a single id (sha1 hash) and that hashes are written without any prefix indicating which type of id it is.

If all hashes were prefixed with "h", it would have been so simple to add another (secure) hash and a serial number.

E.g. h123456 for the sha1, k6543 for sha256 and n100 for the commit number.

bloppe3y ago· 1 in thread

This is a fun idea, but it will mess with your GC heuristics.

https://git-scm.com/docs/git-gc#_configuration

Git does something called "packing" when it detects "approximately more than <X (configurable)> loose objects" in your .git/objects/ folder. The key word here is "approximately". It will guess how many total objects you have by looking in a few folders and assuming that the objects are uniformly distributed among them (these folders consist of the first 2 characters of the SHA-1 digest). If you have a bunch of commits in the .git/objects/00/ folder, as would happen here, git will drastically over- or under-approximate the total number of objects depending on whether that 00/ folder is included in the heuristic.

This isn't the end of the world, but something to consider.

chrismorgan3y ago

Could use little-endian numbers to avoid this: 0000, 1000, 2000, 3000, …, e000, f000, 0100, …

Ayesh3y ago· 1 in thread

I wonder if Git provides a pluggable hashing mechanism as part of SHA2 migration.

I imagine stuff like this and SVN to Git mirroring to work nicely with identical hashes.

masklinn3y ago

Not currently, it’s a repo-level flag and you get one or the other.

It’ll undoubtedly be easier to further expand, but it’s nowhere near pluggable.

HextenAndy3y ago· 1 in thread

Wait until you see subversion :)

breck3y ago

Two steps forward one step back. So it goes.

nsajko3y ago· 1 in thread

Gitlab supports an option called "Fast-forward merge":

> No merge commits are created.

> Fast-forward merges only.

> When there is a merge conflict, the user is given the option to rebase.

The maintainer can enable this for a project.

spyremeown3y ago

So does almost every PR-based workflow tool for (bitbucket, GitHub etc). It's very common.

JoachimS3y ago· 1 in thread

A neat trick with this tool is to generate a commit message that corresponds to a given issue number. It could almost be useful.

Kudos to @zegl for this cool project.

mjochim3y ago

> It could almost be useful.

I'm still pondering the “almost” ;-).

titzer3y ago· 1 in thread

I prefer a linear version number on the main branch and I have a really tiny version file that gets incremented on every change to the src/ directory. That's not entirely automated, but a commit queue could do that.

Brute-forcing hash collisions seems like an April Fool's joke. You can't really be serious that people are going to do this regularly?

javier1234543213y ago

I don't think people actually take this project seriously

bcoughlan3y ago· 1 in thread

I wish Git had more support for "linear" revisions in the main branches. It's great for continuous delivery where you can get a unique identifier that's also human-friendly.

I emulate this by counting the number of merges on main:

git rev-list --count --first-parent HEAD

But it's not that traceable (hard to go from a rev back to a commit).

tazjin3y ago

We do this at TVL, and push the corresponding revision as a ref (refs/r/$n) back to git. See for example our cgit log view: https://code.tvl.fyi/log/

This way, a correctly configured git client (which pulls those refs) can use `git checkout r/1234` to get to that revision. It's also noteworthy that this is effectively stateless, so you can reproduce the exact revisions locally with a single shell command without fetching them from the remote.

The revisions themselves are populated in CI: https://cs.tvl.fyi/depot@c537cc6fcee5f5cde4b0e6f8c5d6dcd5d8e...

chrsig3y ago· 1 in thread

Cool hacker project, learned stuff about git reading the article. I don't want to put this into practice, and don't see the utility of it.

guipsp3y ago

It's not supposed to be put into practice, and it's not supposed to be useful.

zomglings3y ago· 1 in thread

Is it possible to change the checksum implementation that git uses, through configuration or a plugin?

I find all this hash inverting quite inelegant.

yjftsjthsd-h3y ago

There's an effort to add support for sha256, but it's... not recommended https://lwn.net/Articles/898522/

u801e3y ago· 1 in thread

Why does the version skip from 19 to 20? What about 1A, 1B, 1C, 1D, 1E, and 1F?

cjbprime3y ago

It's using a neat numbering system called "decimal", you should check it out.

ccbccccbbcccbb3y ago· 1 in thread

<sarcasm> But what's the carbon footprint and contributed sea level rise of this frivolity? </sarcasm>

housel3y ago

This is a serious criticism... even if it's not likely to catch on enough to have a real effect on the sea level, it is a complete waste of energy to accomplish something that could be done much more efficiently some other way, if it is indeed worth doing at all.

Semaphor3y ago

> Full collision (entire hash is zeros, then 000...1, etc.) — `git linearize --format "%040d"` (takes ~10³³ years to run per commit)

Hah :D

otikik3y ago

This is horrible and I like it.

maxbond3y ago

Has anyone tried using git alternatives like fossil in production? Did it work out? Did you build CI/CD around it?

sagebird3y ago

“ So we only have one option: testing many combinations of junk data until we can find one that passes our criteria. “

I have a somewhat related interest of trying to find sentences that have low Sha256 sums.

I made a go client that searches for low hash sentences and uploads winners to a scoreboard I put up at https://lowhash.com

I am not knowledgeable about gpu methods or crypto mining in general, I just tried to optimize a cpu based method. Someone who knows what they are doing could quickly beat out all the sentences there.

jzer0cool3y ago

I had in past some teammates merging master into a very old branch. This get's pushed back into master with every past history already committed. Could someone suggest series of command so that only their latest updates are separately moved to latest version of master on current or new branch?

kzrdude3y ago

Clever and tempting. I would maybe like to use a smaller prefix but ensure a 0 suffix to the number too, to make it easy to read anyway. Like 00010bad, 00020fed, 00030be1, etc..

Wraparound doesn't really matter, as long as it's spaced long apart.

shadytrees3y ago

There's a memorable Stripe CTF from 2014 that had something similar (Gitcoins). This brought back fond memories of that.

conaclos3y ago

Another approach could be to use prefixes. A 0 could separate the prefix (fixed hash part) from the suffix (random part).

  0<suffix>
  10<suffix>
  20<suffix>
  ...

Combined with auto-completion, you preserve the main advantage (ordering) and you are able to quickly compute the hash.

alvis3y ago

It look appealing from a perfectionist point of view.

But! How can I collaborate with my team when PR merges are inevitable? O:

cranium3y ago

Now, merge only the next-in-line hashes and the contributions to your repo can reach Cloud Scale™. Harness the ultimate power of distributed intelligent agents to create the future, backed by strong mathematical foundations and an ecosystem of innovative technologies. Just at your fingertips

forgotmypw173y ago

I am writing a solo project. I only use main (aka master) and never use branching. Otherwise, I inevitably screw something up. It is good enough to keep me from losing stuff most of the time, and I almost never have to struggle with understanding what the heck Git is doing.

tambourine_man3y ago

The fact that we use a hash as the main way to interact with commits shows how bad git interface is. Sure, you should be able to easily check the sha anytime, but expose the plumbing to end users on almost any interaction is mad. We just got used to it.

hiergiltdiestfu3y ago

wtf, back to SVN :D

I honestly expected this to be from another "really cool date" - April 1st :D

joosters3y ago

Just like perforce and its "p4 changes" command. I like the simplicity.

jbergstroem3y ago

The Webkit project would love this. Can't help but feel that half the reason they spent all the extra effort with subversion was user-friendly commit revisions.

low_tech_punk3y ago

Extremely effective way to waste electricity and emit CO2.

lloydatkinson3y ago

I am confused. Why not use git tags for versioning?

Pirate-of-SV3y ago

Very good! I use this hack every day in winter to heat my apartment (charging laptop at work, run git brute force at home).

breck3y ago

This is absolutely genius. Would be nice to upstream it and make it fast. I would start using it.

mihaaly3y ago

Sounds great for a single person project but perhaps a simpler VCS was better then?

codeulike3y ago

So this is for if you want to use Git as if its Subversion?

_alex_3y ago

Proof of work git hashes. This is nuts and I love it.

pelasaco3y ago

if it was SHA256 we could find an usage for all bitcoin miners that we have...

johmue3y ago

this is a joke right?

j / k navigate · click thread line to collapse

358 comments

198 comments · 56 top-level

larschdk3y ago· 42 in thread

I want the 'merge' function completely deprecated. I simply don't trust it anymore.

If you use merge to sync two branches continously, you completely lose track of what changes were done on the branch and which where done on the mainline.

GuB-423y ago

> I want the 'merge' function completely deprecated. I simply don't trust it anymore.

> If you use merge to sync two branches continously, you completely lose track of what changes were done on the branch and which where done on the mainline.

benatkin3y ago

1 more reply

cerved3y ago

way easier to mess up a rebase

1 more reply

simiones3y ago

Even though I much prefer a linear history, losing 1h or more to the tedious work of re-resolving the same conflict over and over is not worth it, in my opinion.

broeng3y ago

It seems like a very contrived example to me. We have been running rebase/fast-forward only for close to 10 years now, and I have never experienced anything that unfortunate.

5 more replies

semiquaver3y ago

As sibling mentioned, this is totally solved by git-rerere.

2 more replies

adgjlsfhk13y ago

you can often solve this by squashing before rebasing.

3 more replies

TheRealPomax3y ago

Rebase is being annoying here mostly because it's doing exactly what you want it to do: warn you about merge conflicts for every commit in the chain that might have any.

1 more reply

quadhome3y ago

man git-rerere

1 more reply

acchow3y ago

Hmmm I think in your scenario you could avoid resolving the conflict 10 times by using `git rebase --onto`

Suppose "masterX+1" is called latest

Suppose "masterX" is the SHA of your mergebase with master (on top of which you have 10 commits)

`git rebase --onto latest masterX`

jacobsenscott3y ago

Yes. Usually I just squash merge to main and then `git checkout my-branch; git rebase --hard main`. Sure it squashes all the commits, but keeping them all is nearly never needed.

desbo3y ago

https://git-scm.com/docs/git-rerere

mamcx3y ago

I was converted to rebase by my current team, and this hit every time.

I wish it works like merge, or exist a way to merge, resolve conflict, rebase?

2 more replies

YetAnotherNick3y ago

You should do reverse rebase(if it makes sense lol) for this. Instead of rebasing branch to master, rebase master to branch. The only downside is that it requires many force push in the branch.

3 more replies

bsza3y ago

> you might as well rebase or cherry-pick

Both tools are pure vandalism compared to merge. Among the two, cherry-picking is preferable in this case because you're "only" destroying your own history, so in the end, it's your funeral.

> Developer end up fixing additional issues in the merge commit instead of actual commits.

A merge commit IS an actual commit, in every sense of the word. The notion it somehow isn't, is what you need to get rid of.

cerved3y ago

Rebasing / rerolling is completely fine if done right, no need to be overly zealous. But merges are often more elegant

masklinn3y ago

losvedir3y ago

> I do agree that resolving conflicts in merges is risky though.

How do you do otherwise, though? Or is your workflow a combination of rebases and merges? Continual rebasing of the feature branch onto `main` and then a final merge commit when it's ready to go?

2 more replies

BiteCode_dev3y ago

kzrdude3y ago

2 more replies

voakbasda3y ago

If you can’t rebase, I don’t want you pushing to my main branches. I would rather teach everyone how to rebase before I cave and allow merge commits.

1 more reply

rjmunro3y ago

sidlls3y ago

Rebasing is a tool of last resort, when something has so fowled up the code that merging a large-scale refactor is even more time consuming.

cerved3y ago

in my experience, rebase works great if the commits are structured and much more painful with lots of overlapping changes, say by continiusly doing _wip_ commits every hour

1 more reply

pnt123y ago

I don't understand the hate for merges, or the love for rrbaded. Let's consider what may happen using a github flow strategy (main branch, feature branches based solely on main):

* If you screw up a merge, you undo the merge commit. Now your branch is exactly as it were. May not happen with a rebase.

danwee3y ago

I like cherry-pick, but I barely use it (e.g., I need to cherry-pick one commit from branch X into my branch). I don't like rebase much because it requires force-push.

cerved3y ago

rebase only require force push if you're rebasing something already pushed

u801e3y ago

This way, you know which set of commits was in the branch by looking at the parent commits of the merge commit, but the merge commit itself did not involve any automated conflict resolution.

breatheoften3y ago

1 more reply

wnoise3y ago

Yes, it is a shame that you can't combine git merge --ff-only --no-ff .

1 more reply

eurasiantiger3y ago

gpvos3y ago

> If there is any kind of conflict, you are making code changes in the merge commit itself to resolve it.

I don't get it. If you rebase, you get 20 chances to do the same.

thargor903y ago

Reviewing merge commits is harder because they will sometime have huge diffs to both branches.

Rebasing is the process of redeveloping your feature based on the current master. This is smaller, easier steps to review later.

It is a pitty that we can't have tooling to create "hidden" merge commits to allow to connect rebased branches, this would retain the history better and allow pulling more easily.

nextlevelwizard3y ago

If you have bunch of commits in a feature that are related it is easier to revert merges (even if you do pre-merge rebase from master and then merge with --no-ff)

Chris20483y ago

I'd like a "quick ff" that will ff if there are no conflicts, or ff as far as it can with no conflicts - and an easy way to apply to many branches.

foobarbecue3y ago

> Developer end up fixing additional issues in the merge commit instead of actual commits.

silon423y ago

Personally, I believe merging 'master' to your feature branch is the wrong model... what one should do is create a new branch from master and merge the old branch into it.

dahart3y ago

phailhaus3y ago

If you do things the way you're suggesting, you'll make it really hard to tell what commits were made on your branch. Git clients tend to assume the first parent is the branch you care about.

piskerpan3y ago

If you’re merging (and not rebasing) it’s the same exact thing. You’re just switching the “incoming” version, but conflicts will be identical.

e403y ago

I have never had issues with merge, unless rerere was enabled. I've had some extremely surprising results recently with it enabled and I finally disabled it for good.

palata3y ago

What about the commit signatures? Only merge keeps them, right?

wirrbel3y ago· 15 in thread

We performed code review with a projector in our office jointly looking at diffs, or emacs.

sshine3y ago

> If you committed something that broke unit tests your colleagues would pass you a really ugly plush animal of shame that would sit on your desk until the next coworker broke the build.

jonstewart3y ago

2 more replies

cfuendev3y ago

1 more reply

com2kid3y ago

Thankfully modern development practices should ideally run tests before commiting, the build should never be broken.

With good infra, everything from unit tests to integration to acceptable tests get ran before code hits main.

The only excuse for builds breaking nowdays. is insufficient automated safeguards.

1 more reply

rfrey3y ago

It's not toxic because every single developer knows that it could be them next time around.

1 more reply

exabrial3y ago

we did something similar, but everyone knew it was a joke and we all took turns with it. I guess we didn't take ourselves as seriously

couchand3y ago

I have my old team's rubber chicken and I'm never giving it up.

In-person code review is the only way to do it. Pull requests optimize for the wrong part of code review, so now everyone thinks it's supposed to be a quality gate.

wirrbel3y ago

Yep. It makes a lot of sense for open source where gate keeping makes sense (to reduce feature bloat, back doors and an inflated API surface that needs to be maintained almost indefinitely).

titzer3y ago

Commit queues are so far superior to shaming broken builds that I think it's only nostalgia that makes you miss it.

thewebcount3y ago

Absolutely. In my experience, it’s only “not toxic” to a few people, and for most others it is toxic, but the people who like it won’t ever be able to see that.

2 more replies

tomtom13373y ago

This (plush toy and projector) has “feel good” all over it :)

unnah3y ago

Next step: a svn-git proxy that allows one to use a subversion client with a remote git repository.

flir3y ago

Two years from peak Covid, and the plushies are the object of nostalgia.

newswasboring3y ago

I am literally in the middle of trying to convince my group from moving away from all this. Would you recommend going back to this system?

wirrbel3y ago

In this case I reminisced about the toolset but the work flow is what brought the value so I advise of course against using subversion.

1 more reply

infogulch3y ago· 11 in thread

Github-style rebase-only PRs have revealed the best compromise between 'preserve history' and 'linear history' strategies:

couchand3y ago

What's weird about most of these discussions is how they're always seen as technical considerations distinct from the individuals who actually use the system.

pnt123y ago

I wholeheartedly agree!

With this, you can also push people towards smaller PRs which are easier to review and integrate.

bpugh3y ago

1 more reply

przemo_li3y ago

Git branchless have restack command that restacks whole trees/branches of commits.

alvarlagerlof3y ago

Second part explains exactly what I'm juggling with right now.

palata3y ago

What about commit signatures? If you rebase, you lose the original signature, don't you?

GauntletWizard3y ago

If you let Github do the rebase, yes, you do. But you can do so manually yourself, taking the commit down to a single squashed commit, that you then sign.

This is a tooling issue that needs to be solved client-side (i.e. where the signing key lives). It's an important one but actually really simple.

2 more replies

smcameron3y ago

Did you even read the article? This article is about perversely forcing the commit hashes to come out a certain way for lulz.

latexr3y ago

From the guidelines[1]:

> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".

[1]: https://news.ycombinator.com/newsguidelines.html

juped3y ago

But why do you "squash" it! Why do people do this?

plonk3y ago

The best way IMO is to interactive-rebase the branch locally (or force-push a rebased version later), but sometimes 50 commits merge into a 30-ligne single-file change and nothing beats squash.

blux3y ago· 10 in thread

The problem is not a history with a lot of branches in it, it is in not knowing how to use your tools to present a view on that history you are interested in and is easy for you to understand.

boxed3y ago

It's a joke. The swooshing sound you heard was it going past you.

thewebcount3y ago

> The problem is not a history with a lot of branches in it, it is in not knowing how to use your tools to present a view on that history you are interested in and is easy for you to understand.

ranguna3y ago

Pretty analogy, but I don't see how a specific functionality of git (commit history) that has no use case other that looking tidy compares to a handle of a hammer.

michaelmior3y ago

IIRC, GitHub uses a development model where partially implemented features are actually deployed to production, but hidden behind feature flags.

tasuki3y ago

> I fail to see the point of this

lanza3y ago

Quite the opposite. The largest companies just about all use linear commit histories.

1 more reply

lanza3y ago

> How are you going to deal with non-trivial feature branches that need to be integrated into master?

diekhans3y ago

ranguna3y ago

Yes, a thousand times yes.

tcoff913y ago

If people knew about --first-parent everyone could stop complaining about merge commits in the history.

kinduff3y ago· 6 in thread

See also Lucky Commit [0], which uses various types of whitespace characters instead of a hash inside the commit, which makes it look more magical.

I wonder about performance, though. Why is the author's method slower than the package I linked?

[0]: https://github.com/not-an-aardvark/lucky-commit

zeglOP3y ago

Thanks for sharing, this is really cool! Using whitespace is a really clever trick, and running on the GPU makes it even more impressive.

I've been using githashcrash [1], but it's only running on the CPU, which is why it's a bit slower. :-)

[1]: https://github.com/Mattias-/githashcrash

oneeyedpigeon3y ago

Using whitespace is cool, but you know what would be really cool? Using a thesaurus to reword the commit message until it matches the hash :)

3 more replies

zeglOP3y ago

Update: git-linearize now uses lucky_commit as it's backend!

1 more reply

masklinn3y ago

Git also support extra headers in commits. Interesting that neither went with that.

jwilk3y ago

What do you mean by "extra headers"?

1 more reply

kevincox3y ago

I figured that a good option would be to slightly change the date. I don't know what the date resultion us but shuffling it around by a bit shouldn't be an issue.

Of course if the date only has seconds resolution it may be to big of a shift to be reasonable.

enriquto3y ago· 6 in thread

bigDinosaur3y ago

Filligree3y ago

Well, why don't you simply copy the code into a new directory and commit that? Then you can do whatever you want in the scratch directory.

1 more reply

enriquto3y ago

> the reasonably common case very well where someone is working on changes which are constantly breaking the branch

But isn't this bad practice? My grug brain refuses to commit anything that does not pass tests. Check tests, then commit. Check tests, then commit.

You can hide your as yet incomplete feature inside an undocumented option, and work from there, without breaking anything.

3 more replies

mjburgess3y ago

That requires your programming language to identify files with modules, and with your system architecture to be extended by modules alone.

This is an ideal case, of course.

enriquto3y ago

I don't understand your comment. The method that I describe only requires that the programming language ignores unused files. As far as I know, all modern programming languages have this feature.

1 more reply

aqme283y ago

This feels like a lot of extra work to throw away the benefits you actually get out of version control. I would very much not like to work on this team.

zeglOP3y ago· 4 in thread

I don't know how stupid this is on a scale from 1 to 10. I've created a wrapper [1] for git (called "shit", for "short git") that converts non-padded revisions to their padded counterpart.

Examples:

"shit show 14" gets converted to "git show 00000140"

"shit log 10..14" translates to "git log 00000100..00000140"

[1]: https://github.com/zegl/extremely-linear/blob/main/shit

informalo3y ago

Other customers also brew-installed: fuck [1]

[1]: https://github.com/nvbn/thefuck

anderskaseorg3y ago

You may want to take a look at the monotonic commit numbering scheme that Git already has, before trying to hack one into the hashes:

https://git-scm.com/docs/git-describe

thih93y ago

Why the trailing zero? The article quotes hashes starting with "0000001", or "0000014".

Shouldn't "shit show 14" get converted to "git show 0000014"?

couchand3y ago

Thank you for addressing my one and only concern with this scheme! No notes.

chrismorgan3y ago· 4 in thread

jrmg3y ago

How do you make the first commit 0000000? (Without using this project, obviously).

boxed3y ago

You only need to do it once if it's the first commit and you make it empty...

robertlagrant3y ago

Might be by using that hashcrash tool.

chrismorgan3y ago

lucky_commit

gyulai3y ago· 4 in thread

Sane revision numbers are among the many reasons I prefer SVN to GIT.

jstimpfle3y ago

loeg3y ago

For local repositories, you can do it as a post-commit hook.

In the hook:

  prefix=whatever
  old=$(git rev-parse HEAD)
  new=$(brute force $prefix)
  git update-ref -m "chose prefix $prefix" --create-reflog HEAD "$new"

Of course, it's pretty silly and slow.

MAGZine3y ago

... What are the other reasons?

gyulai3y ago

Basically, not to put too fine a point on it, I believe that distributed version control is a problem no one ever truly had, and no one intends to ever have in the future.

So for me: I prefer centralized. And SVN is just a reasonable one to use.

4 more replies

unnouinceput3y ago· 4 in thread

Extremely Linear Git History...also known as SVN. Guess reinventing the wheel does get you to top HN.

mgsk3y ago

Are you allergic to fun? :(

adql3y ago

You need to do it badly and wastefully.

zeglOP3y ago

Using SVN doesn't cause your computer to heat up. ;-)

1 more reply

oneeyedpigeon3y ago

This is Hacker News. You could add this as a canonical example in the FAQ as far as I'm concerned.

oneeyedpigeon3y ago· 3 in thread

entropy_3y ago

It's a hash of everything that goes into a commit, including the commit message. The idea is that nothing that makes up a commit can change without changing the hash.

mjochim3y ago

> It's a hash of everything that goes into a commit, including the commit message

2 more replies

belinder3y ago

I feel like it would be better to have some dummy file in your repo that the tool modifies than mucking up your commit messages

chrismorgan3y ago· 3 in thread

zeglOP3y ago

I left some details out of the post to make it shorter.

avar3y ago

Git hasn't used "seven-character prefixes when there are no collisions" in a long time.

It's a combination of the "repo size" (as in, estimated number of objects) and a hard floor of seven characters.

chrismorgan3y ago

  $ git init x
  Initialized empty Git repository in /tmp/x/.git/

  $ cd x

  $ git commit --allow-empty -m one
  [master (root-commit) 4144321] one

  $ git log --oneline
  4144321 (HEAD -> master) one

  $ lucky_commit

  $ git log --oneline
  0000000 (HEAD -> master) one

  $ git commit --amend --no-edit --reset-author --allow-empty
  [master 3430e13] one

  $ git log --oneline
  3430e13 (HEAD -> master) one

  $ lucky_commit

  $ git log --oneline
  0000000f (HEAD -> master) one

  $ git reflog --oneline
  0000000f (HEAD -> master) HEAD@{0}: amend with lucky_commit
  3430e13 HEAD@{1}: commit (amend): one
  00000005 HEAD@{2}: amend with lucky_commit
  4144321 HEAD@{3}: commit (initial): one

  $ git reflog expire --expire=now --all

  $ git reflog --oneline

  $ git log --oneline
  0000000f (HEAD -> master) one

  $ git gc --aggressive --prune=now
  Enumerating objects: 2, done.
  Counting objects: 100% (2/2), done.
  Writing objects: 100% (2/2), done.
  Total 2 (delta 0), reused 0 (delta 0), pack-reused 0

  $ git log --oneline
  0000000 (HEAD -> master) one

1 more reply

shawabawa33y ago· 3 in thread

Sadly if you use commit signing it's unfeasibly slow to do this :(

mkj3y ago

It shouldn't be, won't the signature be made after the hash brute force is finished?

chipsa3y ago

The signature is part of the commit hash.

imiric3y ago

You think signing is what makes this unfeasibly slow? :)

rock_artist3y ago· 3 in thread

To make sure I've got it right.

In order to get this 'beautiful' hashes, they're crunching numbers leveraging cpu power?

kzrdude3y ago

Yep. If you squint it's similar to bitcoin mining in that particular aspect

Bellyache53y ago

And vanity Tor .onion addresses

contradictioned3y ago

yep

ChrisMarshallNY3y ago· 2 in thread

I find tags to be a fairly useful way of providing a linear progression, but I guess that's no fun.

> but it can also mean to only allow merges in one direction, from feature branches into main, never the other way around. It kind of depends on the project.

That sounds like the Mainline Model, championed by Perforce[0]. It's actually fairly sensible.

[0] https://www.perforce.com/video-tutorials/vcs/mainline-model-...

actuallyalys3y ago

At risk of coming across as a humorless Hacker News commenter, I will add that I enjoyed this post. It’s a neat hack!

ChrisMarshallNY3y ago

Yes, it is a cool hack. I enjoy these, even if I can't find a practical application.

davide_v3y ago· 2 in thread

I thought I was a very tidy person, then I saw this.

Thev00d003y ago

Im not sure it is tidy to inject random junk into your commit message to get a hash prefix

mgsk3y ago

Or maybe it is _extremely_ tidy.

1 more reply

malkia3y ago· 2 in thread

Hail p4, g4, svn and blessed be their monotonically increasing revision number!

cerved3y ago

All hail centralized version control, make life slow again!

charcircuit3y ago

Decentralized version control is slower than centralized version control since it requires downloading and working with the entire repository.

1 more reply

jasmer3y ago· 2 in thread

Wouldn't it have been better if we could use something other than SHA1 as the actual name of something?

Where in the worst dystopian parts of software do we do this?

The SHA1 is kind of a security feature if anything, a side-show thing that should be nestled 1-layer deep into the UI and probably most people are unaware of.

Whereas commits and branches should be designed specifically for the user - not 'externalized artifacts' of some acyclic graph implementation.

Git triggers a product designers OCD so hard, it's hard for some of us to not disdain it for spite.

jonstewart3y ago

I don’t want to make up a good name for every commit. Good comments are hard enough.

A SHA-1 might not look friendly to a dev who doesn’t understand it, but as someone who works with hash values all the time, having my repo be a Merkle tree gives me a warm fuzzy.

jasmer3y ago

You wouldn't 'make one up' there would be an automatic variation of Semantic Versioning, or something actually useful.

Your 'warm and fuzzy' comes at the cost of confusion (even to yourself), not having any clue what the information really means.

It's not even clear that it's a commit, it could be anything.

This posture is exactly what I'm complaining about: it's objectively bad design engineering, embraced as though somehow it's 'smart'.

Git has a few problems like this.

1 more reply

hoseja3y ago· 2 in thread

This is very silly.

chii3y ago

but fun. Also might as well just mine bitcoins tbh...

zeglOP3y ago

Do not try this at home.

pcthrowaway3y ago· 2 in thread

I mean.. this kind of breaks down if you have more than one person on the team

globular-toast3y ago

It just means you have to coordinate more. Or just have one person in charge of the master branch. I don't think the post is supposed to be taken so seriously, though.

Zoadian3y ago

which just means: lets waste many hours coordinating, for the benefit of having a 'nice looking' history.

2 more replies

jordigh3y ago· 1 in thread

Mercurial always has had sequential revision numbers in addition to hashes for every commit.

silvestrov3y ago

Seems like a design fault in git that commits only have a single id (sha1 hash) and that hashes are written without any prefix indicating which type of id it is.

If all hashes were prefixed with "h", it would have been so simple to add another (secure) hash and a serial number.

E.g. h123456 for the sha1, k6543 for sha256 and n100 for the commit number.

bloppe3y ago· 1 in thread

This is a fun idea, but it will mess with your GC heuristics.

https://git-scm.com/docs/git-gc#_configuration

This isn't the end of the world, but something to consider.

chrismorgan3y ago

Could use little-endian numbers to avoid this: 0000, 1000, 2000, 3000, …, e000, f000, 0100, …

Ayesh3y ago· 1 in thread

I wonder if Git provides a pluggable hashing mechanism as part of SHA2 migration.

I imagine stuff like this and SVN to Git mirroring to work nicely with identical hashes.

masklinn3y ago

Not currently, it’s a repo-level flag and you get one or the other.

It’ll undoubtedly be easier to further expand, but it’s nowhere near pluggable.

HextenAndy3y ago· 1 in thread

Wait until you see subversion :)

breck3y ago

Two steps forward one step back. So it goes.

nsajko3y ago· 1 in thread

Gitlab supports an option called "Fast-forward merge":

> No merge commits are created.

> Fast-forward merges only.

> When there is a merge conflict, the user is given the option to rebase.

The maintainer can enable this for a project.

spyremeown3y ago

So does almost every PR-based workflow tool for (bitbucket, GitHub etc). It's very common.

JoachimS3y ago· 1 in thread

A neat trick with this tool is to generate a commit message that corresponds to a given issue number. It could almost be useful.

Kudos to @zegl for this cool project.

mjochim3y ago

> It could almost be useful.

I'm still pondering the “almost” ;-).

titzer3y ago· 1 in thread

Brute-forcing hash collisions seems like an April Fool's joke. You can't really be serious that people are going to do this regularly?

javier1234543213y ago

I don't think people actually take this project seriously

bcoughlan3y ago· 1 in thread

I wish Git had more support for "linear" revisions in the main branches. It's great for continuous delivery where you can get a unique identifier that's also human-friendly.

I emulate this by counting the number of merges on main:

git rev-list --count --first-parent HEAD

But it's not that traceable (hard to go from a rev back to a commit).

tazjin3y ago

We do this at TVL, and push the corresponding revision as a ref (refs/r/$n) back to git. See for example our cgit log view: https://code.tvl.fyi/log/

The revisions themselves are populated in CI: https://cs.tvl.fyi/depot@c537cc6fcee5f5cde4b0e6f8c5d6dcd5d8e...

chrsig3y ago· 1 in thread

Cool hacker project, learned stuff about git reading the article. I don't want to put this into practice, and don't see the utility of it.

guipsp3y ago

It's not supposed to be put into practice, and it's not supposed to be useful.

zomglings3y ago· 1 in thread

Is it possible to change the checksum implementation that git uses, through configuration or a plugin?

I find all this hash inverting quite inelegant.

yjftsjthsd-h3y ago

There's an effort to add support for sha256, but it's... not recommended https://lwn.net/Articles/898522/

u801e3y ago· 1 in thread

Why does the version skip from 19 to 20? What about 1A, 1B, 1C, 1D, 1E, and 1F?

cjbprime3y ago

It's using a neat numbering system called "decimal", you should check it out.

ccbccccbbcccbb3y ago· 1 in thread

<sarcasm> But what's the carbon footprint and contributed sea level rise of this frivolity? </sarcasm>

housel3y ago

Semaphor3y ago

> Full collision (entire hash is zeros, then 000...1, etc.) — `git linearize --format "%040d"` (takes ~10³³ years to run per commit)

Hah :D

otikik3y ago

This is horrible and I like it.

maxbond3y ago

Has anyone tried using git alternatives like fossil in production? Did it work out? Did you build CI/CD around it?

sagebird3y ago

“ So we only have one option: testing many combinations of junk data until we can find one that passes our criteria. “

I have a somewhat related interest of trying to find sentences that have low Sha256 sums.

I made a go client that searches for low hash sentences and uploads winners to a scoreboard I put up at https://lowhash.com

jzer0cool3y ago

kzrdude3y ago

Clever and tempting. I would maybe like to use a smaller prefix but ensure a 0 suffix to the number too, to make it easy to read anyway. Like 00010bad, 00020fed, 00030be1, etc..

Wraparound doesn't really matter, as long as it's spaced long apart.

shadytrees3y ago

There's a memorable Stripe CTF from 2014 that had something similar (Gitcoins). This brought back fond memories of that.

conaclos3y ago

Another approach could be to use prefixes. A 0 could separate the prefix (fixed hash part) from the suffix (random part).

  0<suffix>
  10<suffix>
  20<suffix>
  ...

Combined with auto-completion, you preserve the main advantage (ordering) and you are able to quickly compute the hash.

alvis3y ago

It look appealing from a perfectionist point of view.

But! How can I collaborate with my team when PR merges are inevitable? O:

cranium3y ago

forgotmypw173y ago

tambourine_man3y ago

hiergiltdiestfu3y ago

wtf, back to SVN :D

I honestly expected this to be from another "really cool date" - April 1st :D

joosters3y ago

Just like perforce and its "p4 changes" command. I like the simplicity.

jbergstroem3y ago

The Webkit project would love this. Can't help but feel that half the reason they spent all the extra effort with subversion was user-friendly commit revisions.

low_tech_punk3y ago

Extremely effective way to waste electricity and emit CO2.

lloydatkinson3y ago

I am confused. Why not use git tags for versioning?

Pirate-of-SV3y ago

Very good! I use this hack every day in winter to heat my apartment (charging laptop at work, run git brute force at home).

breck3y ago

This is absolutely genius. Would be nice to upstream it and make it fast. I would start using it.

mihaaly3y ago

Sounds great for a single person project but perhaps a simpler VCS was better then?

codeulike3y ago

So this is for if you want to use Git as if its Subversion?