It's unsurprising that people's mental model of git is incorrect. Git is not something people study at a conceptual level, it's something they learn recipes for in order to work on some project. Recipes like "how do I save all this work I just did" and "oh shit, everything is hosed, please give me a magic spell I can paste into my terminal to fix it".
I don't really blame people, since git itself does nothing to teach you how it works. Git it is the definition of something you have to deal with in order to do something more important to you. Some people want to dig deep and understand how the system works: it's nice to sit near that person and ask them for help sometimes.
Saying "you should really understand more about git" is like saying "you should really study the tax code, it's important and it affects you whether you like it or not." True, but deeply irrelevant!
This is not essential complexity, it's just bad design that stuck.
Take a look at https://gitless.com/
If you just look at a summary of the commands, you will have an accurate mental model of what's going on:
gl init - create an empty repo or create one from an existing remote repo
gl status - show status of the repo
gl track - start tracking changes to files
gl untrack - stop tracking changes to files
gl diff - show changes to files
gl commit - record changes in the local repo
gl checkout - checkout committed versions of files
gl history - show commit history
gl branch - list, create, edit or delete branches
gl switch - switch branches
gl tag - list, create, or delete tags
gl merge - merge the divergent changes of one branch onto another
gl fuse - fuse the divergent changes of one branch onto another
gl resolve - mark files with conflicts as resolved
gl publish - publish commits upstream
gl remote - list, create, edit or delete remotes
To me this clearly demonstrates that the problem isn't that people aren't learning git, it's that git is bad to learn. Stash + Index + Working Tree isn't the right abstraction to present to people. Just say there is a working tree, and tracked and untracked files and snapshots. Done. Branches aren't particular commits but particular working trees on top of particular commits.Working on a feature and want to look at the main branch, but not ready to commit the changes yet? Well just switch to the main branch, then switch back and pick up where you started. No need to know about an additional data structure called the stash.
Unfortunately this did not pick up enough steam. And because a lot of tools expose concepts from gits broken interface you have to learn the git interface anyway...
gl merge - merge the divergent changes of one branch onto another
gl fuse - fuse the divergent changes of one branch onto another
Good while it lasted thoughThe list you provide sounded great until it came to gl switch. Why is there one specific operation for a branch that is NOT done via gl branch?
I don't understand what fuse is supposed to do from this at all. No idea whatsoever. Merge I get and anyone who has worked with any other versioning tool does conceptually.
Rebase most people seem to have a problem with but the abstract concept really isn't that hard. Just like cherry pick isn't really hard but somehow people have trouble with it. Though conceptually it really isn't hard either.
What really helped me the most with git was the realization that it's just a tree of commits with a bunch of labels. Labels have different types so to speak, like branch or tag, remote branches being special in a way etc. And obviously various commands can interact with these labels. Like a fetch updates the remote labels and moves them around on my local copy.
This is HN criticism #94238 on the terrible git CLI.
Okay, sure.
Would you kindly post your superior git CLI? Or at least the outline of it?
---
Snark aside, Git's popularity is not an accident. Bitbucket supported Mercurial too.
The official git handbook, freely available on the official git-scm site is not terribly long, and explains the internals on a conceptual level quite well.
I think the problem is most people learning git land on some wordpress site of someone trying to flog a condensed and uninsightful shortcut to getting started with git for ad clicks, which only involves a series of commands without explaining the effects of those commands - This, combined with peoples expectation that an SCM should take no thought whatsoever causes most people that use git on a day to day basis to not really understand it at all.
Git needs to be introduced as powerful data structure, kind of like how SQL is not a DB, imagine someone explaining SQL without ever refering to the DB tables, rows and fields... only talking about git commits is like only talking about the result of a single query. You must understand the data structure to easily use the interface, otherwise the interface will be very confusing or you will be limited to "recipes"... after that you are just learning new variations on how to manipulate and navigate that structure (yes the graph), and from this perspective peoples complaints about the historical inconsistencies we have to put up with in git porcelain are moot.
So, I started reading through <https://git-scm.com/book/en/v2/Git-Internals-Git-Objects> again to make sure I didn't have anything wrong.
But now there's no point in writing a blog post. Maybe I'll write one that just links to <https://git-scm.com/book/en/v2/Git-Internals-Git-Objects>.
It even has nice diagrams, which I think are essential for this kind of thing.
I used to despise git because it was so hard to learn. Then as an exercise I started writing my own code to read and write its underlying files and it finally dawned on me how simple the whole thing was.
Git's a very unusual piece of software; it's mind-bogglingly useful, the basic data structures and algorithms are perfectly matched to its job, and it has a UI that's a train wreck.
it's like coming into a forum for accountants where people bitch about having to learn tax code. please...
“All operations on a repository involve adding commits and/or manipulating the name resolution table.”
It may be simplified, but that statement alone, taken in context, is worth its weight in gold.
I'd say that's definitely the case but also a problem.
Sophisticated users mixed with people who just want to do a few simple things is a bad combination. I seem to remember that ClearCase had the same issues.
If you would like to know more about how to manipulate the git graph, take this excellent (and free) training:
https://learngitbranching.js.org/
To slowly level up, you can watch video demonstrations from Dan's git school. Dan provides 48, 30 minute training videos:
https://www.youtube.com/watch?v=OZEGnam2M9s&list=PLu-nSsOS6F...
Able to commit locally, examine changes work with them and then push is a something you might not need or require if you think about version system like SVN.
But if you have learned Git or Mercurial or some other distributed system you would never go back to svn.
Once an SVN user discovers the magic of a staging area, stashes, or "git add -p" I don't know how they could claim SVN does anything better. All I remember from those days was how slow everything in SVN was. It felt like every command was backed by some horrible O(n^2) operation or really slow network connection.
git isn't hard. FFS, we shouldn't keep seeing these posts hitting HN every week. iptables? That's tough. DNS? No thanks. Managing package.json and keeping an app up-to-date? Git is nothing in comparison to the real challenges I face everyday.
If SVN is wonderful for you: Great! But that's not really relevant to the issue of using git effectively.
I feel like this is mostly accurate, to my knowledge, but reading this:
> I do not claim that this way of looking at Git represents absolute “facts” in any hard and fast or literal sense. But I contend that if you conceive of Git in the way that I’m going to suggest, if you substitute these conceptions of Git for any misconceptions you might have now, you’ll be a much happier and more fluid Git user.
…vexes me.
“Think of git like bowl of peanuts and marshmallows” and other pointless, wrong, metaphors about how git works are a dime a dozen.
Yet, here is someone who is clearly quite familiar with git, and they go to pains to point out they are simplifying and may not be correct in their explanations.
Its good to be humble, but ffs, git is too frigging complicated if the best you can get is a “probably wrong simplified mental model of how it works so you can be a bit more productive with it”.
I dont care;
- a simple meaningless metaphor that lets you be more productive? OK.
- a accurate description of how things actually work? OK.
…but pick one.
What I do not want is a possibly wrong complicated explanation of how git maybe works.
I would argue most things in technology are complex, and mental models are intentional ways to take something complex and turn it into something more simple. This article does not create meaningless metaphors.
Personally speaking, I find knowing and distinguishing among the 4 indexes to be essential to understanding git. Not including and really exploring that detail gives people an incorrect mental model of what's happening.
Marvelous, if the metaphors of the article helped you, but I empathize with the upstream poster's frustration. I believe that the content of the article is not medicine for the malaise it describes.
The command line interface to git is insanely complicated, confusing, and unnecessarily difficult to use, but this isn't a result of the git data model. It's definitely possible, to give a complete and accurate description of the data model, even using examples from `git cat-file` to walk through the commit history by hand.
I've also got a simple demo that generates a complete repo with a commit. You can manipulate the resulting repo from git. There are 65 non-comment lines of code.
Here it is: https://orib.dev/ugit.py
Based on the title, I was expecting a more in-depth study of user misconceptions about git, similar to the famous CogSci paper "Two Theories of Home Heat Control." Except with like, diagrams.
And now I want someone to make that happen.
1. Commits are immutable blobs that have one or more parents. Graphs, not trees. Anyone who uses trees for git commits misses the whole point and makes their (and their collaborators) lives complicated.
2. Tags are (mostly, best practice) immutable pointers to commits. Tag are "this is this thing FOREVER*."
3. Branches are named, mutable (by design) pointers to commits. Branches are "this is this thing FOR NOW. Later it'll be something else."
4. HEAD is special "branch" that moves around automatically.
5. Origin is the local snapshot of the remote. Origin is "what did it look like when I last looked."
6. (fundamental but not critical) Remote is the current remote state (queried by RPC).
7. Index (aka stage) is where you put changes you want to make into commits. (this is somewhat simplified). Index is "My current and immediate plan. Scrub as needed."
That's (mostly, for non advanced use cases) it. Everything else are commands to query or manipulate the various state. Every action (until it becomes instinctual knowledge) should follow the same recipe: 1. Figure out the current state (current commit graph, relevant branches). 2. Figure out the target state (desired commit graph, new branches positions). 3. Mutate using ANY command you want.
I think that's the issue really. Inexperienced dev / people who don't understand git look at commands as "this is how to do a thing". No. In Git there isn't "how to do the thing". It's exactly like writing code - so many ways to achieve the goal, just choose your own. It might be efficient and elegant, or bumbling and ugly, but it'll get there.
Heck, "a monoid in the category of endofunctors" is simpler.
[1] From the top of my head: The working tree, the index, the stash, the repo ADG, the local remote repo ADG, the remote repo ADG. Of course the branch labels are further state, and working with the commits directly is discouraged. Oh and files can be either tracked or not, and they can either be ignored or no. And one isn't a subset of the other. And that also interacts with the various state transitions.
I can't (yet) reason about monoids easily. But I can reason about Git, even if I can't figure out the single command to change the state the way I want it and have to resort to multiple commands. I guess it's easier for me to think in graphs.
I could never understand what kind of twilight zone stashes go into or remember which stash is which when I had too many of them. So I never use stashes any more, I just make a branch instead.
I largely use git add -A, so I can pretend that the index does not exist.
3. Branches are named, mutable pointers to commits, that you can "ride". While you "ride" a branch it keeps moving to always point to your latest commit.
4. HEAD is an implicit branch that you "ride" at all times.
re "ride" - that's exactly what I'm trying to avoid. It's an additional concept that isn't needed to understand Git. You need to understand the model. The "ride" is an emergent property of the model and commands that you eventually understand, but not a core part.
Bad programmers worry about the code. Good programmers worry about data structures and their relationships.
-Linus Torvalds
Anyway, it's not for everyone to get to understand git this way, I guess. Some people will just react "just tell me how to do X in git!"Seriously, I too find the basic concepts of git quite simple. But whenever I want to do anything slightly out of the ordinary, I find myself wasting a lot of time searching the docs. In fact, I find the naming of commands and their options almost the opposite of intuitive, given my understanding of the basic model.
Then I read "git inside out" [1] (not to be confused by "git from the bottom up" which I think is not as good), had a "aha!" moment, my view changed and everything became clear and easy. Transformation from graph to graph is something I do every day, so why not in Git?
[1] https://www.slideshare.net/MichaelNadel/git-inside-out-57904...
No, the problem is not with "how people use Git". The problem is with git. We've known for years how to make clear, concise interfaces that help people understand what's going to happen. Git does not have a clear, concise interface. That is its biggest problem and will continue to be until it is changed to have a clear, concise interface.