undefined | Better HN

0 pointsPyxl10110y ago0 comments

> Everything else should be recorded in the history. Forensics are important to the long term health of your project and you impoverish yourselves by scrubbing the crime scene.

OK. So would you support an IDE that generated one individual commit per keypress? When I type "hello", that's five individual commits each changing a single letter.

That is the true history of what happened, and it's typically recorded in your IDE's undo log in memory, such that you can step back and forward through it. Are you OK with pushing this commit history to your project? That's the true history of what happened, after all, and it could be forensically useful.

When we commit code to share it with other people, we recognize that only a certain level of detail is relevant to them. The physical sequence of letters that I pressed isn't typically relevant to someone else. Rather, the logical set of changes is what affects them.

If you agree that a single commit per letter of keypress is undesirable then you agree with me in principle. There is a finite level of detail that makes sense to share, practically, with current source control systems, and what we're arguing about is how much to share.

I feel like this is important to point out because people frequently make this argument about the "true history" of source, and all that, while neglecting the fact that the commits they choose to ship are already an arbitrary simplification of the "true history".

0 comments

8 comments · 4 top-level

MaulingMonkey10y ago· 3 in thread

> OK. So would you support an IDE that generated one individual commit per keypress? When I type "hello", that's five individual commits each changing a single letter.

Not OP but - conceptually? Yes. I've written file formats which preserve undo/redo history, for example. The caveats:

1) I don't want to inadvertently leak my password. This is an issue with things such as local bash command history buffers as well, and unreviewed autocommits in general. You could say it's too useful - to the wrong people!

2) My tooling needs to be built around a different level of granularity as the default. I'm not OK with single letter commits cluttering up my git log, for example. Having them around to drill down into if I need them? Sure.

Per-letter detail is so granular that even undo/redo systems will often squash history states together. Observing your exact typing Cadence / the extra evidence of initial authorship is niche enough that I'm quite willing to sacrifice that level of detail for the sake of performance, maintainability, or basically any and every other excuse you can think of.

> When we commit code to share it with other people, we recognize that only a certain level of detail is relevant to them. The physical sequence of letters that I pressed isn't typically relevant to someone else. Rather, the logical set of changes is what affects them.

I have never seen a codebase with perfect commits. The ones that always give me trouble wrapping my head around are the squashed "logical set of changes", where the set size is way too damn big. Reverse engineering a saner overview from a series of tiny commits is way easier.

And the physical way something was done does matter at times. I'm going to pay way more attention to "whitespace cleanup" commit done by a human than "whitespace cleanup" commit done by vetted tools, for example. In whitespace significant languages, the former may trigger a full code review. Similar concerns with a lot of refactorings, actually.

> If you agree that a single commit per letter of keypress is undesirable then you agree with me in principle.

Per the above, I can only agree with you in practice :)

> There is a finite level of detail that makes sense to share, practically, with current source control systems, and what we're arguing about is how much to share.

Agreed. But I haven't found a single, solitary codebase, where I'd ever argue "less". I can't even recall a single, solitary commit where I'd have ever argued "less". Commit directly to mainline to fix a single character typo? I'll be annoyed if your changelist description was too terse! I want to see:

  Fix the build:  Fix typo 'baz' -> 'bar'
  Fix documentation typo: 'baz' -> 'bar'

I've seen a codebase where a majority (read literally: >50%) of the commits were "good enough". It was beautiful.

A coworker of mine shared he'd collected stats on who made the most commits/day to test a hypothesis. Due to some outliers, he'd decided against it, but thought I'd find the stats amusing. Out of ~50 people (~15 programmers), I (the most recently hired programmer) topped the chart. Second place? The build server account (thanks to nightly build scripts.) I was indeed amused.

chris_wot10y ago

I'm curious, what do you think of the commit history of the last few years of LibreOffice?

https://cgit.freedesktop.org/libreoffice/core/log/

(Don't look at the OpenOffice.org years, they literally took a whole bunch if development work from SVN branches and then merged them in as a single commit and put in single line descriptions with internal tracking numbers and odd project management codes... utter disaster! And of course the branches are now all lost...)

MaulingMonkey10y ago

How about I just look at the latest ~33h directly on that page ;)

I like the scope of a lot of those commits, although a few are still chunkier looking than I'd like - take that with a grain of salt, though, as I don't have a good enough feel for the codebase to reasonably estimate how much more they could be chunked up. Pretty much everything has a review link, which is nice. I'd expect more back and forth in the comments, but perhaps that's handled out-of-band.

I have lots of nitpicks with the actual changelist descriptions where I'd want things to improve. "Clean up" could mean just about anything - I must go to diff (and expand the context) to understand e.g. https://cgit.freedesktop.org/libreoffice/core/commit/?id=945... properly. I'd be inclined to instead write:

officeipcthread: Cleanup RequestHandler::Enable: Early bail, remove aDummy (just use aUserInstallPath directly), move declarations.

Now I know scope, and the types of changes (refactoring worth reviewing if looking for breakage, not just ignorable whitespace / comment changes.)

No gripes with the overall style on this one, although interacting with a security component, I'd want multiple reviewers: https://cgit.freedesktop.org/libreoffice/core/commit/?id=2a9... .

I'd lean towards linking a screenshot of at least the new version of UI when the file being modified "isn't human readable" (read: is modified with something other than a text editor, even if I can totally read it) which would apply to e.g. https://cgit.freedesktop.org/libreoffice/core/commit/?id=72c... .

EDIT: Formatting, + rationale RE: changelist description.

1 more reply

lerpa10y ago

> My tooling needs to be built around a different level of granularity as the default. I'm not OK with single letter commits cluttering up my git log, for example. Having them around to drill down into if I need them? Sure.

And that would be nice to have, a vcs that allows you to fold and unfold commits to different levels of granularity.

Wanna have only a straight line history with all feature branches squashed? There you have it. Just need some of them like that? Sure enough. What if you want every time the file was saved? Not a problem, just configure it to commit automatically.

phire10y ago· 1 in thread

I actually implemented an eclipse plugin which recorded every keypress (among other things) for my 3rd year project at uni.

chris_wot10y ago

That's pretty interesting... did you ever publish this code?

jasonkester10y ago

So would you support an IDE that generated one individual commit per keypress?

That would actually be awesome, if the tools supported it. Imagine how easy it would be to find bugs with a bisect if it can drill down to the actual keystroke that introduced a bug.

Realistically, you'd want to back out a few notches though. Say, every time the dev hit Shift+Ctrl+B (or tabbed over to the browser or whatever signifies "build the project" in the environment in question), so that you get an indication that the current state of things was meant to at least compile and run.

But yeah, that's the value of source control. Being able to dig back in history to the exact spot where a bug was born. I can't understand all why so many people here would want to scrub that away.

hobs10y ago

I am told that Google's filesystem for their dev workspace is somewhat similar to capturing every single keypress as an immutable layer forever.

Apparently introducing interesting problems when people create a log file one byte at a time, up to several hundred megs. (just 100000000 files for one little project, oops)

j / k navigate · click thread line to collapse

0 comments

8 comments · 4 top-level

MaulingMonkey10y ago· 3 in thread

> OK. So would you support an IDE that generated one individual commit per keypress? When I type "hello", that's five individual commits each changing a single letter.

Not OP but - conceptually? Yes. I've written file formats which preserve undo/redo history, for example. The caveats:

> If you agree that a single commit per letter of keypress is undesirable then you agree with me in principle.

Per the above, I can only agree with you in practice :)

> There is a finite level of detail that makes sense to share, practically, with current source control systems, and what we're arguing about is how much to share.

  Fix the build:  Fix typo 'baz' -> 'bar'
  Fix documentation typo: 'baz' -> 'bar'

I've seen a codebase where a majority (read literally: >50%) of the commits were "good enough". It was beautiful.

chris_wot10y ago

I'm curious, what do you think of the commit history of the last few years of LibreOffice?

https://cgit.freedesktop.org/libreoffice/core/log/

MaulingMonkey10y ago

How about I just look at the latest ~33h directly on that page ;)

officeipcthread: Cleanup RequestHandler::Enable: Early bail, remove aDummy (just use aUserInstallPath directly), move declarations.

Now I know scope, and the types of changes (refactoring worth reviewing if looking for breakage, not just ignorable whitespace / comment changes.)

No gripes with the overall style on this one, although interacting with a security component, I'd want multiple reviewers: https://cgit.freedesktop.org/libreoffice/core/commit/?id=2a9... .

EDIT: Formatting, + rationale RE: changelist description.

1 more reply

lerpa10y ago

And that would be nice to have, a vcs that allows you to fold and unfold commits to different levels of granularity.

phire10y ago· 1 in thread

I actually implemented an eclipse plugin which recorded every keypress (among other things) for my 3rd year project at uni.

chris_wot10y ago

That's pretty interesting... did you ever publish this code?

jasonkester10y ago

So would you support an IDE that generated one individual commit per keypress?

That would actually be awesome, if the tools supported it. Imagine how easy it would be to find bugs with a bisect if it can drill down to the actual keystroke that introduced a bug.

But yeah, that's the value of source control. Being able to dig back in history to the exact spot where a bug was born. I can't understand all why so many people here would want to scrub that away.

hobs10y ago

I am told that Google's filesystem for their dev workspace is somewhat similar to capturing every single keypress as an immutable layer forever.

Apparently introducing interesting problems when people create a log file one byte at a time, up to several hundred megs. (just 100000000 files for one little project, oops)

j / k navigate · click thread line to collapse