The comparison between jiff, chrono, time and hifitime is just as good of a read in my opinion: https://github.com/BurntSushi/jiff/blob/HEAD/COMPARE.md
(And they have also written interesting things on regex, non-regex string matching, etc.)
There was this post from cursor https://cursor.com/blog/fast-regex-search today about building an index for agents due to them hitting a limit on ripgrep, but I’m not sure what codebase they are hitting that warrants it. Especially since they would have to be at 100-200 GB to be getting to 15s of runtime. Unless it’s all matches that is.
On a mid-size codebase, I fzf- and rg-ed through the code almost instantly, while watching my coworker's computer slow down to a crawl when Pycharm started reindexing the project.
And I was dead wrong. Overnight everyone uses rg (me included).
It’s fast even on a 300mhz Octane.
There's also RGA (ripgrep-all) which searches binary files like PDFs, ebooks, doc files: https://github.com/phiresky/ripgrep-all
Eventually I was considering rebuilding the machine completely but for some reason after a very long time digging deep into the rabbit hole I tried plain old grep and there was the data exactly where it should have been.
So it's such a vague story but it was a while back - I don't remember the specifics but I sure recall the panic.
If it actually matched grep's contract with opt-in differences that'd be a gamechanger and actually let it become the default for people, but that ship seems to have sailed.
rg : Searches git tracked files
rg -u : Includes .gitignored files
rg -uu : Includes .gitignored + hidden files
rg -uuu : Includes .gitignored + hidden + binary filesSometimes I forget that some of the config files I have for CI in a project are under a dot directory, and therefore ignored by rg by default, so I have to repeat the search giving the path to that config files subdirectory if I want to see the results that are under that one (or use some extra flags for rg to not ignore dot directories other than .git)
I think riggrep will not search UTF-16 files by default. I had some such issue once at least.
I ran into that with pt, and it definitely made me think I was going mad[0]. I can't fully remember if rg suffered from the same issue or not.
[0] https://github.com/monochromegane/the_platinum_searcher/issu...
I suspect, in general, age has a fair amount to do with it (I certainly notice it in myself) but either way I think it's worth evaluating new things every so often.
Something like rg in specific can be really tricky to evaluate because it does basically the same thing as the builtin grep, but sometimes just being faster crosses a threshold where you can use it in ways you couldn't previously.
E.g. some kind of find as you type system, if it took 1s per letter it would be genuinely unusuable but 50ms might take it over the edge so now it's an option. Stuff like that.
https://hwisnu.bearblog.dev/building-cgrep-using-safe_ch-cus...
It seems this was possible because ripgrep is inefficient in CPU usage when runs multithreaded and uses about 2x times more CPU time in comparison to GNU grep.
https://hwisnu.bearblog.dev/levelized-cost-of-resources-in-b...
The ".ignore" name was actually suggested by the author of ag (whereas the author of rg thought it was too generic): https://news.ycombinator.com/item?id=12568245
It's nice and everything, but I remember being happy with the tools before (I think i moved from grep to ack, then jumped due to perf to ag and for unremembered reasons to pt.)
It took me a while, but I remembered I ran into an issue with pt incorrectly guessing the encoding of some files[0].
I can't remember whether rg suffered from the same issue or not, but I do know after switching to rg everything was plain sailing and I've been happy with it since.
[0] https://github.com/monochromegane/the_platinum_searcher/issu...
TIL: rg uses Rusts RegEx library (incompatible to PCRE, incompatible to RE2)
With 240 log files in various subfolders.
grep -q -r "22:02" --include=".log" 4.15s user 0.09s system 99% cpu 4.269 total
grep -q -r "22:02" --include=".log" 4.18s user 0.09s system 99% cpu 4.265 total
grep -q -r "22:02" --include="*.log" 4.31s user 0.09s system 99% cpu 4.401 total
rg -q "22:02" -t log 0.01s user 0.01s system 83% cpu 0.018 total
rg -q "22:02" -t log 0.01s user 0.01s system 93% cpu 0.017 total
rg -q "22:02" -t log 0.01s user 0.01s system 95% cpu 0.018 total
I really did not expect it to be that fast.
Someone please make an awesome new sed and awk.
I don’t understand when people typeset some name in verbatim, lowercase, but then have another name for the actual command. That’s confusing to me.
Programmers are too enarmored with lower-case names. Why not Ripgrep? Then I can surmise that there might not be some program ripgrep(1) (there might be a shorter version), since using capital letters is not traditional for CLI programs.
Look at Stacked Git:
https://stacked-git.github.io/
> Stacked Git, StGit for short, is an application for managing Git commits as a stack of patches.
> ... The `stg` command line tool ...
Now, I’ve been puzzled in the past when inputing `stgit` doesn’t work. But here they call it StGit for short and the actual command is typeset in verbatim (stg(1) would have also worked).
// really hoping openai wouldn't now force him to work on some crappy codex stuff if he stays there / in astral.
The TUI is great, and approximate matches are insanely useful.
https://reddit.com/r/rust/comments/1fvzfnb/gg_a_fast_more_li...
Also something-something about dependencies (a Rust staple): https://www.reddit.com/r/rust/comments/1fvzfnb/gg_a_fast_mor...
I wonder how much the above reflects a dated and/or stereotyped view? Who here works with "sysadmins"? I mean... devops-all-the-places now, right? :P Share your experiences?: I'm curious.
If I were a sysadmin, I'd have some kind of "sanity check" script if I had to manage a fleet of disparate systems. I'd use it automatically when I logon to a system. Checking things like: Linux vs BSD, what tools are installed, load, any weirdnesses, etc. Heck, maybe even create aliases that abstract over all of it, as much as possible. Maybe copy over a helix binary for editing too, if that was kosher.
To be honest I hate all the new rust replacement tools, they introduce new behavior just for the sake of it, it's annoying.