A Faster Alternative to Jq (opens in new tab)

(micahkepe.com)

396 pointspistolario1mo ago258 comments

258 comments

Jq's syntax is so arcane I can never remember it and always need to look up how to get a value from simple JSON.

I think the big problem is it's a tool you usually reach for so rarely you never quite get the opportunity to really learn it well, so it always remains in that valley of despair where you know you should use it, but it's never intuitive or easy to use.

It's not unique in that regard. 'sed' is Turing complete[1][2], but few people get farther than learning how to do a basic regex substitution.

[1] https://catonmat.net/proof-that-sed-is-turing-complete

[1] And arguably a Turing tarpit.

jasomill1mo ago

I was just going to say, jq is like sed in that I only use 1% of it 99% of the time, but unlike sed in that I'm not aware of any clearly better if less ubiquitous alternatives to the 1% (e.g., Perl or ripgrep for simple regex substitutions in pipelines because better regex dialects).

Closest I've come, if you're willing to overlook its verbosity and (lack of) speed, is actually PowerShell, if only because it's a bit nicer than Python or JavaScript for interactive use.

HappMacDonald1mo ago

Yeah, sed (and friends) browbeat everyone into learning regex (which PERL then refined).

I think it might be more cognitive load than it is worth to expect everyone en masse to learn another single-line-punctuation-driven-language to perform everyday tasks with.

d350071mo ago

That’s interesting! Can you say a little more? I find jq’s syntax and semantics to be simple and intuitive. It’s mostly dots, pipes, and brackets. It’s a lot like writing shell pipelines imo. And I tend to use it in the same way. Lots of one-time use invocations, so I spend more time writing jq filters than I spend reading them.

I suspect my use cases are less complex than yours. Or maybe jq just fits the way I think for some reason.

I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together. Sounds like that would be a nightmare for you.

randusername1mo ago

I'm not GP, I use jq all the time, but I each time I use it I feel like I'm still a beginner because I don't get where I want to go on the first several attempts. Great tool, but IMO it is more intuitive to JSON people that want a CLI tool than CLI people that want a JSON tool. In other words, I have my own preconceptions about how piping should work on the whole thing, not iterating, and it always trips me up.

Here's an example of my white whale, converting JSON arrays to TSV.

cat input.json | jq -S '(first|keys | map({key: ., value: .}) | from_entries), (.[])' | jq -r '[.[]] | @tsv' > out.tsv

nh23423fefe1mo ago

    <input.json  jq -S  -r '(first | keys) , (.[]| [.[]]) | @tsv'
    <input.json  # redir
    jq
    -S           # sort
    -r           # raw string out
    '
    (first | keys) # header
    ,              # comma is generator
    (.[] |           # loop input array and bind to .
    [                # construct array
     .[]             # with items being the array of values of the bound object
     ])           
     | @tsv'        # generator binds the above array to . and renders to tsv

1 more reply

figmert1mo ago

Here's an easier to understand query for what you're trying to do (at least it's easier to understand for me):

    cat input.json | jq -r '(first | keys) as $cols | $cols, (.[] | [.[$cols[]]]) | @tsv'

That whole map and from entries throws it off. It's not a good use for what you're doing. tsv expects a bunch of arrays, whereas you're getting a bunch of objects (with the header also being one) and then converting them to arrays. That is an unnecessary step and makes it a little harder to understand.

2 more replies

lokar1mo ago

I find it much harder to remember / use each time then awk

firesteelrain1mo ago

Trying to make a generic pipeline for json arrays because you don’t know the field names?

1 more reply

attentive1mo ago

> I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together.

that world exists and mature (powershell)

stingraycharles1mo ago

Sound similar to how power shell works, and it’s not great. Plain text is better.

rzzzt1mo ago

I'm often having trouble with figuring out in advance what the end result will be when processing an input array: an array of mapped objects or a series of self-contained JSON objects? Why? Which one is better? What if I would like to filter out some of the elements as part of the operation?

xnx1mo ago

It's extra complicated under Windows because of issues escaping/wrapping quotes "" and pipes ^|.

ivaniscoding1mo ago

Shameless plug, but you might like this: https://github.com/IvanIsCoding/celq

jq is the CLI I like the most, but sometimes even I struggled to understand the queries I wrote in the past. celq uses a more familiar language (CEL)

xpe1mo ago

CEL looks interesting and useful, though it isn't common nor familiar imo (not for me at least). Quoting from https://github.com/google/cel-spec

    # Common Expression Language

    The Common Expression Language (CEL) implements common
    semantics for expression evaluation, enabling different
    applications to more easily interoperate.

    ## Key Applications

    - Security policy: organizations have complex infrastructure
      and need common tooling to reason about the system as a whole
    - Protocols: expressions are a useful data type and require
      interoperability across programming languages and platforms.

ivaniscoding1mo ago

That’s some fair criticism, but the same page tells that the language wanted to have a similar syntax to C and JavaScript.

I think my personal preference for syntax would be Python’s. One day I want to try writing a query tool with https://github.com/pydantic/monty

TomNomNom1mo ago

Cool tool! Really appreciate the shoutout to gron in the readme, thanks! :)

bigfishrunning1mo ago

I had never heard of CEL, looks useful though, thanks for posting this!

dcre1mo ago

Funny that everyone is linking the tools they wrote for themselves to deal with this problem. I am no exception. I wrote one that just lets you write JavaScript. Imagine my surprise that this extremely naive implementation was faster than jq, even on large files.

    $ cat package.json | dq 'Object.keys(data).slice(0, 5)'
    [ "name", "type", "version", "scripts", "dependencies" ]

https://crespo.business/posts/dq-its-just-js/

physicles1mo ago

Love it. This is so clearly the way to solve the jq writeability problem. I’m going to replace jq with this immediately.

arunix1mo ago

Thanks. Can you say more about why TypeScript with Deno is your scripting language of choice?

1 more reply

otterley1mo ago

Love it!

iLemming1mo ago

It's because .json itself has so much useless cruft it's often annoying to deal with. I am forever indebted for younger self forcing me to learn Clojure. Most of the time I choose not even bother with JSON anymore - EDN semantically so much cleaner - it's almost twice compact (yet lossless), it's far more readable (quotes and commas are optional), and easier to work with structurally. These days I'd use borkdude/jet or babashka and then deal with data in Clojure REPL - there I can inspect it from all sorts of angles, it's far easier to group, sort, slice, dice, map and filter through it. One can even easily visualize the data using djblue/portal. Why most people strangulate themselves with confusing jq operators unnecessarily, I would never understand. Clojure is not that hard, maybe learn some basics, it comes handy a lot. Even when your team doesn't have any Clojure code.

xendo1mo ago

Highly recommend gron. https://github.com/tomnomnom/gron

eevmanu1mo ago

or https://github.com/adamritter/fastgron

epr1mo ago

To fix this I recently made myself a tiny tool I called jtree that recursively walks json, spitting out one line per leaf. Each line is the jq selector and leaf value separated by "=".

No more fiddling around trying to figure out the damn selector by trying to track the indentation level across a huge file. Also easy to pipe into fzf, then split on "=", trim, then pass to jq

iamjackg1mo ago

You might like https://github.com/tomnomnom/gron

janderland1mo ago

JMESPath is what I wish jq was. Consistent grammar. It only issue is it lacks the ability to convert JSON to other formats like CSV.

charlesdaniels1mo ago

If we're plugging jq alternatives, I'll plug my own: https://git.sr.ht/~charles/rq

I was working at lot with Rego (the DSL for Open Policy Agent) and realized it was actually a pretty nice syntax for jq type use cases.

voidfunc1mo ago

I just ask Opus to generate the queries for me these days.

hilti1mo ago

LOL ... I can absolutely feel your pain. That's exactly why I created for myself a graphical approach. I shared the first version with friends and it turned into "ColumnLens" (ImGUI on Mac) app. Here is a use case from the healthcare industry: https://columnlens.com/industries/medical

raydev1mo ago

Like I did with regex some years earlier, I worked on a project for a few weeks that required constant interactions with jq, and through that I managed to lock in the general shape of queries so that my google hints became much faster.

Of course, this doesn't matter now, I just ask an LLM to make the query for me if it's so complex that I can't do it by hand within seconds.

dhuan_1mo ago

I agree, even trivial tasks require us to go back to jq's manual to learn how to write their language.

this and other reasons is why I built: https://github.com/dhuan/dop

justonceokay1mo ago

When I need it i find that relearning the jq syntax is still faster than whatever other harebrained scheme I might come up with to solve my problem. It’s just so useful 2x a year when I really need it

LgWoodenBadger1mo ago

I completely agree. I much prefer leveraging actual javascript to get what I need instead of spending time trying to fumble my way through jq syntax.

dcre1mo ago

Check this out: https://crespo.business/posts/dq-its-just-js/

You don't have to use my implementation, you could easily write your own.

GaryNumanVevo1mo ago

yeah I literally just use gemini / claude to one-shot JQ queries now

palcu1mo ago

I've been calling LLMs superhuman at writing `jq`. It's like you're talking directly with the JSON.

NSPG9111mo ago

I also genuinely hate using jq. It is one of the only things that I rely heavily on AI.

vips7L1mo ago

You should try nushell or PowerShell which have built ins to convert json to objects. It makes it so easy.

bigstrat20031mo ago

Second this. Working with nushell is a joy.

dannyobrien1mo ago

I use the llm-jq plugin for Simon Willison's `llm` command line frontend for this: https://github.com/simonw/llm-jq

amelius1mo ago

At that point why don't we ask the AI directly to filter through our data? The AI query language is much more powerful.

4 more replies

1a527dd51mo ago

I appreciate performance as much as the next person; but I see this endless battle to measure things in ns/us/ms as performative.

Sure there are 0.000001% edge cases where that MIGHT be the next big bottleneck.

I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

9/10 whatever tooling you are using now will be perfectly fine. Example; I use grep a lot in an ad hoc manner on really large files I switch to rg. But that is only in the handful of cases.

j1elo1mo ago

Whenever you have this kind of impressions on some development, here are my 2 cents: just think "I'm not the target audience". And that's fine.

The difference between 2ms and 0.2ms might sound unneeded, or even silly to you. But somebody, somewhere, is doing stream processing of TB-sized JSON objects, and they will care. These news are for them.

alsetmusic1mo ago

I remember when I was coming up on the command line and I'd browse the forums at unix.com. Someone would ask how to do a thing and CFAJohnson would come in with a far less readable solution that was more performative (probably replacing calls to external tools with Bash internals, but I didn't know enough then to speak intelligently about it now).

People would say, "Why use this when it's harder to read and only saves N ms?" He'd reply that you'd care about those ms when you had to read a database from 500 remote servers (I'm paraphrasing. He probably had a much better example.)

Turns out, he wrote a book that I later purchased. It appears to have been taken over by a different author, but the first release was all him and I bought it immediately when I recognized the name / unix.com handle. Though it was over my head when I first bought it, I later learned enough to love it. I hope he's on HN and knows that someone loved his posts / book.

https://www.amazon.com/Pro-Bash-Programming-Scripting-Expert...

2 more replies

mememememememo1mo ago

Also as someone who looks at latency charts too much, what happens is a request does a lot in series and any little ms you can knock off adds up. You save 10ms by saving 10 x 1ms. And if you are a proxyish service then you are a 10ms in a chain that might be taking 200 or 300ms. It is like saving money, you have to like cut lots of small expenses to make an impact. (unless you move etc. but once you done that it is small numerous things thay add up)

Also performance improvements on heavy used systems unlocks:

Cost savings

Stability

Higher reliability

Higher throughput

Fewer incidents

Lower scaling out requirements.

lock11mo ago

Wait what? I don't get why performance improvement implies reliability and incident improvement.

For example, doing dangerous thing might be faster (no bound checks, weaker consistency guarantee, etc), but it clearly tend to be a reliability regression.

3 more replies

Chris20481mo ago

But even in this example, the 2ms vs 0.2 is irrelevant - its whatever the timings are for TB-size objects.

So went not compare that case directly? We'd also want to see the performance of the assumed overheads i.e. how it scales.

tclancy1mo ago

Which is fine, but the vast majority of the things that get presented aren’t bothering to benchmark against my use (for a whole lotta mes). They come from someone scratching an itch and solving it for a target audience of one and then extrapolating and bolting on some benchmarks. And at the sizes you’re talking about, how many tooling authors have the computing power on hand to test that?

NoSalt1mo ago

> "somebody, somewhere, is doing stream processing of TB-sized JSON objects"

That's crazy to think about. My JSON files can be measured in bytes. :-D

2 more replies

7bit1mo ago

Who is the target audience? I truly wonder who will process TB-sized data using jq? Either it's in a database already, in which case you're using the database to process the data, or you're putting it in a database.

Either way, I have really big doubts that there will be ever a significant amount of people who'd choose jq for that.

simonw1mo ago

There was a thread yesterday where a company rewrote a similar JSON processing library in Go because they were spending $100,000s on serving costs using it to filter vast amounts of data: https://news.ycombinator.com/item?id=47536712

lbj1mo ago

That's a really great perspective. Thanks for sharing!

Hendrikto1mo ago

I get the sentiment, but everybody thinks that, and in aggregate, you get death by a thousand paper cuts.

It’s the same sentiment as “Individuals don’t matter, look at how tiny my contribution is.”. Society is made up of individuals, so everybody has to do their part.

> 9/10 whatever tooling you are using now will be perfectly fine.

It is not though. Software is getting slower faster than hardware is getting quicker. We have computers that are easily 3–4+ orders of magnitudes faster than what we had 40 years ago, yet everything has somehow gotten slower.

lemagedurage1mo ago

True. I feel like the main way a tool could differentiate from jq is having more intuitive syntax and many real world examples to show off the syntax.

mpalmer1mo ago

The syntax makes perfect sense when you understand the semantics of the language.

Out of curiosity, have you read the jq manpage? The first 500 words explain more or less the entire language and how it works. Not the syntax or the functions, but what the language itself is/does. The rest follows fairly easily from that.

roland351mo ago

For better or worse, Claude is my intuitive interface to jq. I don't use it frequently, and before I would have to look up the commands every time, and slowly iterate it down to what I needed.

Koschi131mo ago

Maybe look at it from another perspective. Better performance == less CPU cycles wasted. Consider how many people use jq daily and think about how much energy could be saved by faster implementations. In times like this where energy is becoming more scarce we should think about things like this.

mpalmer1mo ago

> Consider how many people use jq daily and think about how much energy could be saved by faster implementations.

Say a number; make a real argument. Don't just wave your hand and say "just imagine how right I could be about this vague notion if we only knew the facts"

gpvos1mo ago

I agree, but in this age of widespread LLM use, that's only marginal.

mattbis1mo ago

Was about to post exactly this... It is impressive engineering wise, but for data and syntax, ease of use or all the great features, I care about that more. Speed isn't that important to me for a lot of these tools.

If I/you was working with JSON of that size where this was important, id say you probably need to stop using JSON! and some other binary or structured format... so long as it has some kinda tooling support.

And further if you are doing important stuff in the CLI needing a big chain of commands, you probably should be programming something to do it anyways...

that's even before we get to the whole JSON isn't really a good data format whatsoever... and there are many better ways. The old ways or the new ways. One day I will get to use my XSLT skills again :D

sramsay1mo ago

I absolutely understand what you're saying. It makes complete sense. But I will never, ever shake the sense that software that isn't as fast as possible is offensive, immoral, delinquent -- the result of sloth, lassitude, lack of imagination, and a general hostility toward our noble Art.

"Fast enough" will always bug me. "Still ahead of network latency" will always sound like the dog ate your homework. I understand the perils of premature optimization, but not a refusal to optimize.

And I doubt I'm alone.

montroser1mo ago

Then this is for the handful of cases for you. When it matters it matters.

phillipcarter1mo ago

I agree for some things, but not for tools or "micro-software" like jq that can get called a LOT in an automated process. Every order of magnitude saved for the latter category can be meaningful.

mikojan1mo ago

> I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

> 9/10 whatever tooling you are using now will be perfectly fine

Are you working in frontend? On non-trivial webapps? Because this is entirely wrong in my experience. Performance issues are the #1 complaint of everyone on the frontend team. Be that in compiling, testing or (to a lesser extend) the actual app.

g947o1mo ago

Worked on front end for years. Rarely ever hear people talking about performance issues. I was among the very few people who knew how to use the dev tools to investigate memory leak or heard of memlab.

Either the team I worked at was horrible, or you are from Google/Meta/Walmart where either everyone is smart or frondend performance is directly related to $$.

1 more reply

lelandfe1mo ago

There are some really fast tools out there for compiling FE these days, and that's probably to what they refer. Testing is still a slog.

ffsm81mo ago

Uh, I've worked for a few years as a frontend dev, as in literal frontend dev - at that job my responsibility started at consuming and ended at feeding backend APIs, essentially.

From that I completely agree with your statement - however, you're not addressing the point he makes which kinda makes your statement completely unrelated to his point

99.99% of all performance issues in the frontend are caused by devs doing dumb shit at this point

The frameworks performance benefits are not going to meaningfully impact this issue anymore, hence no matter how performant yours is, that's still going to be their primary complaint across almost all complex rwcs

And the other issue is that we've decided that complex transpiling is the way to go in the frontend (typescript) - without that, all built time issues would magically go away too. But I guess that's another story.

It was a different story back when eg meteorjs was the default, but nowadays they're all fast enough to not be the source of the performance issues

raverbashing1mo ago

Yes

I don't think I remember one case where jq wasn't fast enough

Now what I'd really want is a jq that's more intuitive and easier to understand

latexr1mo ago

> Now what I'd really want is a jq that's more intuitive and easier to understand

Unfortunately I don’t recall the name, but there was something submitted to HN not too long ago (I think it was still 2026) which was like jq but used JavaScript syntax.

dalvrosa1mo ago

Fair, but agentic tooling can benefit quite a lot from this

Opencode, ClaudeCode, etc, feel slow. Whatever make them faster is a win :)

httpsterio1mo ago

The 2ms it takes to run jq versus the 0.2ms to run an alternative is not why your coding agent feels slow.

jmalicki1mo ago

Still, jq is run a whole lot more than it used to be due to coding agents, so every bit helps.

The vast majority of Linux kernel performance improvement patches probably have way less of a real world impact than this.

1 more reply

jamespo1mo ago

It's not running jq locally that's causing that

Kovah1mo ago

I wonder so often about many new CLI tools whose primary selling point is their speed over other tools. Yet I personally have not encountered any case where a tool like jq feels incredibly slow, and I would feel the urge to find something else. What do people do all day that existing tools are no longer enough? Or is it that kind of "my new terminal opens 107ms faster now, and I don't notice it, but I simply feel better because I know"?

n_e1mo ago

I process TB-size ndjson files. I want to use jq to do some simple transformations between stages of the processing pipeline (e.g. rename a field), but it so slow that I write a single-use node or rust script instead.

messe1mo ago

Now I'm really curious. What field are you in that ndjson files of that size are common?

I'm sure there are reasons against switching to something more efficient–we've all been there–I'm just surprised.

1 more reply

eru1mo ago

This reminds me of someone who wrote a regex tool that matches by compiling regexes (at runtime of the tool) via LLVM to native code.

You could probably do something similar for a faster jq.

loxias1mo ago

I would love, _love_ to know more about your data formats, your tools, what the JSON looks like, basically as much as you're willing to share. :)

For about a month now I've been working on a suite of tools for dealing with JSON specifically written for the imagined audience of "for people who like CLIs or TUIs and have to deal with PILES AND PILES of JSON and care deeply about performance".

For me, I've been writing them just because it's an "itch". I like writing high performance/efficient software, and there's a few gaps that it bugged me they existed, that I knew I could fill.

I'm having fun and will be happy when I finish, regardless, but it would be so cool if it happened to solve a problem for someone else.

1 more reply

nchmy1mo ago

This isn't for you then

> The query language is deliberately less expressive than jq's. jsongrep is a search tool, not a transformation tool-- it finds values but doesn't compute new ones. There are no filters, no arithmetic, no string interpolation.

Mind me asking what sorts of TB json files you work with? Seems excessively immense.

2 more replies

swiftcoder1mo ago

Deal with really big log files, mostly.

If you work at a hyperscaler, service log volume borders on the insane, and while there is a whole pile of tooling around logs, often there's no real substitute for pulling a couple of terabytes locally and going to town on them.

sgarland1mo ago

> often there's no real substitute for pulling a couple of terabytes locally and going to town on them.

Fully agree. I already know the locations of the logs on-disk, and ripgrep - or at worst, grep with LC_ALL=C - is much, much faster than any aggregation tool.

If I need to compare different machines, or do complex projections, then sure, external tooling is probably easier. But for the case of “I know roughly when a problem occurred / a text pattern to match,” reading the local file is faster.

sophacles1mo ago

I'll write a one-off shell pipeline to inspect something on 10^5 servers - it will be sent to each of those servers and run once or a handful of times, and the results will be transmitted back and that's that. Kind of a map-reduce shell thing, for ops type tasks.

Sometimes those will actually need to process through a bunch of data unexpectedly.

Sometimes those will be run on a loop - once per second, N per minute (etc), and the results will be used to monitor a situation until a bug is fixed or a spike in load is resolved or a proper monitoring program/metric can be deployed.

Sometimes those are to investigate a pegged CPU and the amortized lower runtime across all the tasks on the CPU is noticable.

We run our machines hot and part of the reason we can do that is being in the habit of choosing lower cost (in cycles) tooling whenever we can. If i can spend a little time and effort learning a tool that saves a bunch of cpu in aggregate, its a win. When the whole company does it, we can spend a lot less on hardware than it costs in engineer time to make these decisions.

Another way of putting it is: its a type of frugality (not cheapness, just spending wisely). If you save a dollar once, its nothing. If you have a habit of saving a dollar every time the opportunity arises, it adds up quickly. By having a habit of choosing more performant tools, you're less likely to hit a case where you wish you did use more performant tools, and are practiced at it when the need arises for pure parsimony and it's less painful.

bluedino1mo ago

We parse JSON responses for dashboards, alerting, etc. Thousands of nodes, depending on the resolution of your monitoring you could see improvements here.

xlii1mo ago

It's a simple loop:

- Someone likes tool X

- Figures, that they can vibe code alternative

- They take Rust for performance or FAVORITE_LANG for credentials

- Claude implements small subset of features

- Benchmark subset

- Claim win, profit on showcase

Note: this particular project doesn't have many visible tells, but there's pattern of overdocumentation (17% comment-to-code ratio, >1000 words in README, Claude-like comment patterns), so it might be a guided process.

I still think that the project follows the "subset is faster than set" trend.

InfinityByTen1mo ago

You don't know something is slow until you encounter a use case where the speed becomes noticeable. Then you see the slowness across the board. If you can notice that a command hasn't completed and you are able to fully process a thought about it, it's slow(er than your mind, ergo slow!).

Usually, a perceptive user/technical mind is able to tweak their usage of the tools around their limitations, but if you can find a tool that doesn't have those limitations, it feels far more superior.

The only place where ripgrep hasn't seeped into my workflow for example, is after the pipe and that's just out of (bad?) habit. So much so, sometimes I'll do this foolishly rg "<term>" | grep <second filter>; then proceed to do a metaphoric facepalm on my mind. Let's see if jg can make me go jg <term> | jq <transformation> :)

oefrha1mo ago

Well grep is just better sometimes. Like you want to copy some lines and grep at the end of a pipeline is just easier than rg -N to suppress line numbers. Whatever works, no need to facepalm.

skywhopper1mo ago

Not every use case of jq is a person using it interactively in their terminal, believe it or not.

mikkupikku1mo ago

If somebody needs performance, they probably shouldn't be calling out to a separate process for json of all things, no?

(Honestly, who even still writes shell scripts? Have a coding agent write the thing in a real scripting language at least, they aren't phased by the boilerplate of constructing pipelines with python or whatever. I haven't written a shell script in over a year now.)

1 more reply

7bit1mo ago

If Ms performance is a main concern, you shouldn't use jq. Believe it or not.

postepowanieadm1mo ago

Race between ripgrep and ugrep is entertaining.

password43211mo ago

Optimization = good

Prioritizing SEO-ing speed over supporting the same features/syntax (especially without an immediately prominent disclosure of these deficiencies) = marketing bullshit

A faster jq except it can't do what jq does... maybe I can use this as a pre-filter when necessary.

Jakob1mo ago

Speed is a quality in itself. We are so bugged down by slow stuff that we often ignore that and don’t actively search for another.

But every now and then a well-optimised tool/page comes along with instant feedback and is a real pleasure to use.

I think some people are more affected by that than others.

Obligatory https://m.xkcd.com/1205

Imustaskforhelp1mo ago

I am not sure if it was simon or pg who might've quoted this but I remembered a quote about that a 2 magnitude order in speed (quantity) is a huge qualititative change in it of itself.

hackrmn1mo ago

Having used `jq` and `yq` (which followed from the former, in spirit), I have never had to complain about performance of the _latter_ which an order of magnitude (or several) _slower_ than the former. So if there's something faster than `jq`, it's laudable that the author of the faster tool accomplished such a goal, but in the broader context I'd say the performance benefit would be required by a niche slice of the userbase. People who analyse JSON-formatted logs, perhaps? Then again, newline-delimited JSON reigns supreme in that particular kind of scenario, making the point of a faster `jq` moot again.

However, as someone who always loved faster software and being an optimisation nerd, hat's off!

bungle1mo ago

Integrating with server software, the performance is nice to have, as you can have say 100 kRPS requests coming in that need some jq-like logic. For CLI tool, like you said, the performance of any of them is ok, for most of the cases.

robmccoll1mo ago

jq is probably faster than storage, the network, compression, or something else in your stack and not your bottleneck.

mroche1mo ago

> Having used `jq` and `yq`

If you don't mind me asking, which yq? There's a Go variant and a Python pass-through variant, the latter also including xq and tomlq.

hackrmn1mo ago

Indeed, thanks for spotting that, as I myself remember discovering there's at least two. Thing is, I had learned and started with Mike Farah's `yq`, not the pass-through-to-`jq` variant written in Python that's often more easily (read: system package manager) available. Both semantics and syntax are a bit different between the two.

A bit of a fun fact: there's a quote by Farah where he said that the language and semantics of the tool he was writing, didn't really "click in" until he was well into writing it :-) I myself have been on occasion pulling my hair out trying to wield `yq`'s language, there's some inconsistencies here and there which I think are related to the novel nature of the language (not novel to everyone but it's uncommon even for those well versed with e.g. SQL). `jq` suffers from similar woes, but to a lesser degree.

jeffbee1mo ago

I use jq to grind through gigabytes of GeoJSON files exported from ArcGIS, as an ETL stage. It takes a long time.

skywhopper1mo ago

Yeah, turns out not everyone uses these tools the way you do. Weird!

hackrmn1mo ago

Fair enough, I deserved that :-)

ifh-hn1mo ago

I learned a number of data processing cli tools: jq, mlr, htmlq, xsv, yq, etc; to name a few. Not to the level of completing advent of code or anything, but good enough for my day to day usage. It was never ending with the amount of formats I needed to extract data from, and the different syntax's. All that changed when I found nushell though, its replaced all of these tools for me. One syntax for everything, breath of fresh air!

rlonstein1mo ago

+1. I switched to using Nushell as my daily driver around mid-2023 (0.84.0?) and use it in preference to other interactive tools. I do keep at hand jq, yq, and mlr because I need to exchange stuff with colleagues who don't use Nu.

igorramazanov1mo ago

Same! Nushell replaced almost all of them

Had to spend some efforts to set up completions, also there some small rough edges around commands discoverability, but anyway, much better than the previous oh-my-zsh setup

Ideally, wish it also had a flag to enforce users to write type annotations + compiling scripts as static binaries + a TUI library, and then I'd seriously consider it for writing small apps, but I like and appreciate it in the current state already

joknoll1mo ago

Same here, nushell is awesome! It helped me to automate so many more things than I did with any other shell. The syntax is so much more intuitive and coherent, which really helps a lot for someone who always forgot how to write ifs or loops in bash ^^

ndyg1mo ago

Something I find myself saying a lot, Nushell is a better `jq` than `jq`

Bigpet1mo ago

When initially opening the page it had broken colors in light mode. For anyone else encountering it: switch to dark mode and then back to light mode to fix it.

CodeCompost1mo ago

I suspect the website is vibe-coded, like the tool itself.

jmalicki1mo ago

I can forgive vibe code... It needs to execute if it works it's fine.

Unedited vibe documentation is unforgivable.

merlindru1mo ago

this is a bad faith take. i think the website is really cool and doesn't reek of slop at all. what makes you think differently?

2 more replies

shellac1mo ago

I think this has just been fixed. A bit of dark mode was leaking into light in the css.

majewsky1mo ago

I still saw the same bug just now (Firefox on macOS).

drob5181mo ago

It’s still broken for me at this point. White link text on nearly white background. Impossible to read. Safari on my iPad.

jvdvegt1mo ago

Fine in Firefox on Android. Note that the scales of the charts are all different, which makes them hard to compare.

Also, there are lots of charts without comparison so the numbers mean nothing...

micahkepe1mo ago

Hey OP here! Sorry about this this is just laziness on my part because I never use light mode so I forget to test haha, will push a fix!

xyst1mo ago

Modern programmers these days just give a shit about user experience. Better to just load up in reader mode.

qwe----31mo ago

White text with light background, yeah.

keysersoze331mo ago

I had the same problem (brave browser)

vladvasiliu1mo ago

Looks fine to me on Edge/Windows.

micahkepe1mo ago

Should be fixed now! Let me now :)

youngtaff1mo ago

Broken on iOS Safari too

Jenk1mo ago

I switched to Jaq[0] a while back for the 'correctness' sake rather than performance. But Jaq also claims to be more performant than jq.

[0]: https://github.com/01mf02/jaq

jeffbee1mo ago

I keep an eye on jaq, but there are some holes in the story. jaq 3.0 is faster than Linux distro builds of jq, but jq built correctly is faster than jaq. As far as I can tell the performance reputation of jq is caused by bad distro packaging.

password43211mo ago

Thank you for the recommendation.

It looks like jaq has already progressed much further in the right direction than jsongrep has just started in the not-quite-as-right direction.

jiehong1mo ago

First of all, congratulations! Nice tool!

Second, some comments on the presentation: the horizontal violin graphs are nice, but all tools have the same colours, and so it's just hard to even spot where jsongrep is. I'd recommend grouping by tool and colour coding it. Besides, jq itself isn't in the graphs at all (but the title of the post made me think it would be!).

Last, xLarge is a 190MiB file. I was surprised by that. It seems too low for xLarge. I daily check 400MiB json documents, and sometimes GiB ones.

micahkepe1mo ago

Hey thank you! OP here, yes I was struggling to find large enough documents to run the benchmarks on, the range currently on the benchmark data is ~106 B - ~190MB, which I think covers the majority of quick task workloads, but would love to have large documents, if there's an public ones you can thinking of I'd like to know!

jiehong1mo ago

The US government tend to offer big public json document [0], such as crime rates [1], or others.

[0]: https://catalog.data.gov/dataset/?res_format=JSON

[1]: https://catalog.data.gov/dataset/crimes-2001-to-present

Asmod4n1mo ago

You could just take simdjson, use its ondemand api and then navigate it with .at_path(_with_wildcard) (https://github.com/simdjson/simdjson/blob/master/doc/basics....)

The whole tool would be like a few dozen lines of c++ and most likely be faster than this.

maxloh1mo ago

From their README [0]:

> Jq is a powerful tool, but its imperative filter syntax can be verbose for common path-matching tasks. jsongrep is declarative: you describe the shape of the paths you want, and the engine finds them.

IMO, this isn't a common use case. The comparison here is essentially like Java vs Python. Jq is perfectly fine for quick peeking. If you actually need better performance, there are always faster ways to parse JSON than using a CLI.

[0]: https://github.com/micahkepe/jsongrep

vindin1mo ago

The data viz of the benchmarks is really rough. I think you’d get a lot of leverage out of rebuilding it and using colors and/or shapes to extract additional dimensions. Nobody wants to scan through raw file paths as labels to try and figure out what the hell the results are

allknowingfrog1mo ago

I deal with a fair amount of newline-delimited JSON in my day job, where each line in the file is a complete JSON object. I've seen this referred to as "jsonl", and it's not entirely uncommon for logs and other kinds of time-series data dumps. Do any of the popular JSON CLI tools work with this format? I didn't see any mention of it here.

throwawaypath1mo ago

After reading the title, I was worried that this wasn't written in Rust!

VHRanger1mo ago

If rust is not in the HN title and fire emojis in the readme, it doesn't come from the Rust region of France.

It's just sparkling memory safe high performance software

onedognight1mo ago

Having the equivalent jq expression in these examples might help to compare expressiveness, and it might help me see if jq could “just” use a DFA when a (sub)query admits one. grep, ripgrep, etc change algorithms based on the query and that makes the speed improvements automatic.

bouk1mo ago

I highly recommend anyone to look at jq's VM implementation some time, it's kind of mind-blowing how it works under the hood: https://github.com/jqlang/jq/blob/master/src/execute.c

It does some kind of stack forking which is what allows its funky syntax

functional_dev1mo ago

The backtracking implementation in jq is really the secret sauce for how it handles those complex filters without getting bogged down

vbezhenar1mo ago

Looks like naive implementation of homemade bytecode interpreter. What's so mind blowing about that? Maybe I missed something.

Self-Perfection1mo ago

I think that in most cases jq is launched to extract value from relatively small JSON document, for which raw parsing speed is not affect much. jq is just really slow to start. Version 1.6 was especially abysmally slow to start, 10x times slower than 1.5:

https://github.com/jqlang/jq/issues/1826

So any replacement candidate should also benchmark like hyperfine "jq .a <<< '{"a": 10 }'" . This oneliner does not work but should illustrate the idea.

Also please just use jshon if you need to just extract specific value from some small JSON. jshon uses way less resources by any conceivable metric.

skywhopper1mo ago

If the author cares, I can’t read everything on this page. The command snippets have a “BASH” pill in the top left that covers up the command I’m supposed to run. And then there are, I guess topic headings or something that are white-on-white text, so honestly I don’t know what they say or what they are.

micahkepe1mo ago

OP here: sorry about that, the light mode inconsistencies should be fixed now. Will continue to work on making the site design better as well!

ontouchstart1mo ago

Everything can be written in JavaScript will be written in JavaScript.

Everything can be rewritten in Rust will be written in Rust.

enricozb1mo ago

I am excited for some alternative syntax to jq's. I haven't given much thought to how I'd write a new JSON query syntax if I were writing things from scratch, but I personally never found the jq syntax intuitive. Perhaps I haven't given it enough effort to learn properly.

s_dev1mo ago

You don't learn it properly. It's not supposed to be intuitive, it's supposed to be concise at the cost of it being intuitive. Would be like somebody saying typing words in to Google is more intuitive than writing regex.

jq is supposed to fit in to other bash scripts as a one liner. That's it's super power. I know very few people who write regex on the fly either (unless you were using it everyday) they check the documentation and flesh it out when they need it.

Just use Claude to generate the jq expression you need and test it.

Voranto1mo ago

Quick question: Isn't the construction of a NFA - DFA a O(2^n) algorithm? If a JSON file has a couple hundred values, its equivalent NFA will have a similar amount, and the DFA will have 2^100 states, so I must be missing something.

functional_dev1mo ago

theory is one thing but the cpu cache is the real bottleneck here... here is a small visual breakdown of how these arrays look in memory and why pointer chasing is so expensive compared to the actual logic: https://vectree.io/c/json-array-memory-indexing

basically the double jump to find values in the heap is what slows down these tools most

Voranto1mo ago

I can see that in practice the bottleneck isn't the automata construction, I'm just curious of how the construction is approached with such a super-exponential conversion algorithm

tehnub1mo ago

I've been using jj, which apparently is also faster than jq https://github.com/tidwall/jj

swah1mo ago

jj was already taken by jujutsu, unfortunately.

hilti1mo ago

I'm glad you adjusted the CSS while I was typing my comment. I needed to switch to dark mode to be able to read highlighted words.

Nice write up. I will try out your tool.

mlmonkey1mo ago

LOL ... came here to grips about that!

Also "jg" reads very similar to "jq", and initially I thought he was talking about "jq" all along, and I was like: where can I see the "jasongrep" examples? Threw me off for a minute.

steelbrain1mo ago

Surprised to see that there's no official binaries for arm64 darwin. Meaning macOS users will have to run it through the Rosetta 2 translation layer.

alexellisuk1mo ago

Just hit this too:

https://news.ycombinator.com/item?id=47542182

The reason I was interested, was adding the new tool to arkade (similar to Brew, but more developer/devops focused - downloads binaries)

The agent found no Arm binaries.. and it seemed like an odd miss for a core tool

https://x.com/alexellisuk/status/2037514629409112346?s=20

QuantumNomad_1mo ago

I’d install it via cargo anyway and that would build it for arm64.

If the arm64 version was on homebrew (didn’t check if it is but assume not because it’s not mentioned on the page), I’d install it from there rather than from cargo.

I don’t really manually install binaries from GitHub, but it’s nice that the author provides binaries for several platforms for people that do like to install it that way.

maleldil1mo ago

You can use cargo-binstall to retrieve Github binary releases if there are any.

micahkepe1mo ago

OP here: Releases have been updated! Also someone was kind enough to package it in Homebrew already :) https://github.com/micahkepe/jsongrep/pull/22

baszalmstra1mo ago

Really? That is your response? This is an high quality article from someone who spend a lot of time implementing a cool tool and also sharing the intricate inner workings of it. And your response is, "eh there are no official binaries for my platform". Give them some credit! Be a little more constructive!

coldtea1mo ago

His response at least fits the discussion and is relevant to the tool, not generic hollier-than-thou scolding.

To address the concern, anyway, I'm sure it would soon be available in brew as an arm binary.

sirfz1mo ago

Nowadays I'd just use clickhouse-local / chdb / duckdb to query json files (and pretty much any standard format files)

quotemstr1mo ago

Reminder you can also get DuckDB to slurp the JSON natively and give you a much more expressive query model than anything jq-like.

maleldil1mo ago

How does it deal with nested objects? E.g. one of the fields/columns is an array of objects.

quotemstr1mo ago

Beautifully. UNNEST works well, as do the pivot operators.

micahkepe1mo ago

OP jsongrep author here: v0.8.0 now has multi format support for serializable formats![^1]

[1]: https://github.com/micahkepe/jsongrep/releases/tag/v0.8.0

luc41mo ago

Since the query compilation needs exponential time, I wonder how large the queries can be before jsongrep becomes slower than all the other tools. In that regard, I think the library could benefit from some functionality for query compilation at compile-time.

mlmonkey1mo ago

Minor suggestion: often I just want to extract one field, whose name I know exactly. I see that `jg` has an option `-F` like this:

$ cat sample.json | jg -F name

I would humbly suggest that a better syntax would be:

$ cat sample.json | jg .name

for a leaf node named "name"; or

$ cat sample.json | jg -F .name.

for any node named "name".

soleveloper1mo ago

I already can't remember jq syntax. Naming this jg just means I'll type one, instinctively use the other's syntax, and get an error anyway. It's a DX trap.

But I will admit, the new syntax makes a lot more sense.

keysersoze331mo ago

I was a bit skeptical at first, but after reading more into jsongrep, it's actually very good. Only did a very quick test just now, and after stumbling over slightly different syntax to jq, am actually quite impressed. Give it a try

carlmr1mo ago

What were your syntax stumbling blocks? I must be honest I've used jq enough but can never remember the syntax. It's one of the worst things about jq IMO (not the speed, even though I'm a fan of speedups). There's something ungrokkable about that syntax for me.

keysersoze331mo ago

Just the basic things, like viewing the complete json (with syntax highlighting) to then determine the filter, that is '.' becomes '**'

vismit20001mo ago

Table of contents seems inspired by the famous ripgrep post from 2016: https://burntsushi.net/ripgrep/

wolfi11mo ago

forgive me my rant, but when I see "just install it with cargo" I immediately lose interest. How many GB do I have to install just to test a little tool? sorry, not gonna do that

arjie1mo ago

Thank you. Very cool. Going to try embedding this into my JSON viewer. One thing I’ve struggled with is that live querying in the UI is constrained by performance.

stuaxo1mo ago

Nice.

Some bits of the site are hard to read "takes a query and a JSON input" query is in white and the background of the site is very light which makes it hard to read.

rswail1mo ago

Just about to read, but I had to change to dark mode to be able to see the examples, which are bold white on a white background.

1vuio0pswjnm71mo ago

One problem I have not seen addressed by jq or alterataives, perhaps this one addresses it, is "JSON-like" data. That is, JSON that is not contained in a JSON file

For example, web pages sometimes contain inline "JSON". But as this is not a proper JSON file, jq-style utilties cannot process it

The solution I have used for years is a simple utility written in C using flex^1 (a "filter") that reformats "JSON" on stdin, regardless of whether the input is a proper JSON file or not, into stdout that is line-delimited, human-readable and therefore easy to process with common UNIX utilities

The size of the JSON input does not affect the filter's memory usage. Generally, a large JSON file is processed at the same speed with the same resource usage as a small one

The author here has provided musl static-pie binaries instead of glibc. HN commenters seeking to discredit musl often claim glibc is faster

Personally I choose musl for control not speed

1. jq also uses flex

1vuio0pswjnm71mo ago

*alternatives

furryrain1mo ago

If it's easier to use than jq, they should sell the tool on that.

coldtea1mo ago

Speed is good! Not a big fan of the syntax though.

jrhey1mo ago

Since when was jq considered slow?

PUSH_AX1mo ago

Is Jq slow?

PunchyHamster1mo ago

alexellisuk1mo ago

Quick comment for the author.

Just added this new tool to arkade, along with the existing jq/yq.

No Arm64 for Darwin.. seriously? (Only x86_64 darwin.. it's a "choice")

No Arm64 for Linux?

For Rust tools it's trivial to add these. Do you think you can do that for the next release?

https://github.com/micahkepe/jsongrep/releases/tag/v0.7.0

micahkepe1mo ago

OP here- I will add! Thank you for checking out the project!

micahkepe1mo ago

They've been added!

peterohler1mo ago

Another alternative is oj, https://github.com/ohler55/ojg. I don't know how the performance compares to jq or any others but it does use JSONPath as the query language. It has a few other options for making nicely formatted JSON and colorizing JSON.

silverwind1mo ago

Effort would be better investigated making `jq` itself faster.

adastra221mo ago

The fastest alternative to jq is to not use JSON.

1vuio0pswjnm71mo ago

The unlimited memory required for JSON is poor design

netstrings has no such issues

marxisttemp1mo ago

Many Useless Uses of cat in this documentation. You never need to do `cat file | foo`, you can just do `<file foo`. cat is for concatenating inputs, you never need it for a single input.

norenh1mo ago

As someone who worked with Unix/Linux and command line arguments for 30 years and still "abuse" cat like the documentation, I regularly hear this complaint.

Yes, "cmd <file" is more efficient for the computer but not for the reader in many cases. I read from left to the right and the pipeline might be long or "cmd" might have plenty of arguments (or both). Having "cat file | cmd" immediately gives me the context for what I am working with and corresponds well with "take this file, do this, then that, etc" with it) and makes it easier for me to grok what is happening (the first operation will have some kind of input from stdin). Without that, the context starts with the (first) operation like in the sentence "do this operation, on this file (,then this, etc)". I might not be familiar with it or knowing the arguments it expects.

At least for me, the first variant comes more naturally and is quicker to follow (in most cases), so unless it is performance sensitive that is what I end up with (and cat is insanely fast for most cases).

jasomill1mo ago

If left-to-right is your main concern, observe that the post you replied to uses

  <file command

which is equivalent to

  command <file

j / k navigate · click thread line to collapse

258 comments

regus1mo ago

Jq's syntax is so arcane I can never remember it and always need to look up how to get a value from simple JSON.

marginalia_nu1mo ago

It's not unique in that regard. 'sed' is Turing complete[1][2], but few people get farther than learning how to do a basic regex substitution.

[1] https://catonmat.net/proof-that-sed-is-turing-complete

[1] And arguably a Turing tarpit.

jasomill1mo ago

Closest I've come, if you're willing to overlook its verbosity and (lack of) speed, is actually PowerShell, if only because it's a bit nicer than Python or JavaScript for interactive use.

HappMacDonald1mo ago

Yeah, sed (and friends) browbeat everyone into learning regex (which PERL then refined).

I think it might be more cognitive load than it is worth to expect everyone en masse to learn another single-line-punctuation-driven-language to perform everyday tasks with.

d350071mo ago

I suspect my use cases are less complex than yours. Or maybe jq just fits the way I think for some reason.

I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together. Sounds like that would be a nightmare for you.

randusername1mo ago

Here's an example of my white whale, converting JSON arrays to TSV.

cat input.json | jq -S '(first|keys | map({key: ., value: .}) | from_entries), (.[])' | jq -r '[.[]] | @tsv' > out.tsv

nh23423fefe1mo ago

    <input.json  jq -S  -r '(first | keys) , (.[]| [.[]]) | @tsv'
    <input.json  # redir
    jq
    -S           # sort
    -r           # raw string out
    '
    (first | keys) # header
    ,              # comma is generator
    (.[] |           # loop input array and bind to .
    [                # construct array
     .[]             # with items being the array of values of the bound object
     ])           
     | @tsv'        # generator binds the above array to . and renders to tsv

1 more reply

figmert1mo ago

Here's an easier to understand query for what you're trying to do (at least it's easier to understand for me):

    cat input.json | jq -r '(first | keys) as $cols | $cols, (.[] | [.[$cols[]]]) | @tsv'

2 more replies

lokar1mo ago

I find it much harder to remember / use each time then awk

firesteelrain1mo ago

Trying to make a generic pipeline for json arrays because you don’t know the field names?

1 more reply

attentive1mo ago

> I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together.

that world exists and mature (powershell)

stingraycharles1mo ago

Sound similar to how power shell works, and it’s not great. Plain text is better.

rzzzt1mo ago

xnx1mo ago

It's extra complicated under Windows because of issues escaping/wrapping quotes "" and pipes ^|.

ivaniscoding1mo ago

Shameless plug, but you might like this: https://github.com/IvanIsCoding/celq

jq is the CLI I like the most, but sometimes even I struggled to understand the queries I wrote in the past. celq uses a more familiar language (CEL)

xpe1mo ago

CEL looks interesting and useful, though it isn't common nor familiar imo (not for me at least). Quoting from https://github.com/google/cel-spec

    # Common Expression Language

    The Common Expression Language (CEL) implements common
    semantics for expression evaluation, enabling different
    applications to more easily interoperate.

    ## Key Applications

    - Security policy: organizations have complex infrastructure
      and need common tooling to reason about the system as a whole
    - Protocols: expressions are a useful data type and require
      interoperability across programming languages and platforms.

ivaniscoding1mo ago

That’s some fair criticism, but the same page tells that the language wanted to have a similar syntax to C and JavaScript.

I think my personal preference for syntax would be Python’s. One day I want to try writing a query tool with https://github.com/pydantic/monty

TomNomNom1mo ago

Cool tool! Really appreciate the shoutout to gron in the readme, thanks! :)

bigfishrunning1mo ago

I had never heard of CEL, looks useful though, thanks for posting this!

dcre1mo ago

    $ cat package.json | dq 'Object.keys(data).slice(0, 5)'
    [ "name", "type", "version", "scripts", "dependencies" ]

https://crespo.business/posts/dq-its-just-js/

physicles1mo ago

Love it. This is so clearly the way to solve the jq writeability problem. I’m going to replace jq with this immediately.

arunix1mo ago

Thanks. Can you say more about why TypeScript with Deno is your scripting language of choice?

1 more reply

otterley1mo ago

Love it!

iLemming1mo ago

xendo1mo ago

Highly recommend gron. https://github.com/tomnomnom/gron

eevmanu1mo ago

or https://github.com/adamritter/fastgron

epr1mo ago

To fix this I recently made myself a tiny tool I called jtree that recursively walks json, spitting out one line per leaf. Each line is the jq selector and leaf value separated by "=".

No more fiddling around trying to figure out the damn selector by trying to track the indentation level across a huge file. Also easy to pipe into fzf, then split on "=", trim, then pass to jq

iamjackg1mo ago

You might like https://github.com/tomnomnom/gron

janderland1mo ago

JMESPath is what I wish jq was. Consistent grammar. It only issue is it lacks the ability to convert JSON to other formats like CSV.

charlesdaniels1mo ago

If we're plugging jq alternatives, I'll plug my own: https://git.sr.ht/~charles/rq

I was working at lot with Rego (the DSL for Open Policy Agent) and realized it was actually a pretty nice syntax for jq type use cases.

voidfunc1mo ago

I just ask Opus to generate the queries for me these days.

hilti1mo ago

raydev1mo ago

Of course, this doesn't matter now, I just ask an LLM to make the query for me if it's so complex that I can't do it by hand within seconds.

dhuan_1mo ago

I agree, even trivial tasks require us to go back to jq's manual to learn how to write their language.

this and other reasons is why I built: https://github.com/dhuan/dop

justonceokay1mo ago

LgWoodenBadger1mo ago

I completely agree. I much prefer leveraging actual javascript to get what I need instead of spending time trying to fumble my way through jq syntax.

dcre1mo ago

Check this out: https://crespo.business/posts/dq-its-just-js/

You don't have to use my implementation, you could easily write your own.

GaryNumanVevo1mo ago

yeah I literally just use gemini / claude to one-shot JQ queries now

palcu1mo ago

I've been calling LLMs superhuman at writing `jq`. It's like you're talking directly with the JSON.

NSPG9111mo ago

I also genuinely hate using jq. It is one of the only things that I rely heavily on AI.

vips7L1mo ago

You should try nushell or PowerShell which have built ins to convert json to objects. It makes it so easy.

bigstrat20031mo ago

Second this. Working with nushell is a joy.

dannyobrien1mo ago

I use the llm-jq plugin for Simon Willison's `llm` command line frontend for this: https://github.com/simonw/llm-jq

amelius1mo ago

At that point why don't we ask the AI directly to filter through our data? The AI query language is much more powerful.

4 more replies

1a527dd51mo ago

I appreciate performance as much as the next person; but I see this endless battle to measure things in ns/us/ms as performative.

Sure there are 0.000001% edge cases where that MIGHT be the next big bottleneck.

I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

9/10 whatever tooling you are using now will be perfectly fine. Example; I use grep a lot in an ad hoc manner on really large files I switch to rg. But that is only in the handful of cases.

j1elo1mo ago

Whenever you have this kind of impressions on some development, here are my 2 cents: just think "I'm not the target audience". And that's fine.

alsetmusic1mo ago

https://www.amazon.com/Pro-Bash-Programming-Scripting-Expert...

2 more replies

mememememememo1mo ago

Also performance improvements on heavy used systems unlocks:

Cost savings

Stability

Higher reliability

Higher throughput

Fewer incidents

Lower scaling out requirements.

lock11mo ago

Wait what? I don't get why performance improvement implies reliability and incident improvement.

For example, doing dangerous thing might be faster (no bound checks, weaker consistency guarantee, etc), but it clearly tend to be a reliability regression.

3 more replies

Chris20481mo ago

But even in this example, the 2ms vs 0.2 is irrelevant - its whatever the timings are for TB-size objects.

So went not compare that case directly? We'd also want to see the performance of the assumed overheads i.e. how it scales.

tclancy1mo ago

NoSalt1mo ago

> "somebody, somewhere, is doing stream processing of TB-sized JSON objects"

That's crazy to think about. My JSON files can be measured in bytes. :-D

2 more replies

7bit1mo ago

Either way, I have really big doubts that there will be ever a significant amount of people who'd choose jq for that.

simonw1mo ago

lbj1mo ago

That's a really great perspective. Thanks for sharing!

Hendrikto1mo ago

I get the sentiment, but everybody thinks that, and in aggregate, you get death by a thousand paper cuts.

It’s the same sentiment as “Individuals don’t matter, look at how tiny my contribution is.”. Society is made up of individuals, so everybody has to do their part.

> 9/10 whatever tooling you are using now will be perfectly fine.

lemagedurage1mo ago

True. I feel like the main way a tool could differentiate from jq is having more intuitive syntax and many real world examples to show off the syntax.

mpalmer1mo ago

The syntax makes perfect sense when you understand the semantics of the language.

roland351mo ago

For better or worse, Claude is my intuitive interface to jq. I don't use it frequently, and before I would have to look up the commands every time, and slowly iterate it down to what I needed.

Koschi131mo ago

mpalmer1mo ago

> Consider how many people use jq daily and think about how much energy could be saved by faster implementations.

Say a number; make a real argument. Don't just wave your hand and say "just imagine how right I could be about this vague notion if we only knew the facts"

gpvos1mo ago

I agree, but in this age of widespread LLM use, that's only marginal.

mattbis1mo ago

And further if you are doing important stuff in the CLI needing a big chain of commands, you probably should be programming something to do it anyways...

sramsay1mo ago

And I doubt I'm alone.

montroser1mo ago

Then this is for the handful of cases for you. When it matters it matters.

phillipcarter1mo ago

I agree for some things, but not for tools or "micro-software" like jq that can get called a LOT in an automated process. Every order of magnitude saved for the latter category can be meaningful.

mikojan1mo ago

> I see the same thing repeated in various front end tooling too. They all claim to be _much_ faster than their counterpart.

> 9/10 whatever tooling you are using now will be perfectly fine

g947o1mo ago

Either the team I worked at was horrible, or you are from Google/Meta/Walmart where either everyone is smart or frondend performance is directly related to $$.

1 more reply

lelandfe1mo ago

There are some really fast tools out there for compiling FE these days, and that's probably to what they refer. Testing is still a slog.

ffsm81mo ago

Uh, I've worked for a few years as a frontend dev, as in literal frontend dev - at that job my responsibility started at consuming and ended at feeding backend APIs, essentially.

From that I completely agree with your statement - however, you're not addressing the point he makes which kinda makes your statement completely unrelated to his point

99.99% of all performance issues in the frontend are caused by devs doing dumb shit at this point

It was a different story back when eg meteorjs was the default, but nowadays they're all fast enough to not be the source of the performance issues

raverbashing1mo ago

Yes

I don't think I remember one case where jq wasn't fast enough

Now what I'd really want is a jq that's more intuitive and easier to understand

latexr1mo ago

> Now what I'd really want is a jq that's more intuitive and easier to understand

Unfortunately I don’t recall the name, but there was something submitted to HN not too long ago (I think it was still 2026) which was like jq but used JavaScript syntax.

dalvrosa1mo ago

Fair, but agentic tooling can benefit quite a lot from this

Opencode, ClaudeCode, etc, feel slow. Whatever make them faster is a win :)

httpsterio1mo ago

The 2ms it takes to run jq versus the 0.2ms to run an alternative is not why your coding agent feels slow.

jmalicki1mo ago

Still, jq is run a whole lot more than it used to be due to coding agents, so every bit helps.

The vast majority of Linux kernel performance improvement patches probably have way less of a real world impact than this.

1 more reply

jamespo1mo ago

It's not running jq locally that's causing that

Kovah1mo ago

n_e1mo ago

messe1mo ago

Now I'm really curious. What field are you in that ndjson files of that size are common?

I'm sure there are reasons against switching to something more efficient–we've all been there–I'm just surprised.

1 more reply

eru1mo ago

This reminds me of someone who wrote a regex tool that matches by compiling regexes (at runtime of the tool) via LLVM to native code.

You could probably do something similar for a faster jq.

loxias1mo ago

I would love, _love_ to know more about your data formats, your tools, what the JSON looks like, basically as much as you're willing to share. :)

For me, I've been writing them just because it's an "itch". I like writing high performance/efficient software, and there's a few gaps that it bugged me they existed, that I knew I could fill.

I'm having fun and will be happy when I finish, regardless, but it would be so cool if it happened to solve a problem for someone else.

1 more reply

nchmy1mo ago

This isn't for you then

Mind me asking what sorts of TB json files you work with? Seems excessively immense.

2 more replies

swiftcoder1mo ago

Deal with really big log files, mostly.

sgarland1mo ago

> often there's no real substitute for pulling a couple of terabytes locally and going to town on them.

Fully agree. I already know the locations of the logs on-disk, and ripgrep - or at worst, grep with LC_ALL=C - is much, much faster than any aggregation tool.

sophacles1mo ago

Sometimes those will actually need to process through a bunch of data unexpectedly.

Sometimes those are to investigate a pegged CPU and the amortized lower runtime across all the tasks on the CPU is noticable.

bluedino1mo ago

We parse JSON responses for dashboards, alerting, etc. Thousands of nodes, depending on the resolution of your monitoring you could see improvements here.

xlii1mo ago

It's a simple loop:

- Someone likes tool X

- Figures, that they can vibe code alternative

- They take Rust for performance or FAVORITE_LANG for credentials

- Claude implements small subset of features

- Benchmark subset

- Claim win, profit on showcase

I still think that the project follows the "subset is faster than set" trend.

InfinityByTen1mo ago

oefrha1mo ago

Well grep is just better sometimes. Like you want to copy some lines and grep at the end of a pipeline is just easier than rg -N to suppress line numbers. Whatever works, no need to facepalm.

skywhopper1mo ago

Not every use case of jq is a person using it interactively in their terminal, believe it or not.

mikkupikku1mo ago

If somebody needs performance, they probably shouldn't be calling out to a separate process for json of all things, no?

1 more reply

7bit1mo ago

If Ms performance is a main concern, you shouldn't use jq. Believe it or not.

postepowanieadm1mo ago

Race between ripgrep and ugrep is entertaining.

password43211mo ago

Optimization = good

Prioritizing SEO-ing speed over supporting the same features/syntax (especially without an immediately prominent disclosure of these deficiencies) = marketing bullshit

A faster jq except it can't do what jq does... maybe I can use this as a pre-filter when necessary.

Jakob1mo ago

Speed is a quality in itself. We are so bugged down by slow stuff that we often ignore that and don’t actively search for another.

But every now and then a well-optimised tool/page comes along with instant feedback and is a real pleasure to use.

I think some people are more affected by that than others.

Obligatory https://m.xkcd.com/1205

Imustaskforhelp1mo ago

I am not sure if it was simon or pg who might've quoted this but I remembered a quote about that a 2 magnitude order in speed (quantity) is a huge qualititative change in it of itself.

hackrmn1mo ago

However, as someone who always loved faster software and being an optimisation nerd, hat's off!

bungle1mo ago

robmccoll1mo ago

jq is probably faster than storage, the network, compression, or something else in your stack and not your bottleneck.

mroche1mo ago

> Having used `jq` and `yq`

If you don't mind me asking, which yq? There's a Go variant and a Python pass-through variant, the latter also including xq and tomlq.

hackrmn1mo ago

jeffbee1mo ago

I use jq to grind through gigabytes of GeoJSON files exported from ArcGIS, as an ETL stage. It takes a long time.

skywhopper1mo ago

Yeah, turns out not everyone uses these tools the way you do. Weird!

hackrmn1mo ago

Fair enough, I deserved that :-)

ifh-hn1mo ago

rlonstein1mo ago

igorramazanov1mo ago

Same! Nushell replaced almost all of them

Had to spend some efforts to set up completions, also there some small rough edges around commands discoverability, but anyway, much better than the previous oh-my-zsh setup

joknoll1mo ago

ndyg1mo ago

Something I find myself saying a lot, Nushell is a better `jq` than `jq`

Bigpet1mo ago

When initially opening the page it had broken colors in light mode. For anyone else encountering it: switch to dark mode and then back to light mode to fix it.

CodeCompost1mo ago

I suspect the website is vibe-coded, like the tool itself.

jmalicki1mo ago

I can forgive vibe code... It needs to execute if it works it's fine.

Unedited vibe documentation is unforgivable.

merlindru1mo ago

this is a bad faith take. i think the website is really cool and doesn't reek of slop at all. what makes you think differently?

2 more replies

shellac1mo ago

I think this has just been fixed. A bit of dark mode was leaking into light in the css.

majewsky1mo ago

I still saw the same bug just now (Firefox on macOS).

drob5181mo ago

It’s still broken for me at this point. White link text on nearly white background. Impossible to read. Safari on my iPad.

jvdvegt1mo ago

Fine in Firefox on Android. Note that the scales of the charts are all different, which makes them hard to compare.

Also, there are lots of charts without comparison so the numbers mean nothing...

micahkepe1mo ago

Hey OP here! Sorry about this this is just laziness on my part because I never use light mode so I forget to test haha, will push a fix!

xyst1mo ago

Modern programmers these days just give a shit about user experience. Better to just load up in reader mode.

qwe----31mo ago

White text with light background, yeah.

keysersoze331mo ago

I had the same problem (brave browser)

vladvasiliu1mo ago

Looks fine to me on Edge/Windows.

micahkepe1mo ago

Should be fixed now! Let me now :)

youngtaff1mo ago

Broken on iOS Safari too

Jenk1mo ago

I switched to Jaq[0] a while back for the 'correctness' sake rather than performance. But Jaq also claims to be more performant than jq.

[0]: https://github.com/01mf02/jaq

jeffbee1mo ago

password43211mo ago

Thank you for the recommendation.

It looks like jaq has already progressed much further in the right direction than jsongrep has just started in the not-quite-as-right direction.

jiehong1mo ago

First of all, congratulations! Nice tool!

Last, xLarge is a 190MiB file. I was surprised by that. It seems too low for xLarge. I daily check 400MiB json documents, and sometimes GiB ones.

micahkepe1mo ago

jiehong1mo ago

The US government tend to offer big public json document [0], such as crime rates [1], or others.

[0]: https://catalog.data.gov/dataset/?res_format=JSON

[1]: https://catalog.data.gov/dataset/crimes-2001-to-present

Asmod4n1mo ago

You could just take simdjson, use its ondemand api and then navigate it with .at_path(_with_wildcard) (https://github.com/simdjson/simdjson/blob/master/doc/basics....)

The whole tool would be like a few dozen lines of c++ and most likely be faster than this.

maxloh1mo ago

From their README [0]:

[0]: https://github.com/micahkepe/jsongrep

vindin1mo ago

allknowingfrog1mo ago

throwawaypath1mo ago

After reading the title, I was worried that this wasn't written in Rust!

VHRanger1mo ago

If rust is not in the HN title and fire emojis in the readme, it doesn't come from the Rust region of France.

It's just sparkling memory safe high performance software

onedognight1mo ago

bouk1mo ago

I highly recommend anyone to look at jq's VM implementation some time, it's kind of mind-blowing how it works under the hood: https://github.com/jqlang/jq/blob/master/src/execute.c

It does some kind of stack forking which is what allows its funky syntax

functional_dev1mo ago

The backtracking implementation in jq is really the secret sauce for how it handles those complex filters without getting bogged down

vbezhenar1mo ago

Looks like naive implementation of homemade bytecode interpreter. What's so mind blowing about that? Maybe I missed something.

Self-Perfection1mo ago

https://github.com/jqlang/jq/issues/1826

So any replacement candidate should also benchmark like hyperfine "jq .a <<< '{"a": 10 }'" . This oneliner does not work but should illustrate the idea.

Also please just use jshon if you need to just extract specific value from some small JSON. jshon uses way less resources by any conceivable metric.

skywhopper1mo ago

micahkepe1mo ago

OP here: sorry about that, the light mode inconsistencies should be fixed now. Will continue to work on making the site design better as well!

ontouchstart1mo ago

Everything can be written in JavaScript will be written in JavaScript.

Everything can be rewritten in Rust will be written in Rust.

enricozb1mo ago

s_dev1mo ago

Just use Claude to generate the jq expression you need and test it.

Voranto1mo ago

functional_dev1mo ago

basically the double jump to find values in the heap is what slows down these tools most

Voranto1mo ago

I can see that in practice the bottleneck isn't the automata construction, I'm just curious of how the construction is approached with such a super-exponential conversion algorithm

tehnub1mo ago

I've been using jj, which apparently is also faster than jq https://github.com/tidwall/jj

swah1mo ago

jj was already taken by jujutsu, unfortunately.

hilti1mo ago

I'm glad you adjusted the CSS while I was typing my comment. I needed to switch to dark mode to be able to read highlighted words.

Nice write up. I will try out your tool.

mlmonkey1mo ago

LOL ... came here to grips about that!

Also "jg" reads very similar to "jq", and initially I thought he was talking about "jq" all along, and I was like: where can I see the "jasongrep" examples? Threw me off for a minute.

steelbrain1mo ago

Surprised to see that there's no official binaries for arm64 darwin. Meaning macOS users will have to run it through the Rosetta 2 translation layer.

alexellisuk1mo ago

Just hit this too:

https://news.ycombinator.com/item?id=47542182

The reason I was interested, was adding the new tool to arkade (similar to Brew, but more developer/devops focused - downloads binaries)

The agent found no Arm binaries.. and it seemed like an odd miss for a core tool

https://x.com/alexellisuk/status/2037514629409112346?s=20

QuantumNomad_1mo ago

I’d install it via cargo anyway and that would build it for arm64.

If the arm64 version was on homebrew (didn’t check if it is but assume not because it’s not mentioned on the page), I’d install it from there rather than from cargo.

I don’t really manually install binaries from GitHub, but it’s nice that the author provides binaries for several platforms for people that do like to install it that way.

maleldil1mo ago

You can use cargo-binstall to retrieve Github binary releases if there are any.

micahkepe1mo ago

OP here: Releases have been updated! Also someone was kind enough to package it in Homebrew already :) https://github.com/micahkepe/jsongrep/pull/22

baszalmstra1mo ago

coldtea1mo ago

His response at least fits the discussion and is relevant to the tool, not generic hollier-than-thou scolding.

To address the concern, anyway, I'm sure it would soon be available in brew as an arm binary.

sirfz1mo ago

Nowadays I'd just use clickhouse-local / chdb / duckdb to query json files (and pretty much any standard format files)

quotemstr1mo ago

Reminder you can also get DuckDB to slurp the JSON natively and give you a much more expressive query model than anything jq-like.

maleldil1mo ago

How does it deal with nested objects? E.g. one of the fields/columns is an array of objects.

quotemstr1mo ago

Beautifully. UNNEST works well, as do the pivot operators.

micahkepe1mo ago

OP jsongrep author here: v0.8.0 now has multi format support for serializable formats![^1]

[1]: https://github.com/micahkepe/jsongrep/releases/tag/v0.8.0

luc41mo ago

mlmonkey1mo ago

Minor suggestion: often I just want to extract one field, whose name I know exactly. I see that `jg` has an option `-F` like this:

$ cat sample.json | jg -F name

I would humbly suggest that a better syntax would be:

$ cat sample.json | jg .name

for a leaf node named "name"; or

$ cat sample.json | jg -F .name.

for any node named "name".

soleveloper1mo ago

I already can't remember jq syntax. Naming this jg just means I'll type one, instinctively use the other's syntax, and get an error anyway. It's a DX trap.

But I will admit, the new syntax makes a lot more sense.

keysersoze331mo ago

carlmr1mo ago

keysersoze331mo ago

Just the basic things, like viewing the complete json (with syntax highlighting) to then determine the filter, that is '.' becomes '**'

vismit20001mo ago

Table of contents seems inspired by the famous ripgrep post from 2016: https://burntsushi.net/ripgrep/

wolfi11mo ago

forgive me my rant, but when I see "just install it with cargo" I immediately lose interest. How many GB do I have to install just to test a little tool? sorry, not gonna do that

arjie1mo ago

Thank you. Very cool. Going to try embedding this into my JSON viewer. One thing I’ve struggled with is that live querying in the UI is constrained by performance.

stuaxo1mo ago

Nice.

Some bits of the site are hard to read "takes a query and a JSON input" query is in white and the background of the site is very light which makes it hard to read.

rswail1mo ago

Just about to read, but I had to change to dark mode to be able to see the examples, which are bold white on a white background.

1vuio0pswjnm71mo ago

One problem I have not seen addressed by jq or alterataives, perhaps this one addresses it, is "JSON-like" data. That is, JSON that is not contained in a JSON file

For example, web pages sometimes contain inline "JSON". But as this is not a proper JSON file, jq-style utilties cannot process it

The size of the JSON input does not affect the filter's memory usage. Generally, a large JSON file is processed at the same speed with the same resource usage as a small one

The author here has provided musl static-pie binaries instead of glibc. HN commenters seeking to discredit musl often claim glibc is faster

Personally I choose musl for control not speed

1. jq also uses flex

1vuio0pswjnm71mo ago

*alternatives

furryrain1mo ago

If it's easier to use than jq, they should sell the tool on that.

coldtea1mo ago

Speed is good! Not a big fan of the syntax though.

jrhey1mo ago

Since when was jq considered slow?

PUSH_AX1mo ago

Is Jq slow?

PunchyHamster1mo ago

alexellisuk1mo ago

Quick comment for the author.

Just added this new tool to arkade, along with the existing jq/yq.

No Arm64 for Darwin.. seriously? (Only x86_64 darwin.. it's a "choice")

No Arm64 for Linux?

For Rust tools it's trivial to add these. Do you think you can do that for the next release?

https://github.com/micahkepe/jsongrep/releases/tag/v0.7.0

micahkepe1mo ago

OP here- I will add! Thank you for checking out the project!

micahkepe1mo ago

They've been added!

peterohler1mo ago

silverwind1mo ago

Effort would be better investigated making `jq` itself faster.

adastra221mo ago

The fastest alternative to jq is to not use JSON.

1vuio0pswjnm71mo ago

The unlimited memory required for JSON is poor design

netstrings has no such issues

marxisttemp1mo ago

Many Useless Uses of cat in this documentation. You never need to do `cat file | foo`, you can just do `<file foo`. cat is for concatenating inputs, you never need it for a single input.

norenh1mo ago

As someone who worked with Unix/Linux and command line arguments for 30 years and still "abuse" cat like the documentation, I regularly hear this complaint.

jasomill1mo ago

If left-to-right is your main concern, observe that the post you replied to uses

  <file command

which is equivalent to

  command <file

j / k navigate · click thread line to collapse