Q: A faster re-implementaiton of jq written in Reason Native/OCaml (opens in new tab)

(github.com)

250 pointsdavesnx5y ago192 comments

192 comments

127 comments · 27 top-level

aasasd5y ago· 14 in thread

For everyone pining for a Jq with a different syntax: I have a bunch of links to alternatives collected, you might want to try some of them (some may be for different things than JSON):

https://github.com/fiatjaf/awesome-jq

https://github.com/TomConlin/json2xpath

https://github.com/antonmedv/fx

https://github.com/fiatjaf/jiq

https://github.com/simeji/jid

https://github.com/jmespath/jp

https://github.com/cube2222/jql

https://jsonnet.org

https://github.com/borkdude/jet

https://github.com/jzelinskie/faq

https://github.com/dflemstr/rq

Personally I think that next time I might just fire up Hy and use its functional capabilities.

topher2005y ago

My personal favorite solves the same problem but attacks it differently.

> Make JSON greppable!

> gron[1] transforms JSON into discrete assignments to make it easier to grep for what you want and see the absolute 'path' to it. It eases the exploration of APIs that return large blobs of JSON but have terrible documentation.

  ▶ gron "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | fgrep "commit.author"
  json[0].commit.author = {};
  json[0].commit.author.date = "2016-07-02T10:51:21Z";
  json[0].commit.author.email = "mail@tomnomnom.com";
  json[0].commit.author.name = "Tom Hudson";

[1] https://github.com/tomnomnom/gron

sitkack5y ago

Gron and Jq are like peanut butter and jelly. Also gron has `gron -u` (ungron) to turn the pivot back into json.

vips7L5y ago

Don't forget powershell's Convert-FromJson :)

https://docs.microsoft.com/en-us/powershell/module/microsoft...

melbourne_mat5y ago

That is so not jq! I've really been pining for jq on my current Windows project :-(

1 more reply

mkesper5y ago

Does it still convert only two levels by default?

retzkek5y ago

Babashka is another Hy-like alternative, but based on Clojure, and recently discussed on HN:

https://github.com/borkdude/babashka

https://news.ycombinator.com/item?id=24353476

Aside: another nice tool I recently discovered for working with JSON and YML, doing conversion and diffs (especially helpful for generated files):

https://github.com/homeport/dyff

sansblob5y ago

Late to the party but Benthos has its own language aimed at larger mappings: https://www.benthos.dev/docs/guides/bloblang/about

cristoperb5y ago

My go-to for simple queries is https://github.com/tidwall/jj

It is not nearly as expressive as jq, but it is faster for my use cases (written in golang).

jberryman5y ago

Sorry, what is "Hy"?

madmax1085y ago

https://github.com/hylang/hy

aasasd5y ago

A lightweight Lisp on top of Python.

Personally I'd prefer Fennel, which is on Lua and thus a whole lot faster, especially in regard to the startup time—but as I noted in a thread on Fennel, Lua's omission of a proper ‘null’ makes it awkward to handle exchange and transformations of data from third parties. And, since I'm likely to fiddle with the queries for some time, startup delay is less important here.

jdc5y ago

also the interactive https://github.com/jmespath/jmespath.terminal

xonix5y ago

My humble attempt https://github.com/jsqry/jsqry-cli2

davesnxOP5y ago

Thanks, next time if you paste the list somewhere else you can add query-json :P

jeffbee5y ago· 13 in thread

1) refuses to operate on stdin; requires a filename argument, which is so irritating.

2) doesn't accept values that jq accepts

  % time jq -r '[expression]' < parcels | wc       
      365    1454    7978
  jq -r  < parcels  1.39s user 0.00s system 99% cpu 1.390 total
  wc  0.00s user 0.00s system 0% cpu 1.390 total

  % time ~/.yarn/bin/q  '[expression]' parcels | wc
  q: internal error, uncaught exception:
     Yojson.Json_error("Line 56, bytes -1-32:\nJunk after end 
  of JSON value: '{\n  \"OBJECTID\": 155303,\n  \"BOOK\"'")

diggan5y ago

1 is easy to work around (handy tip incoming for any tools that _seem_ to not support stdin but actually do, as stdin is also available as a file in unix):

    echo '{"foo": "bar"}' | query-json ".foo" /dev/stdin

saagarjha5y ago

Tools that accept filenames often expect you give them a real file, as they’ll do things on it that may not be supported by the various “it’s a file descriptor pretending to be something on disk” solutions.

1 more reply

chriswarbo5y ago

I try to avoid the dummy files /dev/stdin, /dev/stdout and /dev/stderr, since I've been bitten when they're not available, or when I hit permission denied errors.

Two examples I can remember off the top of my head:

- Nix build scripts

- OpenMoko

gpderetta5y ago

alternatively in bash:

   query-json ".foo" <(echo '{"foo": "bar"}')

davesnxOP5y ago

but indeed, it's a nice workaround!

davesnxOP5y ago

query-json --kind=inline would support reading from stdin, I didn't spend time on the Cmdliner enough!

f311a5y ago

That's super weird, I think most people use jq for bash pipelines.

jeffbee5y ago

Yes, I don't understand how people end up with assertions that the filename is a require argument. At least we've got /dev/stdin or /proc/self/fd/0 as workarounds.

2 more replies

ashtonkem5y ago

My most common jq usage is to copy and paste some json into quotes to make it easier to read. My second most common action is to chain curl and jq together. A replacement for jq that doesn’t use stdin is literally useless to me.

1 more reply

kilroy1235y ago

Probably true but I use it regularly in the automation app huginn.

https://github.com/huginn/huginn

davesnxOP5y ago

Nothing that can be fixed later?

freedomben5y ago

Thanks for this. I've been planning a similar work for years and haven't gotten off my ass (too many other projects lol).

I definitely agree that reading from stdin is critical if I'll be able to use it. Don't take the criticism too hard though (especially the "author doesn't appreciate unix" stuff. Sometimes we can be such assholes to each other).

Nice work!

1 more reply

jeffbee5y ago

Probably. You tell us :=)

The incompatibility is apparently due to the fact that jq is happy with a concatenation of JSON objects and q is not. For example {'foo':1}{'foo':2} as opposed to [{'foo':1},{'foo':2}]

2 more replies

toastal5y ago· 11 in thread

Are we sure it should get a single-letter 'q' binary name though? Docs seem to point that it's short for 'query-json'? Why not call it 'query-json' and let the user decide that as a shell alias or whatever. Even the ubiquitous 'ls' and 'cd' are two characters.

nondave5y ago

Also clashes with this existing q: https://en.m.wikipedia.org/wiki/Q_(programming_language_from...

stingraycharles5y ago

Which is relatively established and widely used (although mainly in finance). It was the first thing I thought about.

vincnetas5y ago

And there's also this 'q':

http://harelba.github.io/q/

Query CSV with SQL

V6HBGNQHU5y ago

Seems in the past hour he has renamed it to query-json and suggests that you set your own alias if you want something shorter.

Hnrobert425y ago

Yeah. Can you imagine trying to do a web search for ‘q’?

pinopinopino5y ago

I can and it is lightly disturbing. https://imgur.com/a/yAOF31g ^_^

3 more replies

mritchie7125y ago

or at least qj

davesnxOP5y ago

I don't want to fight for name, q was a shortener.

Happy to rename it to qj instead, but the option of renaming the binary it's a good workaround.

1 more reply

rurban5y ago

I made the same argument to him on reddit. q even exists already.

He replied and thought about qj

davesnxOP5y ago

Renamed :P

1 more reply

davesnxOP5y ago

Renamed already :D

mkesper5y ago· 9 in thread

I'd long for such a tool with a better comprehensible query language.

cube22225y ago

If so, and for anybody else having this wish, check out jql[0], I've created it exactly for this reason, to have the most common jq operations available in a more uniform and easier to use interface.

[0]: https://github.com/cube2222/jql

davesnxOP5y ago

Nice!

I will try to bring it to the brenchmark, thanks for sharing

diggan5y ago

Give jet a try! Uses a lightweight query language over EDN. If you're familiar with Clojure, it'll be very natural to use and if you're not familiar with Clojure, the query language used is very easy to pickup :) https://github.com/borkdude/jet/blob/master/doc/query.md

benibela5y ago

There is XPath 3.1: https://www.w3.org/TR/xpath-31/#id-lookup

It is more verbose, like you get the size of something with array:size or map:size functions, so it is more readable

I am implementing it in Xidel 0.9.9+: http://www.videlibri.de/xidel.html

jeffbee5y ago

XPath is really useful and `xmlstarlet` is/was jq before jq existed. These days it's relatively rare to get data in XML instead of JSON, though.

1 more reply

peterohler5y ago

The oj command in https://github.com/ohler55/ojg uses JSONPath as the query and filter language. Maybe it is more in line with what you are looking for.

aasasd5y ago

I've collected a bunch of alternatives to look at, see here: https://news.ycombinator.com/item?id=24470715

JamesSwift5y ago

For anything moderately complex I iterate using https://jqplay.org/. Life is much better since I started doing that.

cube22225y ago

Hint: you can do live jq query preview for any jq-like command using fzf. It looks like this for jql, an alternative I've created (you can find it in a neighboring comment):

echo '' | fzf --print-query --preview-window wrap --preview 'cat test.json | jql {q}'

dkdk82835y ago· 7 in thread

Is jq slow? I have only worked with datasets up to 1mb but I’ve never had a performance issue that wasn’t attributed to my error.

minaguib5y ago

Yes

I often have to pluck out attributes from streams of json records (1 json object per line) - often millions/billions.

jq is almost always the bottleneck in the pipeline at 100% CPU - so much so that we often add an fgrep to the left side of the pipeline to minimize the input to jq as much as possible.

de_watcher5y ago

It's very slow. It was immediately standing out in our automated tests when we've added a json protocol to our system and used jq to test some assertions.

nicoburns5y ago

jq is pretty fast in my experience. But there have been cases where I've wanted it to be faster (dealing with a 90GB JSON file).

The main weakness seems to be streaming use cases (not having the whole file in memory at once). These are supported, but the syntax is quite awkward.

davesnxOP5y ago

Right, q doesn't support streaming so it will manage a 90GB JSON.

I should specify that on the performance section, Thanks!

arethuza5y ago

Out of interest, what created a 90GB JSON file?

3 more replies

jonstewart5y ago

Yes. I was looking to embed it in a tool, but decided against it after looking at its implementation. It parses the expression with a stack and executes it directly, and its JSON parsing is much the same. I doubt the parsing would be close to competitive with RapidJSON, let alone simd-json. The conditionals and pointer chasing of such an implementation are stumbling blocks to performance.

The C code is clean enough as C code goes, but fairly monolithic. And it’s C, so it’s not noticeably slow until you start processing GB. But it would probably take a rewrite to improve its performance significantly.

davesnxOP5y ago

Nobody said that jq is slow.

vasergen5y ago· 6 in thread

The speed is not concern for me. I am wondering if there something better than `jq` in terms of syntax. Whenever I want to get something more that just prettify json output in the console or simply get value by specific field name I have a problem, for me it is just difficult to remember jq syntax without looking into history. As well have in my notes links to examples like this one

https://mosermichael.github.io/jq-illustrated/dir/content.ht...

mumblemumble5y ago

For my part, I've always wanted a tool that just replicates PostgreSQL's JSON syntax. That way I can have only one syntax to remember.

davesnxOP5y ago

One of the main ideas of query-json is to provide excelent errors. So, it would teach you by using the tool.

and there are a few techniques to "discover" the schema of the json file, I trend to read with '.' or 'keys' and later keep going.

I'm planning to implement a flag where each operation prints the internal state of the json, so you would see what are the "pipes".

I will pick a few of your cheatsheet to implement next in q, Thanks!

bradly5y ago

Check out jql [https://github.com/cube2222/jql] and oj [https://github.com/ohler55/ojg]

vasergen5y ago

definitively will take a look, I've never heard of `jql` before, thanks

aasasd5y ago

Might want to take a look at some of these alternatives: https://news.ycombinator.com/item?id=24470715

xonix5y ago

My humble attempt https://github.com/jsqry/jsqry-cli2

jakuboboza5y ago· 5 in thread

Do we need to make jq faster ? Anyone has issues with current speed ? Is there any specific reason other than "because we can" ?

phonebucket5y ago

I can't answer for the OP, but "because we can" is a valid enough reason (pun unintended) for me.

IMO, an individual dev making a fast useful tool should always be welcomed as a feat of worthy hacking.

mumblemumble5y ago

According to the "Purpose" section of the readme, it doesn't look like beating jq's speed was ever a goal. It was meant to be a learning exercise.

But if I had done something like that, and then serendipitously discovered that I was exceeding the original's performance, I certainly wouldn't be shy about it.

Also, this comes across as armchair criticism purely for the sake of armchair criticism. My own experience has been that, when I'm doing ETL that involves wrangling JSON, the "wrangling JSON" bit of it is almost always the bottleneck. So any improvement is more than welcome and deserves to be cheered. Even if it's an improvement on something that's already the current fastest way to do it.

davesnxOP5y ago

I'm not sure at which "we" do you refer.

Reimplementing a piece of software that is 12 years old which mimics their UX and improves performance and error messages it's more than welcome in my opinion. My purpose was to learn the OCaml stack of writting compilers, so I personally found that I "needed" a language already created.

Thanks for raising those concerns

ludamad5y ago

If we were at a board meeting deciding how to spend my time, no. If it is done? Why not

specialist5y ago

Batting practice.

We'd all be better off if plebes grew their skills by reimplementing common tools.

gkfasdfasdf5y ago· 4 in thread

Curious, any description as to why it's faster? Something intrinsic to Reason Native/OCaml? Architectural changes? Reduced feature set?

tyingq5y ago

Jq appears to have its own hand written json parser and requires flex/bison. I suspect something about the hand written parser is slow for large data sets.

I was somewhat surprised it didn't use an existing json parser library.

brundolf5y ago

I'm doubly surprised that such a popular utility uses bison; generated parsers tend to be slower than handwritten parser, and JSON isn't exactly the world's hardest language to parse

dubcanada5y ago

Well it is missing a ton of jq functionality, it's possible that in that list is something causing the performance degradation.

masklinn5y ago

jq’s in C do probably not anything intrinsic. Not to mention I don’t think the ocaml compiler is an optimisation beast.

YesThatTom25y ago· 4 in thread

Great! Now improve the syntax!

dividedbyzero5y ago

How, though? I agree that jq's syntax isn't exactly the most straightforward, and it gets raised as a point of criticism anytime jq is mentioned, but its scripting language seems like a pretty good compromise between compactness and rich features.

Replacing that with, say, traditional command line flags would make it a lot less useful for me, I'd probably have to build much longer pipe-chains to do things that are relatively simple and readable jq snippets (if one knows the syntax.)

Using an established scripting language in its place would make it pretty much just python -c/ruby -e or whatever with some pre-loaded functions, but what's the point? You can always just write a quick python/ruby/whatever script, jq to me is an alternative for cases where a script feels unnecessary. It would also mean everything gets more verbose, so less of my jq transformations can be inlined without loss of readability.

Aligning it to more established languages would probably cause confusion as well in those cases where it doesn't match the reference language 1:1. Looks like javascript, writes like javascript, but only for a tiny subset of the language, etc.

Doing this only for a few function names or syntax constructs still results in a pretty unique and unusual language that will require people to reference the docs a lot, just now lots of existing scripts break.

davesnxOP5y ago

Just because jq is very well stablished doesn't mean their APIs are well designed and we shoudn't improved because will break existing scripts.

There're a lot of quirks from the usage of it and people struggling with learning such a great tool, so in the area of query-json it will try to make a better interface for users.

aasasd5y ago

Perhaps one of these might work for you: https://news.ycombinator.com/item?id=24470715

JamesSwift5y ago

I think the issue is not the syntax, its the barebones docs with absolutely trivial examples.

ksmg5y ago· 3 in thread

Hm, I thought q is synonym for querying CSV files https://harelba.github.io/q/

redsaz5y ago

Same. When I saw the name "q" I thought of this same tool.

davesnxOP5y ago

Right, I found q cute... but I'm thinking to release new version with the name query-json or just change the name all-together. Any suggestion? ^^

smabie5y ago

No q is an array language

jamil75y ago· 3 in thread

As an outsider I get very confused by the Reason / Reason Native / OCaml / Bucklescript / Rescript?! ecosystem. What does it mean for it to be written in Reason Native/OCaml?

rashkov5y ago

That means it produces a native binary (for example, a .exe file on windows platforms), so ultimately you're aiming to run the program in a terminal. This is the normal way for OCaml to operate.

In this case the author is using Reason as an alternative syntax to OCaml. Reason resembles javascript a little more, and some people find that nicer to work with. So the idea is that you write Reason code, then translate it into OCaml code using the Reason tools, and then ultimately you compile it down to a native binary.

If instead you want to write a web-app which runs in a web browser or node.js, then you'd need to compile it to Javascript, which is what bucklescript helps you do.

Where does Rescript come in? As explained above, Reason can be used for writing either native apps or javascript apps. However, it's hard to evolve the syntax of Reason in a way which satisfies both aims. So they've now split the work -- going forward, Reason will specialize on native, and Rescript will specialize on javascript apps. Their syntax is expected to diverge from each other, in order to support those aims as best as they can.

jamil75y ago

Thank you for the detailed answer! I check in on the status of the related projects from time to time and was often confused by the relationship between the components.

davesnxOP5y ago

Right, the explanation of Reason - BuckleScript - OCaml is always nebulous.

I used Reason to compile to Native, so using OCaml's stdlib and OCaml's dependencies and compiling it with OCaml, but my source code is written in Reason syntax.

StavrosK5y ago· 3 in thread

This looks nice, but I was a bit dismayed at "friends don't let friends curl | bash, to install this run curl | bash".

konjin5y ago

I remember one of the first times I tried installing Linux software in the wild. The bash script asked for your password, sent it to their server using curl then returned you the script with the password hard coded into it, run itself with sudo, all over unencrypted http. I was 17 but even then I stopped to think if this was a good idea.

It wasn't.

faitswulff5y ago

That is pretty amusing. I’ve seen some bootstrap scripts that pipe the curled output to the terminal for approval before executing it. That seems like an ergonomic alternative to curl | bash. It would be at least as useful as the terms of service warnings before you install something, anyway.

davesnxOP5y ago

There's 4 alternatives to install query-json.

Before doing any curl | bash, check what's on the install command, that's the entire point of it.

andylynch5y ago· 2 in thread

This looks interesting, but could be confusing given the programming language of the same name (https://code.kx.com/q/)

_v7gu5y ago

Ah yes, the old ".j.k raze read0`" as a separate app

andylynch5y ago

I should definitely check how that compares on some big files here.

jonemi5y ago· 2 in thread

I used to be a regular user of jq, but I was never parsing very large JSON. I now do what I used to do with jq in my browser's developer tools console. Map and filter are far more familiar than jq's syntax where I found myself referring to the documentation most of the time.

I'm sure other people have use cases where the browser wouldn't meet their needs, but for me, I find jq unnecessary.

choward5y ago

Writing a script? I'm not going to have my script open a web browser so I can attempt to interact with a web console.

jonemi5y ago

When it got to the point when I needed a script, I just preferred Python. I can understand how some might prefer jq and a shell script, I just realized it wasn't worth it for my particular needs.

muktabh5y ago· 2 in thread

Slightly out of context here, I find the entire stack of bsb, bsb-native, ocaml and esy pretty cool. However, I just dont find enough resources, good tutorials etc on Google search. Is there a good set of beginner tutorials anyone can point to ? Thanks in advance.

davesnxOP5y ago

The documentation is a problem in the OCaml world and a problem with Reason Native as well. I found myself pretty lost some times, esy.sh should be a initial point in contact for most of Reason related stuff.

Menhir/sedlex and others are pretty high accessibility barrier for new commers.

One of the nice things about all of it it's the discord, it's friendly and always helpful.

Hope it helps, just let me know if there's any specific!

smabie5y ago

Just ditch Reason and use OCaml. There's a lot more documentation and the syntax is better.

nikolay5y ago· 2 in thread

JMESPath is the only viable alternative, which probably has a wider footprint than even jq as it's part of AWS CLI.

acdha5y ago

It's definitely popular but “only viable alternative” is a bit strong: that's only if you need compatibility with particular tools which support only one of the two formats. There's no reason why anyone who doesn't like those tools couldn't create a different syntax to scratch whatever particular itch they have.

nikolay5y ago

It's embeddable and available as a library for all languages [0]. Everything else is nothing but an CLI tool pretty much, which further limits its adoption.

[0]: https://github.com/jmespath

1 more reply

as-j5y ago· 1 in thread

> Aside from that, q isn't feature parity with jq which is ok at this point, but jq contains a ton of functionality that query-json misses and some of the jq operations aren't native, are builtin with the runtime. In order to do a proper comparision all of this above would need to take into consideration.

> The report shows that q is between 2x and 5x faster than jq in all operations tested and same speed (~1.1x) with huge files (> 100M).

While faster for somethings....that's a pretty large set of caveats!

davesnxOP5y ago

Adding most of the jq operations shoudn't affect performance at all, in fact If I endup implementing streaming could be even faster.

I have a issue to improve performance where I can push this forward: https://github.com/davesnx/query-json/issues/7

But sure, are caveats!

riston5y ago· 1 in thread

Would be good if someone adds an explanation why this new approach is better, is it that the OCaml is faster, more efficient algorithms were used, etc?

davesnxOP5y ago

I tried to explain it on the Performance section and on the report

https://github.com/davesnx/query-json#performance https://github.com/davesnx/query-json/blob/master/benchmarks...

But all explanations aren't based by any evidence, just asumptions.

Ericson23145y ago· 1 in thread

This is funny because Stephen Dolan, the original jq author, works on OCaml itself.

davesnxOP5y ago

Exactly! I wanted to contact him

layoutIfNeeded5y ago· 1 in thread

Umm... There's already a language called Q for array processing.

davesnxOP5y ago

Will rename it to query-json. Thaaanks!

Borkdude5y ago· 1 in thread

If you're into Clojure, check out https://github.com/borkdude/jet

iLemming5y ago

I use jet all the time when I need to quickly examine a json snippet in Emacs. I would use <C-u M-|> (shell-command-on-region with a prefix) and execute jet to convert selected json part to EDN. That cuts out all the visual noise. EDN is much more concise, cleaner and easier to read. I'd use it even if I don't write Clojure.

RMPR5y ago· 1 in thread

Upcoming q-rs a rewrite of q in Rust :p

davesnxOP5y ago

I hope so!

skywhopper5y ago· 1 in thread

This is cool, but I’m not sure it’s fair to claim it’s “faster” yet when it doesn’t do 95% of what jq does—-particularly the command line options. If it’s still faster when you can match 80% of the functionality, then it might be a claim worth making.

davesnxOP5y ago

Exactly I didn't claim to be faster in all the cases, since there's no feature parity and I won't make it that way.

For the set of operations that I implement it it's faster, that's true.

brundolf5y ago· 1 in thread

I'd love to hear some speculation - from the author or otherwise - as to why a fresh OCaml implementation would so dramatically outperform a mature C implementation

davesnxOP5y ago

There are a few good asumtions about why is faster, there are just speculations since I didn't profile jq or query-json.

The feature that I think penalizes a lot jq is "def functions", the capacity of define any function that can be available during run-time.

This creates a few layers, one of the difference is the interpreter and the linker, the responsible for getting all the builtin functions and compile them have them ready to use at runtime.

The other pain point is the architecture of the operations on top of jq, since it's a stack based. In query-json it's a piped recursive operations.

Aside from the code, the OCaml stack, menhir has been proved to be really fast when creating those kind of compilers.

I will dig more into performance and try to profile both tools in order to improve mine.

Thanks

tus885y ago· 1 in thread

Isn't JQ written in C? I doubt LISP is going to be faster.

davesnxOP5y ago

Yes, jq is written in C. Where LISP comes from?

yahyaheee5y ago· 1 in thread

I’m with Q!

davesnxOP5y ago

heycosmo5y ago

In case anyone is interested in yet another alternative, I have this old, unpolished project: https://github.com/bauerca/jv

It is a JSON parser in C without heap allocations. The query language is piddly, but the tool can be useful for grabbing a single value from a very large JSON file. I don't have time for it, but someone could fork and make it a real deal.

j / k navigate · click thread line to collapse

192 comments

127 comments · 27 top-level

aasasd5y ago· 14 in thread

For everyone pining for a Jq with a different syntax: I have a bunch of links to alternatives collected, you might want to try some of them (some may be for different things than JSON):

https://github.com/fiatjaf/awesome-jq

https://github.com/TomConlin/json2xpath

https://github.com/antonmedv/fx

https://github.com/fiatjaf/jiq

https://github.com/simeji/jid

https://github.com/jmespath/jp

https://github.com/cube2222/jql

https://jsonnet.org

https://github.com/borkdude/jet

https://github.com/jzelinskie/faq

https://github.com/dflemstr/rq

Personally I think that next time I might just fire up Hy and use its functional capabilities.

topher2005y ago

My personal favorite solves the same problem but attacks it differently.

> Make JSON greppable!

  ▶ gron "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | fgrep "commit.author"
  json[0].commit.author = {};
  json[0].commit.author.date = "2016-07-02T10:51:21Z";
  json[0].commit.author.email = "mail@tomnomnom.com";
  json[0].commit.author.name = "Tom Hudson";

[1] https://github.com/tomnomnom/gron

sitkack5y ago

Gron and Jq are like peanut butter and jelly. Also gron has `gron -u` (ungron) to turn the pivot back into json.

vips7L5y ago

Don't forget powershell's Convert-FromJson :)

https://docs.microsoft.com/en-us/powershell/module/microsoft...

melbourne_mat5y ago

That is so not jq! I've really been pining for jq on my current Windows project :-(

1 more reply

mkesper5y ago

Does it still convert only two levels by default?

retzkek5y ago

Babashka is another Hy-like alternative, but based on Clojure, and recently discussed on HN:

https://github.com/borkdude/babashka

https://news.ycombinator.com/item?id=24353476

Aside: another nice tool I recently discovered for working with JSON and YML, doing conversion and diffs (especially helpful for generated files):

https://github.com/homeport/dyff

sansblob5y ago

Late to the party but Benthos has its own language aimed at larger mappings: https://www.benthos.dev/docs/guides/bloblang/about

cristoperb5y ago

My go-to for simple queries is https://github.com/tidwall/jj

It is not nearly as expressive as jq, but it is faster for my use cases (written in golang).

jberryman5y ago

Sorry, what is "Hy"?

madmax1085y ago

https://github.com/hylang/hy

aasasd5y ago

A lightweight Lisp on top of Python.

jdc5y ago

also the interactive https://github.com/jmespath/jmespath.terminal

xonix5y ago

My humble attempt https://github.com/jsqry/jsqry-cli2

davesnxOP5y ago

Thanks, next time if you paste the list somewhere else you can add query-json :P

jeffbee5y ago· 13 in thread

1) refuses to operate on stdin; requires a filename argument, which is so irritating.

2) doesn't accept values that jq accepts

  % time jq -r '[expression]' < parcels | wc       
      365    1454    7978
  jq -r  < parcels  1.39s user 0.00s system 99% cpu 1.390 total
  wc  0.00s user 0.00s system 0% cpu 1.390 total

  % time ~/.yarn/bin/q  '[expression]' parcels | wc
  q: internal error, uncaught exception:
     Yojson.Json_error("Line 56, bytes -1-32:\nJunk after end 
  of JSON value: '{\n  \"OBJECTID\": 155303,\n  \"BOOK\"'")

diggan5y ago

1 is easy to work around (handy tip incoming for any tools that _seem_ to not support stdin but actually do, as stdin is also available as a file in unix):

    echo '{"foo": "bar"}' | query-json ".foo" /dev/stdin

saagarjha5y ago

1 more reply

chriswarbo5y ago

I try to avoid the dummy files /dev/stdin, /dev/stdout and /dev/stderr, since I've been bitten when they're not available, or when I hit permission denied errors.

Two examples I can remember off the top of my head:

- Nix build scripts

- OpenMoko

gpderetta5y ago

alternatively in bash:

   query-json ".foo" <(echo '{"foo": "bar"}')

davesnxOP5y ago

but indeed, it's a nice workaround!

davesnxOP5y ago

query-json --kind=inline would support reading from stdin, I didn't spend time on the Cmdliner enough!

f311a5y ago

That's super weird, I think most people use jq for bash pipelines.

jeffbee5y ago

Yes, I don't understand how people end up with assertions that the filename is a require argument. At least we've got /dev/stdin or /proc/self/fd/0 as workarounds.

2 more replies

ashtonkem5y ago

1 more reply

kilroy1235y ago

Probably true but I use it regularly in the automation app huginn.

https://github.com/huginn/huginn

davesnxOP5y ago

Nothing that can be fixed later?

freedomben5y ago

Thanks for this. I've been planning a similar work for years and haven't gotten off my ass (too many other projects lol).

Nice work!

1 more reply

jeffbee5y ago

Probably. You tell us :=)

The incompatibility is apparently due to the fact that jq is happy with a concatenation of JSON objects and q is not. For example {'foo':1}{'foo':2} as opposed to [{'foo':1},{'foo':2}]

2 more replies

toastal5y ago· 11 in thread

nondave5y ago

Also clashes with this existing q: https://en.m.wikipedia.org/wiki/Q_(programming_language_from...

stingraycharles5y ago

Which is relatively established and widely used (although mainly in finance). It was the first thing I thought about.

vincnetas5y ago

And there's also this 'q':

http://harelba.github.io/q/

Query CSV with SQL

V6HBGNQHU5y ago

Seems in the past hour he has renamed it to query-json and suggests that you set your own alias if you want something shorter.

Hnrobert425y ago

Yeah. Can you imagine trying to do a web search for ‘q’?

pinopinopino5y ago

I can and it is lightly disturbing. https://imgur.com/a/yAOF31g ^_^

3 more replies

mritchie7125y ago

or at least qj

davesnxOP5y ago

I don't want to fight for name, q was a shortener.

Happy to rename it to qj instead, but the option of renaming the binary it's a good workaround.

1 more reply

rurban5y ago

I made the same argument to him on reddit. q even exists already.

He replied and thought about qj

davesnxOP5y ago

Renamed :P

1 more reply

davesnxOP5y ago

Renamed already :D

mkesper5y ago· 9 in thread

I'd long for such a tool with a better comprehensible query language.

cube22225y ago

If so, and for anybody else having this wish, check out jql[0], I've created it exactly for this reason, to have the most common jq operations available in a more uniform and easier to use interface.

[0]: https://github.com/cube2222/jql

davesnxOP5y ago

Nice!

I will try to bring it to the brenchmark, thanks for sharing

diggan5y ago

benibela5y ago

There is XPath 3.1: https://www.w3.org/TR/xpath-31/#id-lookup

It is more verbose, like you get the size of something with array:size or map:size functions, so it is more readable

I am implementing it in Xidel 0.9.9+: http://www.videlibri.de/xidel.html

jeffbee5y ago

XPath is really useful and `xmlstarlet` is/was jq before jq existed. These days it's relatively rare to get data in XML instead of JSON, though.

1 more reply

peterohler5y ago

The oj command in https://github.com/ohler55/ojg uses JSONPath as the query and filter language. Maybe it is more in line with what you are looking for.

aasasd5y ago

I've collected a bunch of alternatives to look at, see here: https://news.ycombinator.com/item?id=24470715

JamesSwift5y ago

For anything moderately complex I iterate using https://jqplay.org/. Life is much better since I started doing that.

cube22225y ago

Hint: you can do live jq query preview for any jq-like command using fzf. It looks like this for jql, an alternative I've created (you can find it in a neighboring comment):

echo '' | fzf --print-query --preview-window wrap --preview 'cat test.json | jql {q}'

dkdk82835y ago· 7 in thread

Is jq slow? I have only worked with datasets up to 1mb but I’ve never had a performance issue that wasn’t attributed to my error.

minaguib5y ago

Yes

I often have to pluck out attributes from streams of json records (1 json object per line) - often millions/billions.

jq is almost always the bottleneck in the pipeline at 100% CPU - so much so that we often add an fgrep to the left side of the pipeline to minimize the input to jq as much as possible.

de_watcher5y ago

It's very slow. It was immediately standing out in our automated tests when we've added a json protocol to our system and used jq to test some assertions.

nicoburns5y ago

jq is pretty fast in my experience. But there have been cases where I've wanted it to be faster (dealing with a 90GB JSON file).

The main weakness seems to be streaming use cases (not having the whole file in memory at once). These are supported, but the syntax is quite awkward.

davesnxOP5y ago

Right, q doesn't support streaming so it will manage a 90GB JSON.

I should specify that on the performance section, Thanks!

arethuza5y ago

Out of interest, what created a 90GB JSON file?

3 more replies

jonstewart5y ago

davesnxOP5y ago

Nobody said that jq is slow.

vasergen5y ago· 6 in thread

https://mosermichael.github.io/jq-illustrated/dir/content.ht...

mumblemumble5y ago

For my part, I've always wanted a tool that just replicates PostgreSQL's JSON syntax. That way I can have only one syntax to remember.

davesnxOP5y ago

One of the main ideas of query-json is to provide excelent errors. So, it would teach you by using the tool.

and there are a few techniques to "discover" the schema of the json file, I trend to read with '.' or 'keys' and later keep going.

I'm planning to implement a flag where each operation prints the internal state of the json, so you would see what are the "pipes".

I will pick a few of your cheatsheet to implement next in q, Thanks!

bradly5y ago

Check out jql [https://github.com/cube2222/jql] and oj [https://github.com/ohler55/ojg]

vasergen5y ago

definitively will take a look, I've never heard of `jql` before, thanks

aasasd5y ago

Might want to take a look at some of these alternatives: https://news.ycombinator.com/item?id=24470715

xonix5y ago

My humble attempt https://github.com/jsqry/jsqry-cli2

jakuboboza5y ago· 5 in thread

Do we need to make jq faster ? Anyone has issues with current speed ? Is there any specific reason other than "because we can" ?

phonebucket5y ago

I can't answer for the OP, but "because we can" is a valid enough reason (pun unintended) for me.

IMO, an individual dev making a fast useful tool should always be welcomed as a feat of worthy hacking.

mumblemumble5y ago

According to the "Purpose" section of the readme, it doesn't look like beating jq's speed was ever a goal. It was meant to be a learning exercise.

But if I had done something like that, and then serendipitously discovered that I was exceeding the original's performance, I certainly wouldn't be shy about it.

davesnxOP5y ago

I'm not sure at which "we" do you refer.

Thanks for raising those concerns

ludamad5y ago

If we were at a board meeting deciding how to spend my time, no. If it is done? Why not

specialist5y ago

Batting practice.

We'd all be better off if plebes grew their skills by reimplementing common tools.

gkfasdfasdf5y ago· 4 in thread

Curious, any description as to why it's faster? Something intrinsic to Reason Native/OCaml? Architectural changes? Reduced feature set?

tyingq5y ago

Jq appears to have its own hand written json parser and requires flex/bison. I suspect something about the hand written parser is slow for large data sets.

I was somewhat surprised it didn't use an existing json parser library.

brundolf5y ago

I'm doubly surprised that such a popular utility uses bison; generated parsers tend to be slower than handwritten parser, and JSON isn't exactly the world's hardest language to parse

dubcanada5y ago

Well it is missing a ton of jq functionality, it's possible that in that list is something causing the performance degradation.

masklinn5y ago

jq’s in C do probably not anything intrinsic. Not to mention I don’t think the ocaml compiler is an optimisation beast.

YesThatTom25y ago· 4 in thread

Great! Now improve the syntax!

dividedbyzero5y ago

davesnxOP5y ago

Just because jq is very well stablished doesn't mean their APIs are well designed and we shoudn't improved because will break existing scripts.

There're a lot of quirks from the usage of it and people struggling with learning such a great tool, so in the area of query-json it will try to make a better interface for users.

aasasd5y ago

Perhaps one of these might work for you: https://news.ycombinator.com/item?id=24470715

JamesSwift5y ago

I think the issue is not the syntax, its the barebones docs with absolutely trivial examples.

ksmg5y ago· 3 in thread

Hm, I thought q is synonym for querying CSV files https://harelba.github.io/q/

redsaz5y ago

Same. When I saw the name "q" I thought of this same tool.

davesnxOP5y ago

Right, I found q cute... but I'm thinking to release new version with the name query-json or just change the name all-together. Any suggestion? ^^

smabie5y ago

No q is an array language

jamil75y ago· 3 in thread

As an outsider I get very confused by the Reason / Reason Native / OCaml / Bucklescript / Rescript?! ecosystem. What does it mean for it to be written in Reason Native/OCaml?

rashkov5y ago

That means it produces a native binary (for example, a .exe file on windows platforms), so ultimately you're aiming to run the program in a terminal. This is the normal way for OCaml to operate.

If instead you want to write a web-app which runs in a web browser or node.js, then you'd need to compile it to Javascript, which is what bucklescript helps you do.

jamil75y ago

Thank you for the detailed answer! I check in on the status of the related projects from time to time and was often confused by the relationship between the components.

davesnxOP5y ago

Right, the explanation of Reason - BuckleScript - OCaml is always nebulous.

I used Reason to compile to Native, so using OCaml's stdlib and OCaml's dependencies and compiling it with OCaml, but my source code is written in Reason syntax.

StavrosK5y ago· 3 in thread

This looks nice, but I was a bit dismayed at "friends don't let friends curl | bash, to install this run curl | bash".

konjin5y ago

It wasn't.

faitswulff5y ago

davesnxOP5y ago

There's 4 alternatives to install query-json.

Before doing any curl | bash, check what's on the install command, that's the entire point of it.

andylynch5y ago· 2 in thread

This looks interesting, but could be confusing given the programming language of the same name (https://code.kx.com/q/)

_v7gu5y ago

Ah yes, the old ".j.k raze read0`" as a separate app

andylynch5y ago

I should definitely check how that compares on some big files here.

jonemi5y ago· 2 in thread

I'm sure other people have use cases where the browser wouldn't meet their needs, but for me, I find jq unnecessary.

choward5y ago

Writing a script? I'm not going to have my script open a web browser so I can attempt to interact with a web console.

jonemi5y ago

When it got to the point when I needed a script, I just preferred Python. I can understand how some might prefer jq and a shell script, I just realized it wasn't worth it for my particular needs.

muktabh5y ago· 2 in thread

davesnxOP5y ago

Menhir/sedlex and others are pretty high accessibility barrier for new commers.

One of the nice things about all of it it's the discord, it's friendly and always helpful.

Hope it helps, just let me know if there's any specific!

smabie5y ago

Just ditch Reason and use OCaml. There's a lot more documentation and the syntax is better.

nikolay5y ago· 2 in thread

JMESPath is the only viable alternative, which probably has a wider footprint than even jq as it's part of AWS CLI.

acdha5y ago

nikolay5y ago

It's embeddable and available as a library for all languages [0]. Everything else is nothing but an CLI tool pretty much, which further limits its adoption.

[0]: https://github.com/jmespath

1 more reply

as-j5y ago· 1 in thread

> The report shows that q is between 2x and 5x faster than jq in all operations tested and same speed (~1.1x) with huge files (> 100M).

While faster for somethings....that's a pretty large set of caveats!

davesnxOP5y ago

Adding most of the jq operations shoudn't affect performance at all, in fact If I endup implementing streaming could be even faster.

I have a issue to improve performance where I can push this forward: https://github.com/davesnx/query-json/issues/7

But sure, are caveats!

riston5y ago· 1 in thread

Would be good if someone adds an explanation why this new approach is better, is it that the OCaml is faster, more efficient algorithms were used, etc?

davesnxOP5y ago

I tried to explain it on the Performance section and on the report

https://github.com/davesnx/query-json#performance https://github.com/davesnx/query-json/blob/master/benchmarks...

But all explanations aren't based by any evidence, just asumptions.

Ericson23145y ago· 1 in thread

This is funny because Stephen Dolan, the original jq author, works on OCaml itself.

davesnxOP5y ago

Exactly! I wanted to contact him

layoutIfNeeded5y ago· 1 in thread

Umm... There's already a language called Q for array processing.

davesnxOP5y ago

Will rename it to query-json. Thaaanks!

Borkdude5y ago· 1 in thread

If you're into Clojure, check out https://github.com/borkdude/jet

iLemming5y ago

RMPR5y ago· 1 in thread

Upcoming q-rs a rewrite of q in Rust :p

davesnxOP5y ago

I hope so!

skywhopper5y ago· 1 in thread

davesnxOP5y ago

Exactly I didn't claim to be faster in all the cases, since there's no feature parity and I won't make it that way.

For the set of operations that I implement it it's faster, that's true.

brundolf5y ago· 1 in thread

I'd love to hear some speculation - from the author or otherwise - as to why a fresh OCaml implementation would so dramatically outperform a mature C implementation

davesnxOP5y ago

There are a few good asumtions about why is faster, there are just speculations since I didn't profile jq or query-json.

The feature that I think penalizes a lot jq is "def functions", the capacity of define any function that can be available during run-time.

This creates a few layers, one of the difference is the interpreter and the linker, the responsible for getting all the builtin functions and compile them have them ready to use at runtime.

The other pain point is the architecture of the operations on top of jq, since it's a stack based. In query-json it's a piped recursive operations.

Aside from the code, the OCaml stack, menhir has been proved to be really fast when creating those kind of compilers.

I will dig more into performance and try to profile both tools in order to improve mine.

Thanks

tus885y ago· 1 in thread

Isn't JQ written in C? I doubt LISP is going to be faster.

davesnxOP5y ago

Yes, jq is written in C. Where LISP comes from?

yahyaheee5y ago· 1 in thread

I’m with Q!

davesnxOP5y ago

heycosmo5y ago

In case anyone is interested in yet another alternative, I have this old, unpolished project: https://github.com/bauerca/jv

j / k navigate · click thread line to collapse