JC – JSONifies the output of many CLI tools (opens in new tab)

(kellyjonbrazil.github.io)

411 pointspgl3y ago132 comments

132 comments

See also:

* "Bringing the Unix philosophy to the 21st century (2019)" (https://blog.kellybrazil.com/2019/11/26/bringing-the-unix-ph...) - https://news.ycombinator.com/item?id=28266193 238 points | Aug 22, 2021 | 146 comments

* "Tips on adding JSON output to your CLI app" (https://blog.kellybrazil.com/2021/12/03/tips-on-adding-json-...) - https://news.ycombinator.com/item?id=29435786 183 points | 11 months ago | 110 comments

dotancohen3y ago

If you're bringing that up, then this is the place to spread the word about my dream of a stdmeta file descriptor.

https://unix.stackexchange.com/questions/197809/propose-addi...

Just like we have stdout and stderr, header lines such as those produced by `ps` should be printed to stdmeta. Curl is the worse offender here, outputing meta lines to stderr instead of stdout. A stdmeta file descriptor would make it clear what is data, what is an error, and what is _describing_ the data.

runlevel13y ago

That reminds me of something I've wanted for quite a while:

A ringbuffer filetype. Similar to a named pipe file (see: fifo(7)[^1]), but without consuming the contents on read and automatically rotating out the oldest lines.

Of course, there would be some complexities around handling read position as lines are being rotated out from under you.

[1]: https://linux.die.net/man/7/fifo

naikrovek3y ago

> Of course, there would be some complexities around handling read position as lines are being rotated out from under you.

that seems solvable to me, but multiple simultaneous readers... that seems like it might be a bit more challenging...

1 more reply

anonymoushn3y ago

If we're adding standard output fds, maybe it would be a useful time to define any mechanism for consumers of those fds to discern the total ordering of bytes written to the three of them.

dotancohen3y ago

WC has a "character" flag that really just counts bytes:

  $ echo דותן | wc -c
  9

Note that each letter is two bytes, and the newline is an additional byte. So you could pipe (or tree) to wc to count bytes. For a hypothetical stdmeta on fd 3, that might look like this (piping stdout to devnull):

  $ foo 3>&1 > /dev/null | wc -c

2 more replies

rjh293y ago

Any metadata mechanism should be extensible IMO. For example sort and join care about ordering but they also care about column names.

ducktective3y ago

Then pipe it into jq [1] to query parameters or build up a formatted string.

Or pipe it into rq [2] to convert the format to yaml, toml etc.

[1]: https://stedolan.github.io/jq/tutorial/ [2]: https://github.com/dflemstr/rq#format-support-status

asicsp3y ago

There's also https://github.com/TomWright/dasel (supports JSON, TOML, YAML, XML and CSV)

samuell3y ago

I would wish for jq to be a really generic tool for working with structured data on the commandline, but I have a really hard time figuring out how to do e.g. conditional-based editing etc. Can't get my head around that, and don't find any info about it on the net. Seems even something that just supports SQL (upon JSON) would be better in this regard.

kbrazil3y ago

Hi there - I'm the author of `jc`. I also created `jello`[0], which works just like `jq` but uses python syntax. I find `jq` is great for many things but sometimes more complex operations are easier for me to grok in python.

[0] https://github.com/kellyjonbrazil/jello

ddulaney3y ago

What do you mean by conditional-based editing? I've found its language to be pretty concise and readable, especially with the // operator.

samuell3y ago

Doing editing only on selected "rows" or items in a structure, based on the values of other attributes in them. Similar to what you specify in the WHERE clause in SQL-statements.

spinningslate3y ago

Didn't know about this, the HN dividend pays out again!

When wrestling with sed/awk in trying to parse results of a shell command, I've often thought that a shell-standard, structured outpout would be very handy. Powershell[0] has this, but it's a binary format - so not human-readable. I want something in the middle: human- and machine-readable. Without either having to do parsing gymnastics.

jc isn't quite that shell standard, but looks like it goes a long way towards it. And, of course, when JSON falls out of fashion and is replaced by <whatever>, `*c` can emerge to fill the gap. Nice.

[0]: https://learn.microsoft.com/en-us/powershell/

jerjerjer3y ago

> Powershell has this, but it's a binary format

Well, yes - powershell passes binary objects but as you can always:

1) access their properties 2) pass them downstream 3) serialize to json/csv 4) instantiate from json/csv

I think this is both human- and machine-readable enough (even through internal format is binary, but working with Powershell you are never really exposed to it).

How do you think it can be improved?

In my opinion object io IS the best part of powershell - it allows us to ditch results wrangling with sed/awk/grep entirely. I'm super interested if there's an even better way forward.

vbezhenar3y ago

Shell would benefit from Content-Type/Accept headers. Like you can specify that cat accepts text and jq accepts Json. Then `ip a` would output corresponding type automatically.

thesuperbigfrog3y ago

>> Shell would benefit from Content-Type/Accept headers. Like you can specify that cat accepts text and jq accepts Json. Then `ip a` would output corresponding type automatically.

That seems unnecessary. Traditionally, shells have always used text streams. JSON is just text that follows a given convention. Couldn't what you are describing be implemented by setting environment variables or using command line flags?

For example:

PREFERRED_OUTPUT_FORMAT="JSON"

--output-format="JSON"

--input-format="JSON"

Tools that can generate and consume structured text formats are a good idea, but they should be flexible enough that they can even work with other tools that have not been written yet.

"This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface." --Doug McIlroy

vbezhenar3y ago

I don't follow. JSON is not really readable. I don't want to see JSON output ever except for script debugging. I want to see well formatted output. But at the same time I want to be able to write something like

  ip a | filter "[].address like 192.*"

So when I'm typing `ip a` I expect to get output for human and when I'm piping it to `filter` program, I expect for those programs to exchange with JSON (and ideally `filter` should use some tabular formatting as its human-readable output).

You suggesting that I should write `PREFERRED_OUTPUT_FORMAT=JSON ip a | filter "[].address like 192.*"` but that's really verbose and error-prone. It might work for scripts, but for ad-hoc shell I don't like this approach. Ideally programs should be able to communicate between pipes for their preferred formats.

2 more replies

teddyh3y ago

“ip” has the -json option; i.e. “ip -json a” gives you straight JSON; no need for JC.

mprovost3y ago

Someone on this site suggested that programs open another filehandle along with stdout and stderr (stdjson) for their json output which struck me as a way to make this work in a backwards-compatible fashion.

dotancohen3y ago

Even just a generic stdmeta would go a long way to defining _what_ is being output. Curl is the worst about this.

https://unix.stackexchange.com/questions/197809/propose-addi...

brodo3y ago

Nushell has this too. I‘ve tried is as daily driver for a while. It‘s not there yet, but almost. After it hits 1.0 I‘m going to switch for good and leave the duck tape solutions behind.

hnlmorg3y ago

Shameless plug but maybe give murex a look. That’s been stable for a while now and does the same thing too.

HTTPS://GitHub.com/lmorg/murex

vesinisa3y ago

Seems to be part of the Arch repository: https://archlinux.org/packages/community/any/jc/

Why does this site recommend using "paru", "aura" or "yay" to install it on Arch? I have been using Arch for a decade or so but have never even heard of such tools. They don't even have pages in the Arch wiki, and only yay ("Pacman wrapper and AUR helper written in go") is available via the standard repository.

Begs the question: what is so wrong with plain pacman?

EDIT: Okay so seems they were previously on AUR and once accepted to community repository they just forgot to stop recommending an AUR wrapper for installing: https://github.com/kellyjonbrazil/jc/commit/f2dd7b8815edc92e...

EDIT2: Created a PR with the GitHub.dev editor .. Absolutely blown away by how easy it was! Feels like the future of development.. https://github.com/kellyjonbrazil/jc/pull/310

bennyp1013y ago

Looking at github, it was only moved from AUR in May this year, but the commit only changed the description, not the commands.

But this is 100% going in my toolbox - I can think of a couple of scripts that I can update to use this right of the bat!

tazjin3y ago

It's a program to wrap stable CLI programs, makes sense that they recommend a wrapper around a stable program like pacman

kangalioo3y ago

Is this a snarky comment implying that the CLI programs and pacman are better left unwrapped?

dmoura3y ago

This is great!

I am the author of SPyQL [1]. Combining JC with SPyQL you can easily query the json output and run python commands on top of it from the command-line :-) You can do aggregations and so forth in a much simpler and intuitive way than with jq.

I just wrote a blogpost [2] that illustrates it. It is more focused on CSV, but the commands would be the same if you were working with JSON.

[1] https://github.com/dcmoura/spyql [2] https://danielcmoura.com/blog/2022/spyql-cell-towers/

sbt5673y ago

Wow, this looks super useful. Will definitely check this out. Thanks

garfieldnate3y ago

One of the features of Julia that I really like and wish would be adopted elsewhere is that the stringification methods take a context object that can indicate the string is to be presented in: plaintext, ansii-colored terminal, json, HTML, LaTeX, GraphViz, etc. It would have been amazing to have this kind of flexibility considered in the Unix tools.

That said, I would argue that JSONLines is a better universal output format when you're dealing with pipelines. If the output is one giant JSON array, then you have to wait for a long-running program to finish completely before the output can be passed on to the next long-running program. If you output one JSON line at a time as you process your input, then the next program in the pipeline can get started on processing already without waiting for the first to finish completely.

kbrazil3y ago

Good news - `jc` supports JSONLines output for many commands and file-types, too!

enriquto3y ago

In the opposite direction, if you want to recover a human-editable text stream from json data, you run gron.

asicsp3y ago

Link: https://github.com/tomnomnom/gron

Discussion: https://news.ycombinator.com/item?id=25006277 366 points | Nov 6, 2020 | 91 comments

zikduruqe3y ago

Came looking for gron recommendations.

pglOP3y ago

Command line output as JSON. Very handy and has parsers for tons of common utilities, written in Python.

Blog post with examples here: https://blog.kellybrazil.com/2020/08/30/parsing-command-outp...

anecdotal13y ago

Can you trust it? Cli tool output is not exactly stable. I thought that's why libxo exists?

https://github.com/Juniper/libxo

gorkish3y ago

Another vote for something like libXo as the better solution. This thing is just passing the parsing problem of interacting with an unstable API to someone else and hoping they maintain theirs better than you would maintain yours.

rob743y ago

Great idea, but sounds like a maintenance nightmare to me. Not only that many users will complain that their favorite CLI tool isn't supported, but also a new release of any of the supported CLI tools might break the support without any kind of warning, as I don't think changes to the (human-readable) output are considered major changes.

slightwinder3y ago

It doesn't even need a new release. jc can already fail because of details like the system-language. With my local language, on a simple output of ls -l, it's parsing

    {"filename":"drwxr-xr-x  16 root root    4096 Oct  4 11:21 ."}

instead of

    {"filename":".","flags":"drwxr-xr-x","links":16,"owner":"root","group":"root","size":4096,"date":"Oct 4 11:21"}

with LANG=US. This makes it really hard to trust such a tool.

kbrazil3y ago

Hi there! `jc` author here. Yes, it is a documented caveat[0] that the `C` or `en_US.UTF-8` locales should be used for best results.

It's not unheard of for tools to require `C` locale for proper parsing:

    $ LC_ALL=C ls -l | jc --ls

This is one of many inherent issues with using unstructured text as an API. That's why I believe there should be a JSON (or at least some other widely used format[1]) option for tools that have output that would be useful in scripts.

[0] https://github.com/kellyjonbrazil/jc#locale

[1] formats should have good library support across many languages and nice filter/query capabilities from the command-line

simonkagedal3y ago

This is true. However, the issue is with the tool outputting unstructured data to begin with. As the `jc` author says, the best would be if tools supported structured output formats to begin with.

Any custom parser of ls output would potentially have the same problem. Of course, it can be improved though – for example by looking at LANG – and it would be nice for such improvements to get into `jc`, so that other tools can rely on it at least more than doing the parsing directly themselves.

vidarh3y ago

Changes to the human-readable output of most standard Unix tools are a big deal given the amount of scripts which depends on them. They commonly are seen as fairly stable to the point that even some non-Posix tools like e.g. apt, which doesn't have a stable CLI warns about it to stderr if you try to redirect it's output to a pipe[1].

If anything, though, that's a good reason for a tool like this to exist rather than have every script that depends on these tools use their own, often hacky, parsing of the output.

[1] "WARNING: apt does not have a stable CLI interface. Use with caution in scripts."

HyperSane3y ago

The greatest thing about powershell is that all commands returns structured output like this.

WorldMaker3y ago

Also, this tool is useful for using old unix commands in powershell: `jc … | ConvertFrom-Json` easily gives you powershell objects to work with.

HyperSane3y ago

That is a great idea!

otikik3y ago

This is a good step in the right direction, which IMHO should be that all those cli tools should have a —-json flag.

nixcraft3y ago

Last year I looked for JSON output with the dig command on Linux but found the yaml option while reading the man page. It was handy. I always wondered why yaml/JSON output is not standard with Linux and Unix utilities for scripting needs. Anyway:

     dig +yaml google.com

vidarh3y ago

Because neither yaml or JSON existed when most of these tools were written, and then people got used to the current outputs, and so it's very varying whether or not anyone has felt the pain enough to decide it's worth adding the option for another output format.

khiqxj3y ago

There are at least 3 ways this can create bugs:

  - It has to parse output of commands which may or may not be intended to be parsed and may or may not have a predictable format. The only way to overcome this is if this program becomes one of the Big Four "UN*X command output -> data" converters

  - It casts things to "float/int"

  - Depending on who made this library, the output itself may not be strict / predictable. Perhaps it will output JSON with two different key names in two different scenarios.

And don't forget that any of these issues will still come up even if they are accommodated for, due to versions of programs changing without the author of this tool knowing.

But yeah basic things having intrinsic shortcomings is a given, when you're using UN*X.

From the nested article:

> Had JSON been around when I was born in the 1970’s Ken Thompson and Dennis Ritchie may very well have embraced it as a recommended output format to help programs “do one thing well” in a pipeline.

They had S-expressions and plenty more options. They also could have just made a format as you can tell with thousands of ad-hoc trendy new formats like YAML and TOML being spewed out every recent year now that programmers discovered data structures.

kbrazil3y ago

Hi - jc author here. Just a couple points:

The casting to int/float is not done unless the underlying values are predictably documented to be numbers. There are rare cases where auto int/float conversions are done, but:

1) This is always documented, and

2) You can turn this functionality off via the —raw flag

Also, predictable schemas are documented with each parser. (e.g `jc -h —-arp`)

jwilk3y ago

> Big Four "UN*X command output -> data" converters

What do you mean?

khiqxj3y ago

I mean that there would be a big 4 such converters if people actually invested time into these "front ends" to CLIs.

fabianthomas3y ago

I'm really concerned about performance/latency with the json approach. I couldn't find anyone discussing that yet completely.

Even when using jq (written in C) my quick tests show that parsing json is really slow compared to parsing with simple unix tools like awk. I suspect that to come from the fact that the parser has to check the full json output first in order to print a result, while awk does not care about syntax of the output.

I compared two shell scripts both printing the ifindex of some network device (that is the integer in the first column) 10 times.

Using awk and head combined gives me 0,068s total time measured with the time command.

Using ip with the -j flag together with jq gives 0,411s.

Therefore the awk approach is 6 times faster. And here I used a binary (ip) that already supports json output and doesn't even need the mentioned jc.

While this whole test setup is somewhat arbitrary I experienced similar results in the past when writing shell scripts for, e.g., my panel. Reach out to me if you are interested in my test setup.

uvesten3y ago

As someone who uses `jq` almost daily, this looks like a great tool I didn’t know I needed. Thanks for the tip!

mg3y ago

    dig example.com | jc --dig

Seems a bit redundant. Maybe it should be the other way round?

    jc dig example.com

Similar to how you do

    time dig example.com

DougBTX3y ago

Yes, that works, there are examples on the page like:

    $ jc dig example.com | jq -r '.[].answer[].data'
    93.184.216.34

It uses the first argument to infer the command output type.

chii3y ago

`time` needs to be the one to exec the command because they need to know when the command they are timing starts. Therefore, `time` cannot use pipes, as by then, the command being timed would've already started!

`jc` doesn't need to know anything about the command producing the output - just the format of the output. So using a pipe and stdin makes a lot of sense.

berkes3y ago

Is "redundancy" that you have to provide --dig?

I can imagine `jc` having some detection built in, from which it determines the command/content it's being parsed. Doesn't seem to have it, yet, and I'm generally no big fan of "magic" like this, but it would remove the redundancy.

Having it as a pipe, allows for much more, though.

   some_expensive_command > out.log
   jc --expensive-cmd < out.log

   hourly_dig.sh > example_com_records_$(date +%F+%s)
   cat example_com_records_* | jc --dig

kbrazil3y ago

Hi there - author of `jc` here. I originally intended to have auto-detection but put that on the backburner to focus on creating parsers, especially after introducing the magic syntax.

I did implement auto-detection for `/proc` file parsers so you can just do:

    $ cat /proc/foo | jc --proc

    $ jc /proc/foo

But you can specify each procfile parser directly if you want to as well.

pastage3y ago

I Love using this in streams of data, but there is a lot to be said about the pit falls. Some of them; first you might want to filter with grep first and that leads to missing metadata, second error handling is a good thing but people tend to ignore errors and then not handle them. This is basically what filebeat/logstash from Elastic does, which is a beast at parsing (and impossible to use from command line).

The power of plain text pipes is that you do not interpret them and that makes them fast, that is usefull because you handle both 100 bytes, 1MB and 1TB as input. You choose what you parse keeping it simple, fast and usually error free. This tool miss the, fast, simple and human readable part of debugging pipes. Which is fine!

naikrovek3y ago

This seems very fragile, to me, without support from the application whose output is converted to JSON.

minor updates to command-line tools can and do subtly alter the textual output of the tool, and the outputs of these tools are not standardized.

This is a step towards "objects passing messages" as originally conceived by Alan Kay, if my incomplete understanding of what he's said is correct, and that's a good thing, I think. Objects passing messages around is a very solid model for computing, to me. Note that I am stupid and don't understand much, if I'm honest.

synergy203y ago

to make this great tool truly universal, it has to be written in c instead of python these days, then provide python|javascript|etc bindings if possible.

I'd like to use it on embedded systems, where python is too large to fit. this tool can be widely deployed just like awk|sed|etc but it has to be in C for that.

makapuf3y ago

Agreed but I would change it with "any compiled language that has no external runtime" and common shared libs dependencies. I don't care that a utility is written in Go, Zig, C++ or Pascal ;)

kbrazil3y ago

`jc` is available today as a compiled binary with all dependencies self-contained. The binaries and OS package installers are available under the GitHub Releases:

https://github.com/kellyjonbrazil/jc/releases

This is still python under the hood and not as small of a binary as I would like, but it does work.

ElectricalUnion3y ago

But python is also universal? You can just bundle jc inside https://github.com/jart/cosmopolitan/tree/master/third_party... and it's as universal as it gets.

maxbond3y ago

Seems to me like anywhere Python is too bulky to work, a layer serializing and deserializing JSON at every pipeline step is likely to exhaust your memory too.

montroser3y ago

I would welcome POSIX shell or at least bash as an alternative to a compiled C/Rust/Go binary.

But yeah, for these types of utilities, relying on an external language runtime like Python/Node is pretty rough.

bitwize3y ago

Replace C with Rust.

synergy203y ago

on embedded rust is still much larger than c because rust links its stdlib statically.

bitwize3y ago

There is such a thing as no-stdlib Rust deployments.

1 more reply

codedokode3y ago

I think that JSON is a bad choice here.

It is obvious that CLI commands should produce machine-readable output because they are often used in scripts, and accept machine-readable input as well. Using arbitrary text output was a mistake because it is difficult to parse, especially when spaces and non-ASCII characters are present.

A good choice would be a format that is easily parsed by programs but still readable by the user. JSON is a bad choice here because it is hard to read.

In my opinion, something formatted with pipes, quotes and spaces would be better:

    eth0:
      ip: 127.15.34.23
      flags: BROADCAST|UNICAST
      mtu: 1500
      name: """"Gigabit" by Network Interfaces Inc."""

Note that the format I have proposed here is machine-readable, somewhat human-readable and somewhat parseable by line-oriented tools like grep. Therefore there might be no need for switches to choose output format. It is also relatively easy to produce without any libraries.

Regarding idea to output data in /proc or /sys in JSON format, I think this is wrong as well. This would mean that reading data about multiple processes would require lot of formatting and parsing JSON. Instead or parsing /proc and /sys directly, applications should use libraries distributed with kernel, and reading the data directly should be discouraged. Because currently /proc and /sys are just a kind of undocumented API.

Also, I wanted to note that I dislike jq utility. Instead of using JSONPath it uses some proprietary query format that I constantly fail to remember.

feanaro3y ago

Some alternative ideas for making JSON more readable:

- Pipe into gron (https://github.com/tomnomnom/gron) to get a `foo.bar.baz = val` kind of syntax.

- Pipe into visidata (https://www.visidata.org/) to get a spreadsheet-like editable view.

kbrazil3y ago

Hi there - `jc` author here. `jc` can also output in YAML format with the `-y` flag. It is fairly trivial to add other options in the future since `jc` just turns the text into objects which can be serialized to many different formats.

For example:

    % jc -y date
    ---
    year: 2022
    month: Nov
    month_num: 11
    day: 3
    weekday: Thu
    weekday_num: 4
    hour: 9
    hour_24: 9
    minute: 0
    second: 22
    period: AM
    timezone: PDT
    utc_offset:
    day_of_year: 307
    week_of_year: 44
    iso: '2022-11-03T09:00:22'
    epoch: 1667491222
    epoch_utc:
    timezone_aware: false

codedokode3y ago

Great, by I would prefer to avoid YAML because it is very complicated and difficult to parse.

bonzini3y ago

> In my opinion, something formatted with pipes, quotes and spaces would be better:

Just pipe it into a JSON-to-YAML script like this:

    #! /usr/bin/python3
    from ruamel import yaml
    import json, sys, io
    print(yaml.dump(json.load(sys.stdin)))

psychoslave3y ago

Following the project documentation, you easily come to:

  jc dig example.com | jq       
  [
    {
      "id": 30081,
      "opcode": "QUERY",
      "status": "NOERROR",
      "flags": [
        "qr",
        "rd",
        "ra"
      ],
      "query_num": 1,
      "answer_num": 1,
      "authority_num": 0,
      "additional_num": 1,
      "opt_pseudosection": {
        "edns": {
          "version": 0,
          "flags": [],
          "udp": 4096
        }
      },
      "question": {
        "name": "example.com.",
        "class": "IN",
        "type": "A"
      },
      "answer": [
        {
          "name": "example.com.",
          "class": "IN",
          "type": "A",
          "ttl": 56151,
          "data": "93.184.216.34"
        }
      ],
      "query_time": 0,
      "server": "192.168.1.254#53(192.168.1.254)",
      "when": "Thu Nov 03 14:06:40 CET 2022",
      "rcvd": 56,
      "when_epoch": 1667480800,
      "when_epoch_utc": null
    }
  ]

Rather readable to my mind. And you can rather easily transform it to your preferred human readable output format I guess.

codedokode3y ago

For me there are too many quotes and brackets. My proposed format can also be converted to JSON if necessary.

kbrazil3y ago

Also note that you are looking at plaintext output here. By default `jc` and other JSON filtering tools do syntax highlighting when outputting to the terminal so it's actually quite easy to read JSON these days.

WorldMaker3y ago

> A good choice would be a format that is easily parsed by programs but still readable by the user.

I think the powershell approach is a good one here too: powershell commands output binary streams of objects rather than text and it is powershell itself that has several standard ways of human readable outputs, most of which are automatic (but easily tweaked with an extra pipe or two). Standard human readable forms are nice, and even standardized there's no need to rely on parsing them back out into objects because they are already passed as objects so they can focus a bit more on "pretty" over "parse-able" (such as including human useful things like ellisions `…` on long columns).

DonHopkins3y ago

Not binary streams of serialized objects, but arrays of pointers to live objects in one process's memory. No serialization / deserialization, binary or text, or piping between processes, just passing pointers to live objects between cmdlets in the same address space. That's quite different and vastly more efficient than serializing and deserializing text or binary data between every step in different processes connected by pipes.

https://en.wikipedia.org/wiki/PowerShell#Pipeline

>As with Unix pipelines, PowerShell pipelines can construct complex commands, using the | operator to connect stages. However, the PowerShell pipeline differs from Unix pipelines in that stages execute within the PowerShell runtime rather than as a set of processes coordinated by the operating system. Additionally, structured .NET objects, rather than byte streams, are passed from one stage to the next. Using objects and executing stages within the PowerShell runtime eliminates the need to serialize data structures, or to extract them by explicitly parsing text output. An object can also encapsulate certain functions that work on the contained data, which become available to the recipient command for use. For the last cmdlet in a pipeline, PowerShell automatically pipes its output object to the Out-Default cmdlet, which transforms the objects into a stream of format objects and then renders those to the screen.

zokier3y ago

The thing to understand with PowerShell is that the way it pipelines objects is enabled by the fact that it is all happening in-process within one .net runtime. It is significantly more difficult to achieve anything similar with several independent processes being piped together

WorldMaker3y ago

Well, yes, powershell takes some shortcuts and has the advantage that .NET has a strong object system.

If you were to build it from scratch with the idea of "shared nothing" applications similar to the unix model with text files, it's not that much more difficult with just about any sort of object or message broker. You could easily imagine a world with a dbus based "REPL"/shell, for instance. Or a different approach easily imaginable if you still want to focus on unix-style streams/files between processes would be something like BSON streams (thought it would still have some serialization/deserialization overhead).

rjzzleep3y ago

I also dislike jq, but this is a bit of a non issue IMHO. You could in theory add any kind of output transformer in theory. The codebase doesn't seem to be optimized for that yet, but it should be trivial to add.

onion2k3y ago

If you use something other than JSON you'd have to wait until every app you want to use chooses to update to support your preferred format. That might take a while. Wouldn't it be better to use JSON for the output as that's an acceptable input to lots and lots of applications already, and if you want to read the output just pass it to an app that converts from JSON to "something formatted with pipes, quotes and spaces".

codedokode3y ago

On the other hand, if machine-readable format is adopted, it will be used for many years or even decades. So instead of making a quick hack that everybody will regret later it might be better to spend some time comparing different options.

ape43y ago

JSON is the lazy choice. I particularly dislike quoting keys (variable names). Relaxed JSON, for one, allows unquoted keys http://www.relaxedjson.org But that's just a small step - I am sure we (the community) could do better.

thesuperbigfrog3y ago

>> something formatted with pipes, quotes and spaces would be better

How well would this format handle deeply nested structures? It seems like it would require a lot of space characters compared to nesting open and close characters: {} or () or []

How would escaping pipes, quotes, and spaces work to represent those character literals?

There are already numerous structured text formats: JSON, XML, S-expressions, YAML, TOML, EDN, and many more. Wouldn't this be yet another format? (https://xkcd.com/927/)

djedr3y ago

Dare I suggest Jevko[0] as yet another alternative?

  eth0 [
    ip [127.15.34.23]
    flags [[BROADCAST][UNICAST]]
    mtu [1500]
    name ["Gigabit" by Network Interfaces Inc.]
  ]

This is one of the things it was designed with in mind.

It's even simpler and more flexible than S-expressions.

Handles deeply nested structures perfectly well. Has only 3 characters to escape (brackets and the escape character).

(I am the author)

[0] https://jevko.org/

artemisart3y ago

How do you differentiate types with jevko (numbers, strings, boolean)? Your examples on jevko.org appear lossy as they encode in the same way things that are different in JSON and I don't know how you would then differentiate between true and "true", 27 and "27", etc.

1 more reply

Too3y ago

While I can appreciate the immediate value this tool brings for some applications.

At the same time it’s an epitome of everything that is wrong with current software landscape. Instead of fixing the deficiencies in upstream, once and for all, we just keep piling more and more layers on top.

Programs having structural data and APIs inside, that get translated into human representation - only to be re-parsed again into structural form. What could possibly go wrong?

If you are already in any programming environment, many of the tools already have better built in APIs. I mean who needs an “ls” or timestamp parser. Just use os.listdir or equivalent. As someone previously pointed out in this thread, the ls parser is in fact already broken, unsurprisingly. Mixing tools made for interactive use in automation is never a good idea.

The Unix philosophy sounds romantic in theory, but need structural data, throughout, to work reliably in practice. Kids, go with the underlying apis unless your tool has structured output.

pessimizer3y ago

I don't know why I never considered this sort of option to powershell out bash. My only problem with it is that it's in python (also why I don't really use jq), and that it's not something that just sets aliases behind the scenes.

If this were written in a performant language, if it simply aliased (i.e. invisibly) all common cli commands to a wrapper which would obviate the need for all of the text processing between steps in command pipelines, if it were versioned and I could include the version number in scripts, and finally if I could run versioned scripts through it to compile them into standard bash scripts (a big ask), I'd give it a 3 month test starting today. There'd be nothing to lose.

Just putting that out there for people who like to rewrite things in Rust. A slightly different version of this concept could allow for nearly friction-free adoption.

cstrahan3y ago

> also why I don't really use jq)

You also don’t like software written in C (the language jq is written in)?

freedomben3y ago

Why is the python an issue for you? I also dislike and avoid tools written in python, unless they have an rpm available. If there's a package then I don't so much care what language it's in.

esskay3y ago

Been using this a while to pull out a bunch of server stats for a monitoring dashboard and it's been great. I just wish there were more supported services. The total mess of different outputs and config types for Linux packages is extremely annoying to deal with.

kbrazil3y ago

Feel free to open issues on GitHub if you would like to recommend parsers! Also, the latest version of jc supports parsing of /proc files, which may also help.

esskay3y ago

Thanks, certainly will do :) Wasn't aware that proc was now supported!

prima-facie3y ago

Unrelated tool but also very useful in combination with JQ: gron

https://github.com/tomnomnom/gron

hinkley3y ago

I discovered process.send() in Node a couple years ago and it made the decision to fork a child process a lot easier. No need to sanitize command line output when you can do direct IPC over a connection that uses JSON under the hood.

The itch I can’t seem to scratch is how to run tasks in parallel and have logs that are legible to coworkers. We do JSON formatted logs in production and I’m wondering if something like this would help solve that set of problems.

wodenokoto3y ago

Something like this is what I've been thinking is the path forward to non-posix shells and a way to get away from cumbersome foot-gunny bash.

Nushell that hit the front page earlier this week seemed to me to be limited by "compatible" apps, but wrapping all the big ones in a json converter superficially seems like a great solution to me.

menjaprunes3y ago

I stumbled upon this few weeks ago to parse the output of nmcli and now it is in my toolset for scripting

I love it

fimdomeio3y ago

There's a clash of names between this jc and autojump (https://github.com/wting/autojump) jc (jump to child)

friendzis3y ago

Sounds a lot like Powershell, which returns rich objects instead of a text blob

WorldMaker3y ago

Yup, powershell is mentioned in the references. Also, this tool can be used to bridge older unix commands into powershell objects (ConvertFrom-Json).

exabrial3y ago

Have a parseable output is great. What would be more incredible is to have a parseable output with a schema definition and/or formal grammar of some sort.

UncleEntity3y ago

> What would be more incredible is to have a parseable output with a schema definition and/or formal grammar of some sort.

I’ve been trying to build something like this but simply don’t have the free time currently.

The plan: adapt the parser VM from lpeg (or similar, there’s a paper I’ve been reading on an Earley parser VM) into a command line app that takes a grammar + text input (or stdin) and spits out json to a file (or stdout). Probably not as general purpose as this one but also wouldn’t need a pull request to add a new format.

All the pieces are there but without the free time…

I was actually curious if there was any demand for such a thing, I just want it to parse my payroll statements because this billion dollar company can only manage crappy pdfs and, well, it’s an interesting problem.

—edit—

Oh, output schema. Totally different than what I’m going on about.

vincnetas3y ago

Schemas are provided on linked page near definitions of parsers.

kbrazil3y ago

You can also run `jc -h --dig` to get the parser details that include the schema.

Having true JSON Schema[0] is being considered, but on the back-burner due to the sheer number of parsers to build schemas for. Also, it is more difficult to accurately define the schema for a small subset of parsers since their command output are so variable.

[0] https://json-schema.org/

psadri3y ago

Is anyone aware of something similar but for the args? A database that maps a command to a schema for all it’s possible cli arguments?

sindoc3y ago

OMG, I was looking for exactly something like this. I’d like to create immersion between Logseq and CLI. This can very useful.

visarga3y ago

love it, json could be the best way to pipe data between processes because it is text, but structured

drunkpotato3y ago

So is xml. Json is relatively unstructured compared to other formats.

I think json has several advantages though. It’s a relatively lightweight and widely known serialization standard, rich enough for most cases and extensible in others, and it has easy to use parsers in all major programming languages.

Also jsonlines is a simple addition that make it easy for json to play well with non-json aware older Unix tools.

It has a few shortcomings but I think its advantages outweigh them, and it’s become a pretty widely used standard in a short time.

tgv3y ago

I just learned about gron (elsewhere in the comment section: https://news.ycombinator.com/item?id=33448471). That overcomes some of jsonlines problems.

deafpolygon3y ago

Bringing a powershell feature to the Unix CLI.

Kipters3y ago

Since it's now cross platform, technically powershell brought powershell features to Unix first

berkes3y ago

A slightly related pet-peeve: I don't like it when "random" commands "squat" the two-letter domain, or worse, the one-letter domain: t, jq, jc etc.

In my perfect world (which, obviously doesn't exist), commands from tools "in the wild" are at least three letters long. With historical exceptions for gnutools: preferably they'd take the three-letter space, but two-letters (cd, ls, rm etc) is fine.

Two letter space outside of gnutools, is then reserved for my aliases. If jsonquery is too long to type, AND I lack autocomplete, then an alias is easy and fast to make. alias jq=jsonquery.

In the case of this tool, it will conflict with a specialised alias I have: `alias jc=javac -Werror`. Easy to solve by me with another alias, but a practical example of why I dislike tools "squatting" the two letter namespace.

artemisart3y ago

I don't believe it's really an issue in practice, your `jc` alias will just take the priority and you can easily add one for jsonquery -> `path/to/bin/jc`. I think good short command names can help adoption (like for ripgrep, fd) but it's true that we should have a race to squat all the 2 letter names.

Sharparam3y ago

Instead of typing the whole path, you can also use `command` or a backslash to bypass aliases

E.g.:

    alias jsonquery='command jq'

ycombobreaker3y ago

I perceive an element of hubris when a tool claims a two-character name. That is a very small namespace, so the tool is effectively staking a claim on mental or emotional real-estate.

Too3y ago

Given this tool is likely to be used in a bunch of bash scripts, collisions with aliases can be a big problem.

WorldMaker3y ago

This seems a general symptom of unix/POSIX command naming where commands have "always" been short names, and commands are expected to be short names.

It's something I appreciate about the powershell naming conventions. A lot of people mock the verbosity of the names of powershell commands and commandlets which require the "proper" name to be Verb-Noun qebab case monstrosities, but this was chosen for exactly the reasons of your pet peeve: short command names should be user aliases for work in a shell, and longer command names are great for avoiding namespace clashes in scripts and between users. The verbs and nouns create large (discoverable) namespaces.

For instance, this tool might be powershell named ConvertTo-ParsedJson. (ConvertTo-Json is an out of the box command that converts any powershell object to JSON.) It might suggest the user alias it by adding `Set-Alias -Name jc -Value ConvertTo-ParsedJson` but generally commands in powershell only offer such aliases as suggestions rather than defaults. (Though there are a lot of out of the box aliases for common powershell commands.)

It makes sense to me that powershell encourages long names first and allows and encourages users to have far more control over short aliases.

j / k navigate · click thread line to collapse

132 comments

asicsp3y ago

See also:

dotancohen3y ago

If you're bringing that up, then this is the place to spread the word about my dream of a stdmeta file descriptor.

https://unix.stackexchange.com/questions/197809/propose-addi...

runlevel13y ago

That reminds me of something I've wanted for quite a while:

A ringbuffer filetype. Similar to a named pipe file (see: fifo(7)[^1]), but without consuming the contents on read and automatically rotating out the oldest lines.

Of course, there would be some complexities around handling read position as lines are being rotated out from under you.

[1]: https://linux.die.net/man/7/fifo

naikrovek3y ago

> Of course, there would be some complexities around handling read position as lines are being rotated out from under you.

that seems solvable to me, but multiple simultaneous readers... that seems like it might be a bit more challenging...

1 more reply

anonymoushn3y ago

If we're adding standard output fds, maybe it would be a useful time to define any mechanism for consumers of those fds to discern the total ordering of bytes written to the three of them.

dotancohen3y ago

WC has a "character" flag that really just counts bytes:

  $ echo דותן | wc -c
  9

  $ foo 3>&1 > /dev/null | wc -c

2 more replies

rjh293y ago

Any metadata mechanism should be extensible IMO. For example sort and join care about ordering but they also care about column names.

ducktective3y ago

Then pipe it into jq [1] to query parameters or build up a formatted string.

Or pipe it into rq [2] to convert the format to yaml, toml etc.

[1]: https://stedolan.github.io/jq/tutorial/ [2]: https://github.com/dflemstr/rq#format-support-status

asicsp3y ago

There's also https://github.com/TomWright/dasel (supports JSON, TOML, YAML, XML and CSV)

samuell3y ago

kbrazil3y ago

[0] https://github.com/kellyjonbrazil/jello

ddulaney3y ago

What do you mean by conditional-based editing? I've found its language to be pretty concise and readable, especially with the // operator.

samuell3y ago

Doing editing only on selected "rows" or items in a structure, based on the values of other attributes in them. Similar to what you specify in the WHERE clause in SQL-statements.

spinningslate3y ago

Didn't know about this, the HN dividend pays out again!

jc isn't quite that shell standard, but looks like it goes a long way towards it. And, of course, when JSON falls out of fashion and is replaced by <whatever>, `*c` can emerge to fill the gap. Nice.

[0]: https://learn.microsoft.com/en-us/powershell/

jerjerjer3y ago

> Powershell has this, but it's a binary format

Well, yes - powershell passes binary objects but as you can always:

1) access their properties 2) pass them downstream 3) serialize to json/csv 4) instantiate from json/csv

I think this is both human- and machine-readable enough (even through internal format is binary, but working with Powershell you are never really exposed to it).

How do you think it can be improved?

In my opinion object io IS the best part of powershell - it allows us to ditch results wrangling with sed/awk/grep entirely. I'm super interested if there's an even better way forward.

vbezhenar3y ago

Shell would benefit from Content-Type/Accept headers. Like you can specify that cat accepts text and jq accepts Json. Then `ip a` would output corresponding type automatically.

thesuperbigfrog3y ago

>> Shell would benefit from Content-Type/Accept headers. Like you can specify that cat accepts text and jq accepts Json. Then `ip a` would output corresponding type automatically.

For example:

PREFERRED_OUTPUT_FORMAT="JSON"

--output-format="JSON"

--input-format="JSON"

Tools that can generate and consume structured text formats are a good idea, but they should be flexible enough that they can even work with other tools that have not been written yet.

vbezhenar3y ago

  ip a | filter "[].address like 192.*"

2 more replies

teddyh3y ago

“ip” has the -json option; i.e. “ip -json a” gives you straight JSON; no need for JC.

mprovost3y ago

dotancohen3y ago

Even just a generic stdmeta would go a long way to defining _what_ is being output. Curl is the worst about this.

https://unix.stackexchange.com/questions/197809/propose-addi...

brodo3y ago

Nushell has this too. I‘ve tried is as daily driver for a while. It‘s not there yet, but almost. After it hits 1.0 I‘m going to switch for good and leave the duck tape solutions behind.

hnlmorg3y ago

Shameless plug but maybe give murex a look. That’s been stable for a while now and does the same thing too.

HTTPS://GitHub.com/lmorg/murex

vesinisa3y ago

Seems to be part of the Arch repository: https://archlinux.org/packages/community/any/jc/

Begs the question: what is so wrong with plain pacman?

EDIT2: Created a PR with the GitHub.dev editor .. Absolutely blown away by how easy it was! Feels like the future of development.. https://github.com/kellyjonbrazil/jc/pull/310

bennyp1013y ago

Looking at github, it was only moved from AUR in May this year, but the commit only changed the description, not the commands.

But this is 100% going in my toolbox - I can think of a couple of scripts that I can update to use this right of the bat!

tazjin3y ago

It's a program to wrap stable CLI programs, makes sense that they recommend a wrapper around a stable program like pacman

kangalioo3y ago

Is this a snarky comment implying that the CLI programs and pacman are better left unwrapped?

dmoura3y ago

This is great!

I just wrote a blogpost [2] that illustrates it. It is more focused on CSV, but the commands would be the same if you were working with JSON.

[1] https://github.com/dcmoura/spyql [2] https://danielcmoura.com/blog/2022/spyql-cell-towers/

sbt5673y ago

Wow, this looks super useful. Will definitely check this out. Thanks

garfieldnate3y ago

kbrazil3y ago

Good news - `jc` supports JSONLines output for many commands and file-types, too!

enriquto3y ago

In the opposite direction, if you want to recover a human-editable text stream from json data, you run gron.

asicsp3y ago

Link: https://github.com/tomnomnom/gron

Discussion: https://news.ycombinator.com/item?id=25006277 366 points | Nov 6, 2020 | 91 comments

zikduruqe3y ago

Came looking for gron recommendations.

pglOP3y ago

Command line output as JSON. Very handy and has parsers for tons of common utilities, written in Python.

Blog post with examples here: https://blog.kellybrazil.com/2020/08/30/parsing-command-outp...

anecdotal13y ago

Can you trust it? Cli tool output is not exactly stable. I thought that's why libxo exists?

https://github.com/Juniper/libxo

gorkish3y ago

rob743y ago

slightwinder3y ago

It doesn't even need a new release. jc can already fail because of details like the system-language. With my local language, on a simple output of ls -l, it's parsing

    {"filename":"drwxr-xr-x  16 root root    4096 Oct  4 11:21 ."}

instead of

    {"filename":".","flags":"drwxr-xr-x","links":16,"owner":"root","group":"root","size":4096,"date":"Oct 4 11:21"}

with LANG=US. This makes it really hard to trust such a tool.

kbrazil3y ago

Hi there! `jc` author here. Yes, it is a documented caveat[0] that the `C` or `en_US.UTF-8` locales should be used for best results.

It's not unheard of for tools to require `C` locale for proper parsing:

    $ LC_ALL=C ls -l | jc --ls

[0] https://github.com/kellyjonbrazil/jc#locale

[1] formats should have good library support across many languages and nice filter/query capabilities from the command-line

simonkagedal3y ago

This is true. However, the issue is with the tool outputting unstructured data to begin with. As the `jc` author says, the best would be if tools supported structured output formats to begin with.

vidarh3y ago

If anything, though, that's a good reason for a tool like this to exist rather than have every script that depends on these tools use their own, often hacky, parsing of the output.

[1] "WARNING: apt does not have a stable CLI interface. Use with caution in scripts."

HyperSane3y ago

The greatest thing about powershell is that all commands returns structured output like this.

WorldMaker3y ago

Also, this tool is useful for using old unix commands in powershell: `jc … | ConvertFrom-Json` easily gives you powershell objects to work with.

HyperSane3y ago

That is a great idea!

otikik3y ago

This is a good step in the right direction, which IMHO should be that all those cli tools should have a —-json flag.

nixcraft3y ago

     dig +yaml google.com

vidarh3y ago

khiqxj3y ago

There are at least 3 ways this can create bugs:

  - It has to parse output of commands which may or may not be intended to be parsed and may or may not have a predictable format. The only way to overcome this is if this program becomes one of the Big Four "UN*X command output -> data" converters

  - It casts things to "float/int"

  - Depending on who made this library, the output itself may not be strict / predictable. Perhaps it will output JSON with two different key names in two different scenarios.

And don't forget that any of these issues will still come up even if they are accommodated for, due to versions of programs changing without the author of this tool knowing.

But yeah basic things having intrinsic shortcomings is a given, when you're using UN*X.

From the nested article:

kbrazil3y ago

Hi - jc author here. Just a couple points:

The casting to int/float is not done unless the underlying values are predictably documented to be numbers. There are rare cases where auto int/float conversions are done, but:

1) This is always documented, and

2) You can turn this functionality off via the —raw flag

Also, predictable schemas are documented with each parser. (e.g `jc -h —-arp`)

jwilk3y ago

> Big Four "UN*X command output -> data" converters

What do you mean?

khiqxj3y ago

I mean that there would be a big 4 such converters if people actually invested time into these "front ends" to CLIs.

fabianthomas3y ago

I'm really concerned about performance/latency with the json approach. I couldn't find anyone discussing that yet completely.

I compared two shell scripts both printing the ifindex of some network device (that is the integer in the first column) 10 times.

Using awk and head combined gives me 0,068s total time measured with the time command.

Using ip with the -j flag together with jq gives 0,411s.

Therefore the awk approach is 6 times faster. And here I used a binary (ip) that already supports json output and doesn't even need the mentioned jc.

While this whole test setup is somewhat arbitrary I experienced similar results in the past when writing shell scripts for, e.g., my panel. Reach out to me if you are interested in my test setup.

uvesten3y ago

As someone who uses `jq` almost daily, this looks like a great tool I didn’t know I needed. Thanks for the tip!

mg3y ago

    dig example.com | jc --dig

Seems a bit redundant. Maybe it should be the other way round?

    jc dig example.com

Similar to how you do

    time dig example.com

DougBTX3y ago

Yes, that works, there are examples on the page like:

    $ jc dig example.com | jq -r '.[].answer[].data'
    93.184.216.34

It uses the first argument to infer the command output type.

chii3y ago

`jc` doesn't need to know anything about the command producing the output - just the format of the output. So using a pipe and stdin makes a lot of sense.

berkes3y ago

Is "redundancy" that you have to provide --dig?

Having it as a pipe, allows for much more, though.

   some_expensive_command > out.log
   jc --expensive-cmd < out.log

   hourly_dig.sh > example_com_records_$(date +%F+%s)
   cat example_com_records_* | jc --dig

kbrazil3y ago

Hi there - author of `jc` here. I originally intended to have auto-detection but put that on the backburner to focus on creating parsers, especially after introducing the magic syntax.

I did implement auto-detection for `/proc` file parsers so you can just do:

    $ cat /proc/foo | jc --proc

    $ jc /proc/foo

But you can specify each procfile parser directly if you want to as well.

pastage3y ago

naikrovek3y ago

This seems very fragile, to me, without support from the application whose output is converted to JSON.

minor updates to command-line tools can and do subtly alter the textual output of the tool, and the outputs of these tools are not standardized.

synergy203y ago

to make this great tool truly universal, it has to be written in c instead of python these days, then provide python|javascript|etc bindings if possible.

I'd like to use it on embedded systems, where python is too large to fit. this tool can be widely deployed just like awk|sed|etc but it has to be in C for that.

makapuf3y ago

Agreed but I would change it with "any compiled language that has no external runtime" and common shared libs dependencies. I don't care that a utility is written in Go, Zig, C++ or Pascal ;)

kbrazil3y ago

`jc` is available today as a compiled binary with all dependencies self-contained. The binaries and OS package installers are available under the GitHub Releases:

https://github.com/kellyjonbrazil/jc/releases

This is still python under the hood and not as small of a binary as I would like, but it does work.

ElectricalUnion3y ago

But python is also universal? You can just bundle jc inside https://github.com/jart/cosmopolitan/tree/master/third_party... and it's as universal as it gets.

maxbond3y ago

Seems to me like anywhere Python is too bulky to work, a layer serializing and deserializing JSON at every pipeline step is likely to exhaust your memory too.

montroser3y ago

I would welcome POSIX shell or at least bash as an alternative to a compiled C/Rust/Go binary.

But yeah, for these types of utilities, relying on an external language runtime like Python/Node is pretty rough.

bitwize3y ago

Replace C with Rust.

synergy203y ago

on embedded rust is still much larger than c because rust links its stdlib statically.

bitwize3y ago

There is such a thing as no-stdlib Rust deployments.

1 more reply

codedokode3y ago

I think that JSON is a bad choice here.

A good choice would be a format that is easily parsed by programs but still readable by the user. JSON is a bad choice here because it is hard to read.

In my opinion, something formatted with pipes, quotes and spaces would be better:

    eth0:
      ip: 127.15.34.23
      flags: BROADCAST|UNICAST
      mtu: 1500
      name: """"Gigabit" by Network Interfaces Inc."""

Also, I wanted to note that I dislike jq utility. Instead of using JSONPath it uses some proprietary query format that I constantly fail to remember.

feanaro3y ago

Some alternative ideas for making JSON more readable:

- Pipe into gron (https://github.com/tomnomnom/gron) to get a `foo.bar.baz = val` kind of syntax.

- Pipe into visidata (https://www.visidata.org/) to get a spreadsheet-like editable view.

kbrazil3y ago

For example:

    % jc -y date
    ---
    year: 2022
    month: Nov
    month_num: 11
    day: 3
    weekday: Thu
    weekday_num: 4
    hour: 9
    hour_24: 9
    minute: 0
    second: 22
    period: AM
    timezone: PDT
    utc_offset:
    day_of_year: 307
    week_of_year: 44
    iso: '2022-11-03T09:00:22'
    epoch: 1667491222
    epoch_utc:
    timezone_aware: false

codedokode3y ago

Great, by I would prefer to avoid YAML because it is very complicated and difficult to parse.

bonzini3y ago

> In my opinion, something formatted with pipes, quotes and spaces would be better:

Just pipe it into a JSON-to-YAML script like this:

    #! /usr/bin/python3
    from ruamel import yaml
    import json, sys, io
    print(yaml.dump(json.load(sys.stdin)))

psychoslave3y ago

Following the project documentation, you easily come to:

  jc dig example.com | jq       
  [
    {
      "id": 30081,
      "opcode": "QUERY",
      "status": "NOERROR",
      "flags": [
        "qr",
        "rd",
        "ra"
      ],
      "query_num": 1,
      "answer_num": 1,
      "authority_num": 0,
      "additional_num": 1,
      "opt_pseudosection": {
        "edns": {
          "version": 0,
          "flags": [],
          "udp": 4096
        }
      },
      "question": {
        "name": "example.com.",
        "class": "IN",
        "type": "A"
      },
      "answer": [
        {
          "name": "example.com.",
          "class": "IN",
          "type": "A",
          "ttl": 56151,
          "data": "93.184.216.34"
        }
      ],
      "query_time": 0,
      "server": "192.168.1.254#53(192.168.1.254)",
      "when": "Thu Nov 03 14:06:40 CET 2022",
      "rcvd": 56,
      "when_epoch": 1667480800,
      "when_epoch_utc": null
    }
  ]

Rather readable to my mind. And you can rather easily transform it to your preferred human readable output format I guess.

codedokode3y ago

For me there are too many quotes and brackets. My proposed format can also be converted to JSON if necessary.

kbrazil3y ago

WorldMaker3y ago

> A good choice would be a format that is easily parsed by programs but still readable by the user.

DonHopkins3y ago

https://en.wikipedia.org/wiki/PowerShell#Pipeline

zokier3y ago

WorldMaker3y ago

Well, yes, powershell takes some shortcuts and has the advantage that .NET has a strong object system.

rjzzleep3y ago

onion2k3y ago

codedokode3y ago

ape43y ago

thesuperbigfrog3y ago

>> something formatted with pipes, quotes and spaces would be better

How well would this format handle deeply nested structures? It seems like it would require a lot of space characters compared to nesting open and close characters: {} or () or []

How would escaping pipes, quotes, and spaces work to represent those character literals?

There are already numerous structured text formats: JSON, XML, S-expressions, YAML, TOML, EDN, and many more. Wouldn't this be yet another format? (https://xkcd.com/927/)

djedr3y ago

Dare I suggest Jevko[0] as yet another alternative?

  eth0 [
    ip [127.15.34.23]
    flags [[BROADCAST][UNICAST]]
    mtu [1500]
    name ["Gigabit" by Network Interfaces Inc.]
  ]

This is one of the things it was designed with in mind.

It's even simpler and more flexible than S-expressions.

Handles deeply nested structures perfectly well. Has only 3 characters to escape (brackets and the escape character).

(I am the author)

[0] https://jevko.org/

artemisart3y ago

1 more reply

Too3y ago

While I can appreciate the immediate value this tool brings for some applications.

Programs having structural data and APIs inside, that get translated into human representation - only to be re-parsed again into structural form. What could possibly go wrong?

The Unix philosophy sounds romantic in theory, but need structural data, throughout, to work reliably in practice. Kids, go with the underlying apis unless your tool has structured output.

pessimizer3y ago

Just putting that out there for people who like to rewrite things in Rust. A slightly different version of this concept could allow for nearly friction-free adoption.

cstrahan3y ago

> also why I don't really use jq)

You also don’t like software written in C (the language jq is written in)?

freedomben3y ago

Why is the python an issue for you? I also dislike and avoid tools written in python, unless they have an rpm available. If there's a package then I don't so much care what language it's in.

esskay3y ago

kbrazil3y ago

Feel free to open issues on GitHub if you would like to recommend parsers! Also, the latest version of jc supports parsing of /proc files, which may also help.

esskay3y ago

Thanks, certainly will do :) Wasn't aware that proc was now supported!

prima-facie3y ago

Unrelated tool but also very useful in combination with JQ: gron

https://github.com/tomnomnom/gron

hinkley3y ago

wodenokoto3y ago

Something like this is what I've been thinking is the path forward to non-posix shells and a way to get away from cumbersome foot-gunny bash.

Nushell that hit the front page earlier this week seemed to me to be limited by "compatible" apps, but wrapping all the big ones in a json converter superficially seems like a great solution to me.

menjaprunes3y ago

I stumbled upon this few weeks ago to parse the output of nmcli and now it is in my toolset for scripting

I love it

fimdomeio3y ago

There's a clash of names between this jc and autojump (https://github.com/wting/autojump) jc (jump to child)

friendzis3y ago

Sounds a lot like Powershell, which returns rich objects instead of a text blob

WorldMaker3y ago

Yup, powershell is mentioned in the references. Also, this tool can be used to bridge older unix commands into powershell objects (ConvertFrom-Json).

exabrial3y ago

Have a parseable output is great. What would be more incredible is to have a parseable output with a schema definition and/or formal grammar of some sort.

UncleEntity3y ago

> What would be more incredible is to have a parseable output with a schema definition and/or formal grammar of some sort.

I’ve been trying to build something like this but simply don’t have the free time currently.

All the pieces are there but without the free time…

—edit—

Oh, output schema. Totally different than what I’m going on about.

vincnetas3y ago

Schemas are provided on linked page near definitions of parsers.

kbrazil3y ago

You can also run `jc -h --dig` to get the parser details that include the schema.

[0] https://json-schema.org/

psadri3y ago

Is anyone aware of something similar but for the args? A database that maps a command to a schema for all it’s possible cli arguments?

sindoc3y ago

OMG, I was looking for exactly something like this. I’d like to create immersion between Logseq and CLI. This can very useful.

visarga3y ago

love it, json could be the best way to pipe data between processes because it is text, but structured

drunkpotato3y ago

So is xml. Json is relatively unstructured compared to other formats.

Also jsonlines is a simple addition that make it easy for json to play well with non-json aware older Unix tools.

It has a few shortcomings but I think its advantages outweigh them, and it’s become a pretty widely used standard in a short time.

tgv3y ago

I just learned about gron (elsewhere in the comment section: https://news.ycombinator.com/item?id=33448471). That overcomes some of jsonlines problems.

deafpolygon3y ago

Bringing a powershell feature to the Unix CLI.

Kipters3y ago

Since it's now cross platform, technically powershell brought powershell features to Unix first

berkes3y ago

A slightly related pet-peeve: I don't like it when "random" commands "squat" the two-letter domain, or worse, the one-letter domain: t, jq, jc etc.

Two letter space outside of gnutools, is then reserved for my aliases. If jsonquery is too long to type, AND I lack autocomplete, then an alias is easy and fast to make. alias jq=jsonquery.

artemisart3y ago

Sharparam3y ago

Instead of typing the whole path, you can also use `command` or a backslash to bypass aliases

E.g.:

    alias jsonquery='command jq'

ycombobreaker3y ago

I perceive an element of hubris when a tool claims a two-character name. That is a very small namespace, so the tool is effectively staking a claim on mental or emotional real-estate.

Too3y ago

Given this tool is likely to be used in a bunch of bash scripts, collisions with aliases can be a big problem.

WorldMaker3y ago

This seems a general symptom of unix/POSIX command naming where commands have "always" been short names, and commands are expected to be short names.

It makes sense to me that powershell encourages long names first and allows and encourages users to have far more control over short aliases.

j / k navigate · click thread line to collapse