This flexibility really seems to break down the transaction costs that the original article talks about, when compared to traditional Unix pipes.
I'm still trying to come up with a Linux-y way to do this. Microsoft can mandate that all Powershell commands run under the CLR (and therefore can talk to arbitrary implementations of some CLR class) but any FOSS equivalent is going to have to deal with people wanting to interface programs written for a dozen different runtimes (and probably even some C code running on the bare metal).
Not to state the obvious, but this step was taken by major scripting languages decades ago with the minor restriction that your program had to run within one executable. Furthermore the idea of encapsulating objects in a way that could be passed between programs is not much younger - in the early 90s you had (still used) protocols for that like CORBA and PDO. And Microsoft was not far behind with its ever evolving OLE/COM/DCOM protocols.
I have no idea how good Microsoft's current implementation is. (Given how many times they have tried to tackle it, it wouldn't surprise me if it was pretty good by now.) But it isn't as original as it may seem.
If you need fields, then separate them with a text delimiter, if you need something more complicated, then maybe you should be using a programming language, or piping between scripts in a language with good text serialization.
Given the power of any computer (even those 30 years old), it's enormously powerful to be able to drop one segment of a pipe-chain and look at the output. Funny that I'm facing a similar issue now that some of the server world is going Javascript: "console.log(X);" doesn't necessarily tell you the whole story about X. Exchanging objects is 14% too complicated and anything more than 0% too complicated is too complicated.
If we define simplicity as: a plain text interchange format, then clearly by definition a text based shell is simpler.
Piping objects instead of text allows a single command to do more, as it can operate on any of the object's properties that it understands. From a user's perspective, I believe, that this is simpler - however I will acknowledge that this is somewhat subjective. For example: The command ls | sort LastWriteTime would print a list of files sorted by their LastWriteTime. The command ls | sort Length would print a list of files sorted by their size. Doing this in bash is more difficult as it requires additional commands to parse the output of ls.
It's not simple, it's primitive rounded in the corner.
If all you have is raw blobs of data in pipes, you can't grow up when you suddenly realize need for something more complex.
See the story with adding UTF-8 BOM support before #! in Linux kernel.
To quote from a random mailing list post (http://www.mail-archive.com/pharo-project@lists.gforge.inria...):
The syntax is a mashup of Smalltalk and unix shell conventions. The pipes are OS pipes where necessary to interact with external programs, or an object that minics OS pipe behavior if the "commands" being connected are Smalltalk expressions or command objects.
How about the interactive python shell?
True that it locks out all other runtimes, but given the massive set of libraries, that may be a small price to pay? Same applies to <your favorite runtime with lots of libraries>.
cat -n
overlaps with: awk '{ print NR "\t" $0 }'
http://harmful.cat-v.org/cat-v/unix_prog_design.pdfYour word processor features a WYSIWYG text editor (if you're on Linux this is probably AbiWord, OO.o Writer, or KWord). Meanwhile, if you have a modern web browser, it probably has a WYSIWYG formatted text editor as well. (Anything based on a recent version of Webkit or Gecko certainly will; on your Linux box that might be Firefox, Chrome, Epiphany, or Konqueror.)
If truly none of your software features overlapping features, I would bet that either you don't have X11 installed, or you have Solaris or HP-UX installed with a bunch of ancient Motif applications. :-)
Unix programs, even GUI ones, share a ton of code, as you can easily see if you try to install an interesting GUI application like Shotwell or Digikam.
Heck, I've never written an Instant Messenger application, but I know that "libpurple" is the name for the IM component that most Linux chat programs use.
And hey, that's exactly how it is advertised:
http://developer.pidgin.im/wiki/WhatIsLibpurple
What is libpurple?
libpurple is intended to be the core of an IM program. When using libpurple, you'll basically be writing a UI for this core chunk of code. Pidgin is a GTK+ frontend to libpurple, Finch is an ncurses frontend, and Adium is a Cocoa frontend.
Not really true at all. find is especially heinous. "find ... -delete" is equivalent to "find ... -print0 | xargs -0 rm --"
* "find ... -delete" probably just calls unlink, just as rm does, which is a libc feature. No subshell spawned thus lean and fast.
* "find ... -exec 'rm {}'" allows to run a command without xargs, but does not allows you to filter and transform the stream. Propably will spawn a subshell for each call.
* "find ... -print0 | sed 'bar' | grep -v foo | frobznicate qux | xargs -0 rm --" is where the power lies. A few more subshells created.
This, to me, shows that 'find' actually does what it should do, and allows you to choose the best way you can do it WRT the task at hand.
I am curious to know if it's fair to compare typical Unix software (CLI-based, text-based --- usually system software is what comes to mind) with application software (Word processing, enterprise software, etc).
You get better output quicker -- a neatly formatted PDF -- with CLI tools like LaTeX or troff (they work as a pipe of filters), than with a WYSIWYG like MS Word.
You get more relevant data quicker with SQL querying a Data Warehouse than via some GUI front-end that only lets you use some opaque, pre-defined queries.
GUI is meant to flatten the learning curve; CLI is meant to let you unleash power of tools you know well.
GUI may shine when you do a task once in a blue moon, but CLI is the choice for frequent use.
On the other hand, I've never seen any good CLI approach to browsing the web ;-)
For example, it would totally suck to have to write the entire GUI for file selection, and nobody does it if they're smart - they just call a function, and get back a file handle or path.
The problem is programs that try to reinvent the wheel. How many file selectors have you seen that are "custom" in some way and thus break convention?
Other applications see themselves as a walled garden, and may allow extension within the program, but make it difficult for communication to occur between programs, often by not exposing enough state, not documenting the interface, or not providing any scripting at all.
And of course, which I think you were saying with your comment on extensibility, vim and emacs are well-designed for sending content out to a shell for processing (spell-check is a common example) and retrieving the results. ("!!" in vim)
Traditional Unix programs make your shell environment work better.
Adblock Plus, Perspectives, Tree Style Tab, etc, make my Iceweasel work better.
mod_rewrite, mod_php, etc, make Apache work better.
The "text streams" part is specific to Unix, because files and pipes are the defining feature of that environment. The other parts are probably universal across any extensible system.
Scripting languages are the next extension of unix - you take all those little programs that do something well, put them together with some control structure and you can do big things, often in an automated manner, which is difficult to do on a GUI.
That said, showing the GUI interface to someone and having them make selections is much more suited to other programming methods (MVC,etc.), often which can just run scripts on the back end.
The point of unix isn't having a bunch of programs that do different things - it's combining them like legos to do complex tasks via scripting.
No, but Apple had a project called OpenDoc that did something similar. The idea was that you'd have a UI centered around the document, and you could pull in spell check and editing and all kinds of other components instead of having a program like Word. It didn't work out, for various reasons, many of which were unrelated to the merit of the idea itself.
I think the reason programs creep in functionality is NOT these technical sounding "transaction costs", but in order to sustain vendor lock in. In a world where nobody worries about lock in, there is no reason to bundle an editor with a calculator with an email program.
In the Adobe world, the boundaries of the programs are pretty tight, with a little bit of bleed over. Counter example? Maybe.
That was not a great article, if you ask me -- I think the analysis was trite, and MS Office software probably isn't a good example of anything.
A standard Emacs installation contains, among other things: a Lisp interpreter, an adventure game, a mail program, a Usenet newsreader, a calendar, a calculator with symbolic-algebra features, and an implementation of Conway's Life.
Put another way, Emacs makes it easy to process text files, combining the built-in features and modes of Emacs with just about any Unix command-line tool (via M-x shell-command-on-region, or simply by processing the file from a terminal and passing the result back to Emacs). Word, on the other hand, makes it easy to do fancy word processing and associated formatting tasks inside Word, but makes it very difficult to share the resulting file with other programs (even different versions of Word).
Beneath the covers, don't the Office applications share a lot of common code?
More info on shared components: http://blogs.msdn.com/b/heaths/archive/2009/12/21/about-shar...
The Unix philosophy breaks down where it should: where you aren't trying to get something done in a few keystrokes at the command line. Even when you aren't, the Unix philosophy works pretty well servicing low-level portions of a program. For example, any spell checker is Unix-y and many programs have a spell checker.
- None of the modern operating systems is really orthogonal: neither linux nor the BSDs embrace the real "Unix philosophy"
- CLIs are monodimensional, and suffer from the same problems that dataflow programming has: it's easy to split the flow of data into two pipes, but is really hard (semantically) to rejoin them.
- GUIs are imperative in nature, and that hurts composability.
- Shells and REPLs have too many overlapping features (like OSs and ProgrammingLanguages+Libraries): the de facto standard for shells, bash, has a syntax that, for a programmer, feels really unclear. Every non trivial bash script that I wrote has turned really fast into escaping hell.
That's not hard if the text streams are line-oriented
* append one to another, if order is not important
* append one to another and then sort again, if order is important,
* JOIN the streams using the `join' command (it's a standalone text utility, does pretty much what SQL JOIN does)
To compute sum (in set theory sense), use ` | sort | uniq', to take intersection, use ` | sort | uniq -d'. Or use the `comm' utility.
If using Bourne shell derivative, you may end up using temporary files. With the Rc shell, you may use the <{COMMAND} syntax, like:
cat <{foo | bar | baz} <{frob | knob | bob} <{some | more} | subsequent | processing
to catenate output of three pipes and do subsequent processing.The transaction costs here are things like all the negotiation that has to go in to getting anything done with contractors every time you want something done.
Does it need stating how useful and powerful text based output can be?
Even the MS Office example goes that way: people who want a nice graph and a computing sheet in a Word document will embed an Excel component, which is virtually Excel running and piping its output into the Word document. The same goes for reaching Access datasets and queries to produce graphs in Excel: Access pipes its data to Excel, which in turn produces a graph.
"programs gain[ing] overlapping features over time" is a result of feature creep and certainly a lack of architecture foresight. The author refers to spreadsheet features in Word and database features in Excel. I can only take a guess at what he refers to but, assuming he's not talking about object embedding (which I took care of above), being able to draw tables graphically in Word is not equal to having a sheet able to compute formulas and render graphs, and being able to sort and filter columns in Excel is worlds apart to queries on a structured, relational table.
The core of the article seems to boil down to the fact that software inevitably brings duplicate functionality, and that this is good because it alleviates the effort of bridging applications. Yet duplicating functionality puts a burden on the developers that might be better spent elsewhere (including providing easy access to the "real deal"). What's more it suddenly becomes begrudgingly effortful for the user to achieve a task escaping the subset of duplicated functionality. In short, it might save you a little work, but it doesn't scale. That's why properly designed technology allows for trivial connection of software (Unix pipes) and description of contracts (sane text output) or abstracts the contract part entirely (e.g including an Excel spreadsheet in a Word document).
I generally enjoy sharing context as a data structure with a file-system-like interface (e.g. REST) between processes, and only more so when the command/control interface is a simple text stream. Of course, this reminds me of Plan 9, but there's also 1060 Research, and other examples I'm sure.
There is quite a lot of head room left in the Plan 9 paradigm, and you do yourself no disservice by taking some time to figure it out. It really is more Unix than Unix.
I see no logic here. Can anybody explain that?
We can handle XML objects with commands xml_grep, xmlpath, xmlpatterns, perl, python, js (JavaScript has builtin support for XML/XPath), etc.
JSON format in new, but we can handle it with js (native format), perl, python. CouchDB already uses JSON over pipe to communicate with tools.