1. Closing stderr in Python is not a good idea because that’ll swallow any other errors that occur at exit. Redirecting stdout to devnull is really just a way to prevent the flushed output from going to the now-closed stdout and triggering another SIGPIPE. That’s more preferable than closing stderr and losing error output at exit.
2. Ignoring SIGPIPE is a terrible idea for a process that should do stream processing. Try making a yes clone and ignoring SIGPIPE - your process will likely run forever trying to shove “y” into a closed pipe. There’s a reason SIGPIPE was invented! Very few programs bother to check the return value from write/printf/etc.
I don't follow. The standard for EPIPE/SIGPIPE handling is to silently exit with an error status. It's fine to close stderr to prevent spurious warning messages about flushing stdout.
> 2. Ignoring SIGPIPE is a terrible idea for a process that should do stream processing. Try making a yes clone and ignoring SIGPIPE - your process will likely run forever trying to shove “y” into a closed pipe. There’s a reason SIGPIPE was invented! Very few programs bother to check the return value from write/printf/etc.
Programs can correctly handle lost pipes masking SIGPIPE entirely, with error checking alone. Python's BrokenPipeError is raised on the basis of EPIPE, not SIGPIPE.
Re: programs not checking error returns of write() and close(): that is not really true in a language like Python with exceptions raised on IO errors. It always does the check, and the unwinder aborts the program if nothing handles the error. Sigpipe is completely unnecessary for Python programs. (It's also not necessary for C programs, but I guess AT&T didn't want to fix their programs to check for errors.)
Does the POSIX standard mandate that programs receiving EPIPE/SIGPIPE die silently? I don’t know of such a rule, and there’s plenty of programs that violate this. Python is a bit too verbose with the errors (with a full trace back and two copies of the error) so suppressing those errors somehow seems like a good idea for a general-purpose command line tool.
Half of this article is not "How do Unix pipes work", but "how to fix broken SIGPIPE handling in Python".
The exception is raised from a -1/EPIPE return from libc write().
I fully agree that Python is often a bad citizen in terms of signal handling — it wants to only process signals on 'the main thread', but also wants end-users to fully control signal-handling. The two ideas are sort of at-odds and in general I find handling signals in Python frustrating.
$ perl -E 'say "y" while 1' | head -1
y
$ If we cat this file, it will be printed to the terminal.
> cat brothers_karamazov.txt
... many lines of text!
***FINIS***
It takes a noticeable amount of time to finish.
The amount of time it takes for cat(1) to read and output the file is almost certainly insignificant. The time the author is noticing is probably related to how long it takes for his console to process the text.This can be easily verified by putting `time` in front of the cat to measure the time taken. Even for huge text files, the wall clock time might be significant but the "user" time is likely still zero.
>how does cat know to stop when head is finished
I'm no expert on Unix, so correct me if I'm wrong, but surely this line of reasoning is misleading because pipes create a unidirectional data flow, so `cat` can not know anything about `head`. It does not "stop" - it passes the whole text along just as it did without the pipe. As you said, the delay comes in printing to the console, not in the `cat` command.
I've actually had "developers" go "but, readability". Yea ok.
(echo red; echo green 1>&2) | echo blue
is indeterministic:http://www.gibney.de/the_output_of_linux_pipes_can_be_indete...
As it turns out, this short line and its behavior nicely demonstrate a bunch of aspects that happen under the hood when you use a pipe.
This means, that reading the read end of the side in the parent process after you forked will not work. Thefore you should explicitly change fctl flags and remove os.O_CLOEXEC:
fcntl.fcntl(readfd, fcntl.F_SETFL, fcntl.fcntl(readfd, fcntl.F_GETFL) & ~os.O_CLOEXEC)* If you only deal with file descriptors provided to you (stdin, stdout, stderr) as well as some files that you open (including special files like FIFOs), do not ignore SIGPIPE.
* If you deal with sophisticated file descriptors (socket(2) and pipe(2) count as sophisticated), you'd better ignore SIGPIPE, but also make sure to check for EPIPE in every single write.
In my view, SIGPIPE is a kludge so that programs that are too lazy to check for errors from write(2) (and fwrite(3) and related friends) will not waste resources. But if you are dealing with sophisticated file descriptors, there is a lot more happening than just open/read/write and a lot more error cases you must handle, and at that point the incremental cost of handling EPIPE isn't a significant addition.