(I am only joking of course. As a recovering academic, I understand that researchers need recognition, and I have no right to throw stones -- glass houses and all. Also, this one is really like regreSSHion's little sibling. Still, easily finding the information I needed made me happy.)
> cleanup_exit() was not meant to be called from a signal handler [...] Fedora 38+ has moved to newer upstream OpenSSH that doesn't have the problematic cleanup_exit() call.
> This extra problematic logic only existed in upstream OpenSSH(-portable) for ~9 months
The fix also doesn't touch the Red Hat-specific code:
diff -urp openssh-8.7p1-38.el9_4.1-tree.orig/sshd.c openssh-8.7p1-38.el9_4.1-tree/sshd.c
--- openssh-8.7p1-38.el9_4.1-tree.orig/sshd.c 2024-07-08 03:42:51.431994307 +0200
+++ openssh-8.7p1-38.el9_4.1-tree/sshd.c 2024-07-08 03:48:13.860316451 +0200
@@ -384,7 +384,7 @@ grace_alarm_handler(int sig)
/* Log error and exit. */
if (use_privsep && pmonitor != NULL && pmonitor->m_pid <= 0)
- cleanup_exit(255); /* don't log in privsep child */
+ _exit(1); /* don't log in privsep child */
else {
sigdie("Timeout before authentication for %s port %d",
ssh_remote_ipaddr(the_active_state),
They suggest applying it even on non Red Hat distros.Cert's SIG30 rule page has a list: https://wiki.sei.cmu.edu/confluence/display/c/SIG30-C.+Call+...
Also there's https://clang.llvm.org/extra/clang-tidy/checks/bugprone/sign...
Ultimately it's an example of an invariant where it's clear that programmers can't be trusted to uphold it. In this case, the consequences can be very significant.
Despite the fact that there is explicit runtime support for signal handlers in the language runtime (i.e. libc).
There is no portable way to annotate all functions ever written or ever will be written as being async signal safe.
Which functions are async signal safe varies with the operating system and runtime (eg. an unsafe function in linux-gnu might be safe in linux-musl or linux-bionic).
Other than those insurmountable problems, yeah, good idea.
The annotation does not need to be portable; if it’s present on one system then other systems still benefit because the code is written to pass the check.
The list of async-signal-safe functions is well documented and quite short, so it would not be much work to add the annotations to the header files. It’s OK if some safe functions are omitted, because signal handlers should be written to do the absolute bare minimum.
You could check it at runtime.
Just like with array bounds checking, in many cases the compiler could sometimes prove the runtime check isn't necessary and eliminate it.
> Which functions are async signal safe varies with the operating system and runtime
Annotations could enumerate specific platforms where it is safe or unsafe. Or you could annotate based on specific attributes of platforms that make it safe or unsafe.
That's why GP suggested annotating them. Typically this would be done via a function attribute.
> There is no portable way to annotate all functions ever written or ever will be written as being async signal safe.
This is not a requirement for such an annotation to exist and to be used by projects that care about security or even just correctness.
> Which functions are async signal safe varies with the operating system and runtime (eg. an unsafe function in linux-gnu might be safe in linux-musl or linux-bionic).
And libc implementations already annotate many of their functions to tell the compiler how they work. Compilers are also more than happy to assume behavior of standard function matches the C/C++ standards in non-freestanding environmnets.
> Other than those insurmountable problems, yeah, good idea.
All fairly trivial problems that have already been solved many times for similar issues.
I'd like a more general attribute though to declare that a particular funcion is in some abstract domain and then annotations that certain functions may or may not be called in certain domains. This could come useful in cases where you want some functions to only be called from special threads.
1. I hate the fact they have the hubris to think they can be smarter than the upstream developers and patch old versions
2. I hate the fact they don't ship vanilla packages, but instead insist on patching things for features that nobody relies on anyway, __because they're not upstream__.
Maintainers should stick to downloading tarballs, building them and updating them promptly when a new version is out. If there's no LTS available, pay upstream and get an LTS, don't take a random version and patch it forever just to keep the same version numbers, it's nonsensical and it was only a matter of time before people tried to exploit it. Just look at the XZ backdoor for instance, which relied on RedHat and Debian deploying a patched libsystemd.
They go for it because it gives a very stable, solid foundation. They don't want a fragile base layer prone to breaking every day of the week.
This involves backporting a lot of stuff (primarily security fixes) because you can't just upgrade any package to its latest version, it will have entirely new dependencies, potentially breaking changes etc.
What should RedHat do, which does not:
1) make them lose their enterprise customers wanting a stable base
2) have unpatched security holes all over their distros
3) not cause them to backport stuff (we are here at the moment) ?
Upstreams generally don't do this until prompted, and sometimes resist until the path is proven and becomes best practice because distributions pushed it.
This process is mostly invisible to you because distributions have been successful at getting the changes needed for sane and consistent behaviour embedded into tools and expectations by default. All you see are the patches that are in flight or didn't make it.
If you want a distribution that does only minimal patching then there are distributions for that. The fact that the major distributions do patch speaks volumes about which approach results in a better experience for users.
>These lines were removed because they caused the Valgrind and Purify tools to produce warnings about the use of uninitialized data in any code that was linked to OpenSSL. Removing this code has the side effect of crippling the seeding process for the OpenSSL PRNG. Instead of mixing in random data for the initial seed, the only "random" value that was used was the current process ID. On the Linux platform, the default maximum process ID is 32,768, resulting in a very small number of seed values being used for all PRNG operations.
Also affected: Fedora 37, 36 and possibly 35, which are all end-of-life (since December 2023 in the case of Fedora 37).
Not affected: Fedora 38 (also EOL), 39 (maintained) and 40 (current).
https://news.ycombinator.com/item?id=40843778
Edit: Ok it seems very closely related; I was just surprised no one had linked the previous discussion.